OptimWrapperDict¶

class mmengine.optim.OptimWrapperDict(**optim_wrapper_dict)[源代码]¶

A dictionary container of OptimWrapper.

If runner is training with multiple optimizers, all optimizer wrappers should be managed by OptimWrapperDict which is built by CustomOptimWrapperConstructor. OptimWrapperDict will load and save the state dictionary of all optimizer wrappers.

Consider the semantic ambiguity of calling :meth:update_params, backward() of all optimizer wrappers, OptimWrapperDict will not implement these methods.

示例

>>> import torch.nn as nn
>>> from torch.optim import SGD
>>> from mmengine.optim import OptimWrapperDict, OptimWrapper
>>> model1 = nn.Linear(1, 1)
>>> model2 = nn.Linear(1, 1)
>>> optim_wrapper1 = OptimWrapper(SGD(model1.parameters(), lr=0.1))
>>> optim_wrapper2 = OptimWrapper(SGD(model2.parameters(), lr=0.1))
>>> optim_wrapper_dict = OptimWrapperDict(model1=optim_wrapper1,
>>>                                       model2=optim_wrapper2)

备注

The optimizer wrapper contained in OptimWrapperDict can be accessed in the same way as dict.

参数:

**optim_wrappers – A dictionary of OptimWrapper instance.
optim_wrapper_dict (OptimWrapper) –

backward(loss, **kwargs)[源代码]¶

Since OptimWrapperDict doesn’t know which optimizer wrapper’s backward method should be called (loss_scaler maybe different in different :obj:AmpOptimWrapper), this method is not implemented.

The optimizer wrapper of OptimWrapperDict should be accessed and call its backward.

参数:: loss (Tensor) –
返回类型:: None

get_lr()[源代码]¶

Get the learning rate of all optimizers.

返回:: Learning rate of all optimizers.
返回类型:: Dict[str, List[float]]

get_momentum()[源代码]¶

Get the momentum of all optimizers.

返回:: momentum of all optimizers.
返回类型:: Dict[str, List[float]]

initialize_count_status(model, cur_iter, max_iters)[源代码]¶

Do nothing but provide unified interface for OptimWrapper

Since OptimWrapperDict does not know the correspondence between model and optimizer wrapper. initialize_iter_status will do nothing and each optimizer wrapper should call initialize_iter_status separately.

参数:: model (Module) –
返回类型:: None

items()[源代码]¶

A generator to get the name and corresponding OptimWrapper

返回类型:: Iterator[Tuple[str, OptimWrapper]]

keys()[源代码]¶

A generator to get the name of OptimWrapper

返回类型:: Iterator[str]

load_state_dict(state_dict)[源代码]¶

Load the state dictionary from the state_dict.

参数:: state_dict (dict) – Each key-value pair in state_dict represents the name and the state dictionary of corresponding OptimWrapper.
返回类型:: None

optim_context(model)[源代码]¶

optim_context should be called by each optimizer separately.

参数:: model (Module) –

property param_groups¶: Returns the parameter groups of each OptimWrapper.

state_dict()[源代码]¶

Get the state dictionary of all optimizer wrappers.

返回:: Each key-value pair in the dictionary represents the name and state dictionary of corresponding OptimWrapper.
返回类型:: dict

step(**kwargs)[源代码]¶

Since the backward method is not implemented, the step should not be implemented either.

返回类型:: None

update_params(loss, step_kwargs=None, zero_kwargs=None)[源代码]¶

Update all optimizer wrappers would lead to a duplicate backward errors, and OptimWrapperDict does not know which optimizer wrapper should be updated.

Therefore, this method is not implemented. The optimizer wrapper of OptimWrapperDict should be accessed and call its update_params.

参数:

loss (Tensor) –
step_kwargs (Dict | None) –
zero_kwargs (Dict | None) –

返回类型:

None

values()[源代码]¶

A generator to get OptimWrapper

返回类型:: Iterator[OptimWrapper]

zero_grad(**kwargs)[源代码]¶

Set the gradients of all optimizer wrappers to zero.

返回类型:: None