BaseStrategy¶
- class mmengine._strategy.BaseStrategy(*, work_dir='work_dirs', experiment_name=None, env_kwargs=None, log_kwargs=None, auto_scale_lr=None)[source]¶
Base class for all strategies.
In the process of supporting FSDP, DeepSpeed, and ColossalAI, the scalability of the Runner faced challenges, which led to the redefinition of the Runner’s responsibilities. The Strategy abstraction was split out, which is responsible for constructing, initializing, and saving/loading the state of training components such as models, optimizers, and parameter schedulers.
Warning
This is an experimental feature, and its interface is subject to change.
- Keyword Arguments:
work_dir (str) – The working directory to save checkpoints. The logs will be saved in the subdirectory of work_dir named
timestamp
. Defaults to ‘work_dirs’.experiment_name (str, optional) – Name of current experiment. If not specified, timestamp will be used as
experiment_name
. Defaults to None.env_kwargs (dict, optional) – Environment config passed in
setup_env()
. Defaults to None.log_kwargs (dict, optional) – Logger config passed in
build_logger()
. Defaults to None.auto_scale_lr (dict, Optional) – Config to scale the learning rate automatically. It includes
base_batch_size
andenable
.base_batch_size
is the batch size that the optimizer lr is based on.enable
is the switch to turn on and off the feature.
- Parameters:
- build_logger(log_level='INFO', log_file=None, **kwargs)[source]¶
Build a global asscessable MMLogger.
- build_model(model)[source]¶
Build model.
If
model
is a dict, it will be used to build ann.Module
object. Otherwise, ifmodel
is ann.Module
object it will be returned directly.An example of
model
:model = dict(type='ResNet')
- Parameters:
model (nn.Module or dict) – A
nn.Module
object or a dict to buildnn.Module
object. Ifmodel
is ann.Module
object, just returns itself.- Return type:
Note
The returned model must implement
train_step
,test_step
ifrunner.train
orrunner.test
will be called. Ifrunner.val
will be called orval_cfg
is configured, model must implement val_step.
- build_optim_wrapper(optim_wrapper, model=None)[source]¶
Build optimizer wrapper.
If
optim_wrapper
is a config dict for only one optimizer, the keys must containoptimizer
, andtype
is optional. It will build aOptimWrapper
by default.If
optim_wrapper
is a config dict for multiple optimizers, i.e., it has multiple keys and each key is for an optimizer wrapper. The constructor must be specified sinceDefaultOptimizerConstructor
cannot handle the building of training with multiple optimizers.If
optim_wrapper
is a dict of pre-built optimizer wrappers, i.e., each value ofoptim_wrapper
represents anOptimWrapper
instance.build_optim_wrapper
will directly build theOptimWrapperDict
instance fromoptim_wrapper
.- Parameters:
- Return type:
BaseOptimWrapper
Note
For single optimizer training, if optim_wrapper is a config dict, type is optional(defaults to
OptimWrapper
) and it must contain optimizer to build the corresponding optimizer.Examples
>>> # build an optimizer >>> optim_wrapper_cfg = dict(type='OptimWrapper', optimizer=dict( ... type='SGD', lr=0.01)) >>> # optim_wrapper_cfg = dict(optimizer=dict(type='SGD', lr=0.01)) >>> # is also valid. >>> optim_wrapper = runner.build_optim_wrapper(optim_wrapper_cfg) >>> optim_wrapper Type: OptimWrapper accumulative_counts: 1 optimizer: SGD ( Parameter Group 0 dampening: 0 lr: 0.01 momentum: 0 nesterov: False weight_decay: 0 ) >>> # build optimizer without `type` >>> optim_wrapper_cfg = dict(optimizer=dict(type='SGD', lr=0.01)) >>> optim_wrapper = runner.build_optim_wrapper(optim_wrapper_cfg) >>> optim_wrapper Type: OptimWrapper accumulative_counts: 1 optimizer: SGD ( Parameter Group 0 dampening: 0 lr: 0.01 maximize: False momentum: 0 nesterov: False weight_decay: 0 ) >>> # build multiple optimizers >>> optim_wrapper_cfg = dict( ... generator=dict(type='OptimWrapper', optimizer=dict( ... type='SGD', lr=0.01)), ... discriminator=dict(type='OptimWrapper', optimizer=dict( ... type='Adam', lr=0.001)) ... # need to customize a multiple optimizer constructor ... constructor='CustomMultiOptimizerConstructor', ...) >>> optim_wrapper = runner.optim_wrapper(optim_wrapper_cfg) >>> optim_wrapper name: generator Type: OptimWrapper accumulative_counts: 1 optimizer: SGD ( Parameter Group 0 dampening: 0 lr: 0.1 momentum: 0 nesterov: False weight_decay: 0 ) name: discriminator Type: OptimWrapper accumulative_counts: 1 optimizer: 'discriminator': Adam ( Parameter Group 0 dampening: 0 lr: 0.02 momentum: 0 nesterov: False weight_decay: 0 )
Important
If you need to build multiple optimizers, you should implement a MultiOptimWrapperConstructor which gets parameters passed to corresponding optimizers and compose the
OptimWrapperDict
. More details about how to customize OptimizerConstructor can be found at optimizer-docs.
- build_param_scheduler(scheduler, optim_wrapper, default_args=None)[source]¶
Build parameter schedulers.
build_param_scheduler
should be called afterbuild_optim_wrapper
because the building logic will change according to the number of optimizers built by the runner. The cases are as below:Single optimizer: When only one optimizer is built and used in the runner,
build_param_scheduler
will return a list of parameter schedulers.Multiple optimizers: When two or more optimizers are built and used in runner,
build_param_scheduler
will return a dict containing the same keys with multiple optimizers and each value is a list of parameter schedulers. Note that, if you want different optimizers to use different parameter schedulers to update optimizer’s hyper-parameters, the input parameterscheduler
also needs to be a dict and its key are consistent with multiple optimizers. Otherwise, the same parameter schedulers will be used to update optimizer’s hyper-parameters.
- Parameters:
scheduler (_ParamScheduler or dict or list) – A Param Scheduler object or a dict or list of dict to build parameter schedulers.
optim_wrapper (BaseOptimWrapper) –
default_args (dict | None) –
- Return type:
Examples
>>> # build one scheduler >>> optim_cfg = dict(dict(type='SGD', lr=0.01)) >>> runner.optim_wrapper = runner.build_optim_wrapper( >>> optim_cfg) >>> scheduler_cfg = dict(type='MultiStepLR', milestones=[1, 2]) >>> schedulers = runner.build_param_scheduler(scheduler_cfg) >>> schedulers [<mmengine.optim.scheduler.lr_scheduler.MultiStepLR at 0x7f70f6966290>] # noqa: E501
>>> # build multiple schedulers >>> scheduler_cfg = [ ... dict(type='MultiStepLR', milestones=[1, 2]), ... dict(type='StepLR', step_size=1) ... ] >>> schedulers = runner.build_param_scheduler(scheduler_cfg) >>> schedulers [<mmengine.optim.scheduler.lr_scheduler.MultiStepLR at 0x7f70f60dd3d0>, # noqa: E501 <mmengine.optim.scheduler.lr_scheduler.StepLR at 0x7f70f6eb6150>]
Above examples only provide the case of one optimizer and one scheduler or multiple schedulers. If you want to know how to set parameter scheduler when using multiple optimizers, you can find more examples optimizer-docs.
- Returns:
List of parameter schedulers or a dictionary contains list of parameter schedulers build from
scheduler
.- Return type:
list[_ParamScheduler] or dict[str, list[_ParamScheduler]]
- Parameters:
scheduler (_ParamScheduler | Dict | List) –
optim_wrapper (BaseOptimWrapper) –
default_args (dict | None) –
- abstract load_checkpoint(filename, *, map_location='cpu', strict=False, revise_keys=[('^module.', '')], callback=None)[source]¶
Load checkpoint from given
filename
.- Parameters:
- Keyword Arguments:
map_location (str or callable) – A string or a callable function to specifying how to remap storage locations. Defaults to ‘cpu’.
strict (bool) – strict (bool): Whether to allow different params for the model and checkpoint.
revise_keys (list) – A list of customized keywords to modify the state_dict in checkpoint. Each item is a (pattern, replacement) pair of the regular expression operations. Defaults to strip the prefix ‘module.’ by [(r’^module.’, ‘’)].
callback (callable, callable) – Callback function to modify the checkpoint after loading the checkpoint. Defaults to None.
- Return type:
- load_model_state_dict(state_dict, *, strict=False, revise_keys=[('^module.', '')])[source]¶
Load model state from dict.
- load_optim_state_dict(state_dict)[source]¶
Load optimizer state from dict.
- Parameters:
state_dict (dict) –
- Return type:
None
- load_or_resume(*, load_from=None, resume=False)[source]¶
Load checkpoint or resume from checkpoint.
- Parameters:
load_from (str, optional) – The checkpoint file to load from. Defaults to None.
resume (bool or str) – Whether to resume training. Defaults to False. If
resume
is True andload_from
is None, automatically to find latest checkpoint fromwork_dir
. If not found, resuming does nothing. Ifresume
is a string, it will be treated as the checkpoint file to resume from.
- Return type:
dict | None
- abstract prepare(model, *, optim_wrapper=None, param_scheduler=None, compile=False, dispatch_kwargs=None)[source]¶
Prepare model and some components.
- Parameters:
model (
torch.nn.Module
or dict) – The model to be run. It can be a dict used for building a model.optim_wrapper (BaseOptimWrapper | dict | None) –
param_scheduler (_ParamScheduler | Dict | List | None) –
dispatch_kwargs (dict | None) –
- Keyword Arguments:
optim_wrapper (BaseOptimWrapper or dict, optional) – Computing the gradient of model parameters and updating them. Defaults to None. See
build_optim_wrapper()
for examples.param_scheduler (_ParamScheduler or dict or list, optional) – Parameter scheduler for updating optimizer parameters. If specified,
optim_wrapper
should also be specified. Defaults to None. Seebuild_param_scheduler()
for examples.compile (dict, optional) – Config to compile model. Defaults to False. Requires PyTorch>=2.0.
dispatch_kwargs (dict, optional) – Kwargs to be passed to other methods of Strategy. Defaults to None.
- abstract resume(filename, *, resume_optimizer=True, resume_param_scheduler=True, map_location='default', callback=None)[source]¶
Resume training from given
filename
.Four types of states will be resumed.
model state
optimizer state
scheduler state
randomness state
- Parameters:
- Keyword Arguments:
resume_optimizer (bool) – Whether to resume optimizer state. Defaults to True.
resume_param_scheduler (bool) – Whether to resume param scheduler state. Defaults to True.
map_location (str or callable) – A string or a callable function to specifying how to remap storage locations. Defaults to ‘default’.
callback (callable, callable) – Callback function to modify the checkpoint before saving the checkpoint. Defaults to None.
- Return type:
- abstract save_checkpoint(filename, *, save_optimizer=True, save_param_scheduler=True, extra_ckpt=None, callback=None)[source]¶
Save checkpoint to given
filename
.- Parameters:
- Keyword Arguments:
save_optimizer (bool) – Whether to save the optimizer to the checkpoint. Defaults to True.
save_param_scheduler (bool) – Whether to save the param_scheduler to the checkpoint. Defaults to True.
extra_ckpt (dict, optional) – Extra checkpoint to save. Defaults to None.
callback (callable, callable) – Callback function to modify the checkpoint before saving the checkpoint. Defaults to None.
- Return type:
None