BaseStrategy¶

class mmengine._strategy.BaseStrategy(*, work_dir='work_dirs', experiment_name=None, env_kwargs=None, log_kwargs=None, auto_scale_lr=None)[源代码]¶

Base class for all strategies.

In the process of supporting FSDP, DeepSpeed, and ColossalAI, the scalability of the Runner faced challenges, which led to the redefinition of the Runner’s responsibilities. The Strategy abstraction was split out, which is responsible for constructing, initializing, and saving/loading the state of training components such as models, optimizers, and parameter schedulers.

警告

This is an experimental feature, and its interface is subject to change.

关键字参数

work_dir (str) – The working directory to save checkpoints. The logs will be saved in the subdirectory of work_dir named timestamp. Defaults to ‘work_dirs’.
experiment_name (str, optional) – Name of current experiment. If not specified, timestamp will be used as experiment_name. Defaults to None.
env_kwargs (dict, optional) – Environment config passed in setup_env(). Defaults to None.
log_kwargs (dict, optional) – Logger config passed in build_logger(). Defaults to None.
auto_scale_lr (dict, Optional) – Config to scale the learning rate automatically. It includes base_batch_size and enable. base_batch_size is the batch size that the optimizer lr is based on. enable is the switch to turn on and off the feature.

参数

work_dir (str) –
experiment_name (Optional[str]) –
env_kwargs (Optional[dict]) –
log_kwargs (Optional[dict]) –
auto_scale_lr (Optional[dict]) –

build_logger(log_level='INFO', log_file=None, **kwargs)[源代码]¶

Build a global asscessable MMLogger.

参数

log_level (int or str) – The log level of MMLogger handlers. Defaults to ‘INFO’.
log_file (str, optional) – Path of filename to save log. Defaults to None.
**kwargs – Remaining parameters passed to MMLogger.

返回

A MMLogger object build from logger.

返回类型

MMLogger

build_model(model)[源代码]¶

Build model.

If model is a dict, it will be used to build a nn.Module object. Otherwise, if model is a nn.Module object it will be returned directly.

An example of model:

model = dict(type='ResNet')

参数: model (nn.Module or dict) – A nn.Module object or a dict to build nn.Module object. If model is a nn.Module object, just returns itself.
返回类型: torch.nn.modules.module.Module

备注

The returned model must implement train_step, test_step if runner.train or runner.test will be called. If runner.val will be called or val_cfg is configured, model must implement val_step.

返回: Model build from model.
返回类型: nn.Module
参数: model (Union[torch.nn.modules.module.Module, dict]) –

build_optim_wrapper(optim_wrapper, model=None)[源代码]¶

Build optimizer wrapper.

If optim_wrapper is a config dict for only one optimizer, the keys must contain optimizer, and type is optional. It will build a OptimWrapper by default.

If optim_wrapper is a config dict for multiple optimizers, i.e., it has multiple keys and each key is for an optimizer wrapper. The constructor must be specified since DefaultOptimizerConstructor cannot handle the building of training with multiple optimizers.

If optim_wrapper is a dict of pre-built optimizer wrappers, i.e., each value of optim_wrapper represents an OptimWrapper instance. build_optim_wrapper will directly build the OptimWrapperDict instance from optim_wrapper.

参数

optim_wrapper (BaseOptimWrapper or dict) – An OptimWrapper object or a dict to build OptimWrapper objects. If optim_wrapper is an OptimWrapper, just return an OptimizeWrapper instance.
model (Optional[torch.nn.modules.module.Module]) –

返回类型

mmengine.optim.optimizer.base.BaseOptimWrapper

备注

For single optimizer training, if optim_wrapper is a config dict, type is optional(defaults to OptimWrapper) and it must contain optimizer to build the corresponding optimizer.

实际案例

>>> # build an optimizer
>>> optim_wrapper_cfg = dict(type='OptimWrapper', optimizer=dict(
...     type='SGD', lr=0.01))
>>> # optim_wrapper_cfg = dict(optimizer=dict(type='SGD', lr=0.01))
>>> # is also valid.
>>> optim_wrapper = runner.build_optim_wrapper(optim_wrapper_cfg)
>>> optim_wrapper
Type: OptimWrapper
accumulative_counts: 1
optimizer:
SGD (
Parameter Group 0
    dampening: 0
    lr: 0.01
    momentum: 0
    nesterov: False
    weight_decay: 0
)
>>> # build optimizer without `type`
>>> optim_wrapper_cfg = dict(optimizer=dict(type='SGD', lr=0.01))
>>> optim_wrapper = runner.build_optim_wrapper(optim_wrapper_cfg)
>>> optim_wrapper
Type: OptimWrapper
accumulative_counts: 1
optimizer:
SGD (
Parameter Group 0
    dampening: 0
    lr: 0.01
    maximize: False
    momentum: 0
    nesterov: False
    weight_decay: 0
)
>>> # build multiple optimizers
>>> optim_wrapper_cfg = dict(
...    generator=dict(type='OptimWrapper', optimizer=dict(
...        type='SGD', lr=0.01)),
...    discriminator=dict(type='OptimWrapper', optimizer=dict(
...        type='Adam', lr=0.001))
...    # need to customize a multiple optimizer constructor
...    constructor='CustomMultiOptimizerConstructor',
...)
>>> optim_wrapper = runner.optim_wrapper(optim_wrapper_cfg)
>>> optim_wrapper
name: generator
Type: OptimWrapper
accumulative_counts: 1
optimizer:
SGD (
Parameter Group 0
    dampening: 0
    lr: 0.1
    momentum: 0
    nesterov: False
    weight_decay: 0
)
name: discriminator
Type: OptimWrapper
accumulative_counts: 1
optimizer:
'discriminator': Adam (
Parameter Group 0
    dampening: 0
    lr: 0.02
    momentum: 0
    nesterov: False
    weight_decay: 0
)

重要

If you need to build multiple optimizers, you should implement a MultiOptimWrapperConstructor which gets parameters passed to corresponding optimizers and compose the OptimWrapperDict. More details about how to customize OptimizerConstructor can be found at optimizer-docs.

返回

Optimizer wrapper build from optimizer_cfg.

返回类型

BaseOptimWrapper

参数

optim_wrapper (Union[torch.optim.optimizer.Optimizer, mmengine.optim.optimizer.base.BaseOptimWrapper, dict]) –
model (Optional[torch.nn.modules.module.Module]) –

build_param_scheduler(scheduler, optim_wrapper, default_args=None)[源代码]¶

Build parameter schedulers.

build_param_scheduler should be called after build_optim_wrapper because the building logic will change according to the number of optimizers built by the runner. The cases are as below:

Single optimizer: When only one optimizer is built and used in the runner, build_param_scheduler will return a list of parameter schedulers.
Multiple optimizers: When two or more optimizers are built and used in runner, build_param_scheduler will return a dict containing the same keys with multiple optimizers and each value is a list of parameter schedulers. Note that, if you want different optimizers to use different parameter schedulers to update optimizer’s hyper-parameters, the input parameter scheduler also needs to be a dict and its key are consistent with multiple optimizers. Otherwise, the same parameter schedulers will be used to update optimizer’s hyper-parameters.

参数

scheduler (_ParamScheduler or dict or list) – A Param Scheduler object or a dict or list of dict to build parameter schedulers.
optim_wrapper (mmengine.optim.optimizer.base.BaseOptimWrapper) –
default_args (Optional[dict]) –

返回类型

Union[List[mmengine.optim.scheduler.param_scheduler._ParamScheduler], Dict[str, List[mmengine.optim.scheduler.param_scheduler._ParamScheduler]]]

实际案例

>>> # build one scheduler
>>> optim_cfg = dict(dict(type='SGD', lr=0.01))
>>> runner.optim_wrapper = runner.build_optim_wrapper(
>>>     optim_cfg)
>>> scheduler_cfg = dict(type='MultiStepLR', milestones=[1, 2])
>>> schedulers = runner.build_param_scheduler(scheduler_cfg)
>>> schedulers
[<mmengine.optim.scheduler.lr_scheduler.MultiStepLR at 0x7f70f6966290>]  # noqa: E501

>>> # build multiple schedulers
>>> scheduler_cfg = [
...    dict(type='MultiStepLR', milestones=[1, 2]),
...    dict(type='StepLR', step_size=1)
... ]
>>> schedulers = runner.build_param_scheduler(scheduler_cfg)
>>> schedulers
[<mmengine.optim.scheduler.lr_scheduler.MultiStepLR at 0x7f70f60dd3d0>,  # noqa: E501
<mmengine.optim.scheduler.lr_scheduler.StepLR at 0x7f70f6eb6150>]

Above examples only provide the case of one optimizer and one scheduler or multiple schedulers. If you want to know how to set parameter scheduler when using multiple optimizers, you can find more examples optimizer-docs.

返回

List of parameter schedulers or a dictionary contains list of parameter schedulers build from scheduler.

返回类型

list[_ParamScheduler] or dict[str, list[_ParamScheduler]]

参数

scheduler (Union[mmengine.optim.scheduler.param_scheduler._ParamScheduler, Dict, List]) –
optim_wrapper (mmengine.optim.optimizer.base.BaseOptimWrapper) –
default_args (Optional[dict]) –

collect_env()[源代码]¶

Collect the information of the running environments.

返回类型: Tuple[dict, dict]

compile_model(model, compile=False)[源代码]¶

Compile model.

参数

model (nn.Module) – Model to compile.
compile (Union[dict, bool]) –

返回

Compiled model.

返回类型

nn.Module

abstract load_checkpoint(filename, *, map_location='cpu', strict=False, revise_keys=[('^module.', '')], callback=None)[源代码]¶

Load checkpoint from given filename.

参数

filename (str) – Accept local filepath, URL, torchvision://xxx, open-mmlab://xxx.
map_location (Union[str, Callable]) –
strict (bool) –
revise_keys (list) –
callback (Optional[Callable]) –

关键字参数

map_location (str or callable) – A string or a callable function to specifying how to remap storage locations. Defaults to ‘cpu’.
strict (bool) – strict (bool): Whether to allow different params for the model and checkpoint.
revise_keys (list) – A list of customized keywords to modify the state_dict in checkpoint. Each item is a (pattern, replacement) pair of the regular expression operations. Defaults to strip the prefix ‘module.’ by [(r’^module.’, ‘’)].
callback (callable, callable) – Callback function to modify the checkpoint after loading the checkpoint. Defaults to None.

返回类型

dict

load_model_state_dict(state_dict, *, strict=False, revise_keys=[('^module.', '')])[源代码]¶

Load model state from dict.

参数

state_dict (dict) –
strict (bool) –
revise_keys (list) –

返回类型

None

load_optim_state_dict(state_dict)[源代码]¶

Load optimizer state from dict.

参数: state_dict (dict) –
返回类型: None

load_or_resume(*, load_from=None, resume=False)[源代码]¶

Load checkpoint or resume from checkpoint.

参数

load_from (str, optional) – The checkpoint file to load from. Defaults to None.
resume (bool or str) – Whether to resume training. Defaults to False. If resume is True and load_from is None, automatically to find latest checkpoint from work_dir. If not found, resuming does nothing. If resume is a string, it will be treated as the checkpoint file to resume from.

返回类型

Optional[dict]

load_scheduler_state_dict(state_dict)[源代码]¶

Load scheduler state from dict.

参数: state_dict (Union[dict, list]) –
返回类型: None

model_state_dict()[源代码]¶

Get model state dict.

返回类型: dict

optim_state_dict()[源代码]¶

Get optimizer state dict.

返回类型: dict

abstract prepare(model, *, optim_wrapper=None, param_scheduler=None, compile=False, dispatch_kwargs=None)[源代码]¶

Prepare model and some components.

参数

model (torch.nn.Module or dict) – The model to be run. It can be a dict used for building a model.
optim_wrapper (Optional[Union[mmengine.optim.optimizer.base.BaseOptimWrapper, dict]]) –
param_scheduler (Optional[Union[mmengine.optim.scheduler.param_scheduler._ParamScheduler, Dict, List]]) –
compile (Union[dict, bool]) –
dispatch_kwargs (Optional[dict]) –

关键字参数

optim_wrapper (BaseOptimWrapper or dict, optional) – Computing the gradient of model parameters and updating them. Defaults to None. See build_optim_wrapper() for examples.
param_scheduler (_ParamScheduler or dict or list, optional) – Parameter scheduler for updating optimizer parameters. If specified, optim_wrapper should also be specified. Defaults to None. See build_param_scheduler() for examples.
compile (dict, optional) – Config to compile model. Defaults to False. Requires PyTorch>=2.0.
dispatch_kwargs (dict, optional) – Kwargs to be passed to other methods of Strategy. Defaults to None.

abstract resume(filename, *, resume_optimizer=True, resume_param_scheduler=True, map_location='default', callback=None)[源代码]¶

Resume training from given filename.

Four types of states will be resumed.

model state
optimizer state
scheduler state
randomness state

参数

filename (str) – Accept local filepath, URL, torchvision://xxx, open-mmlab://xxx.
resume_optimizer (bool) –
resume_param_scheduler (bool) –
map_location (Union[str, Callable]) –
callback (Optional[Callable]) –

关键字参数

resume_optimizer (bool) – Whether to resume optimizer state. Defaults to True.
resume_param_scheduler (bool) – Whether to resume param scheduler state. Defaults to True.
map_location (str or callable) – A string or a callable function to specifying how to remap storage locations. Defaults to ‘default’.
callback (callable, callable) – Callback function to modify the checkpoint before saving the checkpoint. Defaults to None.

返回类型

dict

abstract save_checkpoint(filename, *, save_optimizer=True, save_param_scheduler=True, extra_ckpt=None, callback=None)[源代码]¶

Save checkpoint to given filename.

参数

filename (str) – Filename to save checkpoint.
save_optimizer (bool) –
save_param_scheduler (bool) –
extra_ckpt (Optional[dict]) –
callback (Optional[Callable]) –

关键字参数

save_optimizer (bool) – Whether to save the optimizer to the checkpoint. Defaults to True.
save_param_scheduler (bool) – Whether to save the param_scheduler to the checkpoint. Defaults to True.
extra_ckpt (dict, optional) – Extra checkpoint to save. Defaults to None.
callback (callable, callable) – Callback function to modify the checkpoint before saving the checkpoint. Defaults to None.

返回类型

None

scheduler_state_dict()[源代码]¶

Get parameter scheduler state dict.

返回类型: Union[dict, list]