Migrate Hook from MMCV to MMEngine¶

Introduction¶

Due to the upgrade of our architecture design and the continuous increase of user demands, existing hook mount points in MMCV can no longer meet the requirements. Hence, we redesigned the mount points in MMEngine, and the functions of hooks were adjusted accordingly. It will help a lot to read the tutorial Hook Design before your migration.

This tutorial compares the difference in function, mount point, usage and implementation between MMCV v1.6.0 and MMEngine v0.5.0.

Function Comparison¶

	MMCV	MMEngine
Backpropagation and gradient update	OptimizerHook	Unify the backpropagation and gradient update operations into OptimWrapper rather than hooks
Backpropagation and gradient update	GradientCumulativeOptimizerHook
Learning rate adjustment	LrUpdaterHook	Use ParamSchdulerHook and subclasses of _ParamScheduler to complete the adjustment of optimizer hyperparameters
Momentum adjustment	MomentumUpdaterHook
Saving model weights at specified interval	CheckpointHook	The CheckpointHook is responsible for not only saving weights but also saving the optimal weights. Meanwhile, the model evaluation function of EvalHook is delegated to ValLoop or TestLoop.
Model evaluation and optimal weights saving	EvalHook
Log printing	LoggerHook and its subclasses can print logs, save logs and visualize data	LoggerHook
Visualization		NaiveVisualizationHook
Adding runtime information		RuntimeInfoHook
Model weights exponential moving average (EMA)	EMAHook	EMAHook
Ensuring that the shuffle functionality of the distributed Sampler takes effect	DistSamplerSeedHook	DistSamplerSeedHook
Synchronizing model buffer	SyncBufferHook	SyncBufferHook
Empty PyTorch CUDA cache	EmptyCacheHook	EmptyCacheHook
Calculating iteration time-consuming	IterTimerHook	IterTimerHook
Analyzing bottlenecks of training time	ProfilerHook	Not yet available
Provide the most concise function registration	ClosureHook	Not yet available

Mount Point Comparison¶

		MMCV	MMEngine
Global mount points	before run	before_run	before_run
Global mount points	after run	after_run	after_run
Checkpoint related	after loading checkpoints	None	after_load_checkpoint
Checkpoint related	before saving checkpoints	None	before_save_checkpoint
Training related	triggered before training	None	before_train
	triggered after training	None	after_train
	before each epoch	before_train_epoch	before_train_epoch
	after each epoch	after_train_epoch	after_train_epoch
	before each iteration	before_train_iter	before_train_iter, with additional args: batch_idx and data_batch
	after each iteration	after_train_iter	after_train_iter, with additional args: batch_idx, data_batch, and outputs
Validation related	before validation	None	before_val
	after validation	None	after_val
	before each epoch	before_val_epoch	before_val_epoch
	after each epoch	after_val_epoch	after_val_epoch
	before each iteration	before_val_iter	before_val_iter, with additional args: batch_idx and data_batch
	after each iteration	after_val_iter	after_val_iter, with additional args: batch_idx, data_batch and outputs
Test related	before test	None	before_test
	after test	None	after_test
	before each epoch	None	before_test_epoch
	after each epoch	None	after_test_epoch
	before each iteration	None	before_test_iter, with additional args: batch_idx and data_batch
	after each iteration	None	after_test_iter, with additional args: batch_idx, data_batch and outputs

Usage Comparison¶

In MMCV, to register hooks to the runner, you need to call the Runner’s register_training_hooks method to register hooks to the Runner. In MMEngine, you can register hooks by passing them as parameters to the Runner’s initialization method.

MMCV

model = ResNet18()
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
lr_config = dict(policy='step', step=[2, 3])
optimizer_config = dict(grad_clip=None)
checkpoint_config = dict(interval=5)
log_config = dict(interval=100, hooks=[dict(type='TextLoggerHook')])
custom_hooks = [dict(type='NumClassCheckHook')]
runner = EpochBasedRunner(
    model=model,
    optimizer=optimizer,
    work_dir='./work_dir',
    max_epochs=3,
    xxx,
)
runner.register_training_hooks(
    lr_config=lr_config,
    optimizer_config=optimizer_config,
    checkpoint_config=checkpoint_config,
    log_config=log_config,
    custom_hooks_config=custom_hooks,
)
runner.run([trainloader], [('train', 1)])

MMEngine

model=ResNet18()
optim_wrapper=dict(
    type='OptimizerWrapper',
    optimizer=dict(type='SGD', lr=0.001, momentum=0.9))
param_scheduler = dict(type='MultiStepLR', milestones=[2, 3]),
default_hooks = dict(
    logger=dict(type='LoggerHook'),
    param_scheduler=dict(type='ParamSchedulerHook'),
    checkpoint=dict(type='CheckpointHook', interval=5),
)
custom_hooks = [dict(type='NumClassCheckHook')]
runner = Runner(
    model=model,
    work_dir='./work_dir',
    optim_wrapper=optim_wrapper,
    param_scheduler=param_scheduler,
    train_cfg=dict(by_epoch=True, max_epochs=3),
    default_hooks=default_hooks,
    custom_hooks=custom_hooks,
    xxx,
)
runner.train()

For more details of MMEngine hooks, please refer to Usage of Hooks.

Implementation Comparison¶

Taking CheckpointHook as an example, compared with CheckpointHook in MMCV, CheckpointHook of MMEngine needs to implement the after_val_epoch method, since new CheckpointHook supports saving the optimal weights, while in MMCV, the function is achieved by EvalHook.

MMCV

class CheckpointHook(Hook):
    def before_run(self, runner):
        """Initialize out_dir and file_client"""

    def after_train_epoch(self, runner):
        """Synchronize buffer and save model weights, for tasks trained in epochs"""

    def after_train_iter(self, runner):
        """Synchronize buffers and save model weights for tasks trained in iterations"""

MMEngine

class CheckpointHook(Hook):
    def before_run(self, runner):
        """Initialize out_dir and file_client"""

    def after_train_epoch(self, runner):
        """Synchronize buffer and save model weights, for tasks trained in epochs"""

    def after_train_iter(self, runner, batch_idx, data_batch, outputs):
        """Synchronize buffers and save model weights for tasks trained in iterations"""

    def after_val_epoch(self, runner, metrics):
        """Save optimal weights according to metrics"""