Better performance optimizers¶
This document provides some third-party optimizers supported by MMEngine, which may bring faster convergence speed or higher performance.
D-Adaptation¶
D-Adaptation provides DAdaptAdaGrad
, DAdaptAdam
and DAdaptSGD
optimizers.
Note
If you use the optimizer provided by D-Adaptation, you need to upgrade mmengine to 0.6.0
.
Installation
pip install dadaptation
Usage
Take the DAdaptAdaGrad
as an example.
runner = Runner(
model=ResNet18(),
work_dir='./work_dir',
train_dataloader=train_dataloader_cfg,
# To view the input parameters for DAdaptAdaGrad, you can refer to
# https://github.com/facebookresearch/dadaptation/blob/main/dadaptation/dadapt_adagrad.py
optim_wrapper=dict(optimizer=dict(type='DAdaptAdaGrad', lr=0.001, momentum=0.9)),
train_cfg=dict(by_epoch=True, max_epochs=3),
)
runner.train()
Lion-Pytorch¶
lion-pytorch provides the Lion
optimizer.
Note
If you use the optimizer provided by Lion-Pytorch, you need to upgrade mmengine to 0.6.0
.
Installation
pip install lion-pytorch
Usage
runner = Runner(
model=ResNet18(),
work_dir='./work_dir',
train_dataloader=train_dataloader_cfg,
# To view the input parameters for Lion, you can refer to
# https://github.com/lucidrains/lion-pytorch/blob/main/lion_pytorch/lion_pytorch.py
optim_wrapper=dict(optimizer=dict(type='Lion', lr=1e-4, weight_decay=1e-2)),
train_cfg=dict(by_epoch=True, max_epochs=3),
)
runner.train()
Sophia¶
Sophia provides Sophia
, SophiaG
, DecoupledSophia
and Sophia2
optimizers.
Note
If you use the optimizer provided by Sophia, you need to upgrade mmengine to 0.7.4
.
Installation
pip install Sophia-Optimizer
Usage
runner = Runner(
model=ResNet18(),
work_dir='./work_dir',
train_dataloader=train_dataloader_cfg,
# To view the input parameters for SophiaG, you can refer to
# https://github.com/kyegomez/Sophia/blob/main/Sophia/Sophia.py
optim_wrapper=dict(optimizer=dict(type='SophiaG', lr=2e-4, betas=(0.965, 0.99), rho = 0.01, weight_decay=1e-1)),
train_cfg=dict(by_epoch=True, max_epochs=3),
)
runner.train()
bitsandbytes¶
bitsandbytes provides AdamW8bit
, Adam8bit
, Adagrad8bit
, PagedAdam8bit
, PagedAdamW8bit
, LAMB8bit
, LARS8bit
, RMSprop8bit
, Lion8bit
, PagedLion8bit
and SGD8bit
optimizers.
Note
If you use the optimizer provided by bitsandbytes, you need to upgrade mmengine to 0.9.0
.
Installation
pip install bitsandbytes
Usage
Take the AdamW8bit
as an example.
runner = Runner(
model=ResNet18(),
work_dir='./work_dir',
train_dataloader=train_dataloader_cfg,
# To view the input parameters for AdamW8bit, you can refer to
# https://github.com/TimDettmers/bitsandbytes/blob/main/bitsandbytes/optim/adamw.py
optim_wrapper=dict(optimizer=dict(type='AdamW8bit', lr=1e-4, weight_decay=1e-2)),
train_cfg=dict(by_epoch=True, max_epochs=3),
)
runner.train()
transformers¶
transformers provides Adafactor
optimzier.
Note
If you use the optimizer provided by transformers, you need to upgrade mmengine to 0.9.0
.
Installation
pip install transformers
Usage
Take the Adafactor
as an example.
runner = Runner(
model=ResNet18(),
work_dir='./work_dir',
train_dataloader=train_dataloader_cfg,
# To view the input parameters for Adafactor, you can refer to
# https://github.com/huggingface/transformers/blob/v4.33.2/src/transformers/optimization.py#L492
optim_wrapper=dict(optimizer=dict(type='Adafactor', lr=1e-5,
weight_decay=1e-2, scale_parameter=False, relative_step=False)),
train_cfg=dict(by_epoch=True, max_epochs=3),
)
runner.train()