Shortcuts

mmengine.dist.all_reduce_dict

mmengine.dist.all_reduce_dict(data, op='sum', group=None)[source]

Reduces the dict across all machines in such a way that all get the final result.

The code is modified from https://github.com/Megvii- BaseDetection/YOLOX/blob/main/yolox/utils/allreduce_norm.py.

Parameters:
  • data (dict[str, Tensor]) – Data to be reduced.

  • op (str) – Operation to reduce data. Defaults to ‘sum’. Optional values are ‘sum’, ‘mean’ and ‘produce’, ‘min’, ‘max’, ‘band’, ‘bor’ and ‘bxor’.

  • group (ProcessGroup, optional) – The process group to work on. If None, the default process group will be used. Defaults to None.

Return type:

None

Examples

>>> import torch
>>> import mmengine.dist as dist
>>> # non-distributed environment
>>> data = {
        'key1': torch.arange(2, dtype=torch.int64),
        'key2': torch.arange(3, dtype=torch.int64)
    }
>>> dist.all_reduce_dict(data)
>>> data
    {'key1': tensor([0, 1]), 'key2': tensor([0, 1, 2])}
>>> # distributed environment
>>> # We have 2 process groups, 2 ranks.
>>> data = {
        'key1': torch.arange(2, dtype=torch.int64),
        'key2': torch.arange(3, dtype=torch.int64)
    }
>>> dist.all_reduce_dict(data)
>>> data
{'key1': tensor([0, 2]), 'key2': tensor([0, 2, 4])}  # Rank 0
{'key1': tensor([0, 2]), 'key2': tensor([0, 2, 4])}  # Rank 1