BaseMetric¶

class mmengine.evaluator.BaseMetric(collect_device='cpu', prefix=None, collect_dir=None)[source]¶

Base class for a metric.

The metric first processes each batch of data_samples and predictions, and appends the processed results to the results list. Then it collects all results together from all ranks if distributed training is used. Finally, it computes the metrics of the entire dataset.

A subclass of class:BaseMetric should assign a meaningful value to the class attribute default_prefix. See the argument prefix for details.

Parameters:

collect_device (str) – Device name used for collecting results from different ranks during distributed training. Must be ‘cpu’ or ‘gpu’. Defaults to ‘cpu’.
prefix (str, optional) – The prefix that will be added in the metric names to disambiguate homonymous metrics of different evaluators. If prefix is not provided in the argument, self.default_prefix will be used instead. Default: None
collect_dir (str | None) – (str, optional): Synchronize directory for collecting data from different ranks. This argument should only be configured when collect_device is ‘cpu’. Defaults to None. New in version 0.7.3.

abstract compute_metrics(results)[source]¶

Compute the metrics from processed results.

Parameters:: results (list) – The processed results of each batch.
Returns:: The computed metrics. The keys are the names of the metrics, and the values are corresponding results.
Return type:: dict

property dataset_meta: dict | None¶

Meta info of the dataset.

Type:: Optional[dict]

evaluate(size)[source]¶

Evaluate the model performance of the whole dataset after processing all batches.

Parameters:: size (int) – Length of the entire validation dataset. When batch size > 1, the dataloader may pad some data samples to make sure all ranks have the same length of dataset slice. The collect_results function will drop the padded data based on this size.
Returns:: Evaluation metrics dict on the val dataset. The keys are the names of the metrics, and the values are corresponding results.
Return type:: dict

abstract process(data_batch, data_samples)[source]¶

Process one batch of data samples and predictions. The processed results should be stored in self.results, which will be used to compute the metrics when all batches have been processed.

Parameters:

data_batch (Any) – A batch of data from the dataloader.
data_samples (Sequence[dict]) – A batch of outputs from the model.

Return type:

None