Inference¶
When developing with MMEngine, we usually define a configuration file for a specific algorithm, use the file to build a runner, execute the training and testing processes, and save the trained weights. When performing inference based on the trained model, the following steps are usually required:
Build the model based on the configuration file
Load the model weights
Set up the data preprocessing pipeline
Perform the forward inference of the model
Visualize the inference results
Return the inference results
For such standard inference workflow, MMEngine provides a unified inference interface and recommends that users develop inference applications based on this interface specification.
Usage¶
Defining an Inferencer¶
Implement a custom inferencer
based on BaseInferencer
from mmengine.infer import BaseInferencer
class CustomInferencer(BaseInferencer)
...
For specific details, please refer to the Development Specification.
Building an Inferencer¶
Building an Inferencer Based on Configuration File
cfg = 'path/to/config.py'
weight = 'path/to/weight.pth'
inferencer = CustomInferencer(model=cfg, weight=weight)
Building an Inferencer Based on Config object
from mmengine import Config
cfg = Config.fromfile('path/to/config.py')
weight = 'path/to/weight.pth'
inferencer = CustomInferencer(model=cfg, weight=weight)
Building an Inferencer based on model name defined in model-index. Take the ATSS detector in MMDetection as an example, the model name is atss_r50_fpn_1x_coco
. Since the path of weight has already been defined in the model-index, there is no need to configure the weight argument anymore.
inferencer = CustomInferencer(model='atss_r50_fpn_1x_coco')
Performing Inference¶
Inferring on a Single Image
# Input as Image Path
img = 'path/to/img.jpg'
result = inferencer(img)
# Input as Loaded Image (Type: np.ndarray)
img = cv2.imread('path/to/img.jpg')
result = inferencer(img)
# Input as url
img = 'https://xxx.com/img.jpg'
result = inferencer(img)
Inferring on Multiple Images
img_dir = 'path/to/directory'
result = inferencer(img_dir)
Note
OpenMMLab requires the inferencer(img)
to output a dict
containing two fields: visualization: list
and predictions: list
, representing the visualization results and prediction results, respectively.
Development Specification of Inference Interface¶
When performing inference, the following steps are typically executed:
preprocess: Input data preprocessing, including data reading, data preprocessing, data format conversion, etc.
forward: Execute
model.forwward
visualize: Visualization of predicted results.
postprocess: Post-processing of predicted results, including result format conversion, exporting predicted results, etc.
To improve the user experience of the inferencer, we do not want users to have to configure parameters for each step when performing inference. In other words, we hope that users can simply configure parameters for the __call__
interface without being aware of the above process and complete the inference.
The __call__
interface will execute the aforementioned steps in order, but it is not aware of which step the parameters provided by the user should be assigned to. Therefore, when developing a CustomInferencer
, developers need to define four class attributes: preprocess_kwargs
, forward_kwargs
, visualize_kwargs
, and postprocess_kwargs
. Each attribute is a set of strings that are used to specify which step the parameters in the __call__
interface correspond to:
class CustomInferencer(BaseInferencer):
preprocess_kwargs = {'a'}
forward_kwargs = {'b'}
visualize_kwargs = {'c'}
postprocess_kwargs = {'d'}
def preprocess(self, inputs, batch_size=1, a=None):
pass
def forward(self, inputs, b=None):
pass
def visualize(self, inputs, preds, show, c=None):
pass
def postprocess(self, preds, visualization, return_datasample=False, d=None):
pass
def __call__(
self,
inputs,
batch_size=1,
show=True,
return_datasample=False,
a=None,
b=None,
c=None,
d=None):
return super().__call__(
inputs, batch_size, show, return_datasample, a=a, b=b, c=c, d=d)
In the code above, a
, b
, c
, and d
in the preprocess
, forward
, visualize
, and postprocess
functions are additional parameters that can be passed in by the user (inputs
, preds
, and other parameters are automatically filled in during the execution of __call__
). Therefore, developers need to specify these parameters in the preprocess_kwargs
, forward_kwargs
, visualize_kwargs
, and postprocess_kwargs
class attributes, so that the parameters passed in by the user in the __call__
phase can be correctly assigned to the corresponding steps. The distribution process is implemented by the BaseInferencer.__call__
function, which developers do not need to be concerned about.
In addition, we need to register the CustomInferencer
to a custom registry or the MMEngine’s registry.
from mmseg.registry import INFERENCERS
# It can also be registered to the registry of MMEngine.
# from mmengine.registry import INFERENCERS
@INFERENCERS.register_module()
class CustomInferencer(BaseInferencer):
...
Note
In OpenMMLab’s algorithm repositories, the Inferencer must be registered to the downstream repository’s registry instead of the root registry of MMEngine to avoid naming conflicts.
Core Interface Explanation:¶
__init__()
¶
The BaseInferencer.__init__
method has already implemented the logic for building an inferencer as shown in the above section, so in most cases, there is no need to override the __init__
method. However, if there is a need to implement custom logic for loading configuration files, weight initialization, pipeline initialization, etc., the __init__
method can be overridden.
_init_pipeline()
¶
Note
This is an abstract method that must be implemented by the subclass.
Initialize and return the pipeline required by the inferencer. The pipeline is used for a single image, similar to the train_pipeline
and test_pipeline
defined in the OpenMMLab series algorithm library. Each inputs
passed in by the user when calling the __call__
interface will be processed by the pipeline to form batch data, which will then be passed to the forward
method. This is an abstract method that must be implemented by the subclass.
_init_collate()
¶
Initialize and return the collate_fn
required by the inferencer, which is equivalent to the collate_fn
of the Dataloader in the training process. BaseInferencer
will obtain the collate_fn
from the configuration of test_dataloader
by default, so it is generally not necessary to override the _init_collate
method.
_init_visualizer()
¶
Initializes and returns the visualizer
required by the inferencer, which is equivalent to the visualizer
used in the training process. By default, BaseInferencer
obtains the visualizer
from the configuration of the visualizer
, so there is usually no need to override the _init_visualizer
function.
preprocess()
¶
Input arguments:
inputs: Input data, passed into
__call__
, usually a list of image paths or image data.batch_size: batch size, passed in by the user when calling
__call__
.Other parameters: Passed in by the user and specified in
preprocess_kwargs
.
Return:
A generator that yields one batch of data at each iteration.
The preprocess
function is a generator function by default, which applies the pipeline
and collate_fn
to the input data, and yields the preprocessed batch data. In general, subclasses do not need to override this function.
forward()
¶
Input arguments:
inputs: The batch data processed by
preprocess
function.Other parameters: Passed in by the user and specified in
forward_kwargs
.
Return:
Prediction result, default type is
List[BaseDataElement]
.
Calls model.test_step
to perform forward inference and returns the inference result. Subclasses typically do not need to override this method.
visualize()
¶
Note
This is an abstract method that must be implemented by the subclass.
Input arguments:
inputs: The input data, which is the raw data without preprocessing.
preds: Predicted results of the model.
show: Whether to visualize.
Other parameters: Passed in by the user and specified in
visualize_kwargs
.
Return:
Visualize the results, which are usually of type
List[np.ndarray]
. Taking object detection as an example, each element in the list should be an image with detection boxes drawn, which can be visualized usingcv2.imshow
. The visualization process may vary for different tasks, andvisualize
should return results that are suitable for common visualization processes in that field.
postprocess()
¶
Note
This is an abstract method that must be implemented by the subclass.
Input arguments:
preds: The predicted results of the model, which is a
list
type. Each element in the list represents the prediction result for a single data item. In the OpenMMLab series of algorithm libraries, the type of each element in the prediction result isBaseDataElement
.visualization: Visualization results
return_datasample: Whether to maintain datasample for return. When set to
False
, the returned result is converted to adict
.Other parameters: Passed in by the user and specified in
postprocess_kwargs
.
Return:
The type of the returned value is a dictionary containing both the visualization and prediction results. OpenMMLab requires the returned dictionary to have two keys:
predictions
andvisualization
.
__call__()
¶
Input arguments:
inputs: The input data, usually a list of image paths or image data. Each element in
inputs
can also be other types of data as long as it can be processed by thepipeline
returned by init_pipeline. When there is only one inference data ininputs
, it does not have to be alist
,__call__
will internally wrap it into a list for further processing.return_datasample: Whether to convert datasample to dict for return.
batch_size: Batch size for inference, which will be further passed to the
preprocess
function.Other parameters: Additional parameters assigned to
preprocess
,forward
,visualize
, andpostprocess
methods.
Return:
The visualized and predicted results returned by
postprocess
, in the form of a dictionary. OpenMMLab requires the returned dictionary to contain two keys:predictions
andvisualization
.