Benchmarking tool#

Aidge v0.6.0 and later v0.6.1 introduced a set of scripts for cross-library testing of time performances and correctness of single Operators.

The main script named benchmark.py can be called via its own CLI aidge_benchmark

In its current state, this script can be used for: 1. Measuring time performance - for documentation - to guide developments 2. Cross-comparison of results between modules and some external libraries

Let’s dive into the benchmarking process through a complete step by step benchmarking of the Conv operator.

Benchmarking the Conv Operator#

Benchmarking an operator always involves two steps:

Choosing a test configuration file
Running the benchmark script

1. Choosing a test configuration#

All tests need to be listed and described in a JSON configuration file. You can either: - Use one of the existing configuration files from aidge/aidge_core/aidge_core/benchmark/operator_config - Create your own custom configuration file

Configuration structure#

Each test configuration file is made of 3 sections:

┌───────────────────┐
│ 1. Metadata       │
├───────────────────┤
│ 2. Default config │
│    (optional)     │
├───────────────────┤
│ 3. Test configs   │
│    ┣━ config_1    │
│    ┣━ ...         │
│    ┗━ config_n    │
└───────────────────┘

1. Metadata#

Description of the operator to create (e.g type or the opset_version if generating an ONNX operator)

warning:

Note: Currently, benchmarking is only supported for ONNX operators. Support for benchmarking native Aidge operators (without using ONNX as an intermediate) is planned for future versions.

2. Default configuration (optional)#

The default configuration sets baseline values for the operator’s inputs and attributes. These defaults will apply to all tests unless explicitly overridden in a specific test configuration. This is particularly useful for testing how a single parameter impacts performance while keeping others fixed.

Inputs specifications input_properties

Each input can be defined in one of two ways:

Using explicit values:

When an array of values is provided, it is treated as a constant input for the test. In this case, the input dimensions are automatically inferred from the shape of the array, so the dims field is optional.
Using dimensions only:

If only dims are specified (and no values), the benchmarking tool will generate a random input tensor with those dimensions at runtime.
Missing both values and dimensions:

If neither values nor dims are provided or if they are both set to null for an input, the configuration is considered invalid and will raise an error during execution.

Attributes specifications attributes

Attributes behave similarly:

If a value is provided, it will be used as-is.
If not specified, the attribute is treated as null in the configuration.

Any attribute not defined in either the test case or the default config will be left unset, which may lead to runtime errors depending on the operator’s requirements.

3. Test parameters#

This section contains the actual test cases. Each one overrides selected inputs or attributes from the default configuration. Unspecified values fall back to the defaults defined earlier.

This structure allows you to focus on the heart of your test while reusing a shared base configuration to minimize repetition and potential errors.

Small overview of a config file#

Let’s look at an example configuration file called conv2d.json, which is already available.

{
    // 1. Meta parameters
    "operator": "Conv",    // Select ONNX operator type
    "opset_version": 21,
    "initializer_rank": 1,
    "test_meta_data": {
        "multiple_batchs": false
    },

    // 2. Default configuration (optional)
    "base_configuration": {   // Select default config
        "input_properties": [     // Default input dimensions or values
            {"name": "input_0", "dims": [1, 10, 200, 200]},
            {"name": "weight_1", "dims": [10, 10, 3, 3]},
            {"name": "bias_2", "dims": [10]}
        ],
        "attributes": {    // Default attributes
            "kernel_shape": [3, 3],
            "strides": [1, 1],
            "dilations": [1, 1]
        }
    },

    // 3. Tested parameters
    "test_configurations": {
        "feature_map_size": {
            "10": {
                "attributes": {},
                "input_properties": [
                    {"name": "input_0", "dims": [1, 10, 10, 10]}
                ]
            },
            "100": {
                // ...
            },
            "1000": {
                // ...
            }
        },
        "kernel_shape": {
            "[1,1]": {
                "attributes": {"kernel_shape": [1, 1]},
                "input_properties": [
                    {"name": "weight_1", "dims": [10, 10, 1, 1]}
                ]
            }
        }
    }
}

1.1 Pre-made available config files#

To see a list of all pre-made configurations, you can run:

[1]:

!aidge_benchmark --show-available-config

Available configuration files
 ├─── add.json
 ├─── atan.json
 ├─── batchnorm2d.json
 ├─── broadcasted_add.json
 ├─── broadcasted_div.json
 ├─── broadcasted_sub.json
 ├─── concat.json
 ├─── conv2d.json
 ├─── div.json
 ├─── exp.json
 ├─── fc.json
 ├─── matmul.json
 ├─── mul.json
 ├─── relu.json
 ├─── reshape.json
 ├─── sigmoid.json
 ├─── softmax.json
 └─── sub.json

1.2 Config file template generation#

If you want to create your own configuration file, you can use the built-in template generator to quickly create a well-structured draft for any ONNX operator.

The generated file includes: - All expected input names - All available attributes - Default values, if known

command format

!aidge_benchmark --generate-template <operation-type>:<opset-version>:<initializer_rand>

<operator-type>: the name of the ONNX operator (e.g., Conv)
<opset-version>: the ONNX opset version to use
<initializer_rank>: the number of inputs not considered parameters (e.g., actual data inputs, excluding weights and biases)

Example: Generating a Conv Template#

Here’s how to generate a template for a Conv operator using opset 21, where only the first input is not a parameter:

[2]:

!aidge_benchmark --generate-template Conv:21:1 --save-directory operator_template

template configuration saved as 'operator_template/Conv_opset21_template.json'.

[3]:

# You can load and inspect the file like this:

import json
with open("operator_template/Conv_opset21_template.json", 'r') as f:
    data = json.load(f)
    print(json.dumps(data, indent=4))

{
    "operator": "Conv",
    "opset_version": 21,
    "initializer_rank": 1,
    "test_meta_data": {
        "multiple_batchs": true
    },
    "base_configuration": {
        "attributes": {
            "group": 1,
            "pads": null,
            "auto_pad": "NOTSET",
            "strides": null,
            "dilations": null,
            "kernel_shape": null
        },
        "input_properties": [
            {
                "name": "X",
                "dims": null,
                "values": null
            },
            {
                "name": "W",
                "dims": null,
                "values": null
            },
            {
                "name": "B",
                "dims": null,
                "values": null
            }
        ]
    },
    "test_configurations": {}
}

2. Choosing our test options#

We want to evaluate both time performance (inference time) and correctness (output consistency) across several modules - aidge_backend_cpu - aidge_backend_cuda - aidge_export_cpp.

We’ll compare them against a reliable reference: onnxruntime. For performance context, we’ll also include torch in the timing benchmarks.

To explore available command-line options, you can use the --help option which will displays the full list of options:

$ aidge_benchmark --help
usage: aidge_benchmark [-h] [--show-available-config] [--generate-template GENERATE_TEMPLATE]
                       (--config-file CONFIG_FILE | --onnx-file ONNX_FILE) --modules MODULES
                       [MODULES ...] [--time] [--nb-iterations NB_ITERATIONS]
                       [--nb-warmups NB_WARMUPS] [--compare] [--ref REF]
                       [--save-directory SAVE_DIRECTORY] [--results-filename RESULTS_FILENAME] [-v]

Operator Kernel Performance Benchmarking across multiple inference modules.

options:
  -h, --help            show this help message and exit
  --show-available-config
                        show JSON configuration files stored in the standard configuration directory.
  --generate-template GENERATE_TEMPLATE
                        Create a well-structured draft that includes all the necessary inputs and
                        attributes for the chosen ONNX operator, along with their default values
                        (when available). Format: 'OperatorName:opset_version:initializer_rank'
  ==>    --config-file CONFIG_FILE, -cf CONFIG_FILE
                        Path to a JSON configuration file containing an ONNX operator description
                        with reference and tested parameter values. A new ONNX model will
                        automatically be generated for each test case. Cannot be specified with '--
                        onnx-file' option
        --onnx-file ONNX_FILE, -of ONNX_FILE
                        Path to an existing ONNX file that will be used for benchmarking. Cannot be
                        specified with '--config-file' option.
  ==>   --modules MODULES [MODULES ...], -m MODULES [MODULES ...]
                        List of inference module names to benchmark (e.g., 'torch', 'onnxruntime').
  ==>   --time, -t      Measure inference time for each module.
        --nb-iterations NB_ITERATIONS
                        Number of iterations to run for the 'time' test (default: 50).
        --nb-warmups NB_WARMUPS
                        Number of warmup steps to run for the 'time' test (default: 10).
  ==>   --compare, -c   Compare the inference outputs of each module against a reference
                        implementation.
        --ref REF       Reference module used for comparing results (default: 'onnxruntime').
        --save-directory SAVE_DIRECTORY
                        Directory used for saving any output from the script.
        --results-filename RESULTS_FILENAME
                        Name of the saved result file. If not provided, it will default to the
                        '<operator_name>_<module_to_bench>.json'. If a file with that name and at that
                        location already exists, it will be overrided with elements individually
                        replaced only if new ones are computed
        -v, --verbose   Set the verbosity level of the console output. Use -v to increase verbosity,
                        with the following levels in ascending order: default: WARN - Only warnings
                        and higher (WARN, ERROR, FATAL) are displayed. -v: NOTICE - Notices and
                        higher (NOTICE, WARN, ERROR, FATAL) are displayed. -vv INFO - Informational
                        messages and higher (INFO, NOTICE, WARN, ERROR, FATAL) are displayed. -vvv:
                        DEBUG - All messages, including debug information, are displayed. Available
                        levels i

With these options, you can run a full benchmark like this:

[ ]:

!aidge_benchmark --config-file conv2d.json --time --compare --modules aidge_backend_cpu torch aidge_backend_cuda onnxruntime torch --save-directory benchmark_results

Loading modules...
 ├───aidge_backend_cpu [ ok ]
 ├───torch [ ok ]
 ├───aidge_backend_cuda [ xx ]

We can see each test was successfully run and returned the right result.

Time and comparison details have been saved to the default directory for results (since we did not specify a custom path). You can inspect it with:

[5]:

!tree benchmark_results/

benchmark_results/

0 directories, 0 files

Viewing the results#

Time performance data can be visualized in a bar plot. To do this, run the plotting script manually:

python aidge/aidge_core/aidge_core/benchmark/generate_graph.py --help

usage: generate_graph.py [-h] [--config-file CONFIG_FILE] [--results-directory RESULTS_DIRECTORY]
                         --ref REF --libs LIBS [LIBS ...]

Compare time performance of operator kernels by plotting relative differences.

options:
    -h, --help          show this help message and exit
     --config-file CONFIG_FILE, -cf CONFIG_FILE
                        Path to a JSON configuration file containing an ONNX operator description
                        with reference and tested parameter values.
    --results-directory RESULTS_DIRECTORY
                        Directory to add to the search path for results.
    --ref REF, -r REF   Path to the JSON file with reference results
    --libs LIBS [LIBS ...], -l LIBS [LIBS ...]

[ ]:

!python ../../../aidge/aidge_core/aidge_core/benchmark/generate_graph.py --config-file conv2d.json  --results-directory benchmark_results/ --ref conv_onnxruntime.json --libs benchmark_results/*

Traceback (most recent call last):
  File "/data1/is156025/cm264821/aidge_packages/aidge/examples/tutorials/Benchmarking/../../../aidge/aidge_core/aidge_core/benchmark/generate_graph.py", line 6, in <module>
    import pandas as pd
ModuleNotFoundError: No module named 'pandas'

[7]:

from IPython.display import SVG, display

display(SVG(filename='Conv_inference_time_comparison.svg'))

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[7], line 3
      1 from IPython.display import SVG, display
----> 3 display(SVG(filename='Conv_inference_time_comparison.svg'))

File /data1/is156025/cm264821/miniforge3/envs/aidge/lib/python3.12/site-packages/IPython/core/display.py:331, in DisplayObject.__init__(self, data, url, filename, metadata)
    328 elif self.metadata is None:
    329     self.metadata = {}
--> 331 self.reload()
    332 self._check_data()

File /data1/is156025/cm264821/miniforge3/envs/aidge/lib/python3.12/site-packages/IPython/core/display.py:357, in DisplayObject.reload(self)
    355 if self.filename is not None:
    356     encoding = None if "b" in self._read_flags else "utf-8"
--> 357     with open(self.filename, self._read_flags, encoding=encoding) as f:
    358         self.data = f.read()
    359 elif self.url is not None:
    360     # Deferred import

FileNotFoundError: [Errno 2] No such file or directory: 'Conv_inference_time_comparison.svg'

3. Futur developments#

Bencharking full models, not just individual operators.
Unified test structure, where benchmarking and correctness tests are described using a single JSON file
Selective test execution using labels within configuration files, allowing users to group and selectively run a subset of tests.