Aidge demonstration#

Binder

Install requirements#

Ensure that the Aidge modules are properly installed in the current environment. If it is the case, the following setup steps can be skipped.
Note: When running this notebook on Binder, all required components are pre-installed.
[ ]:
%pip install aidge-core \
    aidge-backend-cpu \
    aidge-learning \
    aidge-export-cpp \
    aidge-onnx \
    aidge-quantization \
    aidge-model-explorer

Introduction#

Aidge is a collaborative open source deep learning library optimized for exporting and processing deep learning algorithms on embedded devices. With Aidge, one can create or import a computational graph from popular frameworks, apply modifications to its structure, train it and export its architecture to various embedded devices. Aidge provides optimized functions for both inference and training, as well as many custom functionalities for the target device.

This notebook provides an overview of the toolchain used to import a deep neural network from an ONNX model and enable its inference in Aidge. The demonstrated toolchain includes:

pipeline(0)

To demonstrate this toolchain, we use the MNIST digit recognition task.

MNIST

Setting up the notebook#

[ ]:
# First import some utility methods used in the tutorial:
import sys, os

sys.path.append(os.path.abspath(os.path.join("..")))
import tuto_utils

Import Aidge#

To provide a collaborative environment on the platform, Aidge is built around a core library that interfaces with multiple modules bound to Python libraries. Key modules include:

  • aidge_core is the core library that offers all the basic functionalities for creating and manipulating the internal computational graph representation;

  • aidge_backend_cpu is a module providing a generic C++ implementations for each component of the computational graph;

  • aidge_onnx is a module allowing to import ONNX models to the Aidge framework;

  • aidge_export_cpp is a module dedicated to the generation of optimized C++ code.

In this way, Aidge’s core library remains free of any dependencies, allowing users to install whatever they need based on their use cases.

modular

[ ]:
import aidge_core

# The Conv2D operator is supported but only the "export_serialize" backend is available.
# This backend allow to generate C++ code but not to run inference.
# For this we would need "cpu" backend.
print(f"Available backends:\n{aidge_core.get_keys_Conv2DOp()}")

# note: Tensor is a special case as 'cpu' backend is provided in the core
# module to guarantee basic functionalities such as data access
print(f"Available backends for Tensor:\n{aidge_core.Tensor.get_available_backends()}")

# Set the log level
aidge_core.Log.set_console_level(aidge_core.Level.Notice)

As shown, only one export backend is available for the Conv2D class, which is export_serialize. The aidge_backend_cpu module must then be imported, as it automatically registers with aidge_core and provide access to a backend capable of running inference.

[ ]:
import aidge_backend_cpu

print(f"Available backends:\n{aidge_core.get_keys_Conv2DOp()}")

For the continuation of this tutorial, it is necessary to import aidge_onnx to load ONNX files, numpy to load data, and matplotlib to display images.

[ ]:
import aidge_onnx
import numpy as np
import matplotlib.pyplot as plt

Download the ONNX model (if needed)#

If git-lfs is not installed, the model and data can be downloaded using the following code snippet.
Reminder: This step is not needed when running in Binder.
[ ]:
BASE_URL = "https://gitlab.eclipse.org/eclipse/aidge/aidge/-/raw/main/examples/tutorials/101_first_step/"

# Download the model, input and output data files
files_to_download = ["MLP_MNIST.onnx", "input_digit.npy", "output_digit.npy"]

for file_name in files_to_download:
    aidge_core.utils.download_file(
        file_path=file_name, file_url=f"{BASE_URL}{file_name}"
    )

The following sections describe the main steps of the aforementioned toolchain, from model import to code generation.

ONNX Import#

pipeline(1)

[ ]:
model = aidge_onnx.load_onnx("MLP_MNIST.onnx")

In this example, Aidge was able to find an implementation for every node in the ONNX model. However, in some cases, certain nodes may be imported as a GenericOperator — the Flatten operator, for example, used to be handled this way.

The GenericOperator is part of a fallback mechanism that allows Aidge to load the entire ONNX graph without failing, even when it encounters nodes that are not yet supported by the framework.

It acts as a stub that retrieves the node’s type and attributes from ONNX. This enables users to either implement the missing nodes in a custom script or modify them using Aidge’s recipe system, as explained below.

The imported graph can now be visualized using the visualize method from aidge_model_explorer. This package extends the ai_edge_model_explorer project by Google, and has been adapted for Aidge’s GraphViews.

[ ]:
import aidge_model_explorer

aidge_model_explorer.visualize(model, "MLP_MNIST", embed=True)

Graph transformation#

pipeline(2)

On the other hand, to enable inference, all operators in the graph must be supported. In the presented example, the imported model contains a Flatten operator preceding the Gemm operator. However, the aidge.FC operator already handles flattening internally. Therefore, a graph transformation is required to adapt the graph for inference, i.e. remove the Flatten operator.

Aidge’s graph transformation toolchain is embedded inside recipes functions. These recipes are available in aidge_core.

Examples include:

  • fuse_batchnorm: Fuse BatchNorm inside Conv or FC operator;

  • matmul_to_fc: Fuse MatMul and Add operators into a FC operator;

  • conv_horizontal_tiling: replace a Conv by an horizontal tilled version;

  • remove_flatten: Remove Flatten if it is before an FC operator;

  • adapt_to_backend: Adapt graph to the current backend by adding Transpose operators to match expected input/output data format;

  • constant_folding: Compute constant parts of the graph and replace them by pre-computed values.

Apply the remove_flatten recipe to eliminate the redundant Flatten operator:

[ ]:
# Use the remove_flatten recipe
aidge_core.remove_flatten(model)

The Flatten node is removed by the recipe. The model can now be visualized:

[ ]:
aidge_model_explorer.visualize(model, "MLP_MNIST", embed=True)

Static analysis#

pipeline(3)

Static analysis can be applied to a graph at any time to assess its complexity in terms of memory usage and computational operations.

[ ]:
import aidge_core.static_analysis

# Dimensions must be forwarded for static analysis!
model.forward_dims(dims=[[1, 1, 28, 28]], allow_data_dependency=True)

model_stats = aidge_core.static_analysis.StaticAnalysis(model)
model_stats.summary()
[ ]:
model_stats.log_nb_ops_by_type("stats_ops.png", log_scale=True)

Inference#

pipeline(4)

Create an input tensor#

To perform an inference pass, an image from the MNIST dataset is loaded using numpy.

[ ]:
digit = np.load("input_digit.npy")
plt.imshow(digit[0][0], cmap="gray")

To validate the result produced by the model, the output generated by a Pytorch model for the same image is also loaded.

[ ]:
output_model = np.load("output_digit.npy")
print(output_model)

Thanks to numpy interoperability, an Aidge Tensor can be created directly from the numpy array storing the image.

[ ]:
input_tensor = aidge_core.Tensor(digit)
print(f"Aidge Input Tensor dimensions: \n{input_tensor.dims}")

Configure the model for inference#

Currently, the model has no implementation and serves only as a data structure. To set an implementation, a data type and backend must be specified.

[ ]:
# Configure the model
model.compile("cpu", aidge_core.dtype.float32, dims=[[1, 1, 28, 28]])
# it is equivalent to set_datatype(), set_backend() and forward_dims()

Check graph is valid for inference (Optional)#

You have the possibility to check that your model is compatible for inference using aidge_core.check_graph_validity.

This function return a boolean indicating whether or not you can infer with your graph. If the option report=True, the function will display a summary of the error it encountered node-wise.

This tool is useful to quickly understand issue with datatype and format.

[ ]:
aidge_core.check_graph_validity(model, report=True)

Create a scheduler and run inference#

The graph is now ready for execution. To schedule the execution, a Scheduler object is created, which takes the graph and generates an optimized schedule using a consumer-producer heuristic.

[ ]:
# Create scheduler
scheduler = aidge_core.SequentialScheduler(model)

# Run inference!
scheduler.forward(data=[input_tensor])
[ ]:
# Assert results
for outNode in model.get_output_nodes():
    output_aidge = np.array(outNode.get_operator().get_output(0))
    print(output_aidge)
    print("Aidge prediction = ", np.argmax(output_aidge[0]))
    assert np.allclose(output_aidge, output_model, rtol=1e-04)

It is possible to save the scheduling in Mermaid format and then visualize it using:

[ ]:
scheduler.save_scheduling_diagram("schedulingSequential")
tuto_utils.visualize_mmd("schedulingSequential_forward.mmd")

Optimize network#

pipeline(5)

The next steps involve optimizing the model through a quantization workflow.

[ ]:
quantized_model = model.clone()
[ ]:
# Optional: if not using git-lfs, download data stored in the PTQ tutorial
BASE_URL = "https://gitlab.eclipse.org/eclipse/aidge/aidge/-/raw/main/examples/tutorials/PTQ_tutorial/"
file_name = "mnist_samples.npy.gz"

aidge_core.utils.download_file(file_path=file_name, file_url=f"{BASE_URL}{file_name}")
[ ]:
import gzip

NB_SAMPLES = 100  # Number of samples to use for PTQ

# Load data from the PTQ tutorial, either downloaded manually or via git-lfs
path = next(
    p
    for p in ["./mnist_samples.npy.gz", "../PTQ_tutorial/mnist_samples.npy.gz"]
    if os.path.exists(p)
)
samples = np.load(gzip.GzipFile(path, "r"))

for i in range(10):
    plt.subplot(1, 10, i + 1)
    plt.axis("off")
    plt.tight_layout()
    plt.imshow(samples[i], cmap="gray")

tensors = []
for sample in samples[0:NB_SAMPLES]:
    sample = np.reshape(sample, (1, 1, 28, 28)).astype(np.float32)
    tensor = aidge_core.Tensor(sample)
    tensors.append(tensor)
[ ]:
import aidge_quantization

# Set the precision
aidge_quantization.auto_assign_node_precision(
    graphview=quantized_model,
    act_precision=aidge_core.dtype.int8,
    weight_precision=aidge_core.dtype.int8,
    bias_precision=aidge_core.dtype.int32,
)

aidge_quantization.quantize_network(
    network=quantized_model,
    calibration_set=tensors,
    clipping_mode=aidge_quantization.Clipping.MSE,
    no_quant=False,
    optimize_signs=False,
    single_shift=False,
)
[ ]:
aidge_model_explorer.visualize(quantized_model, "MLP_MNIST_quantized", embed=True)

Export#

After testing the imported graph, one of Aidge’s main features can be explored: exporting the computational graph to a hardware target through code generation.

pipeline(6)

Generate an export in C++#

In this example, a generic C++ export is generated. This export is independent of the previously configured backend_cpu. It produces a standalone output, abstracted from the Aidge platform.

[ ]:
!rm -r my_export
[ ]:
!ls my_export

Let’s first import the function that we’ll need to export our model. The export process goes as follows :

  1. Create or import the hardware model of the target we want to export the model on;

  2. Adapt the graph to the chosen target;

  3. Generate the export files from the adapted graph.

[ ]:
from pathlib import Path
import shutil

import aidge_export_cpp
from aidge_core.mem_info import generate_optimized_memory_info
from aidge_core.export_utils import (
    scheduler_export,
    generate_main_cpp,
    adapt_graph,
    graph_evaluator,
    get_export_libs_from_names,
    copy_folder,
)

1. Getting the target hardware model#

Several hardware models are natively supported in Aidge.

A harward model contains several information useful for the export :

  • The number of memories and their size;

  • The number of processing elements and their supported export libraries.

In this example, we will use the default one called “Host”, mainly used to test models on your local machine.
It has :
  • A 1GB memory;

  • A CPU (supporting both aidge_export_cpp and aidge_export_xnnpack modules).

Note : The Host device is defined here : aidge/aidge/aidge_core/aidge_core/hw_model/targets/host/Host.py

[ ]:
from aidge_core.hw_model import Host

target = Host(xnn_pack=False)
export_libs = get_export_libs_from_names(target.cpu._lib)
print(target.cpu._lib)

2. Adapt the graph to the chosen target#

The export process mainly consists in matching aidge operators with available implementations from the list of ExportLibs, compatibles with the chosen target.
Each implementation often comes with constraints.
For instance, the ExportLibCpp provides an implementation which supports the Padding as well. Hence before exporting the model, we need to fuse every convolution of the graph preceded by a Padding node.
As this implementation only supports NHWC data format, we need to make sure to adapt it, and add Transpose nodes if needed.
The adapt_graph() function provides an automated way to perform all these steps at once, to find the best topology possible to export the graph on the desired target.
It provides some arguments to guide the optimization :
  • heuristic : The fusion heuristic can be changed to handle specific cases. The default one tries to fuse as much operators as possible in each MetaOperators;

  • wanted_in_dformat : Gives a constraint on the input data format. This is useful to limit the number of possibilities;

  • constant_fold : Automatically folds the constant operators.

[ ]:
adapt_graph(
    model,
    export_libs,
    heuristic=graph_evaluator.evaluate_nb_ops,
    wanted_in_dformat=aidge_core.dformat.nhwc,
    constant_fold=True,
)
[ ]:
aidge_model_explorer.visualize(model, "adapted_model", embed=True)

Again, we can check the validity of the graph, this time according to the export libraries.

[ ]:
aidge_core.check_graph_validity(model, report=True)

3. Generate export files#

The last part generates the export project from the scheduled graph.
As the graph has probably been changed by the adpat_graph() function, we fist need to regenerate the scheduler.
[ ]:
# Regenerate the scheduler
scheduler.reset_scheduling()
scheduler.generate_scheduling()

# Export the graph
scheduler_export(
    scheduler,
    "my_export",
    memory_manager=generate_optimized_memory_info,
    memory_manager_args={"stats_folder": f"my_export/stats"},
)

The scheduler_export function generates the following files:

  • dnn/include/forward.hpp - defines an API function to use the export;

  • dnn/include/kernels - contains kernel function declarations;

  • dnn/include/layers - contains layer configuration headers;

  • dnn/include/parameters - stores parameters definitions;

  • dnn/src/forward.cpp - implements the forward function that calls the generated kernels;

[ ]:
!tree my_export

Generate main file#

The scheduler_export function generates only the kernel exports and a forward function that invokes them in the order defined by the scheduler.

From this point, application development can begin. Aidge provides a utility function named generate_main_cpp, which generates a simple main.cpp file capable of performing an inference pass using an input tensor supplied by the user.

[ ]:
generate_main_cpp("my_export", model)
!cat my_export/main.cpp

Generate the Makefile#

The board_files_path attribute of the target usually contains board drivers and/or Makefiles used to compile the generated project.

[ ]:
copy_folder(target.board_files_path, "my_export")
!cat my_export/Makefile

Generate an input file for tests (optional)#

To test the export, input data must be provided. The generate_main_cpp function automatically generates an input file based on the tensor set as input.

In this case, since an input tensor was already provided during the forward pass, executing the following cell is not necessary. However, if no input tensor has been defined, the input file must be generated manually. This can be done by exporting a numpy array using:

aidge_core.export_utils.generate_input_file(export_folder="myexport", array_name="fc1_Gemm_input_0", tensor=aidge_core.Tensor(digit.reshape(-1)))

Compile the export#

Once the export has been generated, it can be compiled using a simple make command:

[ ]:
!cd my_export && make

Execute the export#

[ ]:
!./my_export/bin/run_export
[ ]: