Aidge demonstration#

Install requirements#

Ensure that the Aidge modules are properly installed in the current environment. If it is the case, the following setup steps can be skipped.

Note: When running this notebook on Binder, all required components are pre-installed.

[ ]:

%pip install aidge-core \
    aidge-backend-cpu \
    aidge-learning \
    aidge-export-cpp \
    aidge-onnx \
    aidge-quantization \
    aidge-model-explorer

Introduction#

Aidge is a collaborative open source deep learning library optimized for exporting and processing deep learning algorithms on embedded devices. With Aidge, one can create or import a computational graph from popular frameworks, apply modifications to its structure, train it and export its architecture to various embedded devices. Aidge provides optimized functions for both inference and training, as well as many custom functionalities for the target device.

This notebook provides an overview of the toolchain used to import a deep neural network from an ONNX model and enable its inference in Aidge. The demonstrated toolchain includes:

To demonstrate this toolchain, we use the MNIST digit recognition task.

Setting up the notebook#

[ ]:

# First import some utility methods used in the tutorial:
import sys, os

sys.path.append(os.path.abspath(os.path.join('..')))
import tuto_utils

Import Aidge#

To provide a collaborative environment on the platform, Aidge is built around a core library that interfaces with multiple modules bound to Python libraries. Key modules include:

aidge_core is the core library that offers all the basic functionnalities for creating and manipulating the internal computational graph representation;
aidge_backend_cpu is a module providing a generic C++ implementations for each component of the computational graph;
aidge_onnx is a module allowing to import ONNX models to the Aidge framework;
aidge_export_cpp is a module dedicated to the generation of optimized C++ code.

In this way, Aidge’s core library remains free of any dependencies, allowing users to install whatever they need based on their use cases.

[ ]:

import aidge_core

# The Conv2D operator is supported but only the "export_serialize" backend is available.
# This backend allow to generate C++ code but not to run inference.
# For this we would need "cpu" backend.
print(f"Available backends:\n{aidge_core.get_keys_Conv2DOp()}")

# note: Tensor is a special case as 'cpu' backend is provided in the core
# module to guarantee basic functionalities such as data accesss
print(f"Available backends for Tensor:\n{aidge_core.Tensor.get_available_backends()}")

As shown, only one export backend is available for the Conv2D class, which is export_serialize. The aidge_backend_cpu module must then be imported, as it automatically registers with aidge_core and provide access to a backend capable of running inference.

[ ]:

import aidge_backend_cpu

print(f"Available backends:\n{aidge_core.get_keys_Conv2DOp()}")

For the continuation of this tutorial, it is necessary to import aidge_onnx to load ONNX files, numpy to load data, and matplotlib to display images.

[ ]:

import aidge_onnx
import numpy as np
import matplotlib.pyplot as plt

Download the ONNX model (if needed)#

If git-lfs is not installed, the model and data can be downloaded using the following code snippet.

Reminder: This step is not needed when running in Binder.

[ ]:

BASE_URL = "https://gitlab.eclipse.org/eclipse/aidge/aidge/-/raw/main/examples/tutorials/101_first_step/"

# Download the model, input and output data files
files_to_download = ["MLP_MNIST.onnx", "input_digit.npy", "output_digit.npy"]

for file_name in files_to_download:
    aidge_core.utils.download_file(
        file_path=file_name,
        file_url=f"{BASE_URL}{file_name}"
    )

The following sections describe the main steps of the aforementioned toolchain, from model import to code generation.

ONNX Import#

[ ]:

model = aidge_onnx.load_onnx("MLP_MNIST.onnx")

In this example, Aidge was able to find an implementation for every node in the ONNX model. However, in some cases, certain nodes may be imported as a GenericOperator — the Flatten operator, for example, used to be handled this way.

The GenericOperator is part of a fallback mechanism that allows Aidge to load the entire ONNX graph without failing, even when it encounters nodes that are not yet supported by the framework.

It acts as a stub that retrieves the node’s type and attributes from ONNX. This enables users to either implement the missing nodes in a custom script or modify them using Aidge’s recipe system, as explained below.

The imported graph can now be visualized using the visualize method from aidge_model_explorer. This package extends the ai_edge_model_explorer project by Google, and has been adapted for Aidge’s GraphViews.

[ ]:

import aidge_model_explorer

aidge_model_explorer.visualize(model, "MLP_MNIST", embed=True)

Graph transformation#

On the other hand, to enable inference, all operators in the graph must be supported. In the presented example, the imported model contains a Flatten operator proceding the Gemm operator. However, the aidge.FC operator already handles flattening internally. Therefore, a graph transformation is required to adapt the graph for inference, i.e. remove the Flatten operator.

Aidge’s graph transformation toolchain is embedded inside recipes functions. These recipes are available in aidge_core.

Examples include: - fuse_batchnorm: Fuse BatchNorm inside Conv or FC operator; - matmul_to_fc: Fuse MatMul and Add operators into a FC operator; - conv_horizontal_tiling: replace a Conv by an horizontal tilled version; - remove_flatten: Remove Flatten if it is before an FC operator; - adapt_to_backend: Adapt graph to the current backend by adding Transpose operators to match expected input/output data format; - constant_folding: Compute constant parts of the graph and replace them by pre-computed values.

Apply the remove_flatten recipe to eliminate the redundant Flatten operator:

[ ]:

# Use the remove_flatten recipe
aidge_core.remove_flatten(model)

The Flatten node is removed by the recipe. The model can now be visualized:

[ ]:

aidge_model_explorer.visualize(model, "MLP_MNIST", embed=True)

Static analysis#

Static analysis can be applied to a graph at any time to assess its complexity in terms of memory usage and computational operations.

[ ]:

import aidge_core.static_analysis

# Dimensions must be forwarded for static analysis!
model.forward_dims(dims=[[1, 1, 28, 28]], allow_data_dependency=True)

model_stats = aidge_core.static_analysis.StaticAnalysis(model)
model_stats.summary()

[ ]:

model_stats.log_nb_ops_by_type("stats_ops.png", log_scale=True)

Inference#

Create an input tensor#

To perform an inference pass, an image from the MNIST dataset is loaded using numpy.

[ ]:

digit = np.load("input_digit.npy")
plt.imshow(digit[0][0], cmap='gray')

To validate the result produced by the model, the output generated by a Pytorch model for the same image is also loaded.

[ ]:

output_model = np.load("output_digit.npy")
print(output_model)

Thanks to numpy interoperability, an Aidge Tensor can be created directly from the numpy array storing the image.

[ ]:

input_tensor = aidge_core.Tensor(digit)
print(f"Aidge Input Tensor dimensions: \n{input_tensor.dims}")

Configure the model for inference#

Currently, the model has no implementation and serves only as a data structure. To set an implementation, a data type and backend must be specified.

[ ]:

# Configure the model
model.compile("cpu", aidge_core.dtype.float32, dims=[[1,1,28,28]])
# it is equivalent to set_datatype(), set_backend() and forward_dims()

Create a scheduler and run inference#

The graph is now ready for execution. To schedule the execution, a Scheduler object is created, which takes the graph and generates an optimized schedule using a consumer-producer heuristic.

[ ]:

# Create scheduler
scheduler = aidge_core.SequentialScheduler(model)

# Run inference!
scheduler.forward(data=[input_tensor])

[ ]:

# Assert results
for outNode in model.get_output_nodes():
    output_aidge = np.array(outNode.get_operator().get_output(0))
    print(output_aidge)
    print('Aidge prediction = ', np.argmax(output_aidge[0]))
    assert(np.allclose(output_aidge, output_model,rtol=1e-04))

It is possible to save the scheduling in Mermaid format and then visualize it using:

[ ]:

scheduler.save_scheduling_diagram("schedulingSequential")
tuto_utils.visualize_mmd("schedulingSequential_forward.mmd")

Optimize network#

The next steps involve optimizing the model through a quantization workflow.

[ ]:

quantized_model = model.clone()

[ ]:

# Optional: if not using git-lfs, download data stored in the PTQ tutorial
BASE_URL = "https://gitlab.eclipse.org/eclipse/aidge/aidge/-/raw/main/examples/tutorials/PTQ_tutorial/"
file_name = "mnist_samples.npy.gz"

aidge_core.utils.download_file(file_path=file_name,file_url=f"{BASE_URL}{file_name}")

[ ]:

import gzip

NB_SAMPLES = 100 # Number of samples to use for PTQ

# Load data from the PTQ tutorial, either downloaded manually or via git-lfs
path = next(p for p in ["./mnist_samples.npy.gz", "../PTQ_tutorial/mnist_samples.npy.gz"] if os.path.exists(p))
samples = np.load(gzip.GzipFile(path, "r"))

for i in range(10):
    plt.subplot(1, 10, i + 1)
    plt.axis('off')
    plt.tight_layout()
    plt.imshow(samples[i], cmap='gray')

tensors = []
for sample in samples[0:NB_SAMPLES]:
    sample = np.reshape(sample, (1, 1, 28, 28)).astype(np.float32)
    tensor = aidge_core.Tensor(sample)
    tensors.append(tensor)

[ ]:

import aidge_quantization
aidge_quantization.quantize_network(
    quantized_model,
    8,
    tensors,
    target_type     = aidge_core.dtype.float32,
    clipping_mode   = aidge_quantization.Clipping.MSE,
    no_quant        = False,
    optimize_signs  = True,
    single_shift    = False,
    use_cuda        = False)

[ ]:

aidge_model_explorer.visualize(quantized_model, "MLP_MNIST_quantized", embed=True)

Export#

After testing the imported graph, one of Aidge’s main features can be explored: exporting the computational graph to a hardware target through code generation.

Generate an export in C++#

In this example, a generic C++ export is generated. This export is independent of the previously configured backend_cpu. It produces a standalone output, abstracted from the Aidge platform.

[ ]:

!rm -r myexport

[ ]:

!ls myexport

Generating a cpu export requires the aidge_export_cpp module.

Once the module is imported, a single line of code is sufficient to generate an export of the graph.

[ ]:

import aidge_export_cpp

# Configuration for the model + forward dimensions
model.compile("cpu", aidge_core.dtype.float32, dims=[[1, 1, 28, 28]])

# Export the model in C++ standalone
aidge_export_cpp.export(
    export_folder_name="myexport",
    model=model,
    scheduler=scheduler,
    inputs_tensor=[input_tensor])

The export_scheduler function generates the following files:

dnn/include/forward.hpp - defines an API function to use the export;
dnn/include/kernels - contains kernel function declarations;
dnn/include/layers - contains layer configuration headers;
dnn/include/parameters - stores parameters definitions;
dnn/src/forward.cpp - implements the forward function that calls the generated kernels;
Makefile - used to compile main.cpp.

[ ]:

!tree myexport

Generate main file#

The scheduler_export function generates only the kernel exports and a forward function that invokes them in the order defined by the scheduler.

From this point, application development can begin. Aidge provides a utility function named generate_main_cpp, which generates a simple main.cpp file capable of performing an inference pass using an input tensor supplied by the user.

[ ]:

aidge_core.export_utils.generate_main_cpp("myexport", model)

[ ]:

!cat myexport/main.cpp

Generate an input file for tests (optional)#

To test the export, input data must be provided. The generate_main_cpp function automatically generates an input file based on the tensor set as input.

In this case, since an input tensor was already provided during the forward pass, executing the following cell is not necessary. However, if no input tensor has been defined, the input file must be generated manually. This can be done by exporting a numpy array using:

aidge_core.export_utils.generate_input_file(export_folder="myexport", array_name="fc1_Gemm_input_0", tensor=aidge_core.Tensor(digit.reshape(-1)))

Compile the export#

Once the export has been generated, it can be compiled using a simple make command:

[ ]:

!cd myexport && make

Execute the export#

[ ]:

!./myexport/bin/run_export