Perform static analysis on a neural network model#
In this tutorial, we employ Aidge’s static analysis
module to evaluate the DINOv2 model.
Setting up the notebook#
[1]:
# Import Aidge modules
import aidge_core
import aidge_onnx
# Select low verbose mode
aidge_core.Log.set_console_level(aidge_core.Level.Error)
# Import module to show images in the notebook
from IPython.display import Image
# Import some utility modules
import os
import requests
(if needed) Download the model#
[2]:
# Download the DINOv2 model ONNX file, if this has not been done yet
file_url = "https://huggingface.co/EclipseAidge/dinov2/resolve/main/dinov2.onnx?download=true"
file_path = "dinov2.onnx"
aidge_core.utils.download_file(file_path, file_url)
dinov2.onnx: 100%|██████████| 82.7M/82.7M [00:01<00:00, 53.5MB/s]
Import the ONNX model into Aidge#
[3]:
# We rely on the Aidge ONNX module, which provides an interface between ONNX operators and Aidge's internal representation
dinov2_model = aidge_onnx.load_onnx("dinov2.onnx")
# Verify Aidge's native operators coverage for this model
aidge_onnx.native_coverage_report(dinov2_model)
Native operators: 824 (17 types)
- Add: 159
- Concat: 1
- Conv2D: 1
- Div: 49
- Erf: 12
- Gather: 1
- MatMul: 72
- Mul: 73
- Pow: 25
- Producer: 209
- ReduceMean: 50
- Reshape: 49
- Softmax: 12
- Split: 12
- Sqrt: 25
- Sub: 25
- Transpose: 49
Generic operators: 0 (0 types)
Native types coverage: 100.0% (17/17)
Native operators coverage: 100.0% (824/824)
[3]:
(defaultdict(int,
{'Add': 159,
'Reshape': 49,
'Transpose': 49,
'Erf': 12,
'MatMul': 72,
'Split': 12,
'Mul': 73,
'Div': 49,
'Sqrt': 25,
'ReduceMean': 50,
'Pow': 25,
'Sub': 25,
'Softmax': 12,
'Gather': 1,
'Producer': 209,
'Concat': 1,
'Conv2D': 1}),
defaultdict(int, {}))
Explore a handful of graph transformations#
In this section we use some of Aidge’s recipes to simplify the model graph.
[4]:
# Create a clone of the original model to be used for comparison later
clone_dinov2 = dinov2_model.clone()
# Simplify the model using meta-operators via the ``fuse_to_metaops`` recipe
# In this context we use Graph Regex to specify which sequence of operators must be replaced by a given meta operator
aidge_core.fuse_to_metaops(dinov2_model, "MatMul-*>Add", "Linear")
aidge_core.fuse_to_metaops(dinov2_model, "ReduceMean-*>Sub#1~>(Pow#1->ReduceMean-*>Add#1->Sqrt)-*>Div#1-*>Mul#1-*>Add#2;"
"Sub#1~*>Div#1;"
"Pow#1<1~Producer;"
"Add#1<*~Producer;"
"Mul#1<*~Producer;"
"Add#2<*~Producer;"
"Sub#1~>$", "LayerNorm")
aidge_core.fuse_to_metaops(dinov2_model, "MatMul->Div#1->Softmax-*>MatMul;"
"Div#1<1~Producer", "ScaledDotProductAttention")
aidge_core.fuse_to_metaops(dinov2_model, "ScaledDotProductAttention#1->Transpose->Reshape#1->Linear;"
"Reshape#1<1~Producer;"
"ScaledDotProductAttention#1<0-(Transpose<-Reshape#2<-Add#1);"
"ScaledDotProductAttention#1<1-(Transpose<-Reshape#3<-Add#2);"
"ScaledDotProductAttention#1<2-(Transpose<-Reshape#4<-Add#3);"
"Reshape#2<1~Producer;"
"Add#1<*-0-Split#1;"
"Add#2<*-1-Split#1;"
"Add#3<*-2-Split#1;"
"Split#1<-MatMul;"
"Split#1<1~Producer", "MultiHeadAttention")
aidge_core.fuse_to_metaops(dinov2_model, "Div#1->Erf->Add#1-*>Mul->Mul#2;"
"Div#1<1~Producer;"
"Add#1<*~Producer;"
"Mul#2<*~Producer", "GeLU")
dinov2_model.set_ordered_outputs([dinov2_model.get_ordered_outputs()[0][0].inputs()[0], dinov2_model.get_ordered_outputs()[0]])
[5]:
# After creating meta-operators, we can verify the updated total number of operators in the graph
_ = aidge_onnx.native_coverage_report(dinov2_model)
Native operators: 277 (12 types)
- Add: 25
- Concat: 1
- Conv2D: 1
- Gather: 1
- GeLU: 12
- LayerNorm: 25
- Linear: 24
- Mul: 24
- MultiHeadAttention: 12
- Producer: 150
- Reshape: 1
- Transpose: 1
Generic operators: 0 (0 types)
Native types coverage: 100.0% (12/12)
Native operators coverage: 100.0% (277/277)
The number of operators is reduced from 824 (17 unique) to 277 (12 unique), as depicted in the following image.
Run static analysis#
[6]:
import aidge_core.static_analysis
# Input dimensions must be forwarded for static analysis
dinov2_model.forward_dims(dims=[[1,3,224,224]], allow_data_dependency=True)
dinov2_stats = aidge_core.static_analysis.StaticAnalysis(dinov2_model)
dinov2_stats.summary()
--------------------------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================================
embeddings_patch_embeddings_projection_Conv (Conv2D#0) [1, 384, 16, 16] 226176
embeddings_patch_embeddings_Reshape (Reshape#0) [1, 384, 256] 3
embeddings_patch_embeddings_Transpose (Transpose#0) [1, 256, 384] 0
embeddings_Concat (Concat#0) [1, 257, 384] 384
embeddings_Add (Add#0) [1, 257, 384] 98688
(LayerNorm#0) [1, 257, 384] 770
(MultiHeadAttention#0) [1, 257, 384] 591370
encoder_layer_0_layer_scale1_Mul (Mul#0) [1, 257, 384] 384
encoder_layer_0_Add (Add#1) [1, 257, 384] 0
(LayerNorm#1) [1, 257, 384] 770
(Linear#0) [1, 257, 1536] 591360
(GeLU#0) [1, 257, 1536] 3
(Linear#1) [1, 257, 384] 590208
encoder_layer_0_layer_scale2_Mul (Mul#1) [1, 257, 384] 384
encoder_layer_0_Add_1 (Add#2) [1, 257, 384] 0
(LayerNorm#2) [1, 257, 384] 770
(MultiHeadAttention#1) [1, 257, 384] 591370
encoder_layer_1_layer_scale1_Mul (Mul#2) [1, 257, 384] 384
encoder_layer_1_Add (Add#3) [1, 257, 384] 0
(LayerNorm#3) [1, 257, 384] 770
(Linear#2) [1, 257, 1536] 591360
(GeLU#1) [1, 257, 1536] 3
(Linear#3) [1, 257, 384] 590208
encoder_layer_1_layer_scale2_Mul (Mul#3) [1, 257, 384] 384
encoder_layer_1_Add_1 (Add#4) [1, 257, 384] 0
(LayerNorm#4) [1, 257, 384] 770
(MultiHeadAttention#2) [1, 257, 384] 591370
encoder_layer_2_layer_scale1_Mul (Mul#4) [1, 257, 384] 384
encoder_layer_2_Add (Add#5) [1, 257, 384] 0
(LayerNorm#5) [1, 257, 384] 770
(Linear#4) [1, 257, 1536] 591360
(GeLU#2) [1, 257, 1536] 3
(Linear#5) [1, 257, 384] 590208
encoder_layer_2_layer_scale2_Mul (Mul#5) [1, 257, 384] 384
encoder_layer_2_Add_1 (Add#6) [1, 257, 384] 0
(LayerNorm#6) [1, 257, 384] 770
(MultiHeadAttention#3) [1, 257, 384] 591370
encoder_layer_3_layer_scale1_Mul (Mul#6) [1, 257, 384] 384
encoder_layer_3_Add (Add#7) [1, 257, 384] 0
(LayerNorm#7) [1, 257, 384] 770
(Linear#6) [1, 257, 1536] 591360
(GeLU#3) [1, 257, 1536] 3
(Linear#7) [1, 257, 384] 590208
encoder_layer_3_layer_scale2_Mul (Mul#7) [1, 257, 384] 384
encoder_layer_3_Add_1 (Add#8) [1, 257, 384] 0
(LayerNorm#8) [1, 257, 384] 770
(MultiHeadAttention#4) [1, 257, 384] 591370
encoder_layer_4_layer_scale1_Mul (Mul#8) [1, 257, 384] 384
encoder_layer_4_Add (Add#9) [1, 257, 384] 0
(LayerNorm#9) [1, 257, 384] 770
(Linear#8) [1, 257, 1536] 591360
(GeLU#4) [1, 257, 1536] 3
(Linear#9) [1, 257, 384] 590208
encoder_layer_4_layer_scale2_Mul (Mul#9) [1, 257, 384] 384
encoder_layer_4_Add_1 (Add#10) [1, 257, 384] 0
(LayerNorm#10) [1, 257, 384] 770
(MultiHeadAttention#5) [1, 257, 384] 591370
encoder_layer_5_layer_scale1_Mul (Mul#10) [1, 257, 384] 384
encoder_layer_5_Add (Add#11) [1, 257, 384] 0
(LayerNorm#11) [1, 257, 384] 770
(Linear#10) [1, 257, 1536] 591360
(GeLU#5) [1, 257, 1536] 3
(Linear#11) [1, 257, 384] 590208
encoder_layer_5_layer_scale2_Mul (Mul#11) [1, 257, 384] 384
encoder_layer_5_Add_1 (Add#12) [1, 257, 384] 0
(LayerNorm#12) [1, 257, 384] 770
(MultiHeadAttention#6) [1, 257, 384] 591370
encoder_layer_6_layer_scale1_Mul (Mul#12) [1, 257, 384] 384
encoder_layer_6_Add (Add#13) [1, 257, 384] 0
(LayerNorm#13) [1, 257, 384] 770
(Linear#12) [1, 257, 1536] 591360
(GeLU#6) [1, 257, 1536] 3
(Linear#13) [1, 257, 384] 590208
encoder_layer_6_layer_scale2_Mul (Mul#13) [1, 257, 384] 384
encoder_layer_6_Add_1 (Add#14) [1, 257, 384] 0
(LayerNorm#14) [1, 257, 384] 770
(MultiHeadAttention#7) [1, 257, 384] 591370
encoder_layer_7_layer_scale1_Mul (Mul#14) [1, 257, 384] 384
encoder_layer_7_Add (Add#15) [1, 257, 384] 0
(LayerNorm#15) [1, 257, 384] 770
(Linear#14) [1, 257, 1536] 591360
(GeLU#7) [1, 257, 1536] 3
(Linear#15) [1, 257, 384] 590208
encoder_layer_7_layer_scale2_Mul (Mul#15) [1, 257, 384] 384
encoder_layer_7_Add_1 (Add#16) [1, 257, 384] 0
(LayerNorm#16) [1, 257, 384] 770
(MultiHeadAttention#8) [1, 257, 384] 591370
encoder_layer_8_layer_scale1_Mul (Mul#16) [1, 257, 384] 384
encoder_layer_8_Add (Add#17) [1, 257, 384] 0
(LayerNorm#17) [1, 257, 384] 770
(Linear#16) [1, 257, 1536] 591360
(GeLU#8) [1, 257, 1536] 3
(Linear#17) [1, 257, 384] 590208
encoder_layer_8_layer_scale2_Mul (Mul#17) [1, 257, 384] 384
encoder_layer_8_Add_1 (Add#18) [1, 257, 384] 0
(LayerNorm#18) [1, 257, 384] 770
(MultiHeadAttention#9) [1, 257, 384] 591370
encoder_layer_9_layer_scale1_Mul (Mul#18) [1, 257, 38
[7]:
# Display certain statistics about the number of operations by node
_ = dinov2_stats.log_nb_ops_by_type("stats_ops.png", log_scale=True)

Configure the model for inference#
Currently, the model has no implementation and exists only as a data structure. To set an implementation, we will specify a backend and a data type.
[8]:
import aidge_backend_cpu
dinov2_model.set_backend("cpu")
dinov2_model.set_datatype(aidge_core.dtype.float32)
dinov2_model.forward_dims([[1,3,224,224]], True)
[8]:
True
Finally, to run inference, we need to schedule the execution. To do so we create a Scheduler
object, which takes the graph and generates an optimized scheduling using a consummer-producer (C-P) heuristic.
[9]:
s = aidge_core.SequentialScheduler(dinov2_model)
s.generate_scheduling()
In addition, it is possible to verify the memory usage of the different nodes composing the graph.
[10]:
_ = aidge_core.generate_optimized_memory_info(s, "mem_strategy_dino", wrapping=False, display_names=False)
Image(filename="./mem_strategy_dino/memory_info.png")
[10]:

We can then compare the modified model with the original one, whose operators have not been fused:
[11]:
# Compile the model
clone_dinov2.set_backend("cpu")
clone_dinov2.set_datatype(aidge_core.dtype.float32)
clone_dinov2.forward_dims([[1,3,224,224]], True)
# Generate scheduling
s = aidge_core.SequentialScheduler(clone_dinov2)
s.generate_scheduling()
# Visualize memory usage
_ = aidge_core.generate_optimized_memory_info(s, "mem_strategy_og_dino", wrapping=False, display_names=False)
Image(filename="./mem_strategy_og_dino/memory_info.png")
[11]:

In this tutorial the following concepts were studied:
Graph transformations, in particular the
fuse_to_metaops
recipe;Static analysis, to measure the graph’s complexity in terms of number of operations;
Memory information generation, to visualize the graph’s memory usage over inference time.