Perform static analysis on a neural network model#

In this tutorial, we employ Aidge’s static analysis module to evaluate the DINOv2 model.

Setting up the notebook#

[1]:
# Import Aidge modules
import aidge_core
import aidge_onnx

# Select low verbose mode
aidge_core.Log.set_console_level(aidge_core.Level.Error)

# Import module to show images in the notebook
from IPython.display import Image

# Import some utility modules
import os
import requests

(if needed) Download the model#

[2]:
# Download the DINOv2 model ONNX file, if this has not been done yet
file_url = "https://huggingface.co/EclipseAidge/dinov2/resolve/main/dinov2.onnx?download=true"
file_path = "dinov2.onnx"
aidge_core.utils.download_file(file_path, file_url)
dinov2.onnx: 100%|██████████| 82.7M/82.7M [00:01<00:00, 53.5MB/s]

Import the ONNX model into Aidge#

[3]:
# We rely on the Aidge ONNX module, which provides an interface between ONNX operators and Aidge's internal representation
dinov2_model = aidge_onnx.load_onnx("dinov2.onnx")

# Verify Aidge's native operators coverage for this model
aidge_onnx.native_coverage_report(dinov2_model)
Native operators: 824 (17 types)
- Add: 159
- Concat: 1
- Conv2D: 1
- Div: 49
- Erf: 12
- Gather: 1
- MatMul: 72
- Mul: 73
- Pow: 25
- Producer: 209
- ReduceMean: 50
- Reshape: 49
- Softmax: 12
- Split: 12
- Sqrt: 25
- Sub: 25
- Transpose: 49
Generic operators: 0 (0 types)
Native types coverage: 100.0% (17/17)
Native operators coverage: 100.0% (824/824)
[3]:
(defaultdict(int,
             {'Add': 159,
              'Reshape': 49,
              'Transpose': 49,
              'Erf': 12,
              'MatMul': 72,
              'Split': 12,
              'Mul': 73,
              'Div': 49,
              'Sqrt': 25,
              'ReduceMean': 50,
              'Pow': 25,
              'Sub': 25,
              'Softmax': 12,
              'Gather': 1,
              'Producer': 209,
              'Concat': 1,
              'Conv2D': 1}),
 defaultdict(int, {}))

Explore a handful of graph transformations#

In this section we use some of Aidge’s recipes to simplify the model graph.

[4]:
# Create a clone of the original model to be used for comparison later
clone_dinov2 = dinov2_model.clone()

# Simplify the model using meta-operators via the ``fuse_to_metaops`` recipe
# In this context we use Graph Regex to specify which sequence of operators must be replaced by a given meta operator
aidge_core.fuse_to_metaops(dinov2_model, "MatMul-*>Add", "Linear")

aidge_core.fuse_to_metaops(dinov2_model, "ReduceMean-*>Sub#1~>(Pow#1->ReduceMean-*>Add#1->Sqrt)-*>Div#1-*>Mul#1-*>Add#2;"
                                        "Sub#1~*>Div#1;"
                                        "Pow#1<1~Producer;"
                                        "Add#1<*~Producer;"
                                        "Mul#1<*~Producer;"
                                        "Add#2<*~Producer;"
                                        "Sub#1~>$", "LayerNorm")

aidge_core.fuse_to_metaops(dinov2_model, "MatMul->Div#1->Softmax-*>MatMul;"
                                        "Div#1<1~Producer", "ScaledDotProductAttention")

aidge_core.fuse_to_metaops(dinov2_model, "ScaledDotProductAttention#1->Transpose->Reshape#1->Linear;"
                                        "Reshape#1<1~Producer;"
                                        "ScaledDotProductAttention#1<0-(Transpose<-Reshape#2<-Add#1);"
                                        "ScaledDotProductAttention#1<1-(Transpose<-Reshape#3<-Add#2);"
                                        "ScaledDotProductAttention#1<2-(Transpose<-Reshape#4<-Add#3);"
                                        "Reshape#2<1~Producer;"
                                        "Add#1<*-0-Split#1;"
                                        "Add#2<*-1-Split#1;"
                                        "Add#3<*-2-Split#1;"
                                        "Split#1<-MatMul;"
                                        "Split#1<1~Producer", "MultiHeadAttention")

aidge_core.fuse_to_metaops(dinov2_model, "Div#1->Erf->Add#1-*>Mul->Mul#2;"
                                        "Div#1<1~Producer;"
                                        "Add#1<*~Producer;"
                                        "Mul#2<*~Producer", "GeLU")

dinov2_model.set_ordered_outputs([dinov2_model.get_ordered_outputs()[0][0].inputs()[0], dinov2_model.get_ordered_outputs()[0]])
[5]:
# After creating meta-operators, we can verify the updated total number of operators in the graph
_ = aidge_onnx.native_coverage_report(dinov2_model)
Native operators: 277 (12 types)
- Add: 25
- Concat: 1
- Conv2D: 1
- Gather: 1
- GeLU: 12
- LayerNorm: 25
- Linear: 24
- Mul: 24
- MultiHeadAttention: 12
- Producer: 150
- Reshape: 1
- Transpose: 1
Generic operators: 0 (0 types)
Native types coverage: 100.0% (12/12)
Native operators coverage: 100.0% (277/277)

The number of operators is reduced from 824 (17 unique) to 277 (12 unique), as depicted in the following image.

./static/dino_sim.png

Run static analysis#

[6]:
import aidge_core.static_analysis

# Input dimensions must be forwarded for static analysis
dinov2_model.forward_dims(dims=[[1,3,224,224]], allow_data_dependency=True)

dinov2_stats = aidge_core.static_analysis.StaticAnalysis(dinov2_model)
dinov2_stats.summary()
--------------------------------------------------------------------------------
                        Layer (type)               Output Shape         Param #
================================================================================
embeddings_patch_embeddings_projection_Conv (Conv2D#0)           [1, 384, 16, 16]          226176
embeddings_patch_embeddings_Reshape (Reshape#0)              [1, 384, 256]               3
embeddings_patch_embeddings_Transpose (Transpose#0)              [1, 256, 384]               0
        embeddings_Concat (Concat#0)              [1, 257, 384]             384
              embeddings_Add (Add#0)              [1, 257, 384]           98688
                       (LayerNorm#0)              [1, 257, 384]             770
              (MultiHeadAttention#0)              [1, 257, 384]          591370
encoder_layer_0_layer_scale1_Mul (Mul#0)              [1, 257, 384]             384
         encoder_layer_0_Add (Add#1)              [1, 257, 384]               0
                       (LayerNorm#1)              [1, 257, 384]             770
                          (Linear#0)             [1, 257, 1536]          591360
                            (GeLU#0)             [1, 257, 1536]               3
                          (Linear#1)              [1, 257, 384]          590208
encoder_layer_0_layer_scale2_Mul (Mul#1)              [1, 257, 384]             384
       encoder_layer_0_Add_1 (Add#2)              [1, 257, 384]               0
                       (LayerNorm#2)              [1, 257, 384]             770
              (MultiHeadAttention#1)              [1, 257, 384]          591370
encoder_layer_1_layer_scale1_Mul (Mul#2)              [1, 257, 384]             384
         encoder_layer_1_Add (Add#3)              [1, 257, 384]               0
                       (LayerNorm#3)              [1, 257, 384]             770
                          (Linear#2)             [1, 257, 1536]          591360
                            (GeLU#1)             [1, 257, 1536]               3
                          (Linear#3)              [1, 257, 384]          590208
encoder_layer_1_layer_scale2_Mul (Mul#3)              [1, 257, 384]             384
       encoder_layer_1_Add_1 (Add#4)              [1, 257, 384]               0
                       (LayerNorm#4)              [1, 257, 384]             770
              (MultiHeadAttention#2)              [1, 257, 384]          591370
encoder_layer_2_layer_scale1_Mul (Mul#4)              [1, 257, 384]             384
         encoder_layer_2_Add (Add#5)              [1, 257, 384]               0
                       (LayerNorm#5)              [1, 257, 384]             770
                          (Linear#4)             [1, 257, 1536]          591360
                            (GeLU#2)             [1, 257, 1536]               3
                          (Linear#5)              [1, 257, 384]          590208
encoder_layer_2_layer_scale2_Mul (Mul#5)              [1, 257, 384]             384
       encoder_layer_2_Add_1 (Add#6)              [1, 257, 384]               0
                       (LayerNorm#6)              [1, 257, 384]             770
              (MultiHeadAttention#3)              [1, 257, 384]          591370
encoder_layer_3_layer_scale1_Mul (Mul#6)              [1, 257, 384]             384
         encoder_layer_3_Add (Add#7)              [1, 257, 384]               0
                       (LayerNorm#7)              [1, 257, 384]             770
                          (Linear#6)             [1, 257, 1536]          591360
                            (GeLU#3)             [1, 257, 1536]               3
                          (Linear#7)              [1, 257, 384]          590208
encoder_layer_3_layer_scale2_Mul (Mul#7)              [1, 257, 384]             384
       encoder_layer_3_Add_1 (Add#8)              [1, 257, 384]               0
                       (LayerNorm#8)              [1, 257, 384]             770
              (MultiHeadAttention#4)              [1, 257, 384]          591370
encoder_layer_4_layer_scale1_Mul (Mul#8)              [1, 257, 384]             384
         encoder_layer_4_Add (Add#9)              [1, 257, 384]               0
                       (LayerNorm#9)              [1, 257, 384]             770
                          (Linear#8)             [1, 257, 1536]          591360
                            (GeLU#4)             [1, 257, 1536]               3
                          (Linear#9)              [1, 257, 384]          590208
encoder_layer_4_layer_scale2_Mul (Mul#9)              [1, 257, 384]             384
      encoder_layer_4_Add_1 (Add#10)              [1, 257, 384]               0
                      (LayerNorm#10)              [1, 257, 384]             770
              (MultiHeadAttention#5)              [1, 257, 384]          591370
encoder_layer_5_layer_scale1_Mul (Mul#10)              [1, 257, 384]             384
        encoder_layer_5_Add (Add#11)              [1, 257, 384]               0
                      (LayerNorm#11)              [1, 257, 384]             770
                         (Linear#10)             [1, 257, 1536]          591360
                            (GeLU#5)             [1, 257, 1536]               3
                         (Linear#11)              [1, 257, 384]          590208
encoder_layer_5_layer_scale2_Mul (Mul#11)              [1, 257, 384]             384
      encoder_layer_5_Add_1 (Add#12)              [1, 257, 384]               0
                      (LayerNorm#12)              [1, 257, 384]             770
              (MultiHeadAttention#6)              [1, 257, 384]          591370
encoder_layer_6_layer_scale1_Mul (Mul#12)              [1, 257, 384]             384
        encoder_layer_6_Add (Add#13)              [1, 257, 384]               0
                      (LayerNorm#13)              [1, 257, 384]             770
                         (Linear#12)             [1, 257, 1536]          591360
                            (GeLU#6)             [1, 257, 1536]               3
                         (Linear#13)              [1, 257, 384]          590208
encoder_layer_6_layer_scale2_Mul (Mul#13)              [1, 257, 384]             384
      encoder_layer_6_Add_1 (Add#14)              [1, 257, 384]               0
                      (LayerNorm#14)              [1, 257, 384]             770
              (MultiHeadAttention#7)              [1, 257, 384]          591370
encoder_layer_7_layer_scale1_Mul (Mul#14)              [1, 257, 384]             384
        encoder_layer_7_Add (Add#15)              [1, 257, 384]               0
                      (LayerNorm#15)              [1, 257, 384]             770
                         (Linear#14)             [1, 257, 1536]          591360
                            (GeLU#7)             [1, 257, 1536]               3
                         (Linear#15)              [1, 257, 384]          590208
encoder_layer_7_layer_scale2_Mul (Mul#15)              [1, 257, 384]             384
      encoder_layer_7_Add_1 (Add#16)              [1, 257, 384]               0
                      (LayerNorm#16)              [1, 257, 384]             770
              (MultiHeadAttention#8)              [1, 257, 384]          591370
encoder_layer_8_layer_scale1_Mul (Mul#16)              [1, 257, 384]             384
        encoder_layer_8_Add (Add#17)              [1, 257, 384]               0
                      (LayerNorm#17)              [1, 257, 384]             770
                         (Linear#16)             [1, 257, 1536]          591360
                            (GeLU#8)             [1, 257, 1536]               3
                         (Linear#17)              [1, 257, 384]          590208
encoder_layer_8_layer_scale2_Mul (Mul#17)              [1, 257, 384]             384
      encoder_layer_8_Add_1 (Add#18)              [1, 257, 384]               0
                      (LayerNorm#18)              [1, 257, 384]             770
              (MultiHeadAttention#9)              [1, 257, 384]          591370
encoder_layer_9_layer_scale1_Mul (Mul#18)              [1, 257, 38
[7]:
# Display certain statistics about the number of operations by node
_ = dinov2_stats.log_nb_ops_by_type("stats_ops.png", log_scale=True)

../../_images/source_Tutorial_static_analysis_14_0.png

Configure the model for inference#

Currently, the model has no implementation and exists only as a data structure. To set an implementation, we will specify a backend and a data type.

[8]:
import aidge_backend_cpu

dinov2_model.set_backend("cpu")
dinov2_model.set_datatype(aidge_core.dtype.float32)
dinov2_model.forward_dims([[1,3,224,224]], True)


[8]:
True

Finally, to run inference, we need to schedule the execution. To do so we create a Scheduler object, which takes the graph and generates an optimized scheduling using a consummer-producer (C-P) heuristic.

[9]:
s = aidge_core.SequentialScheduler(dinov2_model)
s.generate_scheduling()

In addition, it is possible to verify the memory usage of the different nodes composing the graph.

[10]:
_ = aidge_core.generate_optimized_memory_info(s, "mem_strategy_dino", wrapping=False, display_names=False)
Image(filename="./mem_strategy_dino/memory_info.png")
[10]:
../../_images/source_Tutorial_static_analysis_20_0.png

We can then compare the modified model with the original one, whose operators have not been fused:

[11]:
# Compile the model
clone_dinov2.set_backend("cpu")
clone_dinov2.set_datatype(aidge_core.dtype.float32)
clone_dinov2.forward_dims([[1,3,224,224]], True)

# Generate scheduling
s = aidge_core.SequentialScheduler(clone_dinov2)
s.generate_scheduling()

# Visualize memory usage
_ = aidge_core.generate_optimized_memory_info(s, "mem_strategy_og_dino", wrapping=False, display_names=False)
Image(filename="./mem_strategy_og_dino/memory_info.png")
[11]:
../../_images/source_Tutorial_static_analysis_22_0.png

In this tutorial the following concepts were studied:

  • Graph transformations, in particular the fuse_to_metaops recipe;

  • Static analysis, to measure the graph’s complexity in terms of number of operations;

  • Memory information generation, to visualize the graph’s memory usage over inference time.