TensorRT export#
In this tutorial, we’ll walk through the process of performing 8-bit quantization on a simple model using TensorRT and Aidge. The steps include:
exporting the model
modifying the test script for quantization
preparing calibration data
running the quantization and profile the quantized model
Furthermore, as shown in this image but not demonstrated in this tutorial, Aidge allows the user to:
Add custom operators via the plugin interface
Facilitate the transformation of user data into calibration data
0. Requirements for this tutorial#
To complete this tutorial, we hightly recommend following these requirements:
To have completed the Aidge 101 tutorial
To have installed the
aidge_export_tensorrt
module
In order to compile the export on your machine, please be sure to have one of these two conditions:
To have installed Docker (the export compilation chain is able to use docker)
To have installed the correct packages to support TensorRT 8.6
1. Exporting the model#
In this tutorial, we will export MobileNetV2, a lightweight convolutional neural network.
[1]:
!wget -c https://github.com/onnx/models/raw/main/validated/vision/classification/mobilenet/model/mobilenetv2-7.onnx
/usr/bin/sh: 1: wget: not found
For visualizing the model structure, we recommend using Netron. If you haven’t installed Netron yet, you can do so by executing the following command:
[2]:
# !pip install netron
Once installed, you can launch Netron to visualize the model:
[3]:
# import netron
# netron.start('mobilenetv2-7.onnx', 8080)
Then let’s export the model using the aidge_export_tensorrt
module.
[4]:
# First, be sure that any previous exports are removed
!rm -rf export_trt
[5]:
import aidge_export_tensorrt
# Generate export for your model
# This function takes as argument the name of the export folder
# and the onnx file or the graphview of your model
aidge_export_tensorrt.export("export_trt", "mobilenetv2-7.onnx")
Generating TensorRT export in export_trt.
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Cell In[5], line 6
1 import aidge_export_tensorrt
3 # Generate export for your model
4 # This function takes as argument the name of the export folder
5 # and the onnx file or the graphview of your model
----> 6 aidge_export_tensorrt.export("export_trt", "mobilenetv2-7.onnx")
File /builds/eclipse/aidge/aidge/venv/lib/python3.10/site-packages/aidge_export_tensorrt/__init__.py:56, in export(export_folder, graphview, python_binding, trt_version)
54 elif isinstance(graphview, str):
55 if graphview.endswith(".onnx"):
---> 56 shutil.copy(graphview, export_folder)
57 # Rename onnx file to "model.onnx"
58 _, old_name = os.path.split(graphview)
File /usr/lib/python3.10/shutil.py:417, in copy(src, dst, follow_symlinks)
415 if os.path.isdir(dst):
416 dst = os.path.join(dst, os.path.basename(src))
--> 417 copyfile(src, dst, follow_symlinks=follow_symlinks)
418 copymode(src, dst, follow_symlinks=follow_symlinks)
419 return dst
File /usr/lib/python3.10/shutil.py:254, in copyfile(src, dst, follow_symlinks)
252 os.symlink(os.readlink(src), dst)
253 else:
--> 254 with open(src, 'rb') as fsrc:
255 try:
256 with open(dst, 'wb') as fdst:
257 # macOS
FileNotFoundError: [Errno 2] No such file or directory: 'mobilenetv2-7.onnx'
The export povides a Makefile with several options to use the export on your machine. You can generate a C++ export or a Python export.
You also have the possibility to compile the export or/and the Python library by using Docker if your host machine doesn’t have the correct packages. In this tutorial, we generate the Python library of the export and use it a Python script.
All of these options are resumed in the helper of the Makefile (run make help
in the export folder for more details).
[6]:
# Compile the export Python library by using docker
# and the Makefile provided in the export
!cd export_trt/ && make build_lib_python_docker
make[1]: Entering directory '/builds/eclipse/aidge/aidge/docs/source/Tutorial/export_trt'
make[1]: *** No rule to make target 'build_lib_python_docker'. Stop.
make[1]: Leaving directory '/builds/eclipse/aidge/aidge/docs/source/Tutorial/export_trt'
2. Modifying the test script for quantization#
Next, you have to modify test.py
by adding nb_bits=8
in the graph constructor and call model.calibrate()
.
calibrate()
can accept three arguments:
calibration_folder_path: to specify the path to your calibration folder
cache_file_path: to use your pre-built calibration cache
batch_size: to specify the batch size for calibration data
[7]:
%%writefile export_trt/test.py
"""Example test file for the TensorRT Python API."""
import build.lib.aidge_trt as aidge_trt
import numpy as np
if __name__ == '__main__':
# Load the model
model = aidge_trt.Graph("model.onnx", nb_bits=8)
# Calibrate the model
model.calibrate()
# Initialize the model
model.initialize()
# Profile the model with 10 iterations
model.profile(10)
# Example of running inference
# img: numpy.array = np.load("PATH TO NPY file")
# output: numpy.array = model.run_sync([img])
Writing export_trt/test.py
3. Preparing the calibration dataset#
To ensure accurate calibration, it’s essential to select representative samples. In this example, we will use a 224x224 RGB image from the ImageNet dataset.
However, for practical applications, TensorRT suggests that “The amount of input data required is application-dependent, but experiments indicate that approximately 500 images are adequate for calibrating ImageNet classification networks”.
[8]:
# Create calibration folder
!cd export_trt/ && mkdir calibration_folder
[9]:
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
demo_img_path = './data/0.jpg'
img = mpimg.imread(demo_img_path)
imgplot = plt.imshow(img)
plt.show()
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Cell In[9], line 7
3 import matplotlib.image as mpimg
5 demo_img_path = './data/0.jpg'
----> 7 img = mpimg.imread(demo_img_path)
8 imgplot = plt.imshow(img)
9 plt.show()
File /builds/eclipse/aidge/aidge/venv/lib/python3.10/site-packages/matplotlib/image.py:1520, in imread(fname, format)
1513 if isinstance(fname, str) and len(parse.urlparse(fname).scheme) > 1:
1514 # Pillow doesn't handle URLs directly.
1515 raise ValueError(
1516 "Please open the URL for reading and pass the "
1517 "result to Pillow, e.g. with "
1518 "``np.array(PIL.Image.open(urllib.request.urlopen(url)))``."
1519 )
-> 1520 with img_open(fname) as image:
1521 return (_pil_png_to_float_array(image)
1522 if isinstance(image, PIL.PngImagePlugin.PngImageFile) else
1523 pil_to_array(image))
File /builds/eclipse/aidge/aidge/venv/lib/python3.10/site-packages/PIL/Image.py:3505, in open(fp, mode, formats)
3502 filename = os.fspath(fp)
3504 if filename:
-> 3505 fp = builtins.open(filename, "rb")
3506 exclusive_fp = True
3507 else:
FileNotFoundError: [Errno 2] No such file or directory: './data/0.jpg'
This image has been preprocessed and stored in /data/
as 0.batch
file. Information about the image’s shape is stored in the .info
file.
[10]:
import shutil
shutil.copy("data/.info", "export_trt/calibration_folder/.info")
shutil.copy("data/0.batch", "export_trt/calibration_folder/0.batch")
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Cell In[10], line 3
1 import shutil
----> 3 shutil.copy("data/.info", "export_trt/calibration_folder/.info")
4 shutil.copy("data/0.batch", "export_trt/calibration_folder/0.batch")
File /usr/lib/python3.10/shutil.py:417, in copy(src, dst, follow_symlinks)
415 if os.path.isdir(dst):
416 dst = os.path.join(dst, os.path.basename(src))
--> 417 copyfile(src, dst, follow_symlinks=follow_symlinks)
418 copymode(src, dst, follow_symlinks=follow_symlinks)
419 return dst
File /usr/lib/python3.10/shutil.py:254, in copyfile(src, dst, follow_symlinks)
252 os.symlink(os.readlink(src), dst)
253 else:
--> 254 with open(src, 'rb') as fsrc:
255 try:
256 with open(dst, 'wb') as fdst:
257 # macOS
FileNotFoundError: [Errno 2] No such file or directory: 'data/.info'
4. Generating the quantized model#
Finally, run the test script to quantize the model with the export python library and profile it.
[11]:
!cd export_trt/ && make test_lib_python_docker
make[1]: Entering directory '/builds/eclipse/aidge/aidge/docs/source/Tutorial/export_trt'
make[1]: *** No rule to make target 'test_lib_python_docker'. Stop.
make[1]: Leaving directory '/builds/eclipse/aidge/aidge/docs/source/Tutorial/export_trt'
Following these steps have enabled you to conduct 8-bit quantization on your model. Upon completing the calibration, the calibration data can be reused if a calibration_cache
exists, saving computational resources.
[12]:
!tail -n +0 export_trt/calibration_cache
tail: cannot open 'export_trt/calibration_cache' for reading: No such file or directory
After quantization, feel free to save the generated TensorRT engine using model.save("name_of_your_model")
. The method will save the engine into a .trt
file.
To load the engine for further applications, use model.load("name_of_your_model.trt")
after instancing a model.