CPS-IoT 2026 - Aidge Tutorial#

YOLOv8n for Runway Detection with Aidge#

In aeronautics, VBRD (Visual-Based Runway Detection) is the task of detecting a runway in the image recorded by a camera deployed in the airplane’s nose. The expected result is a bounding box around the runway.

In this turial, we will use a pre-trained YOLOv8 model, available from the LARD project, to execute inferences with different methods, including how to export the model as an ONNX file, load it within Aidge, and finally generating standalone C code with ACETONE.

1) Preliminaries#

[ ]:

import urllib.request

def wget(url, filepath)
    # Download the file from `url` and save it locally under `file_name`:
    with urllib.request.urlopen(url) as response, open(filepath, 'wb') as out_file:
        data = response.read() # a `bytes` object
        out_file.write(data)

1.1) Download runway image and trained YOLOv8 VBRD model#

[ ]:

# runway image
IMG_URL = "https://raw.githubusercontent.com/deel-ai/LARD/refs/heads/LARD_V2/docs/assets/Mosaic_lard_V1.png"
IMG_FILEPATH = "runway.png"

#!wget {IMG_URL} -O {IMG_FILEPATH}
wget(IMG_URL, IMG_FILEPATH)

# YOLOv8 model for VBRD available from LARD2
MODEL_URL = "https://github.com/deel-ai-papers/Yolo_models_LARD_V2/raw/refs/heads/main/yolo_v8_models/yolov8detect_IN_ODD_best.pt"
MODEL_FILEPATH_TORCH = "model.pt"

#!wget {MODEL_URL} -O {MODEL_FILEPATH_TORCH}
wget(MODEL_URL, MODEL_FILEPATH_TORCH)

1.2) Show the runway image using Numpy, cv2, and MatPlotLib#

[ ]:

import numpy as np

import cv2

from matplotlib import pyplot as plt

%matplotlib inline

[ ]:

# YOLOv8n accept 640x640 images
IMG_SIZE = 640

# preprocessing - load image as numpy tensor
img = cv2.imread(IMG_FILEPATH)
img = cv2.resize(
    img, (IMG_SIZE, IMG_SIZE)
)  # resise to 640x640 (compatible with YOLOv8n)
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)  # BGR to RGB

# show image
plt.imshow(img_rgb)
plt.axis("off")
plt.show()

# prepare image as a data tensor
img_input = img_rgb.astype(np.float32) / 255.0  # [0, 255] to [0, 1]
img_input = np.transpose(img_input, (2, 0, 1))[None]  # NHWC to NCHW
img_input = np.ascontiguousarray(img_input)  # ensure contiguity

print("shape", img_input.shape)

2) Predict using yolo ultralytics wrapper (pytorch is underlying)#

Yolov8n is a lightweight and efficient object detection model designed for instance segmentation tasks. It is part of the YOLO family of models, maintained by Ultralytics. The model is reputed for its real-time object detection capabilities. A wrapper class is available within the package, running upon torch.

[ ]:

# install ultralytics package to get YOLOv8 class
%pip install ultralytics torch

[ ]:

from ultralytics import YOLO

import time  # for evaluating inference time

[ ]:

# minimal confidence for detections
CONF_THRESHOLD = 0.2

# load the model
yolo_model = YOLO(MODEL_FILEPATH_TORCH)

# make an inference
start_time = time.perf_counter()
results = yolo_model.predict(source=IMG_FILEPATH, conf=CONF_THRESHOLD, save=False)
print("Inference time (in s):", time.perf_counter() - start_time)

result = results[0]  # get the first and unique result - yolo accepts a batch of images

# get annotated image
annotated_img = result.plot()
# convert to RGB
annotated_img_rgb = cv2.cvtColor(annotated_img, cv2.COLOR_BGR2RGB)
# plot
plt.imshow(annotated_img_rgb)
plt.axis("off")
plt.show()

# get all the returned detection boxes
for box in result.boxes:
    xyxy = box.xyxy[0].int().tolist()  # pixel coords, matrix tensor to list
    xyxyn = box.xyxyn[0].tolist()  # ratio coords, matrix tensor to list
    conf = float(box.conf[0])  # detection confidence
    cls = int(box.cls[0])  # detected object category (there is only one, "runway")

    print("Box corners:", xyxy, "pixels")
    print("Confidence:", conf)

4) Predict directly using torch (without ultralytics wrapper)#

[ ]:

%pip install torchvision

import torch
from torchvision.ops import nms  # this is the method for filtering overlapping boxes

4.1) define a method to interpret and plot the results#

[ ]:

# define a function to plot detections
def plot_inference_resut(output, img_rgb, conf_threshold=0.2, iou_threshold=0.5):

    # create a copy of original image
    img_rgb_plot = img_rgb.copy()

    # output is the tensor resulting from inference
    if output is not None:

        boxes = output[:, :4]  # 4 first values of a row are corner coordinates
        scores = output[
            :, 4
        ]  # remaining values are confidence per class (here, only one)

        # filter by confidence threshold
        mask = scores > conf_threshold

        boxes = boxes[mask]
        scores = scores[mask]

        # convert coordinates to [x_center, y_center, width, height]
        xyxy = np.zeros_like(boxes)

        xyxy[:, 0] = boxes[:, 0] - boxes[:, 2] / 2  # x1
        xyxy[:, 1] = boxes[:, 1] - boxes[:, 3] / 2  # y1
        xyxy[:, 2] = boxes[:, 0] + boxes[:, 2] / 2  # x2
        xyxy[:, 3] = boxes[:, 1] + boxes[:, 3] / 2  # y2

        # to use torch nms filter, convert numpy tensors to torch
        boxes_tensor = torch.tensor(xyxy)
        scores_tensor = torch.tensor(scores)

        # filter boxes inside other boxes
        keep = nms(boxes_tensor, scores_tensor, iou_threshold=iou_threshold)

        # get filtered results back to numpy
        final_boxes = boxes_tensor[keep].numpy()
        final_scores = scores_tensor[keep].numpy()

        # plot each box
        for box, score in zip(final_boxes, final_scores):

            # corners
            x1, y1, x2, y2 = box.astype(int)
            print(
                x1,
                y1,
                x2,
                y2,
                "xyxy pixel corners,",
                "Confidence=",
                str(round(score, 2)),
            )

            # plot detection rectangle on image
            cv2.rectangle(img_rgb_plot, (x1, y1), (x2, y2), (0, 255, 0), 2)
            cv2.putText(
                img_rgb_plot,
                f"runway {score:.2f}",
                (x1, y1 - 5),
                cv2.FONT_HERSHEY_SIMPLEX,
                0.5,
                (0, 255, 0),
                2,
            )

    # plot the resulting image
    plt.imshow(img_rgb_plot)
    plt.axis("off")
    plt.show()

4.2) Get the torch model from the yolo ultralytics wrapper and run an inference#

[ ]:

# get torch model inside yolo wrapper
torch_model = yolo_model.model  # yolo_model.ckpt["model"]

# image as torch tensor
img_input_torch = torch.from_numpy(img_input)

# inference mode
torch_model.eval()
with torch.inference_mode():  # like torch.no_grad():

    # do the inference
    start_time = time.perf_counter()
    preds = torch_model(img_input_torch)
    print("Inference time (in s):", time.perf_counter() - start_time)

    # yolov8 has two outputs: preds[0] are the predictions, preds[1] are features maps (grids)
    pred = preds[0][
        0
    ].T  # get detections, remove batch, and transpose: (1, 5, 8400) --> (8400, 5)

# back to numpy
output_torch_np = pred.numpy()
print(output_torch_np.shape)

# call our method to plot the detection boxes over image
plot_inference_resut(output_torch_np, img_rgb)

5) Export the model as onnx using the yolo wrapper#

[ ]:

# install onnx packages that yolo requires to export an onnx file
%pip install onnx onnxruntime onnxslim onnxscript

[ ]:

# onnx export parameters
EXPORT_ONNX_OPSET = 20  # today, onnxruntime is limited to opset 20
EXPORT_ONNX_DYNAMIC = False
EXPORT_ONNX_SIMPLIFY = True
EXPORT_ONNX_DEVICE = "cpu"

MODEL_FILEPATH_ONNX = yolo_model.export(
    format="onnx",
    opset=EXPORT_ONNX_OPSET,
    imgsz=IMG_SIZE,
    device=EXPORT_ONNX_DEVICE,  # cuda or cpu
    dynamic=EXPORT_ONNX_DYNAMIC,  # dynamic dimensions
    simplify=EXPORT_ONNX_SIMPLIFY,  # simplify the graph using onnxslim
)

[ ]:

import onnx

[ ]:

# VERIFY ONNX INTEGRITY

print("\nLoading the onnx file:", MODEL_FILEPATH_ONNX)
model = onnx.load(MODEL_FILEPATH_ONNX)
opset_version = model.opset_import[0].version if len(model.opset_import) > 0 else None
onnx.checker.check_model(MODEL_FILEPATH_ONNX, full_check=True)
print("opset", opset_version)

6) Predict using ONNXRT#

[ ]:

import onnxruntime as ort

[ ]:

# load model in onnxrt
session_ort = ort.InferenceSession(MODEL_FILEPATH_ONNX)

# inference
start_time = time.perf_counter()
outputs_ort = session_ort.run(None, {"images": img_input})
print("Inference time (in s):", time.perf_counter() - start_time)

# get YOLOv8n results:
#   1 image
#   5 values per box  [x_center, y_center, width, height, confidence]
# since image size 640×640 and 3 feature maps:
#   80×80 = 6400
#   40×40 = 1600
#   20×20 = 400
# then:
#   6400+1600+400 = 8400 candidates
output_ort_np = outputs_ort[
    0
]  # single output with shape (batch, channels, anchors) == (1, 5, 8400)
output_ort_np = output_ort_np.squeeze(0)  # (5, 8400)
output_ort_np = output_ort_np.transpose(1, 0)  # (8400, 5)
print(output_ort_np.shape)

# print the worst numerical difference compared to the torch inference
print("Max Error:", (output_ort_np - output_torch_np).max())

# plot the detections on image
plot_inference_resut(output_ort_np, img_rgb)

7) Open ONNX on Aidge#

[ ]:

%pip install aidge-core aidge_backend_cpu aidge-onnx

[ ]:

import aidge_core as aic
import aidge_onnx as aix

[ ]:

# Check onnx file with AIDGE
aix.check_onnx_validity(MODEL_FILEPATH_ONNX)

# Load onnx file with AIDGE
aidge_model = aix.load_onnx(MODEL_FILEPATH_ONNX)

# Verify operators coverage
if aix.has_native_coverage(aidge_model):
    print("[OK] The graph is fully supported by aidge!")
else:
    print("[WARNING] The graph is not fully supported by aidge!")

# print a report about operators
aix.native_coverage_report(aidge_model)

8) Predict using Aidge (backend CPU)#

[ ]:

import aidge_backend_cpu as aib_cpu

[ ]:

# Image to Aidge Tensor
aidge_input_tensor = aic.Tensor(img_input)

# Set Backend, DataType, and Dimensions
aidge_model.set_backend("cpu")
aidge_model.set_datatype(aic.dtype.float32)
aidge_model.forward_dims(dims=[img_input.shape], allow_data_dependency=True)

# aic.constant_shape_folding(model)

# Create scheduler
scheduler = aic.SequentialScheduler(aidge_model)
scheduler.generate_scheduling()

# Inference
start_time = time.perf_counter()
scheduler.forward(forward_dims=True, data=[aidge_input_tensor])
print("Inference time (in s):", time.perf_counter() - start_time)

# Get outputs
outs = list(aidge_model.get_output_nodes())

# Get single output
output_node = outs[0]
output_aidge = output_node.get_operator().get_output(0)

# back to numpy
output_aidge_np = np.array(output_aidge)

# squeeze and transpose
output_aidge_np = output_aidge_np.squeeze(0)  # (1, 5, 8400) --> (5, 8400)
output_aidge_np = output_aidge_np.transpose(1, 0)  # (5, 8400) --> (8400, 5)
print(output_aidge_np.shape)

# print numerical difference against torcj and ort
print("Max Error (torch):", (output_aidge_np - output_torch_np).max())
print("Max Error (onnx):", (output_aidge_np - output_ort_np).max())

# plot detections
plot_inference_resut(output_aidge_np, img_rgb)

9) Export ACETONE#

ACETONE is a module within Aidge designed to generate C source code from a DNN model, aiming to be certifiable.

[ ]:

%pip install git+https://gitlab.eclipse.org/eclipse/aidge/aidge_export_acetone.git

[ ]:

import aidge_export_acetone as aib_ace
from aidge_export_acetone import ExportLibAcetone

import shutil  # management of files and folders

4.1) Generate the code from the model#

[ ]:

EXPORT_FOLDER = "my_export_acetone"

# Load onnx file with AIDGE
aidge_model_acetone = aix.load_onnx(MODEL_FILEPATH_ONNX)

# Prepare the model for the backend
aidge_model_acetone.set_backend(ExportLibAcetone._name)
# Check operators and implementation coverage
ExportLibAcetone.check_coverage(aidge_model_acetone)
ExportLibAcetone.check_implementation_coverage(aidge_model_acetone)

# adapt and check graph dimensions based on input dimensions
aidge_model_acetone.forward_dims(allow_data_dependency=True, dims=[img_input.shape])

# create a scheduling
scheduler_acetone = aic.SequentialScheduler(aidge_model_acetone)
scheduler_acetone.generate_scheduling()

# clean folder if it already exists
shutil.rmtree(EXPORT_FOLDER, ignore_errors=True)

# generate code
aib_ace.export.scheduler_export(
    scheduler_acetone,
    EXPORT_FOLDER,
    ExportLibAcetone,
    # memory_manager=aic.mem_info.compute_default_mem_info
    memory_manager=aic.mem_info.generate_optimized_memory_info,
    memory_manager_args={"stats_folder": f"{EXPORT_FOLDER}/stats", "wrapping": False},
)

4.2) Generate files for DLL/SO wrapper#

[ ]:

# generate a wrapper dll/so
aib_ace.export.generate_libdnn(
    export_folder=EXPORT_FOLDER,
    outputs_name=["result"],
    outputs_dtype=["float"],
    inputs_name=["input"],
    inputs_dtype=["float"],
)

4.3) Compile#

[ ]:

# compilation parameters
FLAGS = ""  # "-Wall -Wextra "   #-MMD  (for make)  #-fopenmp (openmp directives for parallel optim)    -v (verbose)
OPT_FLAG = "-O0"  # "-O2"

# .c and .h files
INCLUDE_DIRS = "-I./dnn -I./dnn/include"
SOURCE_FILES = "dnn/src/*.c"

# dll/so filename
DLL_FILENAME = "mylib.so.dll"

# compiler
COMPILER = "gcc"  # "g++"

# complete command
command = f"{COMPILER} {FLAGS} {OPT_FLAG} {INCLUDE_DIRS} -fPIC -shared {SOURCE_FILES} -o {DLL_FILENAME}"
print(command)

# call
%cd {EXPORT_FOLDER} && {command}

4.4) Call inference implementation generate by Acetone using DLL/SO call#

[ ]:

from ctypes import cdll, CDLL, POINTER, pointer, c_float

[ ]:

# define c type for python wrapper
C_DTYPE = c_float

# load dll/so with the model inference function
print("Loading DLL")
SO_FILEPATH = f"{EXPORT_FOLDER}/{DLL_FILENAME}"
aidge_so_model_lib = cdll.LoadLibrary(SO_FILEPATH)

# reference to the inference function
model_forward = aidge_so_model_lib.forward
model_forward.argtypes = [POINTER(C_DTYPE), POINTER(POINTER(C_DTYPE))]
model_forward.restype = None

# inputs and outputs
result_c_ptr = POINTER(C_DTYPE)()  # float* result
result_c_ptr_ptr = pointer(result_c_ptr)  # float**

# create aidge input tensor from np data
input_c_ptr = img_input.ctypes.data_as(POINTER(C_DTYPE))  # c pointer to first element

print("Calling DLL function")

start_time = time.perf_counter()
model_forward(input_c_ptr, result_c_ptr_ptr)
print("Inference time (in s):", time.perf_counter() - start_time)

print("Getting result")
output_acetone_dll_np = np.ctypeslib.as_array(result_c_ptr, shape=(5, 8400))  # .copy()
output_acetone_dll_np = output_acetone_dll_np.transpose(1, 0)  # (8400, 5)
print(output_acetone_dll_np.shape)

print("Max Error (torch):", (output_acetone_dll_np - output_torch_np).max())
print("Max Error (onnx):", (output_acetone_dll_np - output_ort_np).max())

plot_inference_resut(output_acetone_dll_np, img_rgb, conf_threshold=0.2)