Simple C++ TensorRT inference library and example (WIP)

Testing and demonstrating functionalities of tensorRT framework in a C++ application. In particular, I tackle how NMS plugin can be used.

The goal of this repo is to look into how one can implement deep learning based detection inference workloads (For example, those using YOLO) The main library is largely based on cyrusbehr's repo, but I removed building related functions. Use trtexec to build the engine.

This repo aims to provide three things:

Lay out how to prepare a onnx based detection model, add NMS TensorRT plugin add_nms_to_graph.py (You should edit this file to fit your model)
Provide a C++ TensorRT execution library trt_infer_engine.h, trt_infer_engine.cpp
Demonstrate how this library can be used in real scenario (WIP)

Applying NMS(Non Maximum Suppresion) in TensorRT

In add_nms_to_graph.py, onnxsim is used to simplify the graph and apply some basic operation fusion(conv+BN -> conv, trim out constants). Although running onnxsim doesn't actually improve the end-result trt engine, it makes visualization way simpler so I consider it as best practice.

Then, EfficientNMS plugin is attached to appropriate output nodes. TensorRT plugins such as batchedNMS and efficientNMS enable NMS inside TensorRT model. Unfortunately, other popular NMS methods such as matrix NMS and soft NMS do not have plugins yet, as far as I know. As softNMS is widely used in the industry, one can try implementing softNMS plugin.

Although devs claim that EfficientNMS plugin will be deprecated and users should be using INMSLayer instead,(source) it's way simpler to just use EfficientNMS plugin. So I use it here.

Generating engine using trtexec

An example of this would look something like this:

trtexec --onnx=ppyoloe_plus_crn_l_80e_coco_w_trt_nms.onnx --saveEngine=ppyoloe_plus_crn_l_80e_coco_w_trt_nms.trt --fp16 --infStreams=1 --memPoolSize=workspace:2048 --iterations=100

You are encouraged to try multiple values of infStreams or memPoolSize, to see which maximizes the inference speed in your system.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
add_nms_to_graph.py		add_nms_to_graph.py
trt_infer_engine.cpp		trt_infer_engine.cpp
trt_infer_engine.h		trt_infer_engine.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simple C++ TensorRT inference library and example (WIP)

Applying NMS(Non Maximum Suppresion) in TensorRT

Generating engine using trtexec

About

Releases

Packages

Languages

yongjik-kim/tensorrt_cpp

Folders and files

Latest commit

History

Repository files navigation

Simple C++ TensorRT inference library and example (WIP)

Applying NMS(Non Maximum Suppresion) in TensorRT

Generating engine using trtexec

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages