PaddleYOLO

Introduction

Introduction
ModelZoo
- PP-YOLOE
- YOLOX
- YOLOv5
- YOLOv6
- YOLOv7
UserGuide
- Pipeline
- CustomDataset

Introduction

PaddleYOLO is a YOLO Series toolbox based on PaddleDetection, only relevant codes of YOLO series models are included. It supports YOLOv3,PP-YOLO,PP-YOLOv2,PP-YOLOE,PP-YOLOE+,YOLOX,YOLOv5,YOLOv6,YOLOv7 and so on. Welcome to use and build it together!

Updates

【2022/09/26】Release PaddleYOLO;
【2022/09/19】Support the new version of YOLOv6, including n/t/s/m/l model;
【2022/08/23】Release YOLOSeries codebase: support YOLOv3,PP-YOLOE,PP-YOLOE+,YOLOX,YOLOv5,YOLOv6 and YOLOv7; support using ConvNeXt backbone to get high-precision version of PP-YOLOE,YOLOX and YOLOv5; support PaddleSlim accelerated quantitative training PP-YOLOE,YOLOv5,YOLOv6 and YOLOv7. For details, please read this article；

Notes：

The Licence of PaddleYOLO is GPL 3.0, the codes of YOLOv5,YOLOv7 and YOLOv6 will not be merged into PaddleDetection. Except for these three YOLO models, other YOLO models are recommended to use in PaddleDetection, which will be the first to release the latest progress of PP-YOLO series detection model;
To use PaddleYOLO, PaddlePaddle-2.3.2 or above is recommended，please refer to the official website to download the appropriate version. **For Windows platforms, please install the paddle develop version **;

Exchanges

If you have any question or suggestion, please give us your valuable input via GitHub Issues

Welcome to join PaddleDetection user groups on WeChat (scan the QR code, add and reply "D" to the assistant)

ModelZoo

PP-YOLOE, PP-YOLOE+

Baseline

Model	Input Size	images/GPU	Epoch	TRT-FP16-Latency(ms)	mAP^val 0.5:0.95	mAP^val 0.5	Params(M)	FLOPs(G)	download	config
PP-YOLOE-s	640	32	400e	2.9	43.4	60.0	7.93	17.36	model	config
PP-YOLOE-s	640	32	300e	2.9	43.0	59.6	7.93	17.36	model	config
PP-YOLOE-m	640	28	300e	6.0	49.0	65.9	23.43	49.91	model	config
PP-YOLOE-l	640	20	300e	8.7	51.4	68.6	52.20	110.07	model	config
PP-YOLOE-x	640	16	300e	14.9	52.3	69.5	98.42	206.59	model	config
PP-YOLOE-tiny ConvNeXt	640	16	36e	-	44.6	63.3	33.04	13.87	model	config
PP-YOLOE+_s	640	8	80e	2.9	43.7	60.6	7.93	17.36	model	config
PP-YOLOE+_m	640	8	80e	6.0	49.8	67.1	23.43	49.91	model	config
PP-YOLOE+_l	640	8	80e	8.7	52.9	70.1	52.20	110.07	model	config
PP-YOLOE+_x	640	8	80e	14.9	54.7	72.0	98.42	206.59	model	config

Deploy Models

Model	Input Size	Exported weights(w/o NMS)	ONNX(w/o NMS)
PP-YOLOE-s(400epoch)	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
PP-YOLOE-s	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
PP-YOLOE-m	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
PP-YOLOE-l	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
PP-YOLOE-x	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
PP-YOLOE+_s	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
PP-YOLOE+_m	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
PP-YOLOE+_l	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
PP-YOLOE+_x	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)

YOLOX

Baseline

Model	Input Size	images/GPU	Epoch	TRT-FP16-Latency(ms)	mAP^val 0.5:0.95	mAP^val 0.5	Params(M)	FLOPs(G)	download	config
YOLOX-nano	416	8	300e	2.3	26.1	42.0	0.91	1.08	model	config
YOLOX-tiny	416	8	300e	2.8	32.9	50.4	5.06	6.45	model	config
YOLOX-s	640	8	300e	3.0	40.4	59.6	9.0	26.8	model	config
YOLOX-m	640	8	300e	5.8	46.9	65.7	25.3	73.8	model	config
YOLOX-l	640	8	300e	9.3	50.1	68.8	54.2	155.6	model	config
YOLOX-x	640	8	300e	16.6	51.8	70.6	99.1	281.9	model	config
YOLOX-cdn-tiny	416	8	300e	1.9	32.4	50.2	5.03	6.33	model	config
YOLOX-crn-s	640	8	300e	3.0	40.4	59.6	7.7	24.69	model	config
YOLOX-s ConvNeXt	640	8	36e	-	44.6	65.3	36.2	27.52	model	config

Deploy Models

Model	Input Size	Exported weights(w/o NMS)	ONNX(w/o NMS)
YOLOx-nano	416	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
YOLOx-tiny	416	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
YOLOx-s	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
YOLOx-m	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
YOLOx-l	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
YOLOx-x	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)

YOLOv5

Baseline

Model	Input Size	images/GPU	Epoch	TRT-FP16-Latency(ms)	mAP^val 0.5:0.95	mAP^val 0.5	Params(M)	FLOPs(G)	download	config
YOLOv5-n	640	16	300e	2.6	28.0	45.7	1.87	4.52	model	config
YOLOv5-s	640	8	300e	3.2	37.0	55.9	7.24	16.54	model	config
YOLOv5-m	640	5	300e	5.2	45.3	63.8	21.19	49.08	model	config
YOLOv5-l	640	3	300e	7.9	48.6	66.9	46.56	109.32	model	config
YOLOv5-x	640	2	300e	13.7	50.6	68.7	86.75	205.92	model	config
YOLOv5-s ConvNeXt	640	8	36e	-	42.4	65.3	34.54	17.96	model	config

Deploy Models

Model	Input Size	Exported weights(w/o NMS)	ONNX(w/o NMS)
YOLOv5-n	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
YOLOv5-s	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
YOLOv5-m	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
YOLOv5-l	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
YOLOv5-x	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)

YOLOv6

Baseline

Model	Input Size	images/GPU	Epoch	TRT-FP16-Latency(ms)	mAP^val 0.5:0.95	mAP^val 0.5	Params(M)	FLOPs(G)	download	config
*YOLOv6-n	416	32	400e	1.0	31.1	45.3	4.74	5.16	model	config
*YOLOv6-n	640	32	400e	1.3	36.1	51.9	4.74	12.21	model	config
*YOLOv6-t	640	32	400e	2.1	40.7	57.4	10.63	27.29	model	config
*YOLOv6-s	640	32	400e	2.6	43.4	60.5	18.87	48.35	model	config
*YOLOv6-m	640	32	300e	5.0	49.0	66.5	37.17	88.82	model	config
*YOLOv6-l	640	32	300e	7.9	51.0	68.9	63.54	155.89	model	config
*YOLOv6-l-silu	640	32	300e	9.6	51.7	69.6	58.59	142.66	model	config

Deploy Models

Model	Input Size	Exported weights(w/o NMS)	ONNX(w/o NMS)
yolov6-n	416	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
yolov6-n	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
yolov6-t	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
yolov6-s	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
yolov6-m	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
yolov6-l	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
yolov6-l-silu	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)

YOLOv7

Baseline

Model	Input Size	images/GPU	Epoch	TRT-FP16-Latency(ms)	mAP^val 0.5:0.95	mAP^val 0.5	Params(M)	FLOPs(G)	download	config
YOLOv7-L	640	32	300e	7.4	51.0	70.2	37.62	106.08	model	config
*YOLOv7-X	640	32	300e	12.2	53.0	70.8	71.34	190.08	model	config
*YOLOv7P6-W6	1280	16	300e	25.5	54.4	71.8	70.43	360.26	model	config
*YOLOv7P6-E6	1280	10	300e	31.1	55.7	73.0	97.25	515.4	model	config
*YOLOv7P6-D6	1280	8	300e	37.4	56.1	73.3	133.81	702.92	model	config
*YOLOv7P6-E6E	1280	6	300e	48.7	56.5	73.7	151.76	843.52	model	config
YOLOv7-tiny	640	32	300e	-	37.3	54.5	6.23	6.90	model	config
YOLOv7-tiny	416	32	300e	-	33.3	49.5	6.23	2.91	model	config
YOLOv7-tiny	320	32	300e	-	29.1	43.8	6.23	1.73	model	config

Deploy Models

Model	Input Size	Exported weights(w/o NMS)	ONNX(w/o NMS)
YOLOv7-l	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
YOLOv7-x	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
YOLOv7P6-W6	1280	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
YOLOv7P6-E6	1280	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
YOLOv7P6-D6	1280	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
YOLOv7P6-E6E	1280	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
YOLOv7-tiny	640	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
YOLOv7-tiny	416	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)
YOLOv7-tiny	320	( w/ nms) \| ( w/o nms)	( w/ nms) \| ( w/o nms)

Notes：

All the models are trained on COCO train2017 dataset and evaluated on val2017 dataset. The * in front of the model indicates that the training is being updated.
Please check the specific accuracy and speed details in PP-YOLOE,YOLOX,YOLOv5,YOLOv6,YOLOv7. Note that YOLOv5, YOLOv6 and YOLOv7 have not adopted multi_label to eval。
TRT-FP16-Latency(ms) is the time spent in testing under TensorRT-FP16, excluding data preprocessing and model output post-processing (NMS). The test adopts single card V100, batch size=1, and the test environment is paddlepaddle-2.3.0, CUDA 11.2, CUDNN 8.2, GCC-8.2, TensorRT 8.0.3.4. Please refer to the respective model homepage for details.
For FLOPs(G), you should first install PaddleSlim, pip install paddleslim, then set print_flops: True in runtime.yml. Make sure single scale like 640x640, MACs are printed，FLOPs=2*MACs。
Based on PaddleSlim, quantitative training of YOLO series models can achieve basically lossless accuracy and generally improve the speed by more than 30%. For details, please refer to auto_compression。

UserGuide

Download MS-COCO dataset, official website. The download links are: annotations, train2017, val2017, test2017. The download link provided by PaddleDetection team is: coco(about 22G) and test2017. Note that test2017 is optional, and the evaluation is based on val2017.

Pipeline

model_type=ppyoloe # can modify to 'yolov7'
job_name=ppyoloe_crn_l_300e_coco # can modify to 'yolov7_l_300e_coco'

config=configs/${model_type}/${job_name}.yml
log_dir=log_dir/${job_name}
# weights=https://bj.bcebos.com/v1/paddledet/models/${job_name}.pdparams
weights=output/${job_name}/model_final.pdparams

# 1.training（single GPU / multi GPU）
# CUDA_VISIBLE_DEVICES=0 python3.7 tools/train.py -c ${config} --eval --amp
python3.7 -m paddle.distributed.launch --log_dir=${log_dir} --gpus 0,1,2,3,4,5,6,7 tools/train.py -c ${config} --eval --amp

# 2.eval
CUDA_VISIBLE_DEVICES=0 python3.7 tools/eval.py -c ${config} -o weights=${weights} --classwise

# 3.infer
CUDA_VISIBLE_DEVICES=0 python3.7 tools/infer.py -c ${config} -o weights=${weights} --infer_img=demo/000000014439_640x640.jpg --draw_threshold=0.5

# 4.export
CUDA_VISIBLE_DEVICES=0 python3.7 tools/export_model.py -c ${config} -o weights=${weights} # exclude_nms=True trt=True

# 5.deploy infer
CUDA_VISIBLE_DEVICES=0 python3.7 deploy/python/infer.py --model_dir=output_inference/${job_name} --image_file=demo/000000014439_640x640.jpg --device=GPU

# 6.deploy speed
CUDA_VISIBLE_DEVICES=0 python3.7 deploy/python/infer.py --model_dir=output_inference/${job_name} --image_file=demo/000000014439_640x640.jpg --device=GPU --run_benchmark=True # --run_mode=trt_fp16

# 7.export onnx
paddle2onnx --model_dir output_inference/${job_name} --model_filename model.pdmodel --params_filename model.pdiparams --opset_version 12 --save_file ${job_name}.onnx

# 8.onnx speed
/usr/local/TensorRT-8.0.3.4/bin/trtexec --onnx=${job_name}.onnx --workspace=4096 --avgRuns=10 --shapes=input:1x3x640x640 --fp16

Note：

Write the above commands in a script file, such as run.sh, and run as：sh run.sh,You can also run the command line sentence by sentence.
If you want to switch models, just modify the first two lines, such as:
```
model_type=yolov7
job_name=yolov7_l_300e_coco
```
For FLOPs(G), you should first install PaddleSlim, pip install paddleslim, then set print_flops: True in runtime.yml. Make sure single scale like 640x640, MACs are printed，FLOPs=2*MACs。

CustomDataset

preparation：

1.For the annotation of custom dataset, please refer toDetAnnoTools;

2.For training preparation of custom dataset，please refer toPrepareDataSet。

fintune：

In addition to changing the path of the dataset, it is generally recommended to load the COCO pre training weight of the corresponding model to fintune, which will converge faster and achieve higher accuracy, such as：

# fintune with single GPU：
# CUDA_VISIBLE_DEVICES=0 python3.7 tools/train.py -c ${config} --eval --amp -o pretrain_weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams

# fintune with multi GPU：
python3.7 -m paddle.distributed.launch --log_dir=./log_dir --gpus 0,1,2,3,4,5,6,7 tools/train.py -c ${config} --eval --amp -o pretrain_weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_300e_coco.pdparams

Note:

The fintune training will show that the channels of the last layer of the head classification branch is not matched, which is a normal situation, because the number of custom dataset is generally inconsistent with that of COCO dataset;
In general, the number of epochs for fintune training can be set less, and the lr setting is also smaller, such as 1/10. The highest accuracy may occur in one of the middle epochs;

Predict and export:

When using custom dataset to predict and export models, if the path of the TestDataset dataset is set incorrectly, COCO 80 categories will be used by default.

In addition to the correct path setting of the TestDataset dataset, you can also modify and add the corresponding label_list. Txt file (one category is recorded in one line), and anno_path in TestDataset can also be set as an absolute path, such as:

TestDataset:
  !ImageFolder
    anno_path: label_list.txt # if not set dataset_dir, the anno_path will be relative path of PaddleDetection root directory
    # dataset_dir: dataset/my_coco # if set dataset_dir, the anno_path will be dataset_dir/anno_path

one line in label_list.txt records a corresponding category：

person
vehicle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

YOLOSERIES_MODEL_en.md

YOLOSERIES_MODEL_en.md

PaddleYOLO

Introduction

Introduction

Updates

Exchanges

ModelZoo

PP-YOLOE, PP-YOLOE+

YOLOX

YOLOv5

YOLOv6

YOLOv7

Notes：

UserGuide

Pipeline

CustomDataset

preparation：

fintune：

Predict and export:

Files

YOLOSERIES_MODEL_en.md

Latest commit

History

YOLOSERIES_MODEL_en.md

File metadata and controls

PaddleYOLO

Introduction

Introduction

Updates

Exchanges

ModelZoo

PP-YOLOE, PP-YOLOE+

YOLOX

YOLOv5

YOLOv6

YOLOv7

Notes：

UserGuide

Pipeline

CustomDataset

preparation：

fintune：

Predict and export: