diff --git a/README.md b/README.md
index 9ba283a..2df5817 100644
--- a/README.md
+++ b/README.md
@@ -1,6 +1,6 @@
# DeepStream-Yolo
-NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 configuration for YOLO models
+NVIDIA DeepStream SDK 7.1 / 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 configuration for YOLO models
--------------------------------------------------------------------------------------------------
### For now, I am limited for some updates. Thank you for understanding.
@@ -12,19 +12,13 @@ NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 c
### Important: please export the ONNX model with the new export file, generate the TensorRT engine again with the updated files, and use the new config_infer_primary file according to your model
--------------------------------------------------------------------------------------------------
-### Future updates
-
-* DeepStream tutorials
-* Updated INT8 calibration
-* Support for classification models
-
### Improvements on this repository
* Support for INT8 calibration
* Support for non square models
* Models benchmarks
* Support for Darknet models (YOLOv4, etc) using cfg and weights conversion with GPU post-processing
-* Support for RT-DETR, YOLO-NAS, PPYOLOE+, PPYOLOE, DAMO-YOLO, YOLOX, YOLOR, YOLOv8, YOLOv7, YOLOv6 and YOLOv5 using ONNX conversion with GPU post-processing
+* Support for RT-DETR, YOLO-NAS, PPYOLOE+, PPYOLOE, DAMO-YOLO, Gold-YOLO, RTMDet (MMYOLO), YOLOX, YOLOR, YOLOv9, YOLOv8, YOLOv7, YOLOv6 and YOLOv5 using ONNX conversion with GPU post-processing
* GPU bbox parser
* Custom ONNX model parser
* Dynamic batch-size
@@ -47,8 +41,11 @@ NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 c
* [YOLOv6 usage](docs/YOLOv6.md)
* [YOLOv7 usage](docs/YOLOv7.md)
* [YOLOv8 usage](docs/YOLOv8.md)
+* [YOLOv9 usage](docs/YOLOv9.md)
* [YOLOR usage](docs/YOLOR.md)
* [YOLOX usage](docs/YOLOX.md)
+* [RTMDet (MMYOLO) usage](docs/RTMDet.md)
+* [Gold-YOLO usage](docs/GoldYOLO.md)
* [DAMO-YOLO usage](docs/DAMOYOLO.md)
* [PP-YOLOE / PP-YOLOE+ usage](docs/PPYOLOE.md)
* [YOLO-NAS usage](docs/YOLONAS.md)
@@ -62,6 +59,16 @@ NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 c
### Requirements
+#### DeepStream 7.1 on x86 platform
+
+* [Ubuntu 22.04](https://releases.ubuntu.com/22.04/)
+* [CUDA 12.6 Update 2](https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=runfile_local)
+* [TensorRT 10.3 GA (10.3.0.26)](https://developer.nvidia.com/nvidia-tensorrt-8x-download)
+* [NVIDIA Driver 535.183.06 (Data center / Tesla series) / 560.35.03 (TITAN, GeForce RTX / GTX series and RTX / Quadro series)](https://www.nvidia.com/Download/index.aspx)
+* [NVIDIA DeepStream SDK 7.1](https://catalog.ngc.nvidia.com/orgs/nvidia/resources/deepstream/files?version=7.1)
+* [GStreamer 1.20.3](https://gstreamer.freedesktop.org/)
+* [DeepStream-Yolo](https://github.com/marcoslucianops/DeepStream-Yolo)
+
#### DeepStream 7.0 on x86 platform
* [Ubuntu 22.04](https://releases.ubuntu.com/22.04/)
@@ -142,6 +149,12 @@ NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 c
* [GStreamer 1.14.5](https://gstreamer.freedesktop.org/)
* [DeepStream-Yolo](https://github.com/marcoslucianops/DeepStream-Yolo)
+#### DeepStream 7.1 on Jetson platform
+
+* [JetPack 6.1](https://developer.nvidia.com/embedded/jetpack-sdk-61)
+* [NVIDIA DeepStream SDK 7.1](https://catalog.ngc.nvidia.com/orgs/nvidia/resources/deepstream/files?version=7.1)
+* [DeepStream-Yolo](https://github.com/marcoslucianops/DeepStream-Yolo)
+
#### DeepStream 7.0 on Jetson platform
* [JetPack 6.0](https://developer.nvidia.com/embedded/jetpack-sdk-60)
@@ -201,10 +214,13 @@ NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 c
* [YOLOv6](https://github.com/meituan/YOLOv6)
* [YOLOv7](https://github.com/WongKinYiu/yolov7)
* [YOLOv8](https://github.com/ultralytics/ultralytics)
+* [YOLOv9](https://github.com/WongKinYiu/yolov9)
* [YOLOR](https://github.com/WongKinYiu/yolor)
* [YOLOX](https://github.com/Megvii-BaseDetection/YOLOX)
+* [RTMDet (MMYOLO)](https://github.com/open-mmlab/mmyolo/tree/main/configs/rtmdet)
+* [Gold-YOLO](https://github.com/huawei-noah/Efficient-Computing/tree/master/Detection/Gold-YOLO)
* [DAMO-YOLO](https://github.com/tinyvision/DAMO-YOLO)
-* [PP-YOLOE / PP-YOLOE+](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyoloe)
+* [PP-YOLOE / PP-YOLOE+](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/ppyoloe)
* [YOLO-NAS](https://github.com/Deci-AI/super-gradients/blob/master/YOLONAS.md)
##
@@ -231,6 +247,7 @@ export CUDA_VER=XY.Z
* x86 platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 = 12.1
DeepStream 6.2 = 11.8
@@ -243,6 +260,7 @@ export CUDA_VER=XY.Z
* Jetson platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
@@ -297,14 +315,14 @@ config-file=config_infer_primary_yoloV2.txt
* x86 platform
```
- nvcr.io/nvidia/deepstream:7.0-gc-triton-devel
- nvcr.io/nvidia/deepstream:7.0-triton-multiarch
+ nvcr.io/nvidia/deepstream:7.1-gc-triton-devel
+ nvcr.io/nvidia/deepstream:7.1-triton-multiarch
```
* Jetson platform
```
- nvcr.io/nvidia/deepstream:7.0-triton-multiarch
+ nvcr.io/nvidia/deepstream:7.1-triton-multiarch
```
**NOTE**: To compile the `nvdsinfer_custom_impl_Yolo`, you need to install the g++ inside the container
@@ -313,7 +331,7 @@ config-file=config_infer_primary_yoloV2.txt
apt-get install build-essential
```
-**NOTE**: With DeepStream 7.0, the docker containers do not package libraries necessary for certain multimedia operations like audio data parsing, CPU decode, and CPU encode. This change could affect processing certain video streams/files like mp4 that include audio track. Please run the below script inside the docker images to install additional packages that might be necessary to use all of the DeepStreamSDK features:
+**NOTE**: With DeepStream 7.1, the docker containers do not package libraries necessary for certain multimedia operations like audio data parsing, CPU decode, and CPU encode. This change could affect processing certain video streams/files like mp4 that include audio track. Please run the below script inside the docker images to install additional packages that might be necessary to use all of the DeepStreamSDK features:
```
/opt/nvidia/deepstream/deepstream/user_additional_install.sh
diff --git a/config_infer_primary_damoyolo.txt b/config_infer_primary_damoyolo.txt
index 4c55b21..98cb89e 100644
--- a/config_infer_primary_damoyolo.txt
+++ b/config_infer_primary_damoyolo.txt
@@ -16,8 +16,8 @@ network-type=0
cluster-mode=2
maintain-aspect-ratio=0
#workspace-size=2000
-parse-bbox-func-name=NvDsInferParseYoloE
-#parse-bbox-func-name=NvDsInferParseYoloECuda
+parse-bbox-func-name=NvDsInferParseYolo
+#parse-bbox-func-name=NvDsInferParseYoloCuda
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet
diff --git a/config_infer_primary_goldyolo.txt b/config_infer_primary_goldyolo.txt
new file mode 100644
index 0000000..6805de5
--- /dev/null
+++ b/config_infer_primary_goldyolo.txt
@@ -0,0 +1,28 @@
+[property]
+gpu-id=0
+net-scale-factor=0.0039215697906911373
+model-color-format=0
+onnx-file=Gold_s_pre_dist.onnx
+model-engine-file=model_b1_gpu0_fp32.engine
+#int8-calib-file=calib.table
+labelfile-path=labels.txt
+batch-size=1
+network-mode=0
+num-detected-classes=80
+interval=0
+gie-unique-id=1
+process-mode=1
+network-type=0
+cluster-mode=2
+maintain-aspect-ratio=1
+symmetric-padding=1
+#workspace-size=2000
+parse-bbox-func-name=NvDsInferParseYolo
+#parse-bbox-func-name=NvDsInferParseYoloCuda
+custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
+engine-create-func-name=NvDsInferYoloCudaEngineGet
+
+[class-attrs-all]
+nms-iou-threshold=0.45
+pre-cluster-threshold=0.25
+topk=300
diff --git a/config_infer_primary_ppyoloe.txt b/config_infer_primary_ppyoloe.txt
index e1f0558..712eee0 100644
--- a/config_infer_primary_ppyoloe.txt
+++ b/config_infer_primary_ppyoloe.txt
@@ -17,8 +17,8 @@ network-type=0
cluster-mode=2
maintain-aspect-ratio=0
#workspace-size=2000
-parse-bbox-func-name=NvDsInferParseYoloE
-#parse-bbox-func-name=NvDsInferParseYoloECuda
+parse-bbox-func-name=NvDsInferParseYolo
+#parse-bbox-func-name=NvDsInferParseYoloCuda
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet
diff --git a/config_infer_primary_ppyoloe_plus.txt b/config_infer_primary_ppyoloe_plus.txt
index 9180720..224336f 100644
--- a/config_infer_primary_ppyoloe_plus.txt
+++ b/config_infer_primary_ppyoloe_plus.txt
@@ -16,8 +16,8 @@ network-type=0
cluster-mode=2
maintain-aspect-ratio=0
#workspace-size=2000
-parse-bbox-func-name=NvDsInferParseYoloE
-#parse-bbox-func-name=NvDsInferParseYoloECuda
+parse-bbox-func-name=NvDsInferParseYolo
+#parse-bbox-func-name=NvDsInferParseYoloCuda
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet
diff --git a/config_infer_primary_rtmdet.txt b/config_infer_primary_rtmdet.txt
new file mode 100644
index 0000000..08882fe
--- /dev/null
+++ b/config_infer_primary_rtmdet.txt
@@ -0,0 +1,29 @@
+[property]
+gpu-id=0
+net-scale-factor=0.0173520735727919486
+offsets=103.53;116.28;123.675
+model-color-format=1
+onnx-file=rtmdet_s_syncbn_fast_8xb32-300e_coco_20221230_182329-0a8c901a.onnx
+model-engine-file=model_b1_gpu0_fp32.engine
+#int8-calib-file=calib.table
+labelfile-path=labels.txt
+batch-size=1
+network-mode=0
+num-detected-classes=80
+interval=0
+gie-unique-id=1
+process-mode=1
+network-type=0
+cluster-mode=2
+maintain-aspect-ratio=1
+symmetric-padding=1
+#workspace-size=2000
+parse-bbox-func-name=NvDsInferParseYolo
+#parse-bbox-func-name=NvDsInferParseYoloCuda
+custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
+engine-create-func-name=NvDsInferYoloCudaEngineGet
+
+[class-attrs-all]
+nms-iou-threshold=0.45
+pre-cluster-threshold=0.25
+topk=300
diff --git a/config_infer_primary_yoloV9.txt b/config_infer_primary_yoloV9.txt
new file mode 100644
index 0000000..8db48a0
--- /dev/null
+++ b/config_infer_primary_yoloV9.txt
@@ -0,0 +1,28 @@
+[property]
+gpu-id=0
+net-scale-factor=0.0039215697906911373
+model-color-format=0
+onnx-file=yolov9-c.onnx
+model-engine-file=model_b1_gpu0_fp32.engine
+#int8-calib-file=calib.table
+labelfile-path=labels.txt
+batch-size=1
+network-mode=0
+num-detected-classes=80
+interval=0
+gie-unique-id=1
+process-mode=1
+network-type=0
+cluster-mode=2
+maintain-aspect-ratio=1
+symmetric-padding=1
+#workspace-size=2000
+parse-bbox-func-name=NvDsInferParseYolo
+#parse-bbox-func-name=NvDsInferParseYoloCuda
+custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
+engine-create-func-name=NvDsInferYoloCudaEngineGet
+
+[class-attrs-all]
+nms-iou-threshold=0.45
+pre-cluster-threshold=0.25
+topk=300
diff --git a/config_infer_primary_yolonas.txt b/config_infer_primary_yolonas.txt
index 5f32a87..4f3a5f0 100644
--- a/config_infer_primary_yolonas.txt
+++ b/config_infer_primary_yolonas.txt
@@ -17,8 +17,8 @@ cluster-mode=2
maintain-aspect-ratio=1
symmetric-padding=0
#workspace-size=2000
-parse-bbox-func-name=NvDsInferParseYoloE
-#parse-bbox-func-name=NvDsInferParseYoloECuda
+parse-bbox-func-name=NvDsInferParseYolo
+#parse-bbox-func-name=NvDsInferParseYoloCuda
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet
diff --git a/config_infer_primary_yolonas_custom.txt b/config_infer_primary_yolonas_custom.txt
index 5c194bc..046a375 100644
--- a/config_infer_primary_yolonas_custom.txt
+++ b/config_infer_primary_yolonas_custom.txt
@@ -17,8 +17,8 @@ cluster-mode=2
maintain-aspect-ratio=1
symmetric-padding=0
#workspace-size=2000
-parse-bbox-func-name=NvDsInferParseYoloE
-#parse-bbox-func-name=NvDsInferParseYoloECuda
+parse-bbox-func-name=NvDsInferParseYolo
+#parse-bbox-func-name=NvDsInferParseYoloCuda
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet
diff --git a/docs/DAMOYOLO.md b/docs/DAMOYOLO.md
index 694d2f7..42b8a6d 100644
--- a/docs/DAMOYOLO.md
+++ b/docs/DAMOYOLO.md
@@ -16,7 +16,7 @@
git clone https://github.com/tinyvision/DAMO-YOLO.git
cd DAMO-YOLO
pip3 install -r requirements.txt
-pip3 install onnx onnxsim onnxruntime
+pip3 install onnx onnxslim onnxruntime
```
**NOTE**: It is recommended to use Python virtualenv.
@@ -107,6 +107,7 @@ export CUDA_VER=XY.Z
* x86 platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 = 12.1
DeepStream 6.2 = 11.8
@@ -119,6 +120,7 @@ export CUDA_VER=XY.Z
* Jetson platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
@@ -139,11 +141,11 @@ Edit the `config_infer_primary_damoyolo.txt` file according to your model (examp
```
[property]
...
-onnx-file=damoyolo_tinynasL25_S.onnx
+onnx-file=damoyolo_tinynasL25_S_477.pth.onnx
...
num-detected-classes=80
...
-parse-bbox-func-name=NvDsInferParseYoloE
+parse-bbox-func-name=NvDsInferParseYolo
...
```
diff --git a/docs/GoldYOLO.md b/docs/GoldYOLO.md
new file mode 100644
index 0000000..e3b0060
--- /dev/null
+++ b/docs/GoldYOLO.md
@@ -0,0 +1,179 @@
+# Gold-YOLO usage
+
+* [Convert model](#convert-model)
+* [Compile the lib](#compile-the-lib)
+* [Edit the config_infer_primary_goldyolo file](#edit-the-config_infer_primary_goldyolo-file)
+* [Edit the deepstream_app_config file](#edit-the-deepstream_app_config-file)
+* [Testing the model](#testing-the-model)
+
+##
+
+### Convert model
+
+#### 1. Download the Gold-YOLO repo and install the requirements
+
+```
+git clone https://github.com/huawei-noah/Efficient-Computing.git
+cd Efficient-Computing/Detection/Gold-YOLO
+pip3 install -r requirements.txt
+pip3 install onnx onnxslim onnxruntime
+```
+
+**NOTE**: It is recommended to use Python virtualenv.
+
+#### 2. Copy conversor
+
+Copy the `export_goldyolo.py` file from `DeepStream-Yolo/utils` directory to the `Gold-YOLO` folder.
+
+#### 3. Download the model
+
+Download the `pt` file from [Gold-YOLO](https://github.com/huawei-noah/Efficient-Computing/tree/master/Detection/Gold-YOLO) releases
+
+**NOTE**: You can use your custom model.
+
+#### 4. Convert model
+
+Generate the ONNX model file (example for Gold-YOLO-S)
+
+```
+python3 export_goldyolo.py -w Gold_s_pre_dist.pt --dynamic
+```
+
+**NOTE**: To change the inference size (defaut: 640)
+
+```
+-s SIZE
+--size SIZE
+-s HEIGHT WIDTH
+--size HEIGHT WIDTH
+```
+
+Example for 1280
+
+```
+-s 1280
+```
+
+or
+
+```
+-s 1280 1280
+```
+
+**NOTE**: To simplify the ONNX model (DeepStream >= 6.0)
+
+```
+--simplify
+```
+
+**NOTE**: To use dynamic batch-size (DeepStream >= 6.1)
+
+```
+--dynamic
+```
+
+**NOTE**: To use static batch-size (example for batch-size = 4)
+
+```
+--batch 4
+```
+
+**NOTE**: If you are using the DeepStream 5.1, remove the `--dynamic` arg and use opset 12 or lower. The default opset is 13.
+
+```
+--opset 12
+```
+
+#### 5. Copy generated files
+
+Copy the generated ONNX model file and labels.txt file (if generated) to the `DeepStream-Yolo` folder.
+
+##
+
+### Compile the lib
+
+1. Open the `DeepStream-Yolo` folder and compile the lib
+
+2. Set the `CUDA_VER` according to your DeepStream version
+
+```
+export CUDA_VER=XY.Z
+```
+
+* x86 platform
+
+ ```
+ DeepStream 7.1 = 12.6
+ DeepStream 7.0 / 6.4 = 12.2
+ DeepStream 6.3 = 12.1
+ DeepStream 6.2 = 11.8
+ DeepStream 6.1.1 = 11.7
+ DeepStream 6.1 = 11.6
+ DeepStream 6.0.1 / 6.0 = 11.4
+ DeepStream 5.1 = 11.1
+ ```
+
+* Jetson platform
+
+ ```
+ DeepStream 7.1 = 12.6
+ DeepStream 7.0 / 6.4 = 12.2
+ DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
+ DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
+ ```
+
+3. Make the lib
+
+```
+make -C nvdsinfer_custom_impl_Yolo clean && make -C nvdsinfer_custom_impl_Yolo
+```
+
+##
+
+### Edit the config_infer_primary_goldyolo file
+
+Edit the `config_infer_primary_goldyolo.txt` file according to your model (example for Gold-YOLO-S with 80 classes)
+
+```
+[property]
+...
+onnx-file=Gold_s_pre_dist.pt.onnx
+...
+num-detected-classes=80
+...
+parse-bbox-func-name=NvDsInferParseYolo
+...
+```
+
+**NOTE**: The **Gold-YOLO** resizes the input with center padding. To get better accuracy, use
+
+```
+[property]
+...
+maintain-aspect-ratio=1
+symmetric-padding=1
+...
+```
+
+##
+
+### Edit the deepstream_app_config file
+
+```
+...
+[primary-gie]
+...
+config-file=config_infer_primary_goldyolo.txt
+```
+
+##
+
+### Testing the model
+
+```
+deepstream-app -c deepstream_app_config.txt
+```
+
+**NOTE**: The TensorRT engine file may take a very long time to generate (sometimes more than 10 minutes).
+
+**NOTE**: For more information about custom models configuration (`batch-size`, `network-mode`, etc), please check the [`docs/customModels.md`](customModels.md) file.
diff --git a/docs/INT8Calibration.md b/docs/INT8Calibration.md
index b752cab..a80e3f4 100644
--- a/docs/INT8Calibration.md
+++ b/docs/INT8Calibration.md
@@ -17,6 +17,7 @@ export CUDA_VER=XY.Z
* x86 platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 = 12.1
DeepStream 6.2 = 11.8
@@ -29,6 +30,7 @@ export CUDA_VER=XY.Z
* Jetson platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
diff --git a/docs/PPYOLOE.md b/docs/PPYOLOE.md
index 53dc414..5071810 100644
--- a/docs/PPYOLOE.md
+++ b/docs/PPYOLOE.md
@@ -1,6 +1,6 @@
# PP-YOLOE / PP-YOLOE+ usage
-**NOTE**: You can use the release/2.6 branch of the PPYOLOE repo to convert all model versions.
+**NOTE**: You can use the develop branch of the PPYOLOE repo to convert all model versions.
* [Convert model](#convert-model)
* [Compile the lib](#compile-the-lib)
@@ -14,7 +14,7 @@
#### 1. Download the PaddleDetection repo and install the requirements
-https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.7/docs/tutorials/INSTALL.md
+https://github.com/PaddlePaddle/PaddleDetection/blob/develop/docs/tutorials/INSTALL.md
**NOTE**: It is recommended to use Python virtualenv.
@@ -24,7 +24,7 @@ Copy the `export_ppyoloe.py` file from `DeepStream-Yolo/utils` directory to the
#### 3. Download the model
-Download the `pdparams` file from [PP-YOLOE](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyoloe) releases (example for PP-YOLOE+_s)
+Download the `pdparams` file from [PP-YOLOE](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/ppyoloe) releases (example for PP-YOLOE+_s)
```
wget https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams
@@ -37,7 +37,7 @@ wget https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams
Generate the ONNX model file (example for PP-YOLOE+_s)
```
-pip3 install onnx onnxsim onnxruntime paddle2onnx
+pip3 install onnx onnxslim onnxruntime paddle2onnx
python3 export_ppyoloe.py -w ppyoloe_plus_crn_s_80e_coco.pdparams -c configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml --dynamic
```
@@ -84,6 +84,7 @@ export CUDA_VER=XY.Z
* x86 platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 = 12.1
DeepStream 6.2 = 11.8
@@ -96,6 +97,7 @@ export CUDA_VER=XY.Z
* Jetson platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
@@ -116,11 +118,11 @@ Edit the `config_infer_primary_ppyoloe_plus.txt` file according to your model (e
```
[property]
...
-onnx-file=ppyoloe_plus_crn_s_80e_coco.onnx
+onnx-file=ppyoloe_plus_crn_s_80e_coco.pdparams.onnx
...
num-detected-classes=80
...
-parse-bbox-func-name=NvDsInferParseYoloE
+parse-bbox-func-name=NvDsInferParseYolo
...
```
diff --git a/docs/RTDETR_Paddle.md b/docs/RTDETR_Paddle.md
index c3c1009..5560e25 100644
--- a/docs/RTDETR_Paddle.md
+++ b/docs/RTDETR_Paddle.md
@@ -14,13 +14,13 @@
#### 1. Download the PaddleDetection repo and install the requirements
-https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.7/docs/tutorials/INSTALL.md
+https://github.com/PaddlePaddle/PaddleDetection/blob/develop/docs/tutorials/INSTALL.md
```
git clone https://github.com/lyuwenyu/RT-DETR.git
cd RT-DETR/rtdetr_paddle
pip3 install -r requirements.txt
-pip3 install onnx onnxsim onnxruntime paddle2onnx
+pip3 install onnx onnxslim onnxruntime paddle2onnx
```
**NOTE**: It is recommended to use Python virtualenv.
@@ -90,6 +90,7 @@ export CUDA_VER=XY.Z
* x86 platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 = 12.1
DeepStream 6.2 = 11.8
@@ -102,6 +103,7 @@ export CUDA_VER=XY.Z
* Jetson platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
@@ -122,7 +124,7 @@ Edit the `config_infer_primary_rtdetr.txt` file according to your model (example
```
[property]
...
-onnx-file=rtdetr_r50vd_6x_coco.onnx
+onnx-file=rtdetr_r50vd_6x_coco.pdparams.onnx
...
num-detected-classes=80
...
diff --git a/docs/RTDETR_PyTorch.md b/docs/RTDETR_PyTorch.md
index 06f7cb1..fa0fad9 100644
--- a/docs/RTDETR_PyTorch.md
+++ b/docs/RTDETR_PyTorch.md
@@ -18,7 +18,7 @@
git clone https://github.com/lyuwenyu/RT-DETR.git
cd RT-DETR/rtdetr_pytorch
pip3 install -r requirements.txt
-pip3 install onnx onnxsim onnxruntime
+pip3 install onnx onnxslim onnxruntime
```
**NOTE**: It is recommended to use Python virtualenv.
@@ -109,6 +109,7 @@ export CUDA_VER=XY.Z
* x86 platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 = 12.1
DeepStream 6.2 = 11.8
@@ -121,6 +122,7 @@ export CUDA_VER=XY.Z
* Jetson platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
@@ -141,7 +143,7 @@ Edit the `config_infer_primary_rtdetr.txt` file according to your model (example
```
[property]
...
-onnx-file=rtdetr_r50vd_6x_coco_from_paddle.onnx
+onnx-file=rtdetr_r50vd_6x_coco_from_paddle.pth.onnx
...
num-detected-classes=80
...
diff --git a/docs/RTDETR_Ultralytics.md b/docs/RTDETR_Ultralytics.md
index 8a4857b..8801241 100644
--- a/docs/RTDETR_Ultralytics.md
+++ b/docs/RTDETR_Ultralytics.md
@@ -17,9 +17,8 @@
```
git clone https://github.com/ultralytics/ultralytics.git
cd ultralytics
-pip3 install -r requirements.txt
-python3 setup.py install
-pip3 install onnx onnxsim onnxruntime
+pip3 install -e .
+pip3 install onnx onnxslim onnxruntime
```
**NOTE**: It is recommended to use Python virtualenv.
@@ -30,17 +29,17 @@ Copy the `export_rtdetr_ultralytics.py` file from `DeepStream-Yolo/utils` direct
#### 3. Download the model
-Download the `pt` file from [Ultralytics](https://github.com/ultralytics/assets/releases/) releases (example for RT-DETR-l)
+Download the `pt` file from [Ultralytics](https://github.com/ultralytics/assets/releases/) releases (example for RT-DETR-L)
```
-wget https://github.com/ultralytics/assets/releases/download/v0.0.0/rtdetr-l.pt
+wget https://github.com/ultralytics/assets/releases/download/v8.2.0/rtdetr-l.pt
```
**NOTE**: You can use your custom model.
#### 4. Convert model
-Generate the ONNX model file (example for RT-DETR-l)
+Generate the ONNX model file (example for RT-DETR-L)
```
python3 export_rtdetr_ultralytics.py -w rtdetr-l.pt --dynamic
@@ -110,6 +109,7 @@ export CUDA_VER=XY.Z
* x86 platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 = 12.1
DeepStream 6.2 = 11.8
@@ -122,6 +122,7 @@ export CUDA_VER=XY.Z
* Jetson platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
@@ -137,12 +138,12 @@ make -C nvdsinfer_custom_impl_Yolo clean && make -C nvdsinfer_custom_impl_Yolo
### Edit the config_infer_primary_rtdetr file
-Edit the `config_infer_primary_rtdetr.txt` file according to your model (example for RT-DETR-l with 80 classes)
+Edit the `config_infer_primary_rtdetr.txt` file according to your model (example for RT-DETR-L with 80 classes)
```
[property]
...
-onnx-file=rtdetr-l.onnx
+onnx-file=rtdetr-l.pt.onnx
...
num-detected-classes=80
...
diff --git a/docs/RTMDet.md b/docs/RTMDet.md
new file mode 100644
index 0000000..8a70192
--- /dev/null
+++ b/docs/RTMDet.md
@@ -0,0 +1,209 @@
+# RTMDet (MMYOLO) usage
+
+* [Convert model](#convert-model)
+* [Compile the lib](#compile-the-lib)
+* [Edit the config_infer_primary_rtmdet file](#edit-the-config_infer_primary_rtmdet-file)
+* [Edit the deepstream_app_config file](#edit-the-deepstream_app_config-file)
+* [Testing the model](#testing-the-model)
+
+##
+
+### Convert model
+
+#### 1. Download the RTMDet (MMYOLO) repo and install the requirements
+
+```
+git clone https://github.com/open-mmlab/mmyolo.git
+cd mmyolo
+pip3 install openmim
+mim install "mmengine>=0.6.0"
+mim install "mmcv>=2.0.0rc4,<2.1.0"
+mim install "mmdet>=3.0.0,<4.0.0"
+pip3 install -r requirements/albu.txt
+mim install -v -e .
+pip3 install onnx onnxslim onnxruntime
+```
+
+**NOTE**: It is recommended to use Python virtualenv.
+
+#### 2. Copy conversor
+
+Copy the `export_rtmdet.py` file from `DeepStream-Yolo/utils` directory to the `mmyolo` folder.
+
+#### 3. Download the model
+
+Download the `pth` file from [RTMDet (MMYOLO)](https://github.com/open-mmlab/mmyolo/tree/main/configs/rtmdet) releases (example for RTMDet-s*)
+
+```
+wget https://download.openmmlab.com/mmrazor/v1/rtmdet_distillation/kd_s_rtmdet_m_neck_300e_coco/kd_s_rtmdet_m_neck_300e_coco_20230220_140647-446ff003.pth
+```
+
+**NOTE**: You can use your custom model.
+
+#### 4. Convert model
+
+Generate the ONNX model file (example for RTMDet-s*)
+
+```
+python3 export_rtmdet.py -w kd_s_rtmdet_m_neck_300e_coco_20230220_140647-446ff003.pth -c configs/rtmdet/distillation/kd_s_rtmdet_m_neck_300e_coco.py --dynamic
+```
+
+**NOTE**: To change the inference size (defaut: 640)
+
+```
+-s SIZE
+--size SIZE
+-s HEIGHT WIDTH
+--size HEIGHT WIDTH
+```
+
+Example for 1280
+
+```
+-s 1280
+```
+
+or
+
+```
+-s 1280 1280
+```
+
+**NOTE**: To simplify the ONNX model (DeepStream >= 6.0)
+
+```
+--simplify
+```
+
+**NOTE**: To use dynamic batch-size (DeepStream >= 6.1)
+
+```
+--dynamic
+```
+
+**NOTE**: To use static batch-size (example for batch-size = 4)
+
+```
+--batch 4
+```
+
+**NOTE**: If you are using the DeepStream 5.1, remove the `--dynamic` arg and use opset 12 or lower. The default opset is 17.
+
+```
+--opset 12
+```
+
+#### 5. Copy generated files
+
+Copy the generated ONNX model file and labels.txt file (if generated) to the `DeepStream-Yolo` folder.
+
+##
+
+### Compile the lib
+
+1. Open the `DeepStream-Yolo` folder and compile the lib
+
+2. Set the `CUDA_VER` according to your DeepStream version
+
+```
+export CUDA_VER=XY.Z
+```
+
+* x86 platform
+
+ ```
+ DeepStream 7.1 = 12.6
+ DeepStream 7.0 / 6.4 = 12.2
+ DeepStream 6.3 = 12.1
+ DeepStream 6.2 = 11.8
+ DeepStream 6.1.1 = 11.7
+ DeepStream 6.1 = 11.6
+ DeepStream 6.0.1 / 6.0 = 11.4
+ DeepStream 5.1 = 11.1
+ ```
+
+* Jetson platform
+
+ ```
+ DeepStream 7.1 = 12.6
+ DeepStream 7.0 / 6.4 = 12.2
+ DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
+ DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
+ ```
+
+3. Make the lib
+
+```
+make -C nvdsinfer_custom_impl_Yolo clean && make -C nvdsinfer_custom_impl_Yolo
+```
+
+##
+
+### Edit the config_infer_primary_rtmdet file
+
+Edit the `config_infer_primary_rtmdet.txt` file according to your model (example for RTMDet-s* with 80 classes)
+
+```
+[property]
+...
+onnx-file=kd_s_rtmdet_m_neck_300e_coco_20230220_140647-446ff003.pth.onnx
+...
+num-detected-classes=80
+...
+parse-bbox-func-name=NvDsInferParseYolo
+...
+```
+
+**NOTE**: The **RTMDet (MMYOLO)** resizes the input with center padding. To get better accuracy, use
+
+```
+[property]
+...
+maintain-aspect-ratio=1
+symmetric-padding=1
+...
+```
+
+**NOTE**: The **RTMDet (MMYOLO)** uses BGR color format for the image input. It is important to change the `model-color-format` according to the trained values.
+
+```
+[property]
+...
+model-color-format=1
+...
+```
+
+**NOTE**: The **RTMDet (MMYOLO)** uses normalization on the image preprocess. It is important to change the `net-scale-factor` and `offsets` according to the trained values.
+
+Default: `mean = 0.485, 0.456, 0.406` and `std = 0.229, 0.224, 0.225`
+
+```
+[property]
+...
+net-scale-factor=0.0173520735727919486
+offsets=103.53;116.28;123.675
+...
+```
+
+##
+
+### Edit the deepstream_app_config file
+
+```
+...
+[primary-gie]
+...
+config-file=config_infer_primary_rtmdet.txt
+```
+
+##
+
+### Testing the model
+
+```
+deepstream-app -c deepstream_app_config.txt
+```
+
+**NOTE**: The TensorRT engine file may take a very long time to generate (sometimes more than 10 minutes).
+
+**NOTE**: For more information about custom models configuration (`batch-size`, `network-mode`, etc), please check the [`docs/customModels.md`](customModels.md) file.
diff --git a/docs/YOLONAS.md b/docs/YOLONAS.md
index d53bfe9..06f7dce 100644
--- a/docs/YOLONAS.md
+++ b/docs/YOLONAS.md
@@ -19,7 +19,7 @@ git clone https://github.com/Deci-AI/super-gradients.git
cd super-gradients
pip3 install -r requirements.txt
python3 setup.py install
-pip3 install onnx onnxsim onnxruntime
+pip3 install onnx onnxslim onnxruntime
```
**NOTE**: It is recommended to use Python virtualenv.
@@ -140,6 +140,7 @@ export CUDA_VER=XY.Z
* x86 platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 = 12.1
DeepStream 6.2 = 11.8
@@ -152,6 +153,7 @@ export CUDA_VER=XY.Z
* Jetson platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
@@ -172,11 +174,11 @@ Edit the `config_infer_primary_yolonas.txt` file according to your model (exampl
```
[property]
...
-onnx-file=yolo_nas_s_coco.onnx
+onnx-file=yolo_nas_s_coco.pth.onnx
...
num-detected-classes=80
...
-parse-bbox-func-name=NvDsInferParseYoloE
+parse-bbox-func-name=NvDsInferParseYolo
...
```
diff --git a/docs/YOLOR.md b/docs/YOLOR.md
index 2678789..d0653fe 100644
--- a/docs/YOLOR.md
+++ b/docs/YOLOR.md
@@ -20,7 +20,7 @@
git clone https://github.com/WongKinYiu/yolor.git
cd yolor
pip3 install -r requirements.txt
-pip3 install onnx onnxsim onnxruntime
+pip3 install onnx onnxslim onnxruntime
```
**NOTE**: It is recommended to use Python virtualenv.
@@ -125,6 +125,7 @@ export CUDA_VER=XY.Z
* x86 platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 = 12.1
DeepStream 6.2 = 11.8
@@ -137,6 +138,7 @@ export CUDA_VER=XY.Z
* Jetson platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
@@ -157,7 +159,7 @@ Edit the `config_infer_primary_yolor.txt` file according to your model (example
```
[property]
...
-onnx-file=yolor_csp.onnx
+onnx-file=yolor_csp.pt.onnx
...
num-detected-classes=80
...
diff --git a/docs/YOLOX.md b/docs/YOLOX.md
index db6f974..b610c1a 100644
--- a/docs/YOLOX.md
+++ b/docs/YOLOX.md
@@ -19,7 +19,7 @@ git clone https://github.com/Megvii-BaseDetection/YOLOX.git
cd YOLOX
pip3 install -r requirements.txt
python3 setup.py develop
-pip3 install onnx onnxsim onnxruntime
+pip3 install onnx onnxslim onnxruntime
```
**NOTE**: It is recommended to use Python virtualenv.
@@ -89,6 +89,7 @@ export CUDA_VER=XY.Z
* x86 platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 = 12.1
DeepStream 6.2 = 11.8
@@ -101,6 +102,7 @@ export CUDA_VER=XY.Z
* Jetson platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
@@ -121,7 +123,7 @@ Edit the `config_infer_primary_yolox.txt` file according to your model (example
```
[property]
...
-onnx-file=yolox_s.onnx
+onnx-file=yolox_s.pth.onnx
...
num-detected-classes=80
...
diff --git a/docs/YOLOv5.md b/docs/YOLOv5.md
index 901c3eb..57859b3 100644
--- a/docs/YOLOv5.md
+++ b/docs/YOLOv5.md
@@ -20,7 +20,7 @@
git clone https://github.com/ultralytics/yolov5.git
cd yolov5
pip3 install -r requirements.txt
-pip3 install onnx onnxsim onnxruntime
+pip3 install onnx onnxslim onnxruntime
```
**NOTE**: It is recommended to use Python virtualenv.
@@ -117,6 +117,7 @@ export CUDA_VER=XY.Z
* x86 platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 = 12.1
DeepStream 6.2 = 11.8
@@ -129,6 +130,7 @@ export CUDA_VER=XY.Z
* Jetson platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
@@ -149,7 +151,7 @@ Edit the `config_infer_primary_yoloV5.txt` file according to your model (example
```
[property]
...
-onnx-file=yolov5s.onnx
+onnx-file=yolov5s.pt.onnx
...
num-detected-classes=80
...
diff --git a/docs/YOLOv6.md b/docs/YOLOv6.md
index 9a58901..d3de8b1 100644
--- a/docs/YOLOv6.md
+++ b/docs/YOLOv6.md
@@ -20,7 +20,7 @@
git clone https://github.com/meituan/YOLOv6.git
cd YOLOv6
pip3 install -r requirements.txt
-pip3 install onnx onnxsim onnxruntime
+pip3 install onnx onnxslim onnxruntime
```
**NOTE**: It is recommended to use Python virtualenv.
@@ -117,6 +117,7 @@ export CUDA_VER=XY.Z
* x86 platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 = 12.1
DeepStream 6.2 = 11.8
@@ -129,6 +130,7 @@ export CUDA_VER=XY.Z
* Jetson platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
@@ -149,7 +151,7 @@ Edit the `config_infer_primary_yoloV6.txt` file according to your model (example
```
[property]
...
-onnx-file=yolov6s.onnx
+onnx-file=yolov6s.pt.onnx
...
num-detected-classes=80
...
diff --git a/docs/YOLOv7.md b/docs/YOLOv7.md
index 2b22cd8..c035e8c 100644
--- a/docs/YOLOv7.md
+++ b/docs/YOLOv7.md
@@ -18,7 +18,7 @@
git clone https://github.com/WongKinYiu/yolov7.git
cd yolov7
pip3 install -r requirements.txt
-pip3 install onnx onnxsim onnxruntime
+pip3 install onnx onnxslim onnxruntime
```
**NOTE**: It is recommended to use Python virtualenv.
@@ -119,6 +119,7 @@ export CUDA_VER=XY.Z
* x86 platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 = 12.1
DeepStream 6.2 = 11.8
@@ -131,6 +132,7 @@ export CUDA_VER=XY.Z
* Jetson platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
@@ -151,7 +153,7 @@ Edit the `config_infer_primary_yoloV7.txt` file according to your model (example
```
[property]
...
-onnx-file=yolov7.onnx
+onnx-file=yolov7.pt.onnx
...
num-detected-classes=80
...
diff --git a/docs/YOLOv8.md b/docs/YOLOv8.md
index f206cea..3e13aac 100644
--- a/docs/YOLOv8.md
+++ b/docs/YOLOv8.md
@@ -17,9 +17,8 @@
```
git clone https://github.com/ultralytics/ultralytics.git
cd ultralytics
-pip3 install -r requirements.txt
-python3 setup.py install
-pip3 install onnx onnxsim onnxruntime
+pip3 install -e .
+pip3 install onnx onnxslim onnxruntime
```
**NOTE**: It is recommended to use Python virtualenv.
@@ -33,7 +32,7 @@ Copy the `export_yoloV8.py` file from `DeepStream-Yolo/utils` directory to the `
Download the `pt` file from [YOLOv8](https://github.com/ultralytics/assets/releases/) releases (example for YOLOv8s)
```
-wget https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8s.pt
+wget https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s.pt
```
**NOTE**: You can use your custom model.
@@ -85,7 +84,7 @@ or
--batch 4
```
-**NOTE**: If you are using the DeepStream 5.1, remove the `--dynamic` arg and use opset 12 or lower. The default opset is 16.
+**NOTE**: If you are using the DeepStream 5.1, remove the `--dynamic` arg and use opset 12 or lower. The default opset is 17.
```
--opset 12
@@ -110,6 +109,7 @@ export CUDA_VER=XY.Z
* x86 platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 = 12.1
DeepStream 6.2 = 11.8
@@ -122,6 +122,7 @@ export CUDA_VER=XY.Z
* Jetson platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
@@ -142,7 +143,7 @@ Edit the `config_infer_primary_yoloV8.txt` file according to your model (example
```
[property]
...
-onnx-file=yolov8s.onnx
+onnx-file=yolov8s.pt.onnx
...
num-detected-classes=80
...
diff --git a/docs/YOLOv9.md b/docs/YOLOv9.md
new file mode 100644
index 0000000..2dda3e4
--- /dev/null
+++ b/docs/YOLOv9.md
@@ -0,0 +1,185 @@
+# YOLOv9 usage
+
+**NOTE**: The yaml file is not required.
+
+* [Convert model](#convert-model)
+* [Compile the lib](#compile-the-lib)
+* [Edit the config_infer_primary_yoloV9 file](#edit-the-config_infer_primary_yolov9-file)
+* [Edit the deepstream_app_config file](#edit-the-deepstream_app_config-file)
+* [Testing the model](#testing-the-model)
+
+##
+
+### Convert model
+
+#### 1. Download the YOLOv9 repo and install the requirements
+
+```
+git clone https://github.com/WongKinYiu/yolov9.git
+cd yolov9
+pip3 install -r requirements.txt
+pip3 install onnx onnxslim onnxruntime
+```
+
+**NOTE**: It is recommended to use Python virtualenv.
+
+#### 2. Copy conversor
+
+Copy the `export_yoloV9.py` file from `DeepStream-Yolo/utils` directory to the `yolov9` folder.
+
+#### 3. Download the model
+
+Download the `pt` file from [YOLOv9](https://github.com/WongKinYiu/yolov9/releases/) releases (example for YOLOv9-S)
+
+```
+wget https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-s-converted.pt
+```
+
+**NOTE**: You can use your custom model.
+
+#### 4. Convert model
+
+Generate the ONNX model file (example for YOLOv9-S)
+
+```
+python3 export_yoloV9.py -w yolov9-s-converted.pt --dynamic
+```
+
+**NOTE**: To change the inference size (defaut: 640)
+
+```
+-s SIZE
+--size SIZE
+-s HEIGHT WIDTH
+--size HEIGHT WIDTH
+```
+
+Example for 1280
+
+```
+-s 1280
+```
+
+or
+
+```
+-s 1280 1280
+```
+
+**NOTE**: To simplify the ONNX model (DeepStream >= 6.0)
+
+```
+--simplify
+```
+
+**NOTE**: To use dynamic batch-size (DeepStream >= 6.1)
+
+```
+--dynamic
+```
+
+**NOTE**: To use static batch-size (example for batch-size = 4)
+
+```
+--batch 4
+```
+
+**NOTE**: If you are using the DeepStream 5.1, remove the `--dynamic` arg and use opset 12 or lower. The default opset is 17.
+
+```
+--opset 12
+```
+
+#### 5. Copy generated files
+
+Copy the generated ONNX model file and labels.txt file (if generated) to the `DeepStream-Yolo` folder.
+
+##
+
+### Compile the lib
+
+1. Open the `DeepStream-Yolo` folder and compile the lib
+
+2. Set the `CUDA_VER` according to your DeepStream version
+
+```
+export CUDA_VER=XY.Z
+```
+
+* x86 platform
+
+ ```
+ DeepStream 7.1 = 12.6
+ DeepStream 7.0 / 6.4 = 12.2
+ DeepStream 6.3 = 12.1
+ DeepStream 6.2 = 11.8
+ DeepStream 6.1.1 = 11.7
+ DeepStream 6.1 = 11.6
+ DeepStream 6.0.1 / 6.0 = 11.4
+ DeepStream 5.1 = 11.1
+ ```
+
+* Jetson platform
+
+ ```
+ DeepStream 7.1 = 12.6
+ DeepStream 7.0 / 6.4 = 12.2
+ DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
+ DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
+ ```
+
+3. Make the lib
+
+```
+make -C nvdsinfer_custom_impl_Yolo clean && make -C nvdsinfer_custom_impl_Yolo
+```
+
+##
+
+### Edit the config_infer_primary_yoloV9 file
+
+Edit the `config_infer_primary_yoloV9.txt` file according to your model (example for YOLOv9-S with 80 classes)
+
+```
+[property]
+...
+onnx-file=yolov9-s-converted.pt.onnx
+...
+num-detected-classes=80
+...
+parse-bbox-func-name=NvDsInferParseYolo
+...
+```
+
+**NOTE**: The **YOLOv9** resizes the input with center padding. To get better accuracy, use
+
+```
+[property]
+...
+maintain-aspect-ratio=1
+symmetric-padding=1
+...
+```
+
+##
+
+### Edit the deepstream_app_config file
+
+```
+...
+[primary-gie]
+...
+config-file=config_infer_primary_yoloV9.txt
+```
+
+##
+
+### Testing the model
+
+```
+deepstream-app -c deepstream_app_config.txt
+```
+
+**NOTE**: The TensorRT engine file may take a very long time to generate (sometimes more than 10 minutes).
+
+**NOTE**: For more information about custom models configuration (`batch-size`, `network-mode`, etc), please check the [`docs/customModels.md`](customModels.md) file.
diff --git a/docs/customModels.md b/docs/customModels.md
index 5f21d80..b3ff858 100644
--- a/docs/customModels.md
+++ b/docs/customModels.md
@@ -34,6 +34,7 @@ export CUDA_VER=XY.Z
* x86 platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 = 12.1
DeepStream 6.2 = 11.8
@@ -46,6 +47,7 @@ export CUDA_VER=XY.Z
* Jetson platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
diff --git a/docs/dGPUInstalation.md b/docs/dGPUInstalation.md
index 1e0f557..807a050 100644
--- a/docs/dGPUInstalation.md
+++ b/docs/dGPUInstalation.md
@@ -29,6 +29,157 @@ sudo apt-get install linux-headers-$(uname -r)
sudo reboot
```
+DeepStream 7.1
+
+### 1. Dependencies
+
+```
+sudo apt-get install dkms
+sudo apt-get install libssl3 libssl-dev libgles2-mesa-dev libgstreamer1.0-0 gstreamer1.0-tools gstreamer1.0-plugins-good gstreamer1.0-plugins-bad gstreamer1.0-plugins-ugly gstreamer1.0-libav libgstreamer-plugins-base1.0-dev libgstrtspserver-1.0-0 libjansson4 libyaml-cpp-dev libjsoncpp-dev protobuf-compiler
+```
+
+### 2. CUDA Keyring
+
+```
+wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0-1_all.deb
+sudo dpkg -i cuda-keyring_1.0-1_all.deb
+sudo apt-get update
+```
+
+### 3. GCC 12
+
+```
+sudo apt-get install gcc-12 g++-12
+sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 12
+sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-12 12
+sudo update-initramfs -u
+```
+
+### 4. NVIDIA Driver
+
+TITAN, GeForce RTX / GTX series and RTX / Quadro series
+
+- Download
+
+ ```
+ wget https://us.download.nvidia.com/XFree86/Linux-x86_64/560.35.03/NVIDIA-Linux-x86_64-560.35.03.run
+ ```
+
+Laptop
+
+* Run
+
+ ```
+ sudo sh NVIDIA-Linux-x86_64-560.35.03.run --no-cc-version-check --silent --disable-nouveau --dkms --install-libglvnd
+ ```
+
+ **NOTE**: This step will disable the nouveau drivers.
+
+* Reboot
+
+ ```
+ sudo reboot
+ ```
+
+* Install
+
+ ```
+ sudo sh NVIDIA-Linux-x86_64-560.35.03.run --no-cc-version-check --silent --disable-nouveau --dkms --install-libglvnd
+ ```
+
+**NOTE**: If you are using a laptop with NVIDIA Optimius, run
+
+```
+sudo apt-get install nvidia-prime
+sudo prime-select nvidia
+```
+
+
+
+Desktop
+
+* Run
+
+ ```
+ sudo sh NVIDIA-Linux-x86_64-560.35.03.run --no-cc-version-check --silent --disable-nouveau --dkms --install-libglvnd --run-nvidia-xconfig
+ ```
+
+ **NOTE**: This step will disable the nouveau drivers.
+
+* Reboot
+
+ ```
+ sudo reboot
+ ```
+
+* Install
+
+ ```
+ sudo sh NVIDIA-Linux-x86_64-560.35.03.run --no-cc-version-check --silent --disable-nouveau --dkms --install-libglvnd --run-nvidia-xconfig
+ ```
+
+
+
+
+
+Data center / Tesla series
+
+ - Download
+
+ ```
+ wget https://us.download.nvidia.com/tesla/535.183.06/NVIDIA-Linux-x86_64-535.183.06.run
+ ```
+
+ * Run
+
+ ```
+ sudo sh NVIDIA-Linux-x86_64-535.183.06.run --no-cc-version-check --silent --disable-nouveau --dkms --install-libglvnd --run-nvidia-xconfig
+ ```
+
+
+
+### 5. CUDA
+
+```
+wget https://developer.download.nvidia.com/compute/cuda/12.6.2/local_installers/cuda_12.6.2_560.35.03_linux.run
+sudo sh cuda_12.6.2_560.35.03_linux.run --silent --toolkit
+```
+
+* Export environment variables
+
+ ```
+ echo $'export PATH=/usr/local/cuda-12.6/bin${PATH:+:${PATH}}\nexport LD_LIBRARY_PATH=/usr/local/cuda-12.6/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}' >> ~/.bashrc && source ~/.bashrc
+ ```
+
+### 6. TensorRT
+
+```
+sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
+sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /"
+sudo apt-get update
+sudo apt-get install libnvinfer-dev=10.3.0.26-1+cuda12.5 libnvinfer-dispatch-dev=10.3.0.26-1+cuda12.5 libnvinfer-dispatch10=10.3.0.26-1+cuda12.5 libnvinfer-headers-dev=10.3.0.26-1+cuda12.5 libnvinfer-headers-plugin-dev=10.3.0.26-1+cuda12.5 libnvinfer-lean-dev=10.3.0.26-1+cuda12.5 libnvinfer-lean10=10.3.0.26-1+cuda12.5 libnvinfer-plugin-dev=10.3.0.26-1+cuda12.5 libnvinfer-plugin10=10.3.0.26-1+cuda12.5 libnvinfer-vc-plugin-dev=10.3.0.26-1+cuda12.5 libnvinfer-vc-plugin10=10.3.0.26-1+cuda12.5 libnvinfer10=10.3.0.26-1+cuda12.5 libnvonnxparsers-dev=10.3.0.26-1+cuda12.5 libnvonnxparsers10=10.3.0.26-1+cuda12.5 tensorrt-dev=10.3.0.26-1+cuda12.5 libnvinfer-samples=10.3.0.26-1+cuda12.5 libnvinfer-bin=10.3.0.26-1+cuda12.5 libcudnn9-cuda-12=9.3.0.75-1 libcudnn9-dev-cuda-12=9.3.0.75-1
+sudo apt-mark hold libnvinfer* libnvparsers* libnvonnxparsers* libcudnn9* python3-libnvinfer* uff-converter-tf* onnx-graphsurgeon* graphsurgeon-tf* tensorrt*
+```
+
+### 7. DeepStream SDK
+
+DeepStream 7.1 for Servers and Workstations
+
+```
+wget --content-disposition 'https://api.ngc.nvidia.com/v2/resources/org/nvidia/deepstream/7.1/files?redirect=true&path=deepstream-7.1_7.1.0-1_amd64.deb' -O deepstream-7.1_7.1.0-1_amd64.deb
+sudo apt-get install ./deepstream-7.1_7.1.0-1_amd64.deb
+rm ${HOME}/.cache/gstreamer-1.0/registry.x86_64.bin
+sudo ln -snf /usr/local/cuda-12.6 /usr/local/cuda
+```
+
+### 8. Reboot
+
+```
+sudo reboot
+```
+
+
+
DeepStream 7.0
### 1. Dependencies
diff --git a/docs/multipleGIEs.md b/docs/multipleGIEs.md
index 13faccd..2c5631d 100644
--- a/docs/multipleGIEs.md
+++ b/docs/multipleGIEs.md
@@ -59,6 +59,7 @@ export CUDA_VER=XY.Z
* x86 platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 = 12.1
DeepStream 6.2 = 11.8
@@ -71,6 +72,7 @@ export CUDA_VER=XY.Z
* Jetson platform
```
+ DeepStream 7.1 = 12.6
DeepStream 7.0 / 6.4 = 12.2
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
diff --git a/nvdsinfer_custom_impl_Yolo/Makefile b/nvdsinfer_custom_impl_Yolo/Makefile
index c800460..5b5f08d 100644
--- a/nvdsinfer_custom_impl_Yolo/Makefile
+++ b/nvdsinfer_custom_impl_Yolo/Makefile
@@ -1,5 +1,5 @@
################################################################################
-# Copyright (c) 2018-2023, NVIDIA CORPORATION. All rights reserved.
+# Copyright (c) 2018-2024, NVIDIA CORPORATION. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
@@ -56,10 +56,15 @@ endif
CUFLAGS:= -I/opt/nvidia/deepstream/deepstream/sources/includes -I/usr/local/cuda-$(CUDA_VER)/include
-LIBS+= -lnvinfer_plugin -lnvinfer -lnvparsers -lnvonnxparser -L/usr/local/cuda-$(CUDA_VER)/lib64 -lcudart -lcublas -lstdc++fs
+ifeq ($(shell ldconfig -p | grep -q libnvparsers && echo 1 || echo 0), 1)
+ LIBS+= -lnvparsers
+endif
+
+LIBS+= -lnvinfer_plugin -lnvinfer -lnvonnxparser -L/usr/local/cuda-$(CUDA_VER)/lib64 -lcudart -lcublas -lstdc++fs
LFLAGS:= -shared -Wl,--start-group $(LIBS) -Wl,--end-group
-INCS:= $(wildcard *.h)
+INCS:= $(wildcard layers/*.h)
+INCS+= $(wildcard *.h)
SRCFILES:= $(filter-out calibrator.cpp, $(wildcard *.cpp))
diff --git a/nvdsinfer_custom_impl_Yolo/calibrator.cpp b/nvdsinfer_custom_impl_Yolo/calibrator.cpp
index 2eba320..d8ef4d3 100644
--- a/nvdsinfer_custom_impl_Yolo/calibrator.cpp
+++ b/nvdsinfer_custom_impl_Yolo/calibrator.cpp
@@ -8,9 +8,10 @@
#include
#include
-Int8EntropyCalibrator2::Int8EntropyCalibrator2(const int& batchSize, const int& channels, const int& height, const int& width,
- const float& scaleFactor, const float* offsets, const std::string& imgPath, const std::string& calibTablePath) :
- batchSize(batchSize), inputC(channels), inputH(height), inputW(width), scaleFactor(scaleFactor), offsets(offsets),
+Int8EntropyCalibrator2::Int8EntropyCalibrator2(const int& batchSize, const int& channels, const int& height,
+ const int& width, const float& scaleFactor, const float* offsets, const int& inputFormat,
+ const std::string& imgPath, const std::string& calibTablePath) : batchSize(batchSize), inputC(channels),
+ inputH(height), inputW(width), scaleFactor(scaleFactor), offsets(offsets), inputFormat(inputFormat),
calibTablePath(calibTablePath), imageIndex(0)
{
inputCount = batchSize * channels * height * width;
@@ -54,7 +55,7 @@ Int8EntropyCalibrator2::getBatch(void** bindings, const char** names, int nbBind
return false;
}
- std::vector inputData = prepareImage(img, inputC, inputH, inputW, scaleFactor, offsets);
+ std::vector inputData = prepareImage(img, inputC, inputH, inputW, scaleFactor, offsets, inputFormat);
size_t len = inputData.size();
memcpy(ptr, inputData.data(), len * sizeof(float));
@@ -93,32 +94,46 @@ Int8EntropyCalibrator2::writeCalibrationCache(const void* cache, std::size_t len
}
std::vector
-prepareImage(cv::Mat& img, int input_c, int input_h, int input_w, float scaleFactor, const float* offsets)
+prepareImage(cv::Mat& img, int inputC, int inputH, int inputW, float scaleFactor, const float* offsets, int inputFormat)
{
cv::Mat out;
- cv::cvtColor(img, out, cv::COLOR_BGR2RGB);
+ if (inputFormat == 0) {
+ cv::cvtColor(img, out, cv::COLOR_BGR2RGB);
+ }
+ else if (inputFormat == 2) {
+ cv::cvtColor(img, out, cv::COLOR_BGR2GRAY);
+ }
+ else {
+ out = img;
+ }
- int image_w = img.cols;
- int image_h = img.rows;
+ int imageW = img.cols;
+ int imageH = img.rows;
- if (image_w != input_w || image_h != input_h) {
- float resizeFactor = std::max(input_w / (float) image_w, input_h / (float) img.rows);
+ if (imageW != inputW || imageH != inputH) {
+ float resizeFactor = std::max(inputW / (float) imageW, inputH / (float) imageH);
cv::resize(out, out, cv::Size(0, 0), resizeFactor, resizeFactor, cv::INTER_CUBIC);
- cv::Rect crop(cv::Point(0.5 * (out.cols - input_w), 0.5 * (out.rows - input_h)), cv::Size(input_w, input_h));
+ cv::Rect crop(cv::Point(0.5 * (out.cols - inputW), 0.5 * (out.rows - inputH)), cv::Size(inputW, inputH));
out = out(crop);
}
out.convertTo(out, CV_32F, scaleFactor);
- cv::subtract(out, cv::Scalar(offsets[2] / 255, offsets[1] / 255, offsets[0] / 255), out, cv::noArray(), -1);
- std::vector input_channels(input_c);
- cv::split(out, input_channels);
- std::vector result(input_h * input_w * input_c);
+ if (inputFormat == 2) {
+ cv::subtract(out, cv::Scalar(offsets[0] / 255), out);
+ }
+ else {
+ cv::subtract(out, cv::Scalar(offsets[0] / 255, offsets[1] / 255, offsets[3] / 255), out);
+ }
+
+ std::vector inputChannels(inputC);
+ cv::split(out, inputChannels);
+ std::vector result(inputH * inputW * inputC);
auto data = result.data();
- int channelLength = input_h * input_w;
- for (int i = 0; i < input_c; ++i) {
- memcpy(data, input_channels[i].data, channelLength * sizeof(float));
+ int channelLength = inputH * inputW;
+ for (int i = 0; i < inputC; ++i) {
+ memcpy(data, inputChannels[i].data, channelLength * sizeof(float));
data += channelLength;
}
diff --git a/nvdsinfer_custom_impl_Yolo/calibrator.h b/nvdsinfer_custom_impl_Yolo/calibrator.h
index 1a92100..bd7685b 100644
--- a/nvdsinfer_custom_impl_Yolo/calibrator.h
+++ b/nvdsinfer_custom_impl_Yolo/calibrator.h
@@ -12,18 +12,19 @@
#include "NvInfer.h"
#include "opencv2/opencv.hpp"
-#define CUDA_CHECK(status) { \
- if (status != 0) { \
- std::cout << "CUDA failure: " << cudaGetErrorString(status) << " in file " << __FILE__ << " at line " << __LINE__ << \
- std::endl; \
- abort(); \
- } \
+#define CUDA_CHECK(status) { \
+ if (status != 0) { \
+ std::cout << "CUDA failure: " << cudaGetErrorString(status) << " in file " << __FILE__ << " at line " << \
+ __LINE__ << std::endl; \
+ abort(); \
+ } \
}
class Int8EntropyCalibrator2 : public nvinfer1::IInt8EntropyCalibrator2 {
public:
Int8EntropyCalibrator2(const int& batchSize, const int& channels, const int& height, const int& width,
- const float& scaleFactor, const float* offsets, const std::string& imgPath, const std::string& calibTablePath);
+ const float& scaleFactor, const float* offsets, const int& inputFormat, const std::string& imgPath,
+ const std::string& calibTablePath);
virtual ~Int8EntropyCalibrator2();
@@ -43,6 +44,7 @@ class Int8EntropyCalibrator2 : public nvinfer1::IInt8EntropyCalibrator2 {
int letterBox;
float scaleFactor;
const float* offsets;
+ int inputFormat;
std::string calibTablePath;
size_t imageIndex;
size_t inputCount;
@@ -53,7 +55,7 @@ class Int8EntropyCalibrator2 : public nvinfer1::IInt8EntropyCalibrator2 {
std::vector calibrationCache;
};
-std::vector prepareImage(cv::Mat& img, int input_c, int input_h, int input_w, float scaleFactor,
- const float* offsets);
+std::vector prepareImage(cv::Mat& img, int inputC, int inputH, int inputW, float scaleFactor,
+ const float* offsets, int inputFormat);
#endif //CALIBRATOR_H
diff --git a/nvdsinfer_custom_impl_Yolo/layers/activation_layer.cpp b/nvdsinfer_custom_impl_Yolo/layers/activation_layer.cpp
index ce3d9a1..e795e2d 100644
--- a/nvdsinfer_custom_impl_Yolo/layers/activation_layer.cpp
+++ b/nvdsinfer_custom_impl_Yolo/layers/activation_layer.cpp
@@ -14,8 +14,9 @@ activationLayer(int layerIdx, std::string activation, nvinfer1::ITensor* input,
{
nvinfer1::ITensor* output;
- if (activation == "linear")
+ if (activation == "linear") {
output = input;
+ }
else if (activation == "relu") {
nvinfer1::IActivationLayer* relu = network->addActivation(*input, nvinfer1::ActivationType::kRELU);
assert(relu != nullptr);
diff --git a/nvdsinfer_custom_impl_Yolo/layers/batchnorm_layer.cpp b/nvdsinfer_custom_impl_Yolo/layers/batchnorm_layer.cpp
index 0b1fce2..89bdc92 100644
--- a/nvdsinfer_custom_impl_Yolo/layers/batchnorm_layer.cpp
+++ b/nvdsinfer_custom_impl_Yolo/layers/batchnorm_layer.cpp
@@ -21,6 +21,11 @@ batchnormLayer(int layerIdx, std::map& block, std::vec
int filters = std::stoi(block.at("filters"));
std::string activation = block.at("activation");
+ float eps = 1.0e-5;
+ if (block.find("eps") != block.end()) {
+ eps = std::stof(block.at("eps"));
+ }
+
std::vector bnBiases;
std::vector bnWeights;
std::vector bnRunningMean;
@@ -39,7 +44,7 @@ batchnormLayer(int layerIdx, std::map& block, std::vec
++weightPtr;
}
for (int i = 0; i < filters; ++i) {
- bnRunningVar.push_back(sqrt(weights[weightPtr] + 1.0e-5));
+ bnRunningVar.push_back(sqrt(weights[weightPtr] + eps));
++weightPtr;
}
@@ -47,18 +52,25 @@ batchnormLayer(int layerIdx, std::map& block, std::vec
nvinfer1::Weights shift {nvinfer1::DataType::kFLOAT, nullptr, size};
nvinfer1::Weights scale {nvinfer1::DataType::kFLOAT, nullptr, size};
nvinfer1::Weights power {nvinfer1::DataType::kFLOAT, nullptr, size};
+
float* shiftWt = new float[size];
- for (int i = 0; i < size; ++i)
+ for (int i = 0; i < size; ++i) {
shiftWt[i] = bnBiases.at(i) - ((bnRunningMean.at(i) * bnWeights.at(i)) / bnRunningVar.at(i));
+ }
shift.values = shiftWt;
+
float* scaleWt = new float[size];
- for (int i = 0; i < size; ++i)
+ for (int i = 0; i < size; ++i) {
scaleWt[i] = bnWeights.at(i) / bnRunningVar[i];
+ }
scale.values = scaleWt;
+
float* powerWt = new float[size];
- for (int i = 0; i < size; ++i)
+ for (int i = 0; i < size; ++i) {
powerWt[i] = 1.0;
+ }
power.values = powerWt;
+
trtWeights.push_back(shift);
trtWeights.push_back(scale);
trtWeights.push_back(power);
diff --git a/nvdsinfer_custom_impl_Yolo/layers/convolutional_layer.cpp b/nvdsinfer_custom_impl_Yolo/layers/convolutional_layer.cpp
index 65fc65a..83bac7a 100644
--- a/nvdsinfer_custom_impl_Yolo/layers/convolutional_layer.cpp
+++ b/nvdsinfer_custom_impl_Yolo/layers/convolutional_layer.cpp
@@ -15,7 +15,7 @@ convolutionalLayer(int layerIdx, std::map& block, std:
{
nvinfer1::ITensor* output;
- assert(block.at("type") == "convolutional" || block.at("type") == "c2f");
+ assert(block.at("type") == "conv" || block.at("type") == "convolutional");
assert(block.find("filters") != block.end());
assert(block.find("pad") != block.end());
assert(block.find("size") != block.end());
@@ -28,27 +28,35 @@ convolutionalLayer(int layerIdx, std::map& block, std:
std::string activation = block.at("activation");
int bias = filters;
- bool batchNormalize = false;
+ int batchNormalize = 0;
+ float eps = 1.0e-5;
if (block.find("batch_normalize") != block.end()) {
bias = 0;
batchNormalize = (block.at("batch_normalize") == "1");
+ if (block.find("eps") != block.end()) {
+ eps = std::stof(block.at("eps"));
+ }
}
if (block.find("bias") != block.end()) {
bias = std::stoi(block.at("bias"));
- if (bias == 1)
+ if (bias == 1) {
bias = filters;
+ }
}
int groups = 1;
- if (block.find("groups") != block.end())
+ if (block.find("groups") != block.end()) {
groups = std::stoi(block.at("groups"));
+ }
int pad;
- if (padding)
+ if (padding) {
pad = (kernelSize - 1) / 2;
- else
+ }
+ else {
pad = 0;
+ }
int size = filters * inputChannels * kernelSize * kernelSize / groups;
std::vector bnBiases;
@@ -58,7 +66,7 @@ convolutionalLayer(int layerIdx, std::map& block, std:
nvinfer1::Weights convWt {nvinfer1::DataType::kFLOAT, nullptr, size};
nvinfer1::Weights convBias {nvinfer1::DataType::kFLOAT, nullptr, bias};
- if (batchNormalize == false) {
+ if (batchNormalize == 0) {
float* val;
if (bias != 0) {
val = new float[filters];
@@ -91,7 +99,7 @@ convolutionalLayer(int layerIdx, std::map& block, std:
++weightPtr;
}
for (int i = 0; i < filters; ++i) {
- bnRunningVar.push_back(sqrt(weights[weightPtr] + 1.0e-5));
+ bnRunningVar.push_back(sqrt(weights[weightPtr] + eps));
++weightPtr;
}
float* val;
@@ -110,40 +118,49 @@ convolutionalLayer(int layerIdx, std::map& block, std:
}
convWt.values = val;
trtWeights.push_back(convWt);
- if (bias != 0)
+ if (bias != 0) {
trtWeights.push_back(convBias);
+ }
}
- nvinfer1::IConvolutionLayer* conv = network->addConvolutionNd(*input, filters, nvinfer1::Dims{2, {kernelSize, kernelSize}},
- convWt, convBias);
+ nvinfer1::IConvolutionLayer* conv = network->addConvolutionNd(*input, filters,
+ nvinfer1::Dims{2, {kernelSize, kernelSize}}, convWt, convBias);
assert(conv != nullptr);
std::string convLayerName = "conv_" + layerName + std::to_string(layerIdx);
conv->setName(convLayerName.c_str());
conv->setStrideNd(nvinfer1::Dims{2, {stride, stride}});
conv->setPaddingNd(nvinfer1::Dims{2, {pad, pad}});
- if (block.find("groups") != block.end())
+ if (block.find("groups") != block.end()) {
conv->setNbGroups(groups);
+ }
output = conv->getOutput(0);
- if (batchNormalize == true) {
+ if (batchNormalize == 1) {
size = filters;
nvinfer1::Weights shift {nvinfer1::DataType::kFLOAT, nullptr, size};
nvinfer1::Weights scale {nvinfer1::DataType::kFLOAT, nullptr, size};
nvinfer1::Weights power {nvinfer1::DataType::kFLOAT, nullptr, size};
+
float* shiftWt = new float[size];
- for (int i = 0; i < size; ++i)
+ for (int i = 0; i < size; ++i) {
shiftWt[i] = bnBiases.at(i) - ((bnRunningMean.at(i) * bnWeights.at(i)) / bnRunningVar.at(i));
+ }
shift.values = shiftWt;
+
float* scaleWt = new float[size];
- for (int i = 0; i < size; ++i)
+ for (int i = 0; i < size; ++i) {
scaleWt[i] = bnWeights.at(i) / bnRunningVar[i];
+ }
scale.values = scaleWt;
+
float* powerWt = new float[size];
- for (int i = 0; i < size; ++i)
+ for (int i = 0; i < size; ++i) {
powerWt[i] = 1.0;
+ }
power.values = powerWt;
+
trtWeights.push_back(shift);
trtWeights.push_back(scale);
trtWeights.push_back(power);
diff --git a/nvdsinfer_custom_impl_Yolo/layers/convolutional_layer.h b/nvdsinfer_custom_impl_Yolo/layers/convolutional_layer.h
index 7329eb9..08ec079 100644
--- a/nvdsinfer_custom_impl_Yolo/layers/convolutional_layer.h
+++ b/nvdsinfer_custom_impl_Yolo/layers/convolutional_layer.h
@@ -13,8 +13,8 @@
#include "activation_layer.h"
-nvinfer1::ITensor* convolutionalLayer(int layerIdx, std::map& block, std::vector& weights,
- std::vector& trtWeights, int& weightPtr, int& inputChannels, nvinfer1::ITensor* input,
- nvinfer1::INetworkDefinition* network, std::string layerName = "");
+nvinfer1::ITensor* convolutionalLayer(int layerIdx, std::map& block,
+ std::vector& weights, std::vector& trtWeights, int& weightPtr, int& inputChannels,
+ nvinfer1::ITensor* input, nvinfer1::INetworkDefinition* network, std::string layerName = "");
#endif
diff --git a/nvdsinfer_custom_impl_Yolo/layers/deconvolutional_layer.cpp b/nvdsinfer_custom_impl_Yolo/layers/deconvolutional_layer.cpp
index 5c6db36..c645318 100644
--- a/nvdsinfer_custom_impl_Yolo/layers/deconvolutional_layer.cpp
+++ b/nvdsinfer_custom_impl_Yolo/layers/deconvolutional_layer.cpp
@@ -6,6 +6,7 @@
#include "deconvolutional_layer.h"
#include
+#include
nvinfer1::ITensor*
deconvolutionalLayer(int layerIdx, std::map& block, std::vector& weights,
@@ -14,7 +15,7 @@ deconvolutionalLayer(int layerIdx, std::map& block, st
{
nvinfer1::ITensor* output;
- assert(block.at("type") == "deconvolutional");
+ assert(block.at("type") == "deconv" || block.at("type") == "deconvolutional");
assert(block.find("filters") != block.end());
assert(block.find("pad") != block.end());
assert(block.find("size") != block.end());
@@ -24,20 +25,38 @@ deconvolutionalLayer(int layerIdx, std::map& block, st
int padding = std::stoi(block.at("pad"));
int kernelSize = std::stoi(block.at("size"));
int stride = std::stoi(block.at("stride"));
+ std::string activation = block.at("activation");
int bias = filters;
- int groups = 1;
- if (block.find("groups") != block.end())
- groups = std::stoi(block.at("groups"));
+ int batchNormalize = 0;
+ float eps = 1.0e-5;
+ if (block.find("batch_normalize") != block.end()) {
+ bias = 0;
+ batchNormalize = (block.at("batch_normalize") == "1");
+ if (block.find("eps") != block.end()) {
+ eps = std::stof(block.at("eps"));
+ }
+ }
- if (block.find("bias") != block.end())
+ if (block.find("bias") != block.end()) {
bias = std::stoi(block.at("bias"));
+ if (bias == 1) {
+ bias = filters;
+ }
+ }
+
+ int groups = 1;
+ if (block.find("groups") != block.end()) {
+ groups = std::stoi(block.at("groups"));
+ }
int pad;
- if (padding)
+ if (padding) {
pad = (kernelSize - 1) / 2;
- else
+ }
+ else {
pad = 0;
+ }
int size = filters * inputChannels * kernelSize * kernelSize / groups;
std::vector bnBiases;
@@ -47,23 +66,62 @@ deconvolutionalLayer(int layerIdx, std::map& block, st
nvinfer1::Weights convWt {nvinfer1::DataType::kFLOAT, nullptr, size};
nvinfer1::Weights convBias {nvinfer1::DataType::kFLOAT, nullptr, bias};
- float* val;
- if (bias != 0) {
- val = new float[filters];
- for (int i = 0; i < filters; ++i) {
+ if (batchNormalize == 0) {
+ float* val;
+ if (bias != 0) {
+ val = new float[filters];
+ for (int i = 0; i < filters; ++i) {
+ val[i] = weights[weightPtr];
+ ++weightPtr;
+ }
+ convBias.values = val;
+ trtWeights.push_back(convBias);
+ }
+ val = new float[size];
+ for (int i = 0; i < size; ++i) {
val[i] = weights[weightPtr];
++weightPtr;
}
- convBias.values = val;
- trtWeights.push_back(convBias);
+ convWt.values = val;
+ trtWeights.push_back(convWt);
}
- val = new float[size];
- for (int i = 0; i < size; ++i) {
+ else {
+ for (int i = 0; i < filters; ++i) {
+ bnBiases.push_back(weights[weightPtr]);
+ ++weightPtr;
+ }
+ for (int i = 0; i < filters; ++i) {
+ bnWeights.push_back(weights[weightPtr]);
+ ++weightPtr;
+ }
+ for (int i = 0; i < filters; ++i) {
+ bnRunningMean.push_back(weights[weightPtr]);
+ ++weightPtr;
+ }
+ for (int i = 0; i < filters; ++i) {
+ bnRunningVar.push_back(sqrt(weights[weightPtr] + eps));
+ ++weightPtr;
+ }
+ float* val;
+ if (bias != 0) {
+ val = new float[filters];
+ for (int i = 0; i < filters; ++i) {
+ val[i] = weights[weightPtr];
+ ++weightPtr;
+ }
+ convBias.values = val;
+ }
+ val = new float[size];
+ for (int i = 0; i < size; ++i) {
val[i] = weights[weightPtr];
++weightPtr;
+ }
+ convWt.values = val;
+ trtWeights.push_back(convWt);
+ if (bias != 0) {
+ trtWeights.push_back(convBias);
+ }
}
- convWt.values = val;
- trtWeights.push_back(convWt);
nvinfer1::IDeconvolutionLayer* conv = network->addDeconvolutionNd(*input, filters,
nvinfer1::Dims{2, {kernelSize, kernelSize}}, convWt, convBias);
@@ -73,10 +131,49 @@ deconvolutionalLayer(int layerIdx, std::map& block, st
conv->setStrideNd(nvinfer1::Dims{2, {stride, stride}});
conv->setPaddingNd(nvinfer1::Dims{2, {pad, pad}});
- if (block.find("groups") != block.end())
+ if (block.find("groups") != block.end()) {
conv->setNbGroups(groups);
+ }
output = conv->getOutput(0);
+ if (batchNormalize == 1) {
+ size = filters;
+ nvinfer1::Weights shift {nvinfer1::DataType::kFLOAT, nullptr, size};
+ nvinfer1::Weights scale {nvinfer1::DataType::kFLOAT, nullptr, size};
+ nvinfer1::Weights power {nvinfer1::DataType::kFLOAT, nullptr, size};
+
+ float* shiftWt = new float[size];
+ for (int i = 0; i < size; ++i) {
+ shiftWt[i] = bnBiases.at(i) - ((bnRunningMean.at(i) * bnWeights.at(i)) / bnRunningVar.at(i));
+ }
+ shift.values = shiftWt;
+
+ float* scaleWt = new float[size];
+ for (int i = 0; i < size; ++i) {
+ scaleWt[i] = bnWeights.at(i) / bnRunningVar[i];
+ }
+ scale.values = scaleWt;
+
+ float* powerWt = new float[size];
+ for (int i = 0; i < size; ++i) {
+ powerWt[i] = 1.0;
+ }
+ power.values = powerWt;
+
+ trtWeights.push_back(shift);
+ trtWeights.push_back(scale);
+ trtWeights.push_back(power);
+
+ nvinfer1::IScaleLayer* batchnorm = network->addScale(*output, nvinfer1::ScaleMode::kCHANNEL, shift, scale, power);
+ assert(batchnorm != nullptr);
+ std::string batchnormLayerName = "batchnorm_" + layerName + std::to_string(layerIdx);
+ batchnorm->setName(batchnormLayerName.c_str());
+ output = batchnorm->getOutput(0);
+ }
+
+ output = activationLayer(layerIdx, activation, output, network, layerName);
+ assert(output != nullptr);
+
return output;
}
diff --git a/nvdsinfer_custom_impl_Yolo/layers/deconvolutional_layer.h b/nvdsinfer_custom_impl_Yolo/layers/deconvolutional_layer.h
index 7e79b6c..607d40e 100644
--- a/nvdsinfer_custom_impl_Yolo/layers/deconvolutional_layer.h
+++ b/nvdsinfer_custom_impl_Yolo/layers/deconvolutional_layer.h
@@ -8,12 +8,13 @@
#include