Compare commits
10 Commits
ea9f4cb7e8
...
master
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
c11c3e4a78 | ||
|
|
e5d994e2d7 | ||
|
|
db3a21133e | ||
|
|
0cc84af9b9 | ||
|
|
f5c486cb7f | ||
|
|
1b96a01ec3 | ||
|
|
b451b036b2 | ||
|
|
bca9e59d07 | ||
|
|
79273786b8 | ||
|
|
4aadab16d1 |
61
README.md
61
README.md
@@ -1,6 +1,6 @@
|
|||||||
# DeepStream-Yolo
|
# DeepStream-Yolo
|
||||||
|
|
||||||
NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 configuration for YOLO models
|
NVIDIA DeepStream SDK 7.1 / 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 configuration for YOLO models
|
||||||
|
|
||||||
--------------------------------------------------------------------------------------------------
|
--------------------------------------------------------------------------------------------------
|
||||||
### For now, I am limited for some updates. Thank you for understanding.
|
### For now, I am limited for some updates. Thank you for understanding.
|
||||||
@@ -12,19 +12,13 @@ NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 c
|
|||||||
### Important: please export the ONNX model with the new export file, generate the TensorRT engine again with the updated files, and use the new config_infer_primary file according to your model
|
### Important: please export the ONNX model with the new export file, generate the TensorRT engine again with the updated files, and use the new config_infer_primary file according to your model
|
||||||
--------------------------------------------------------------------------------------------------
|
--------------------------------------------------------------------------------------------------
|
||||||
|
|
||||||
### Future updates
|
|
||||||
|
|
||||||
* DeepStream tutorials
|
|
||||||
* Updated INT8 calibration
|
|
||||||
* Support for classification models
|
|
||||||
|
|
||||||
### Improvements on this repository
|
### Improvements on this repository
|
||||||
|
|
||||||
* Support for INT8 calibration
|
* Support for INT8 calibration
|
||||||
* Support for non square models
|
* Support for non square models
|
||||||
* Models benchmarks
|
* Models benchmarks
|
||||||
* Support for Darknet models (YOLOv4, etc) using cfg and weights conversion with GPU post-processing
|
* Support for Darknet models (YOLOv4, etc) using cfg and weights conversion with GPU post-processing
|
||||||
* Support for RT-DETR, YOLO-NAS, PPYOLOE+, PPYOLOE, DAMO-YOLO, YOLOX, YOLOR, YOLOv8, YOLOv7, YOLOv6 and YOLOv5 using ONNX conversion with GPU post-processing
|
* Support for RT-DETR, CO-DETR (MMDetection), YOLO-NAS, PPYOLOE+, PPYOLOE, DAMO-YOLO, Gold-YOLO, RTMDet (MMYOLO), YOLOX, YOLOR, YOLOv9, YOLOv8, YOLOv7, YOLOv6 and YOLOv5 using ONNX conversion with GPU post-processing
|
||||||
* GPU bbox parser
|
* GPU bbox parser
|
||||||
* Custom ONNX model parser
|
* Custom ONNX model parser
|
||||||
* Dynamic batch-size
|
* Dynamic batch-size
|
||||||
@@ -47,11 +41,15 @@ NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 c
|
|||||||
* [YOLOv6 usage](docs/YOLOv6.md)
|
* [YOLOv6 usage](docs/YOLOv6.md)
|
||||||
* [YOLOv7 usage](docs/YOLOv7.md)
|
* [YOLOv7 usage](docs/YOLOv7.md)
|
||||||
* [YOLOv8 usage](docs/YOLOv8.md)
|
* [YOLOv8 usage](docs/YOLOv8.md)
|
||||||
|
* [YOLOv9 usage](docs/YOLOv9.md)
|
||||||
* [YOLOR usage](docs/YOLOR.md)
|
* [YOLOR usage](docs/YOLOR.md)
|
||||||
* [YOLOX usage](docs/YOLOX.md)
|
* [YOLOX usage](docs/YOLOX.md)
|
||||||
|
* [RTMDet (MMYOLO) usage](docs/RTMDet.md)
|
||||||
|
* [Gold-YOLO usage](docs/GoldYOLO.md)
|
||||||
* [DAMO-YOLO usage](docs/DAMOYOLO.md)
|
* [DAMO-YOLO usage](docs/DAMOYOLO.md)
|
||||||
* [PP-YOLOE / PP-YOLOE+ usage](docs/PPYOLOE.md)
|
* [PP-YOLOE / PP-YOLOE+ usage](docs/PPYOLOE.md)
|
||||||
* [YOLO-NAS usage](docs/YOLONAS.md)
|
* [YOLO-NAS usage](docs/YOLONAS.md)
|
||||||
|
* [CO-DETR (MMDetection) usage](docs/CODETR.md)
|
||||||
* [RT-DETR PyTorch usage](docs/RTDETR_PyTorch.md)
|
* [RT-DETR PyTorch usage](docs/RTDETR_PyTorch.md)
|
||||||
* [RT-DETR Paddle usage](docs/RTDETR_Paddle.md)
|
* [RT-DETR Paddle usage](docs/RTDETR_Paddle.md)
|
||||||
* [RT-DETR Ultralytics usage](docs/RTDETR_Ultralytics.md)
|
* [RT-DETR Ultralytics usage](docs/RTDETR_Ultralytics.md)
|
||||||
@@ -62,6 +60,16 @@ NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 c
|
|||||||
|
|
||||||
### Requirements
|
### Requirements
|
||||||
|
|
||||||
|
#### DeepStream 7.1 on x86 platform
|
||||||
|
|
||||||
|
* [Ubuntu 22.04](https://releases.ubuntu.com/22.04/)
|
||||||
|
* [CUDA 12.6 Update 2](https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=runfile_local)
|
||||||
|
* [TensorRT 10.3 GA (10.3.0.26)](https://developer.nvidia.com/nvidia-tensorrt-8x-download)
|
||||||
|
* [NVIDIA Driver 535.183.06 (Data center / Tesla series) / 560.35.03 (TITAN, GeForce RTX / GTX series and RTX / Quadro series)](https://www.nvidia.com/Download/index.aspx)
|
||||||
|
* [NVIDIA DeepStream SDK 7.1](https://catalog.ngc.nvidia.com/orgs/nvidia/resources/deepstream/files?version=7.1)
|
||||||
|
* [GStreamer 1.20.3](https://gstreamer.freedesktop.org/)
|
||||||
|
* [DeepStream-Yolo](https://github.com/marcoslucianops/DeepStream-Yolo)
|
||||||
|
|
||||||
#### DeepStream 7.0 on x86 platform
|
#### DeepStream 7.0 on x86 platform
|
||||||
|
|
||||||
* [Ubuntu 22.04](https://releases.ubuntu.com/22.04/)
|
* [Ubuntu 22.04](https://releases.ubuntu.com/22.04/)
|
||||||
@@ -142,6 +150,12 @@ NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 c
|
|||||||
* [GStreamer 1.14.5](https://gstreamer.freedesktop.org/)
|
* [GStreamer 1.14.5](https://gstreamer.freedesktop.org/)
|
||||||
* [DeepStream-Yolo](https://github.com/marcoslucianops/DeepStream-Yolo)
|
* [DeepStream-Yolo](https://github.com/marcoslucianops/DeepStream-Yolo)
|
||||||
|
|
||||||
|
#### DeepStream 7.1 on Jetson platform
|
||||||
|
|
||||||
|
* [JetPack 6.1](https://developer.nvidia.com/embedded/jetpack-sdk-61)
|
||||||
|
* [NVIDIA DeepStream SDK 7.1](https://catalog.ngc.nvidia.com/orgs/nvidia/resources/deepstream/files?version=7.1)
|
||||||
|
* [DeepStream-Yolo](https://github.com/marcoslucianops/DeepStream-Yolo)
|
||||||
|
|
||||||
#### DeepStream 7.0 on Jetson platform
|
#### DeepStream 7.0 on Jetson platform
|
||||||
|
|
||||||
* [JetPack 6.0](https://developer.nvidia.com/embedded/jetpack-sdk-60)
|
* [JetPack 6.0](https://developer.nvidia.com/embedded/jetpack-sdk-60)
|
||||||
@@ -201,11 +215,16 @@ NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 c
|
|||||||
* [YOLOv6](https://github.com/meituan/YOLOv6)
|
* [YOLOv6](https://github.com/meituan/YOLOv6)
|
||||||
* [YOLOv7](https://github.com/WongKinYiu/yolov7)
|
* [YOLOv7](https://github.com/WongKinYiu/yolov7)
|
||||||
* [YOLOv8](https://github.com/ultralytics/ultralytics)
|
* [YOLOv8](https://github.com/ultralytics/ultralytics)
|
||||||
|
* [YOLOv9](https://github.com/WongKinYiu/yolov9)
|
||||||
* [YOLOR](https://github.com/WongKinYiu/yolor)
|
* [YOLOR](https://github.com/WongKinYiu/yolor)
|
||||||
* [YOLOX](https://github.com/Megvii-BaseDetection/YOLOX)
|
* [YOLOX](https://github.com/Megvii-BaseDetection/YOLOX)
|
||||||
|
* [RTMDet (MMYOLO)](https://github.com/open-mmlab/mmyolo/tree/main/configs/rtmdet)
|
||||||
|
* [Gold-YOLO](https://github.com/huawei-noah/Efficient-Computing/tree/master/Detection/Gold-YOLO)
|
||||||
* [DAMO-YOLO](https://github.com/tinyvision/DAMO-YOLO)
|
* [DAMO-YOLO](https://github.com/tinyvision/DAMO-YOLO)
|
||||||
* [PP-YOLOE / PP-YOLOE+](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyoloe)
|
* [PP-YOLOE / PP-YOLOE+](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.8/configs/ppyoloe)
|
||||||
* [YOLO-NAS](https://github.com/Deci-AI/super-gradients/blob/master/YOLONAS.md)
|
* [YOLO-NAS](https://github.com/Deci-AI/super-gradients/blob/master/YOLONAS.md)
|
||||||
|
* [CO-DETR (MMDetection)](https://github.com/open-mmlab/mmdetection/tree/main/projects/CO-DETR)
|
||||||
|
* [RT-DETR](https://github.com/lyuwenyu/RT-DETR)
|
||||||
|
|
||||||
##
|
##
|
||||||
|
|
||||||
@@ -231,6 +250,7 @@ export CUDA_VER=XY.Z
|
|||||||
* x86 platform
|
* x86 platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 = 12.1
|
DeepStream 6.3 = 12.1
|
||||||
DeepStream 6.2 = 11.8
|
DeepStream 6.2 = 11.8
|
||||||
@@ -243,6 +263,7 @@ export CUDA_VER=XY.Z
|
|||||||
* Jetson platform
|
* Jetson platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
@@ -297,27 +318,27 @@ config-file=config_infer_primary_yoloV2.txt
|
|||||||
* x86 platform
|
* x86 platform
|
||||||
|
|
||||||
```
|
```
|
||||||
nvcr.io/nvidia/deepstream:7.0-gc-triton-devel
|
nvcr.io/nvidia/deepstream:7.1-gc-triton-devel
|
||||||
nvcr.io/nvidia/deepstream:7.0-triton-multiarch
|
nvcr.io/nvidia/deepstream:7.1-triton-multiarch
|
||||||
```
|
```
|
||||||
|
|
||||||
* Jetson platform
|
* Jetson platform
|
||||||
|
|
||||||
```
|
```
|
||||||
nvcr.io/nvidia/deepstream:7.0-triton-multiarch
|
nvcr.io/nvidia/deepstream:7.1-triton-multiarch
|
||||||
```
|
```
|
||||||
|
|
||||||
**NOTE**: To compile the `nvdsinfer_custom_impl_Yolo`, you need to install the g++ inside the container
|
**NOTE**: To compile the `nvdsinfer_custom_impl_Yolo`, you need to install the g++ inside the container
|
||||||
|
|
||||||
```
|
```
|
||||||
apt-get install build-essential
|
apt-get install build-essential
|
||||||
```
|
```
|
||||||
|
|
||||||
**NOTE**: With DeepStream 7.0, the docker containers do not package libraries necessary for certain multimedia operations like audio data parsing, CPU decode, and CPU encode. This change could affect processing certain video streams/files like mp4 that include audio track. Please run the below script inside the docker images to install additional packages that might be necessary to use all of the DeepStreamSDK features:
|
**NOTE**: With DeepStream 7.1, the docker containers do not package libraries necessary for certain multimedia operations like audio data parsing, CPU decode, and CPU encode. This change could affect processing certain video streams/files like mp4 that include audio track. Please run the below script inside the docker images to install additional packages that might be necessary to use all of the DeepStreamSDK features:
|
||||||
|
|
||||||
```
|
```
|
||||||
/opt/nvidia/deepstream/deepstream/user_additional_install.sh
|
/opt/nvidia/deepstream/deepstream/user_additional_install.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
##
|
##
|
||||||
|
|
||||||
|
|||||||
28
config_infer_primary_codetr.txt
Normal file
28
config_infer_primary_codetr.txt
Normal file
@@ -0,0 +1,28 @@
|
|||||||
|
[property]
|
||||||
|
gpu-id=0
|
||||||
|
net-scale-factor=0.0039215697906911373
|
||||||
|
model-color-format=0
|
||||||
|
onnx-file=co_dino_5scale_r50_1x_coco-7481f903.onnx
|
||||||
|
model-engine-file=model_b1_gpu0_fp32.engine
|
||||||
|
#int8-calib-file=calib.table
|
||||||
|
labelfile-path=labels.txt
|
||||||
|
batch-size=1
|
||||||
|
network-mode=0
|
||||||
|
num-detected-classes=80
|
||||||
|
interval=0
|
||||||
|
gie-unique-id=1
|
||||||
|
process-mode=1
|
||||||
|
network-type=0
|
||||||
|
cluster-mode=2
|
||||||
|
maintain-aspect-ratio=1
|
||||||
|
symmetric-padding=0
|
||||||
|
#workspace-size=2000
|
||||||
|
parse-bbox-func-name=NvDsInferParseYolo
|
||||||
|
#parse-bbox-func-name=NvDsInferParseYoloCuda
|
||||||
|
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
|
||||||
|
engine-create-func-name=NvDsInferYoloCudaEngineGet
|
||||||
|
|
||||||
|
[class-attrs-all]
|
||||||
|
nms-iou-threshold=0.45
|
||||||
|
pre-cluster-threshold=0.25
|
||||||
|
topk=300
|
||||||
@@ -16,8 +16,8 @@ network-type=0
|
|||||||
cluster-mode=2
|
cluster-mode=2
|
||||||
maintain-aspect-ratio=0
|
maintain-aspect-ratio=0
|
||||||
#workspace-size=2000
|
#workspace-size=2000
|
||||||
parse-bbox-func-name=NvDsInferParseYoloE
|
parse-bbox-func-name=NvDsInferParseYolo
|
||||||
#parse-bbox-func-name=NvDsInferParseYoloECuda
|
#parse-bbox-func-name=NvDsInferParseYoloCuda
|
||||||
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
|
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
|
||||||
engine-create-func-name=NvDsInferYoloCudaEngineGet
|
engine-create-func-name=NvDsInferYoloCudaEngineGet
|
||||||
|
|
||||||
|
|||||||
28
config_infer_primary_goldyolo.txt
Normal file
28
config_infer_primary_goldyolo.txt
Normal file
@@ -0,0 +1,28 @@
|
|||||||
|
[property]
|
||||||
|
gpu-id=0
|
||||||
|
net-scale-factor=0.0039215697906911373
|
||||||
|
model-color-format=0
|
||||||
|
onnx-file=Gold_s_pre_dist.onnx
|
||||||
|
model-engine-file=model_b1_gpu0_fp32.engine
|
||||||
|
#int8-calib-file=calib.table
|
||||||
|
labelfile-path=labels.txt
|
||||||
|
batch-size=1
|
||||||
|
network-mode=0
|
||||||
|
num-detected-classes=80
|
||||||
|
interval=0
|
||||||
|
gie-unique-id=1
|
||||||
|
process-mode=1
|
||||||
|
network-type=0
|
||||||
|
cluster-mode=2
|
||||||
|
maintain-aspect-ratio=1
|
||||||
|
symmetric-padding=1
|
||||||
|
#workspace-size=2000
|
||||||
|
parse-bbox-func-name=NvDsInferParseYolo
|
||||||
|
#parse-bbox-func-name=NvDsInferParseYoloCuda
|
||||||
|
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
|
||||||
|
engine-create-func-name=NvDsInferYoloCudaEngineGet
|
||||||
|
|
||||||
|
[class-attrs-all]
|
||||||
|
nms-iou-threshold=0.45
|
||||||
|
pre-cluster-threshold=0.25
|
||||||
|
topk=300
|
||||||
@@ -17,8 +17,8 @@ network-type=0
|
|||||||
cluster-mode=2
|
cluster-mode=2
|
||||||
maintain-aspect-ratio=0
|
maintain-aspect-ratio=0
|
||||||
#workspace-size=2000
|
#workspace-size=2000
|
||||||
parse-bbox-func-name=NvDsInferParseYoloE
|
parse-bbox-func-name=NvDsInferParseYolo
|
||||||
#parse-bbox-func-name=NvDsInferParseYoloECuda
|
#parse-bbox-func-name=NvDsInferParseYoloCuda
|
||||||
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
|
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
|
||||||
engine-create-func-name=NvDsInferYoloCudaEngineGet
|
engine-create-func-name=NvDsInferYoloCudaEngineGet
|
||||||
|
|
||||||
|
|||||||
@@ -16,8 +16,8 @@ network-type=0
|
|||||||
cluster-mode=2
|
cluster-mode=2
|
||||||
maintain-aspect-ratio=0
|
maintain-aspect-ratio=0
|
||||||
#workspace-size=2000
|
#workspace-size=2000
|
||||||
parse-bbox-func-name=NvDsInferParseYoloE
|
parse-bbox-func-name=NvDsInferParseYolo
|
||||||
#parse-bbox-func-name=NvDsInferParseYoloECuda
|
#parse-bbox-func-name=NvDsInferParseYoloCuda
|
||||||
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
|
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
|
||||||
engine-create-func-name=NvDsInferYoloCudaEngineGet
|
engine-create-func-name=NvDsInferYoloCudaEngineGet
|
||||||
|
|
||||||
|
|||||||
@@ -13,7 +13,7 @@ interval=0
|
|||||||
gie-unique-id=1
|
gie-unique-id=1
|
||||||
process-mode=1
|
process-mode=1
|
||||||
network-type=0
|
network-type=0
|
||||||
cluster-mode=2
|
cluster-mode=4
|
||||||
maintain-aspect-ratio=0
|
maintain-aspect-ratio=0
|
||||||
#workspace-size=2000
|
#workspace-size=2000
|
||||||
parse-bbox-func-name=NvDsInferParseYolo
|
parse-bbox-func-name=NvDsInferParseYolo
|
||||||
@@ -22,6 +22,5 @@ custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
|
|||||||
engine-create-func-name=NvDsInferYoloCudaEngineGet
|
engine-create-func-name=NvDsInferYoloCudaEngineGet
|
||||||
|
|
||||||
[class-attrs-all]
|
[class-attrs-all]
|
||||||
nms-iou-threshold=0.45
|
|
||||||
pre-cluster-threshold=0.25
|
pre-cluster-threshold=0.25
|
||||||
topk=300
|
topk=300
|
||||||
|
|||||||
29
config_infer_primary_rtmdet.txt
Normal file
29
config_infer_primary_rtmdet.txt
Normal file
@@ -0,0 +1,29 @@
|
|||||||
|
[property]
|
||||||
|
gpu-id=0
|
||||||
|
net-scale-factor=0.0173520735727919486
|
||||||
|
offsets=103.53;116.28;123.675
|
||||||
|
model-color-format=1
|
||||||
|
onnx-file=rtmdet_s_syncbn_fast_8xb32-300e_coco_20221230_182329-0a8c901a.onnx
|
||||||
|
model-engine-file=model_b1_gpu0_fp32.engine
|
||||||
|
#int8-calib-file=calib.table
|
||||||
|
labelfile-path=labels.txt
|
||||||
|
batch-size=1
|
||||||
|
network-mode=0
|
||||||
|
num-detected-classes=80
|
||||||
|
interval=0
|
||||||
|
gie-unique-id=1
|
||||||
|
process-mode=1
|
||||||
|
network-type=0
|
||||||
|
cluster-mode=2
|
||||||
|
maintain-aspect-ratio=1
|
||||||
|
symmetric-padding=1
|
||||||
|
#workspace-size=2000
|
||||||
|
parse-bbox-func-name=NvDsInferParseYolo
|
||||||
|
#parse-bbox-func-name=NvDsInferParseYoloCuda
|
||||||
|
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
|
||||||
|
engine-create-func-name=NvDsInferYoloCudaEngineGet
|
||||||
|
|
||||||
|
[class-attrs-all]
|
||||||
|
nms-iou-threshold=0.45
|
||||||
|
pre-cluster-threshold=0.25
|
||||||
|
topk=300
|
||||||
28
config_infer_primary_yoloV9.txt
Normal file
28
config_infer_primary_yoloV9.txt
Normal file
@@ -0,0 +1,28 @@
|
|||||||
|
[property]
|
||||||
|
gpu-id=0
|
||||||
|
net-scale-factor=0.0039215697906911373
|
||||||
|
model-color-format=0
|
||||||
|
onnx-file=yolov9-c.onnx
|
||||||
|
model-engine-file=model_b1_gpu0_fp32.engine
|
||||||
|
#int8-calib-file=calib.table
|
||||||
|
labelfile-path=labels.txt
|
||||||
|
batch-size=1
|
||||||
|
network-mode=0
|
||||||
|
num-detected-classes=80
|
||||||
|
interval=0
|
||||||
|
gie-unique-id=1
|
||||||
|
process-mode=1
|
||||||
|
network-type=0
|
||||||
|
cluster-mode=2
|
||||||
|
maintain-aspect-ratio=1
|
||||||
|
symmetric-padding=1
|
||||||
|
#workspace-size=2000
|
||||||
|
parse-bbox-func-name=NvDsInferParseYolo
|
||||||
|
#parse-bbox-func-name=NvDsInferParseYoloCuda
|
||||||
|
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
|
||||||
|
engine-create-func-name=NvDsInferYoloCudaEngineGet
|
||||||
|
|
||||||
|
[class-attrs-all]
|
||||||
|
nms-iou-threshold=0.45
|
||||||
|
pre-cluster-threshold=0.25
|
||||||
|
topk=300
|
||||||
@@ -17,8 +17,8 @@ cluster-mode=2
|
|||||||
maintain-aspect-ratio=1
|
maintain-aspect-ratio=1
|
||||||
symmetric-padding=0
|
symmetric-padding=0
|
||||||
#workspace-size=2000
|
#workspace-size=2000
|
||||||
parse-bbox-func-name=NvDsInferParseYoloE
|
parse-bbox-func-name=NvDsInferParseYolo
|
||||||
#parse-bbox-func-name=NvDsInferParseYoloECuda
|
#parse-bbox-func-name=NvDsInferParseYoloCuda
|
||||||
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
|
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
|
||||||
engine-create-func-name=NvDsInferYoloCudaEngineGet
|
engine-create-func-name=NvDsInferYoloCudaEngineGet
|
||||||
|
|
||||||
|
|||||||
@@ -17,8 +17,8 @@ cluster-mode=2
|
|||||||
maintain-aspect-ratio=1
|
maintain-aspect-ratio=1
|
||||||
symmetric-padding=0
|
symmetric-padding=0
|
||||||
#workspace-size=2000
|
#workspace-size=2000
|
||||||
parse-bbox-func-name=NvDsInferParseYoloE
|
parse-bbox-func-name=NvDsInferParseYolo
|
||||||
#parse-bbox-func-name=NvDsInferParseYoloECuda
|
#parse-bbox-func-name=NvDsInferParseYoloCuda
|
||||||
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
|
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
|
||||||
engine-create-func-name=NvDsInferYoloCudaEngineGet
|
engine-create-func-name=NvDsInferYoloCudaEngineGet
|
||||||
|
|
||||||
|
|||||||
@@ -1,7 +1,7 @@
|
|||||||
[property]
|
[property]
|
||||||
gpu-id=0
|
gpu-id=0
|
||||||
net-scale-factor=1
|
net-scale-factor=1
|
||||||
model-color-format=0
|
model-color-format=1
|
||||||
onnx-file=yolox_s.onnx
|
onnx-file=yolox_s.onnx
|
||||||
model-engine-file=model_b1_gpu0_fp32.engine
|
model-engine-file=model_b1_gpu0_fp32.engine
|
||||||
#int8-calib-file=calib.table
|
#int8-calib-file=calib.table
|
||||||
|
|||||||
187
docs/CODETR.md
Normal file
187
docs/CODETR.md
Normal file
@@ -0,0 +1,187 @@
|
|||||||
|
# CO-DETR (MMDetection) usage
|
||||||
|
|
||||||
|
* [Convert model](#convert-model)
|
||||||
|
* [Compile the lib](#compile-the-lib)
|
||||||
|
* [Edit the config_infer_primary_codetr file](#edit-the-config_infer_primary_codetr-file)
|
||||||
|
* [Edit the deepstream_app_config file](#edit-the-deepstream_app_config-file)
|
||||||
|
* [Testing the model](#testing-the-model)
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Convert model
|
||||||
|
|
||||||
|
#### 1. Download the CO-DETR (MMDetection) repo and install the requirements
|
||||||
|
|
||||||
|
```
|
||||||
|
git clone https://github.com/open-mmlab/mmdetection.git
|
||||||
|
cd mmdetection
|
||||||
|
pip3 install openmim
|
||||||
|
mim install mmengine
|
||||||
|
mim install mmdeploy
|
||||||
|
mim install "mmcv>=2.0.0rc4,<2.2.0"
|
||||||
|
pip3 install -v -e .
|
||||||
|
pip3 install onnx onnxslim onnxruntime
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: It is recommended to use Python virtualenv.
|
||||||
|
|
||||||
|
#### 2. Copy conversor
|
||||||
|
|
||||||
|
Copy the `export_codetr.py` file from `DeepStream-Yolo/utils` directory to the `mmdetection` folder.
|
||||||
|
|
||||||
|
#### 3. Download the model
|
||||||
|
|
||||||
|
Download the `pth` file from [CO-DETR (MMDetection)](https://github.com/open-mmlab/mmdetection/tree/main/projects/CO-DETR) releases (example for Co-DINO R50 DETR*)
|
||||||
|
|
||||||
|
```
|
||||||
|
wget https://download.openmmlab.com/mmdetection/v3.0/codetr/co_dino_5scale_r50_1x_coco-7481f903.pth
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: You can use your custom model.
|
||||||
|
|
||||||
|
#### 4. Convert model
|
||||||
|
|
||||||
|
Generate the ONNX model file (example for Co-DINO R50 DETR)
|
||||||
|
|
||||||
|
```
|
||||||
|
python3 export_codetr.py -w co_dino_5scale_r50_1x_coco-7481f903.pth -c projects/CO-DETR/configs/codino/co_dino_5scale_r50_8xb2_1x_coco.py --dynamic
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: To change the inference size (defaut: 640)
|
||||||
|
|
||||||
|
```
|
||||||
|
-s SIZE
|
||||||
|
--size SIZE
|
||||||
|
-s HEIGHT WIDTH
|
||||||
|
--size HEIGHT WIDTH
|
||||||
|
```
|
||||||
|
|
||||||
|
Example for 1280
|
||||||
|
|
||||||
|
```
|
||||||
|
-s 1280
|
||||||
|
```
|
||||||
|
|
||||||
|
or
|
||||||
|
|
||||||
|
```
|
||||||
|
-s 1280 1280
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: To simplify the ONNX model (DeepStream >= 6.0)
|
||||||
|
|
||||||
|
```
|
||||||
|
--simplify
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: To use dynamic batch-size (DeepStream >= 6.1)
|
||||||
|
|
||||||
|
```
|
||||||
|
--dynamic
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: To use static batch-size (example for batch-size = 4)
|
||||||
|
|
||||||
|
```
|
||||||
|
--batch 4
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: If you are using the DeepStream 5.1, remove the `--dynamic` arg and use opset 12 or lower. The default opset is 11.
|
||||||
|
|
||||||
|
```
|
||||||
|
--opset 12
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 5. Copy generated files
|
||||||
|
|
||||||
|
Copy the generated ONNX model file and labels.txt file (if generated) to the `DeepStream-Yolo` folder.
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Compile the lib
|
||||||
|
|
||||||
|
1. Open the `DeepStream-Yolo` folder and compile the lib
|
||||||
|
|
||||||
|
2. Set the `CUDA_VER` according to your DeepStream version
|
||||||
|
|
||||||
|
```
|
||||||
|
export CUDA_VER=XY.Z
|
||||||
|
```
|
||||||
|
|
||||||
|
* x86 platform
|
||||||
|
|
||||||
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
|
DeepStream 6.3 = 12.1
|
||||||
|
DeepStream 6.2 = 11.8
|
||||||
|
DeepStream 6.1.1 = 11.7
|
||||||
|
DeepStream 6.1 = 11.6
|
||||||
|
DeepStream 6.0.1 / 6.0 = 11.4
|
||||||
|
DeepStream 5.1 = 11.1
|
||||||
|
```
|
||||||
|
|
||||||
|
* Jetson platform
|
||||||
|
|
||||||
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Make the lib
|
||||||
|
|
||||||
|
```
|
||||||
|
make -C nvdsinfer_custom_impl_Yolo clean && make -C nvdsinfer_custom_impl_Yolo
|
||||||
|
```
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Edit the config_infer_primary_codetr file
|
||||||
|
|
||||||
|
Edit the `config_infer_primary_codetr.txt` file according to your model (example for Co-DINO R50 DETR with 80 classes)
|
||||||
|
|
||||||
|
```
|
||||||
|
[property]
|
||||||
|
...
|
||||||
|
onnx-file=co_dino_5scale_r50_1x_coco-7481f903.pth.onnx
|
||||||
|
...
|
||||||
|
num-detected-classes=80
|
||||||
|
...
|
||||||
|
parse-bbox-func-name=NvDsInferParseYolo
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: The **CO-DETR (MMDetection)** resizes the input with left/top padding. To get better accuracy, use
|
||||||
|
|
||||||
|
```
|
||||||
|
[property]
|
||||||
|
...
|
||||||
|
maintain-aspect-ratio=1
|
||||||
|
symmetric-padding=0
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Edit the deepstream_app_config file
|
||||||
|
|
||||||
|
```
|
||||||
|
...
|
||||||
|
[primary-gie]
|
||||||
|
...
|
||||||
|
config-file=config_infer_primary_codetr.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Testing the model
|
||||||
|
|
||||||
|
```
|
||||||
|
deepstream-app -c deepstream_app_config.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: The TensorRT engine file may take a very long time to generate (sometimes more than 10 minutes).
|
||||||
|
|
||||||
|
**NOTE**: For more information about custom models configuration (`batch-size`, `network-mode`, etc), please check the [`docs/customModels.md`](customModels.md) file.
|
||||||
@@ -16,7 +16,7 @@
|
|||||||
git clone https://github.com/tinyvision/DAMO-YOLO.git
|
git clone https://github.com/tinyvision/DAMO-YOLO.git
|
||||||
cd DAMO-YOLO
|
cd DAMO-YOLO
|
||||||
pip3 install -r requirements.txt
|
pip3 install -r requirements.txt
|
||||||
pip3 install onnx onnxsim onnxruntime
|
pip3 install onnx onnxslim onnxruntime
|
||||||
```
|
```
|
||||||
|
|
||||||
**NOTE**: It is recommended to use Python virtualenv.
|
**NOTE**: It is recommended to use Python virtualenv.
|
||||||
@@ -107,6 +107,7 @@ export CUDA_VER=XY.Z
|
|||||||
* x86 platform
|
* x86 platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 = 12.1
|
DeepStream 6.3 = 12.1
|
||||||
DeepStream 6.2 = 11.8
|
DeepStream 6.2 = 11.8
|
||||||
@@ -119,6 +120,7 @@ export CUDA_VER=XY.Z
|
|||||||
* Jetson platform
|
* Jetson platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
@@ -139,11 +141,11 @@ Edit the `config_infer_primary_damoyolo.txt` file according to your model (examp
|
|||||||
```
|
```
|
||||||
[property]
|
[property]
|
||||||
...
|
...
|
||||||
onnx-file=damoyolo_tinynasL25_S.onnx
|
onnx-file=damoyolo_tinynasL25_S_477.pth.onnx
|
||||||
...
|
...
|
||||||
num-detected-classes=80
|
num-detected-classes=80
|
||||||
...
|
...
|
||||||
parse-bbox-func-name=NvDsInferParseYoloE
|
parse-bbox-func-name=NvDsInferParseYolo
|
||||||
...
|
...
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
179
docs/GoldYOLO.md
Normal file
179
docs/GoldYOLO.md
Normal file
@@ -0,0 +1,179 @@
|
|||||||
|
# Gold-YOLO usage
|
||||||
|
|
||||||
|
* [Convert model](#convert-model)
|
||||||
|
* [Compile the lib](#compile-the-lib)
|
||||||
|
* [Edit the config_infer_primary_goldyolo file](#edit-the-config_infer_primary_goldyolo-file)
|
||||||
|
* [Edit the deepstream_app_config file](#edit-the-deepstream_app_config-file)
|
||||||
|
* [Testing the model](#testing-the-model)
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Convert model
|
||||||
|
|
||||||
|
#### 1. Download the Gold-YOLO repo and install the requirements
|
||||||
|
|
||||||
|
```
|
||||||
|
git clone https://github.com/huawei-noah/Efficient-Computing.git
|
||||||
|
cd Efficient-Computing/Detection/Gold-YOLO
|
||||||
|
pip3 install -r requirements.txt
|
||||||
|
pip3 install onnx onnxslim onnxruntime
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: It is recommended to use Python virtualenv.
|
||||||
|
|
||||||
|
#### 2. Copy conversor
|
||||||
|
|
||||||
|
Copy the `export_goldyolo.py` file from `DeepStream-Yolo/utils` directory to the `Gold-YOLO` folder.
|
||||||
|
|
||||||
|
#### 3. Download the model
|
||||||
|
|
||||||
|
Download the `pt` file from [Gold-YOLO](https://github.com/huawei-noah/Efficient-Computing/tree/master/Detection/Gold-YOLO) releases
|
||||||
|
|
||||||
|
**NOTE**: You can use your custom model.
|
||||||
|
|
||||||
|
#### 4. Convert model
|
||||||
|
|
||||||
|
Generate the ONNX model file (example for Gold-YOLO-S)
|
||||||
|
|
||||||
|
```
|
||||||
|
python3 export_goldyolo.py -w Gold_s_pre_dist.pt --dynamic
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: To change the inference size (defaut: 640)
|
||||||
|
|
||||||
|
```
|
||||||
|
-s SIZE
|
||||||
|
--size SIZE
|
||||||
|
-s HEIGHT WIDTH
|
||||||
|
--size HEIGHT WIDTH
|
||||||
|
```
|
||||||
|
|
||||||
|
Example for 1280
|
||||||
|
|
||||||
|
```
|
||||||
|
-s 1280
|
||||||
|
```
|
||||||
|
|
||||||
|
or
|
||||||
|
|
||||||
|
```
|
||||||
|
-s 1280 1280
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: To simplify the ONNX model (DeepStream >= 6.0)
|
||||||
|
|
||||||
|
```
|
||||||
|
--simplify
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: To use dynamic batch-size (DeepStream >= 6.1)
|
||||||
|
|
||||||
|
```
|
||||||
|
--dynamic
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: To use static batch-size (example for batch-size = 4)
|
||||||
|
|
||||||
|
```
|
||||||
|
--batch 4
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: If you are using the DeepStream 5.1, remove the `--dynamic` arg and use opset 12 or lower. The default opset is 13.
|
||||||
|
|
||||||
|
```
|
||||||
|
--opset 12
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 5. Copy generated files
|
||||||
|
|
||||||
|
Copy the generated ONNX model file and labels.txt file (if generated) to the `DeepStream-Yolo` folder.
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Compile the lib
|
||||||
|
|
||||||
|
1. Open the `DeepStream-Yolo` folder and compile the lib
|
||||||
|
|
||||||
|
2. Set the `CUDA_VER` according to your DeepStream version
|
||||||
|
|
||||||
|
```
|
||||||
|
export CUDA_VER=XY.Z
|
||||||
|
```
|
||||||
|
|
||||||
|
* x86 platform
|
||||||
|
|
||||||
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
|
DeepStream 6.3 = 12.1
|
||||||
|
DeepStream 6.2 = 11.8
|
||||||
|
DeepStream 6.1.1 = 11.7
|
||||||
|
DeepStream 6.1 = 11.6
|
||||||
|
DeepStream 6.0.1 / 6.0 = 11.4
|
||||||
|
DeepStream 5.1 = 11.1
|
||||||
|
```
|
||||||
|
|
||||||
|
* Jetson platform
|
||||||
|
|
||||||
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Make the lib
|
||||||
|
|
||||||
|
```
|
||||||
|
make -C nvdsinfer_custom_impl_Yolo clean && make -C nvdsinfer_custom_impl_Yolo
|
||||||
|
```
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Edit the config_infer_primary_goldyolo file
|
||||||
|
|
||||||
|
Edit the `config_infer_primary_goldyolo.txt` file according to your model (example for Gold-YOLO-S with 80 classes)
|
||||||
|
|
||||||
|
```
|
||||||
|
[property]
|
||||||
|
...
|
||||||
|
onnx-file=Gold_s_pre_dist.pt.onnx
|
||||||
|
...
|
||||||
|
num-detected-classes=80
|
||||||
|
...
|
||||||
|
parse-bbox-func-name=NvDsInferParseYolo
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: The **Gold-YOLO** resizes the input with center padding. To get better accuracy, use
|
||||||
|
|
||||||
|
```
|
||||||
|
[property]
|
||||||
|
...
|
||||||
|
maintain-aspect-ratio=1
|
||||||
|
symmetric-padding=1
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Edit the deepstream_app_config file
|
||||||
|
|
||||||
|
```
|
||||||
|
...
|
||||||
|
[primary-gie]
|
||||||
|
...
|
||||||
|
config-file=config_infer_primary_goldyolo.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Testing the model
|
||||||
|
|
||||||
|
```
|
||||||
|
deepstream-app -c deepstream_app_config.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: The TensorRT engine file may take a very long time to generate (sometimes more than 10 minutes).
|
||||||
|
|
||||||
|
**NOTE**: For more information about custom models configuration (`batch-size`, `network-mode`, etc), please check the [`docs/customModels.md`](customModels.md) file.
|
||||||
@@ -17,6 +17,7 @@ export CUDA_VER=XY.Z
|
|||||||
* x86 platform
|
* x86 platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 = 12.1
|
DeepStream 6.3 = 12.1
|
||||||
DeepStream 6.2 = 11.8
|
DeepStream 6.2 = 11.8
|
||||||
@@ -29,6 +30,7 @@ export CUDA_VER=XY.Z
|
|||||||
* Jetson platform
|
* Jetson platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
# PP-YOLOE / PP-YOLOE+ usage
|
# PP-YOLOE / PP-YOLOE+ usage
|
||||||
|
|
||||||
**NOTE**: You can use the release/2.6 branch of the PPYOLOE repo to convert all model versions.
|
**NOTE**: You can use the develop branch of the PPYOLOE repo to convert all model versions.
|
||||||
|
|
||||||
* [Convert model](#convert-model)
|
* [Convert model](#convert-model)
|
||||||
* [Compile the lib](#compile-the-lib)
|
* [Compile the lib](#compile-the-lib)
|
||||||
@@ -14,7 +14,7 @@
|
|||||||
|
|
||||||
#### 1. Download the PaddleDetection repo and install the requirements
|
#### 1. Download the PaddleDetection repo and install the requirements
|
||||||
|
|
||||||
https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.7/docs/tutorials/INSTALL.md
|
https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.8/docs/tutorials/INSTALL.md
|
||||||
|
|
||||||
**NOTE**: It is recommended to use Python virtualenv.
|
**NOTE**: It is recommended to use Python virtualenv.
|
||||||
|
|
||||||
@@ -24,7 +24,7 @@ Copy the `export_ppyoloe.py` file from `DeepStream-Yolo/utils` directory to the
|
|||||||
|
|
||||||
#### 3. Download the model
|
#### 3. Download the model
|
||||||
|
|
||||||
Download the `pdparams` file from [PP-YOLOE](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.6/configs/ppyoloe) releases (example for PP-YOLOE+_s)
|
Download the `pdparams` file from [PP-YOLOE](https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.8/configs/ppyoloe) releases (example for PP-YOLOE+_s)
|
||||||
|
|
||||||
```
|
```
|
||||||
wget https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams
|
wget https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams
|
||||||
@@ -37,7 +37,7 @@ wget https://paddledet.bj.bcebos.com/models/ppyoloe_plus_crn_s_80e_coco.pdparams
|
|||||||
Generate the ONNX model file (example for PP-YOLOE+_s)
|
Generate the ONNX model file (example for PP-YOLOE+_s)
|
||||||
|
|
||||||
```
|
```
|
||||||
pip3 install onnx onnxsim onnxruntime paddle2onnx
|
pip3 install onnx onnxslim onnxruntime paddle2onnx
|
||||||
python3 export_ppyoloe.py -w ppyoloe_plus_crn_s_80e_coco.pdparams -c configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml --dynamic
|
python3 export_ppyoloe.py -w ppyoloe_plus_crn_s_80e_coco.pdparams -c configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml --dynamic
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -84,6 +84,7 @@ export CUDA_VER=XY.Z
|
|||||||
* x86 platform
|
* x86 platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 = 12.1
|
DeepStream 6.3 = 12.1
|
||||||
DeepStream 6.2 = 11.8
|
DeepStream 6.2 = 11.8
|
||||||
@@ -96,6 +97,7 @@ export CUDA_VER=XY.Z
|
|||||||
* Jetson platform
|
* Jetson platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
@@ -116,11 +118,11 @@ Edit the `config_infer_primary_ppyoloe_plus.txt` file according to your model (e
|
|||||||
```
|
```
|
||||||
[property]
|
[property]
|
||||||
...
|
...
|
||||||
onnx-file=ppyoloe_plus_crn_s_80e_coco.onnx
|
onnx-file=ppyoloe_plus_crn_s_80e_coco.pdparams.onnx
|
||||||
...
|
...
|
||||||
num-detected-classes=80
|
num-detected-classes=80
|
||||||
...
|
...
|
||||||
parse-bbox-func-name=NvDsInferParseYoloE
|
parse-bbox-func-name=NvDsInferParseYolo
|
||||||
...
|
...
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -14,13 +14,13 @@
|
|||||||
|
|
||||||
#### 1. Download the PaddleDetection repo and install the requirements
|
#### 1. Download the PaddleDetection repo and install the requirements
|
||||||
|
|
||||||
https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.7/docs/tutorials/INSTALL.md
|
https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.8/docs/tutorials/INSTALL.md
|
||||||
|
|
||||||
```
|
```
|
||||||
git clone https://github.com/lyuwenyu/RT-DETR.git
|
git clone https://github.com/lyuwenyu/RT-DETR.git
|
||||||
cd RT-DETR/rtdetr_paddle
|
cd RT-DETR/rtdetr_paddle
|
||||||
pip3 install -r requirements.txt
|
pip3 install -r requirements.txt
|
||||||
pip3 install onnx onnxsim onnxruntime paddle2onnx
|
pip3 install onnx onnxslim onnxruntime paddle2onnx
|
||||||
```
|
```
|
||||||
|
|
||||||
**NOTE**: It is recommended to use Python virtualenv.
|
**NOTE**: It is recommended to use Python virtualenv.
|
||||||
@@ -90,6 +90,7 @@ export CUDA_VER=XY.Z
|
|||||||
* x86 platform
|
* x86 platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 = 12.1
|
DeepStream 6.3 = 12.1
|
||||||
DeepStream 6.2 = 11.8
|
DeepStream 6.2 = 11.8
|
||||||
@@ -102,6 +103,7 @@ export CUDA_VER=XY.Z
|
|||||||
* Jetson platform
|
* Jetson platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
@@ -122,7 +124,7 @@ Edit the `config_infer_primary_rtdetr.txt` file according to your model (example
|
|||||||
```
|
```
|
||||||
[property]
|
[property]
|
||||||
...
|
...
|
||||||
onnx-file=rtdetr_r50vd_6x_coco.onnx
|
onnx-file=rtdetr_r50vd_6x_coco.pdparams.onnx
|
||||||
...
|
...
|
||||||
num-detected-classes=80
|
num-detected-classes=80
|
||||||
...
|
...
|
||||||
@@ -139,6 +141,15 @@ maintain-aspect-ratio=0
|
|||||||
...
|
...
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**NOTE**: The **RT-DETR** do not require NMS. To get better accuracy, use
|
||||||
|
|
||||||
|
```
|
||||||
|
[property]
|
||||||
|
...
|
||||||
|
cluster-mode=4
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
##
|
##
|
||||||
|
|
||||||
### Edit the deepstream_app_config file
|
### Edit the deepstream_app_config file
|
||||||
|
|||||||
@@ -18,7 +18,7 @@
|
|||||||
git clone https://github.com/lyuwenyu/RT-DETR.git
|
git clone https://github.com/lyuwenyu/RT-DETR.git
|
||||||
cd RT-DETR/rtdetr_pytorch
|
cd RT-DETR/rtdetr_pytorch
|
||||||
pip3 install -r requirements.txt
|
pip3 install -r requirements.txt
|
||||||
pip3 install onnx onnxsim onnxruntime
|
pip3 install onnx onnxslim onnxruntime
|
||||||
```
|
```
|
||||||
|
|
||||||
**NOTE**: It is recommended to use Python virtualenv.
|
**NOTE**: It is recommended to use Python virtualenv.
|
||||||
@@ -109,6 +109,7 @@ export CUDA_VER=XY.Z
|
|||||||
* x86 platform
|
* x86 platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 = 12.1
|
DeepStream 6.3 = 12.1
|
||||||
DeepStream 6.2 = 11.8
|
DeepStream 6.2 = 11.8
|
||||||
@@ -121,6 +122,7 @@ export CUDA_VER=XY.Z
|
|||||||
* Jetson platform
|
* Jetson platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
@@ -141,7 +143,7 @@ Edit the `config_infer_primary_rtdetr.txt` file according to your model (example
|
|||||||
```
|
```
|
||||||
[property]
|
[property]
|
||||||
...
|
...
|
||||||
onnx-file=rtdetr_r50vd_6x_coco_from_paddle.onnx
|
onnx-file=rtdetr_r50vd_6x_coco_from_paddle.pth.onnx
|
||||||
...
|
...
|
||||||
num-detected-classes=80
|
num-detected-classes=80
|
||||||
...
|
...
|
||||||
@@ -158,6 +160,15 @@ maintain-aspect-ratio=0
|
|||||||
...
|
...
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**NOTE**: The **RT-DETR** do not require NMS. To get better accuracy, use
|
||||||
|
|
||||||
|
```
|
||||||
|
[property]
|
||||||
|
...
|
||||||
|
cluster-mode=4
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
##
|
##
|
||||||
|
|
||||||
### Edit the deepstream_app_config file
|
### Edit the deepstream_app_config file
|
||||||
|
|||||||
@@ -17,9 +17,8 @@
|
|||||||
```
|
```
|
||||||
git clone https://github.com/ultralytics/ultralytics.git
|
git clone https://github.com/ultralytics/ultralytics.git
|
||||||
cd ultralytics
|
cd ultralytics
|
||||||
pip3 install -r requirements.txt
|
pip3 install -e .
|
||||||
python3 setup.py install
|
pip3 install onnx onnxslim onnxruntime
|
||||||
pip3 install onnx onnxsim onnxruntime
|
|
||||||
```
|
```
|
||||||
|
|
||||||
**NOTE**: It is recommended to use Python virtualenv.
|
**NOTE**: It is recommended to use Python virtualenv.
|
||||||
@@ -30,17 +29,17 @@ Copy the `export_rtdetr_ultralytics.py` file from `DeepStream-Yolo/utils` direct
|
|||||||
|
|
||||||
#### 3. Download the model
|
#### 3. Download the model
|
||||||
|
|
||||||
Download the `pt` file from [Ultralytics](https://github.com/ultralytics/assets/releases/) releases (example for RT-DETR-l)
|
Download the `pt` file from [Ultralytics](https://github.com/ultralytics/assets/releases/) releases (example for RT-DETR-L)
|
||||||
|
|
||||||
```
|
```
|
||||||
wget https://github.com/ultralytics/assets/releases/download/v0.0.0/rtdetr-l.pt
|
wget https://github.com/ultralytics/assets/releases/download/v8.2.0/rtdetr-l.pt
|
||||||
```
|
```
|
||||||
|
|
||||||
**NOTE**: You can use your custom model.
|
**NOTE**: You can use your custom model.
|
||||||
|
|
||||||
#### 4. Convert model
|
#### 4. Convert model
|
||||||
|
|
||||||
Generate the ONNX model file (example for RT-DETR-l)
|
Generate the ONNX model file (example for RT-DETR-L)
|
||||||
|
|
||||||
```
|
```
|
||||||
python3 export_rtdetr_ultralytics.py -w rtdetr-l.pt --dynamic
|
python3 export_rtdetr_ultralytics.py -w rtdetr-l.pt --dynamic
|
||||||
@@ -110,6 +109,7 @@ export CUDA_VER=XY.Z
|
|||||||
* x86 platform
|
* x86 platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 = 12.1
|
DeepStream 6.3 = 12.1
|
||||||
DeepStream 6.2 = 11.8
|
DeepStream 6.2 = 11.8
|
||||||
@@ -122,6 +122,7 @@ export CUDA_VER=XY.Z
|
|||||||
* Jetson platform
|
* Jetson platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
@@ -137,12 +138,12 @@ make -C nvdsinfer_custom_impl_Yolo clean && make -C nvdsinfer_custom_impl_Yolo
|
|||||||
|
|
||||||
### Edit the config_infer_primary_rtdetr file
|
### Edit the config_infer_primary_rtdetr file
|
||||||
|
|
||||||
Edit the `config_infer_primary_rtdetr.txt` file according to your model (example for RT-DETR-l with 80 classes)
|
Edit the `config_infer_primary_rtdetr.txt` file according to your model (example for RT-DETR-L with 80 classes)
|
||||||
|
|
||||||
```
|
```
|
||||||
[property]
|
[property]
|
||||||
...
|
...
|
||||||
onnx-file=rtdetr-l.onnx
|
onnx-file=rtdetr-l.pt.onnx
|
||||||
...
|
...
|
||||||
num-detected-classes=80
|
num-detected-classes=80
|
||||||
...
|
...
|
||||||
@@ -159,6 +160,15 @@ maintain-aspect-ratio=0
|
|||||||
...
|
...
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**NOTE**: The **RT-DETR Ultralytics** do not require NMS. To get better accuracy, use
|
||||||
|
|
||||||
|
```
|
||||||
|
[property]
|
||||||
|
...
|
||||||
|
cluster-mode=4
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
##
|
##
|
||||||
|
|
||||||
### Edit the deepstream_app_config file
|
### Edit the deepstream_app_config file
|
||||||
|
|||||||
209
docs/RTMDet.md
Normal file
209
docs/RTMDet.md
Normal file
@@ -0,0 +1,209 @@
|
|||||||
|
# RTMDet (MMYOLO) usage
|
||||||
|
|
||||||
|
* [Convert model](#convert-model)
|
||||||
|
* [Compile the lib](#compile-the-lib)
|
||||||
|
* [Edit the config_infer_primary_rtmdet file](#edit-the-config_infer_primary_rtmdet-file)
|
||||||
|
* [Edit the deepstream_app_config file](#edit-the-deepstream_app_config-file)
|
||||||
|
* [Testing the model](#testing-the-model)
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Convert model
|
||||||
|
|
||||||
|
#### 1. Download the RTMDet (MMYOLO) repo and install the requirements
|
||||||
|
|
||||||
|
```
|
||||||
|
git clone https://github.com/open-mmlab/mmyolo.git
|
||||||
|
cd mmyolo
|
||||||
|
pip3 install openmim
|
||||||
|
mim install "mmengine>=0.6.0"
|
||||||
|
mim install "mmcv>=2.0.0rc4,<2.1.0"
|
||||||
|
mim install "mmdet>=3.0.0,<4.0.0"
|
||||||
|
pip3 install -r requirements/albu.txt
|
||||||
|
mim install -v -e .
|
||||||
|
pip3 install onnx onnxslim onnxruntime
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: It is recommended to use Python virtualenv.
|
||||||
|
|
||||||
|
#### 2. Copy conversor
|
||||||
|
|
||||||
|
Copy the `export_rtmdet.py` file from `DeepStream-Yolo/utils` directory to the `mmyolo` folder.
|
||||||
|
|
||||||
|
#### 3. Download the model
|
||||||
|
|
||||||
|
Download the `pth` file from [RTMDet (MMYOLO)](https://github.com/open-mmlab/mmyolo/tree/main/configs/rtmdet) releases (example for RTMDet-s*)
|
||||||
|
|
||||||
|
```
|
||||||
|
wget https://download.openmmlab.com/mmrazor/v1/rtmdet_distillation/kd_s_rtmdet_m_neck_300e_coco/kd_s_rtmdet_m_neck_300e_coco_20230220_140647-446ff003.pth
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: You can use your custom model.
|
||||||
|
|
||||||
|
#### 4. Convert model
|
||||||
|
|
||||||
|
Generate the ONNX model file (example for RTMDet-s*)
|
||||||
|
|
||||||
|
```
|
||||||
|
python3 export_rtmdet.py -w kd_s_rtmdet_m_neck_300e_coco_20230220_140647-446ff003.pth -c configs/rtmdet/distillation/kd_s_rtmdet_m_neck_300e_coco.py --dynamic
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: To change the inference size (defaut: 640)
|
||||||
|
|
||||||
|
```
|
||||||
|
-s SIZE
|
||||||
|
--size SIZE
|
||||||
|
-s HEIGHT WIDTH
|
||||||
|
--size HEIGHT WIDTH
|
||||||
|
```
|
||||||
|
|
||||||
|
Example for 1280
|
||||||
|
|
||||||
|
```
|
||||||
|
-s 1280
|
||||||
|
```
|
||||||
|
|
||||||
|
or
|
||||||
|
|
||||||
|
```
|
||||||
|
-s 1280 1280
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: To simplify the ONNX model (DeepStream >= 6.0)
|
||||||
|
|
||||||
|
```
|
||||||
|
--simplify
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: To use dynamic batch-size (DeepStream >= 6.1)
|
||||||
|
|
||||||
|
```
|
||||||
|
--dynamic
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: To use static batch-size (example for batch-size = 4)
|
||||||
|
|
||||||
|
```
|
||||||
|
--batch 4
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: If you are using the DeepStream 5.1, remove the `--dynamic` arg and use opset 12 or lower. The default opset is 17.
|
||||||
|
|
||||||
|
```
|
||||||
|
--opset 12
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 5. Copy generated files
|
||||||
|
|
||||||
|
Copy the generated ONNX model file and labels.txt file (if generated) to the `DeepStream-Yolo` folder.
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Compile the lib
|
||||||
|
|
||||||
|
1. Open the `DeepStream-Yolo` folder and compile the lib
|
||||||
|
|
||||||
|
2. Set the `CUDA_VER` according to your DeepStream version
|
||||||
|
|
||||||
|
```
|
||||||
|
export CUDA_VER=XY.Z
|
||||||
|
```
|
||||||
|
|
||||||
|
* x86 platform
|
||||||
|
|
||||||
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
|
DeepStream 6.3 = 12.1
|
||||||
|
DeepStream 6.2 = 11.8
|
||||||
|
DeepStream 6.1.1 = 11.7
|
||||||
|
DeepStream 6.1 = 11.6
|
||||||
|
DeepStream 6.0.1 / 6.0 = 11.4
|
||||||
|
DeepStream 5.1 = 11.1
|
||||||
|
```
|
||||||
|
|
||||||
|
* Jetson platform
|
||||||
|
|
||||||
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Make the lib
|
||||||
|
|
||||||
|
```
|
||||||
|
make -C nvdsinfer_custom_impl_Yolo clean && make -C nvdsinfer_custom_impl_Yolo
|
||||||
|
```
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Edit the config_infer_primary_rtmdet file
|
||||||
|
|
||||||
|
Edit the `config_infer_primary_rtmdet.txt` file according to your model (example for RTMDet-s* with 80 classes)
|
||||||
|
|
||||||
|
```
|
||||||
|
[property]
|
||||||
|
...
|
||||||
|
onnx-file=kd_s_rtmdet_m_neck_300e_coco_20230220_140647-446ff003.pth.onnx
|
||||||
|
...
|
||||||
|
num-detected-classes=80
|
||||||
|
...
|
||||||
|
parse-bbox-func-name=NvDsInferParseYolo
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: The **RTMDet (MMYOLO)** resizes the input with center padding. To get better accuracy, use
|
||||||
|
|
||||||
|
```
|
||||||
|
[property]
|
||||||
|
...
|
||||||
|
maintain-aspect-ratio=1
|
||||||
|
symmetric-padding=1
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: The **RTMDet (MMYOLO)** uses BGR color format for the image input. It is important to change the `model-color-format` according to the trained values.
|
||||||
|
|
||||||
|
```
|
||||||
|
[property]
|
||||||
|
...
|
||||||
|
model-color-format=1
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: The **RTMDet (MMYOLO)** uses normalization on the image preprocess. It is important to change the `net-scale-factor` and `offsets` according to the trained values.
|
||||||
|
|
||||||
|
Default: `mean = 0.485, 0.456, 0.406` and `std = 0.229, 0.224, 0.225`
|
||||||
|
|
||||||
|
```
|
||||||
|
[property]
|
||||||
|
...
|
||||||
|
net-scale-factor=0.0173520735727919486
|
||||||
|
offsets=103.53;116.28;123.675
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Edit the deepstream_app_config file
|
||||||
|
|
||||||
|
```
|
||||||
|
...
|
||||||
|
[primary-gie]
|
||||||
|
...
|
||||||
|
config-file=config_infer_primary_rtmdet.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Testing the model
|
||||||
|
|
||||||
|
```
|
||||||
|
deepstream-app -c deepstream_app_config.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: The TensorRT engine file may take a very long time to generate (sometimes more than 10 minutes).
|
||||||
|
|
||||||
|
**NOTE**: For more information about custom models configuration (`batch-size`, `network-mode`, etc), please check the [`docs/customModels.md`](customModels.md) file.
|
||||||
@@ -19,7 +19,7 @@ git clone https://github.com/Deci-AI/super-gradients.git
|
|||||||
cd super-gradients
|
cd super-gradients
|
||||||
pip3 install -r requirements.txt
|
pip3 install -r requirements.txt
|
||||||
python3 setup.py install
|
python3 setup.py install
|
||||||
pip3 install onnx onnxsim onnxruntime
|
pip3 install onnx onnxslim onnxruntime
|
||||||
```
|
```
|
||||||
|
|
||||||
**NOTE**: It is recommended to use Python virtualenv.
|
**NOTE**: It is recommended to use Python virtualenv.
|
||||||
@@ -140,6 +140,7 @@ export CUDA_VER=XY.Z
|
|||||||
* x86 platform
|
* x86 platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 = 12.1
|
DeepStream 6.3 = 12.1
|
||||||
DeepStream 6.2 = 11.8
|
DeepStream 6.2 = 11.8
|
||||||
@@ -152,6 +153,7 @@ export CUDA_VER=XY.Z
|
|||||||
* Jetson platform
|
* Jetson platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
@@ -172,11 +174,11 @@ Edit the `config_infer_primary_yolonas.txt` file according to your model (exampl
|
|||||||
```
|
```
|
||||||
[property]
|
[property]
|
||||||
...
|
...
|
||||||
onnx-file=yolo_nas_s_coco.onnx
|
onnx-file=yolo_nas_s_coco.pth.onnx
|
||||||
...
|
...
|
||||||
num-detected-classes=80
|
num-detected-classes=80
|
||||||
...
|
...
|
||||||
parse-bbox-func-name=NvDsInferParseYoloE
|
parse-bbox-func-name=NvDsInferParseYolo
|
||||||
...
|
...
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
@@ -20,7 +20,7 @@
|
|||||||
git clone https://github.com/WongKinYiu/yolor.git
|
git clone https://github.com/WongKinYiu/yolor.git
|
||||||
cd yolor
|
cd yolor
|
||||||
pip3 install -r requirements.txt
|
pip3 install -r requirements.txt
|
||||||
pip3 install onnx onnxsim onnxruntime
|
pip3 install onnx onnxslim onnxruntime
|
||||||
```
|
```
|
||||||
|
|
||||||
**NOTE**: It is recommended to use Python virtualenv.
|
**NOTE**: It is recommended to use Python virtualenv.
|
||||||
@@ -125,6 +125,7 @@ export CUDA_VER=XY.Z
|
|||||||
* x86 platform
|
* x86 platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 = 12.1
|
DeepStream 6.3 = 12.1
|
||||||
DeepStream 6.2 = 11.8
|
DeepStream 6.2 = 11.8
|
||||||
@@ -137,6 +138,7 @@ export CUDA_VER=XY.Z
|
|||||||
* Jetson platform
|
* Jetson platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
@@ -157,7 +159,7 @@ Edit the `config_infer_primary_yolor.txt` file according to your model (example
|
|||||||
```
|
```
|
||||||
[property]
|
[property]
|
||||||
...
|
...
|
||||||
onnx-file=yolor_csp.onnx
|
onnx-file=yolor_csp.pt.onnx
|
||||||
...
|
...
|
||||||
num-detected-classes=80
|
num-detected-classes=80
|
||||||
...
|
...
|
||||||
|
|||||||
@@ -19,7 +19,7 @@ git clone https://github.com/Megvii-BaseDetection/YOLOX.git
|
|||||||
cd YOLOX
|
cd YOLOX
|
||||||
pip3 install -r requirements.txt
|
pip3 install -r requirements.txt
|
||||||
python3 setup.py develop
|
python3 setup.py develop
|
||||||
pip3 install onnx onnxsim onnxruntime
|
pip3 install onnx onnxslim onnxruntime
|
||||||
```
|
```
|
||||||
|
|
||||||
**NOTE**: It is recommended to use Python virtualenv.
|
**NOTE**: It is recommended to use Python virtualenv.
|
||||||
@@ -89,6 +89,7 @@ export CUDA_VER=XY.Z
|
|||||||
* x86 platform
|
* x86 platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 = 12.1
|
DeepStream 6.3 = 12.1
|
||||||
DeepStream 6.2 = 11.8
|
DeepStream 6.2 = 11.8
|
||||||
@@ -101,6 +102,7 @@ export CUDA_VER=XY.Z
|
|||||||
* Jetson platform
|
* Jetson platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
@@ -121,7 +123,7 @@ Edit the `config_infer_primary_yolox.txt` file according to your model (example
|
|||||||
```
|
```
|
||||||
[property]
|
[property]
|
||||||
...
|
...
|
||||||
onnx-file=yolox_s.onnx
|
onnx-file=yolox_s.pth.onnx
|
||||||
...
|
...
|
||||||
num-detected-classes=80
|
num-detected-classes=80
|
||||||
...
|
...
|
||||||
@@ -141,6 +143,24 @@ symmetric-padding=0
|
|||||||
...
|
...
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**NOTE**: The **YOLOX** uses BGR color format for the image input. It is important to change the `model-color-format` according to the trained values.
|
||||||
|
|
||||||
|
```
|
||||||
|
[property]
|
||||||
|
...
|
||||||
|
model-color-format=1
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: The **YOLOX legacy** uses RGB color format for the image input. It is important to change the `model-color-format` according to the trained values.
|
||||||
|
|
||||||
|
```
|
||||||
|
[property]
|
||||||
|
...
|
||||||
|
model-color-format=0
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
**NOTE**: The **YOLOX** uses no normalization on the image preprocess. It is important to change the `net-scale-factor` according to the trained values.
|
**NOTE**: The **YOLOX** uses no normalization on the image preprocess. It is important to change the `net-scale-factor` according to the trained values.
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|||||||
@@ -20,7 +20,7 @@
|
|||||||
git clone https://github.com/ultralytics/yolov5.git
|
git clone https://github.com/ultralytics/yolov5.git
|
||||||
cd yolov5
|
cd yolov5
|
||||||
pip3 install -r requirements.txt
|
pip3 install -r requirements.txt
|
||||||
pip3 install onnx onnxsim onnxruntime
|
pip3 install onnx onnxslim onnxruntime
|
||||||
```
|
```
|
||||||
|
|
||||||
**NOTE**: It is recommended to use Python virtualenv.
|
**NOTE**: It is recommended to use Python virtualenv.
|
||||||
@@ -117,6 +117,7 @@ export CUDA_VER=XY.Z
|
|||||||
* x86 platform
|
* x86 platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 = 12.1
|
DeepStream 6.3 = 12.1
|
||||||
DeepStream 6.2 = 11.8
|
DeepStream 6.2 = 11.8
|
||||||
@@ -129,6 +130,7 @@ export CUDA_VER=XY.Z
|
|||||||
* Jetson platform
|
* Jetson platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
@@ -149,7 +151,7 @@ Edit the `config_infer_primary_yoloV5.txt` file according to your model (example
|
|||||||
```
|
```
|
||||||
[property]
|
[property]
|
||||||
...
|
...
|
||||||
onnx-file=yolov5s.onnx
|
onnx-file=yolov5s.pt.onnx
|
||||||
...
|
...
|
||||||
num-detected-classes=80
|
num-detected-classes=80
|
||||||
...
|
...
|
||||||
|
|||||||
@@ -20,7 +20,7 @@
|
|||||||
git clone https://github.com/meituan/YOLOv6.git
|
git clone https://github.com/meituan/YOLOv6.git
|
||||||
cd YOLOv6
|
cd YOLOv6
|
||||||
pip3 install -r requirements.txt
|
pip3 install -r requirements.txt
|
||||||
pip3 install onnx onnxsim onnxruntime
|
pip3 install onnx onnxslim onnxruntime
|
||||||
```
|
```
|
||||||
|
|
||||||
**NOTE**: It is recommended to use Python virtualenv.
|
**NOTE**: It is recommended to use Python virtualenv.
|
||||||
@@ -117,6 +117,7 @@ export CUDA_VER=XY.Z
|
|||||||
* x86 platform
|
* x86 platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 = 12.1
|
DeepStream 6.3 = 12.1
|
||||||
DeepStream 6.2 = 11.8
|
DeepStream 6.2 = 11.8
|
||||||
@@ -129,6 +130,7 @@ export CUDA_VER=XY.Z
|
|||||||
* Jetson platform
|
* Jetson platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
@@ -149,7 +151,7 @@ Edit the `config_infer_primary_yoloV6.txt` file according to your model (example
|
|||||||
```
|
```
|
||||||
[property]
|
[property]
|
||||||
...
|
...
|
||||||
onnx-file=yolov6s.onnx
|
onnx-file=yolov6s.pt.onnx
|
||||||
...
|
...
|
||||||
num-detected-classes=80
|
num-detected-classes=80
|
||||||
...
|
...
|
||||||
|
|||||||
@@ -18,7 +18,7 @@
|
|||||||
git clone https://github.com/WongKinYiu/yolov7.git
|
git clone https://github.com/WongKinYiu/yolov7.git
|
||||||
cd yolov7
|
cd yolov7
|
||||||
pip3 install -r requirements.txt
|
pip3 install -r requirements.txt
|
||||||
pip3 install onnx onnxsim onnxruntime
|
pip3 install onnx onnxslim onnxruntime
|
||||||
```
|
```
|
||||||
|
|
||||||
**NOTE**: It is recommended to use Python virtualenv.
|
**NOTE**: It is recommended to use Python virtualenv.
|
||||||
@@ -119,6 +119,7 @@ export CUDA_VER=XY.Z
|
|||||||
* x86 platform
|
* x86 platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 = 12.1
|
DeepStream 6.3 = 12.1
|
||||||
DeepStream 6.2 = 11.8
|
DeepStream 6.2 = 11.8
|
||||||
@@ -131,6 +132,7 @@ export CUDA_VER=XY.Z
|
|||||||
* Jetson platform
|
* Jetson platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
@@ -151,7 +153,7 @@ Edit the `config_infer_primary_yoloV7.txt` file according to your model (example
|
|||||||
```
|
```
|
||||||
[property]
|
[property]
|
||||||
...
|
...
|
||||||
onnx-file=yolov7.onnx
|
onnx-file=yolov7.pt.onnx
|
||||||
...
|
...
|
||||||
num-detected-classes=80
|
num-detected-classes=80
|
||||||
...
|
...
|
||||||
|
|||||||
@@ -17,9 +17,8 @@
|
|||||||
```
|
```
|
||||||
git clone https://github.com/ultralytics/ultralytics.git
|
git clone https://github.com/ultralytics/ultralytics.git
|
||||||
cd ultralytics
|
cd ultralytics
|
||||||
pip3 install -r requirements.txt
|
pip3 install -e .
|
||||||
python3 setup.py install
|
pip3 install onnx onnxslim onnxruntime
|
||||||
pip3 install onnx onnxsim onnxruntime
|
|
||||||
```
|
```
|
||||||
|
|
||||||
**NOTE**: It is recommended to use Python virtualenv.
|
**NOTE**: It is recommended to use Python virtualenv.
|
||||||
@@ -33,7 +32,7 @@ Copy the `export_yoloV8.py` file from `DeepStream-Yolo/utils` directory to the `
|
|||||||
Download the `pt` file from [YOLOv8](https://github.com/ultralytics/assets/releases/) releases (example for YOLOv8s)
|
Download the `pt` file from [YOLOv8](https://github.com/ultralytics/assets/releases/) releases (example for YOLOv8s)
|
||||||
|
|
||||||
```
|
```
|
||||||
wget https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8s.pt
|
wget https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s.pt
|
||||||
```
|
```
|
||||||
|
|
||||||
**NOTE**: You can use your custom model.
|
**NOTE**: You can use your custom model.
|
||||||
@@ -85,7 +84,7 @@ or
|
|||||||
--batch 4
|
--batch 4
|
||||||
```
|
```
|
||||||
|
|
||||||
**NOTE**: If you are using the DeepStream 5.1, remove the `--dynamic` arg and use opset 12 or lower. The default opset is 16.
|
**NOTE**: If you are using the DeepStream 5.1, remove the `--dynamic` arg and use opset 12 or lower. The default opset is 17.
|
||||||
|
|
||||||
```
|
```
|
||||||
--opset 12
|
--opset 12
|
||||||
@@ -110,6 +109,7 @@ export CUDA_VER=XY.Z
|
|||||||
* x86 platform
|
* x86 platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 = 12.1
|
DeepStream 6.3 = 12.1
|
||||||
DeepStream 6.2 = 11.8
|
DeepStream 6.2 = 11.8
|
||||||
@@ -122,6 +122,7 @@ export CUDA_VER=XY.Z
|
|||||||
* Jetson platform
|
* Jetson platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
@@ -142,7 +143,7 @@ Edit the `config_infer_primary_yoloV8.txt` file according to your model (example
|
|||||||
```
|
```
|
||||||
[property]
|
[property]
|
||||||
...
|
...
|
||||||
onnx-file=yolov8s.onnx
|
onnx-file=yolov8s.pt.onnx
|
||||||
...
|
...
|
||||||
num-detected-classes=80
|
num-detected-classes=80
|
||||||
...
|
...
|
||||||
|
|||||||
185
docs/YOLOv9.md
Normal file
185
docs/YOLOv9.md
Normal file
@@ -0,0 +1,185 @@
|
|||||||
|
# YOLOv9 usage
|
||||||
|
|
||||||
|
**NOTE**: The yaml file is not required.
|
||||||
|
|
||||||
|
* [Convert model](#convert-model)
|
||||||
|
* [Compile the lib](#compile-the-lib)
|
||||||
|
* [Edit the config_infer_primary_yoloV9 file](#edit-the-config_infer_primary_yolov9-file)
|
||||||
|
* [Edit the deepstream_app_config file](#edit-the-deepstream_app_config-file)
|
||||||
|
* [Testing the model](#testing-the-model)
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Convert model
|
||||||
|
|
||||||
|
#### 1. Download the YOLOv9 repo and install the requirements
|
||||||
|
|
||||||
|
```
|
||||||
|
git clone https://github.com/WongKinYiu/yolov9.git
|
||||||
|
cd yolov9
|
||||||
|
pip3 install -r requirements.txt
|
||||||
|
pip3 install onnx onnxslim onnxruntime
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: It is recommended to use Python virtualenv.
|
||||||
|
|
||||||
|
#### 2. Copy conversor
|
||||||
|
|
||||||
|
Copy the `export_yoloV9.py` file from `DeepStream-Yolo/utils` directory to the `yolov9` folder.
|
||||||
|
|
||||||
|
#### 3. Download the model
|
||||||
|
|
||||||
|
Download the `pt` file from [YOLOv9](https://github.com/WongKinYiu/yolov9/releases/) releases (example for YOLOv9-S)
|
||||||
|
|
||||||
|
```
|
||||||
|
wget https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-s-converted.pt
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: You can use your custom model.
|
||||||
|
|
||||||
|
#### 4. Convert model
|
||||||
|
|
||||||
|
Generate the ONNX model file (example for YOLOv9-S)
|
||||||
|
|
||||||
|
```
|
||||||
|
python3 export_yoloV9.py -w yolov9-s-converted.pt --dynamic
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: To change the inference size (defaut: 640)
|
||||||
|
|
||||||
|
```
|
||||||
|
-s SIZE
|
||||||
|
--size SIZE
|
||||||
|
-s HEIGHT WIDTH
|
||||||
|
--size HEIGHT WIDTH
|
||||||
|
```
|
||||||
|
|
||||||
|
Example for 1280
|
||||||
|
|
||||||
|
```
|
||||||
|
-s 1280
|
||||||
|
```
|
||||||
|
|
||||||
|
or
|
||||||
|
|
||||||
|
```
|
||||||
|
-s 1280 1280
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: To simplify the ONNX model (DeepStream >= 6.0)
|
||||||
|
|
||||||
|
```
|
||||||
|
--simplify
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: To use dynamic batch-size (DeepStream >= 6.1)
|
||||||
|
|
||||||
|
```
|
||||||
|
--dynamic
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: To use static batch-size (example for batch-size = 4)
|
||||||
|
|
||||||
|
```
|
||||||
|
--batch 4
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: If you are using the DeepStream 5.1, remove the `--dynamic` arg and use opset 12 or lower. The default opset is 17.
|
||||||
|
|
||||||
|
```
|
||||||
|
--opset 12
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 5. Copy generated files
|
||||||
|
|
||||||
|
Copy the generated ONNX model file and labels.txt file (if generated) to the `DeepStream-Yolo` folder.
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Compile the lib
|
||||||
|
|
||||||
|
1. Open the `DeepStream-Yolo` folder and compile the lib
|
||||||
|
|
||||||
|
2. Set the `CUDA_VER` according to your DeepStream version
|
||||||
|
|
||||||
|
```
|
||||||
|
export CUDA_VER=XY.Z
|
||||||
|
```
|
||||||
|
|
||||||
|
* x86 platform
|
||||||
|
|
||||||
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
|
DeepStream 6.3 = 12.1
|
||||||
|
DeepStream 6.2 = 11.8
|
||||||
|
DeepStream 6.1.1 = 11.7
|
||||||
|
DeepStream 6.1 = 11.6
|
||||||
|
DeepStream 6.0.1 / 6.0 = 11.4
|
||||||
|
DeepStream 5.1 = 11.1
|
||||||
|
```
|
||||||
|
|
||||||
|
* Jetson platform
|
||||||
|
|
||||||
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Make the lib
|
||||||
|
|
||||||
|
```
|
||||||
|
make -C nvdsinfer_custom_impl_Yolo clean && make -C nvdsinfer_custom_impl_Yolo
|
||||||
|
```
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Edit the config_infer_primary_yoloV9 file
|
||||||
|
|
||||||
|
Edit the `config_infer_primary_yoloV9.txt` file according to your model (example for YOLOv9-S with 80 classes)
|
||||||
|
|
||||||
|
```
|
||||||
|
[property]
|
||||||
|
...
|
||||||
|
onnx-file=yolov9-s-converted.pt.onnx
|
||||||
|
...
|
||||||
|
num-detected-classes=80
|
||||||
|
...
|
||||||
|
parse-bbox-func-name=NvDsInferParseYolo
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: The **YOLOv9** resizes the input with center padding. To get better accuracy, use
|
||||||
|
|
||||||
|
```
|
||||||
|
[property]
|
||||||
|
...
|
||||||
|
maintain-aspect-ratio=1
|
||||||
|
symmetric-padding=1
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Edit the deepstream_app_config file
|
||||||
|
|
||||||
|
```
|
||||||
|
...
|
||||||
|
[primary-gie]
|
||||||
|
...
|
||||||
|
config-file=config_infer_primary_yoloV9.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
##
|
||||||
|
|
||||||
|
### Testing the model
|
||||||
|
|
||||||
|
```
|
||||||
|
deepstream-app -c deepstream_app_config.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: The TensorRT engine file may take a very long time to generate (sometimes more than 10 minutes).
|
||||||
|
|
||||||
|
**NOTE**: For more information about custom models configuration (`batch-size`, `network-mode`, etc), please check the [`docs/customModels.md`](customModels.md) file.
|
||||||
@@ -34,6 +34,7 @@ export CUDA_VER=XY.Z
|
|||||||
* x86 platform
|
* x86 platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 = 12.1
|
DeepStream 6.3 = 12.1
|
||||||
DeepStream 6.2 = 11.8
|
DeepStream 6.2 = 11.8
|
||||||
@@ -46,6 +47,7 @@ export CUDA_VER=XY.Z
|
|||||||
* Jetson platform
|
* Jetson platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
|
|||||||
@@ -29,6 +29,157 @@ sudo apt-get install linux-headers-$(uname -r)
|
|||||||
sudo reboot
|
sudo reboot
|
||||||
```
|
```
|
||||||
|
|
||||||
|
<details><summary>DeepStream 7.1</summary>
|
||||||
|
|
||||||
|
### 1. Dependencies
|
||||||
|
|
||||||
|
```
|
||||||
|
sudo apt-get install dkms
|
||||||
|
sudo apt-get install libssl3 libssl-dev libgles2-mesa-dev libgstreamer1.0-0 gstreamer1.0-tools gstreamer1.0-plugins-good gstreamer1.0-plugins-bad gstreamer1.0-plugins-ugly gstreamer1.0-libav libgstreamer-plugins-base1.0-dev libgstrtspserver-1.0-0 libjansson4 libyaml-cpp-dev libjsoncpp-dev protobuf-compiler
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. CUDA Keyring
|
||||||
|
|
||||||
|
```
|
||||||
|
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0-1_all.deb
|
||||||
|
sudo dpkg -i cuda-keyring_1.0-1_all.deb
|
||||||
|
sudo apt-get update
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. GCC 12
|
||||||
|
|
||||||
|
```
|
||||||
|
sudo apt-get install gcc-12 g++-12
|
||||||
|
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 12
|
||||||
|
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-12 12
|
||||||
|
sudo update-initramfs -u
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. NVIDIA Driver
|
||||||
|
|
||||||
|
<details><summary>TITAN, GeForce RTX / GTX series and RTX / Quadro series</summary><blockquote>
|
||||||
|
|
||||||
|
- Download
|
||||||
|
|
||||||
|
```
|
||||||
|
wget https://us.download.nvidia.com/XFree86/Linux-x86_64/560.35.03/NVIDIA-Linux-x86_64-560.35.03.run
|
||||||
|
```
|
||||||
|
|
||||||
|
<blockquote><details><summary>Laptop</summary>
|
||||||
|
|
||||||
|
* Run
|
||||||
|
|
||||||
|
```
|
||||||
|
sudo sh NVIDIA-Linux-x86_64-560.35.03.run --no-cc-version-check --silent --disable-nouveau --dkms --install-libglvnd
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: This step will disable the nouveau drivers.
|
||||||
|
|
||||||
|
* Reboot
|
||||||
|
|
||||||
|
```
|
||||||
|
sudo reboot
|
||||||
|
```
|
||||||
|
|
||||||
|
* Install
|
||||||
|
|
||||||
|
```
|
||||||
|
sudo sh NVIDIA-Linux-x86_64-560.35.03.run --no-cc-version-check --silent --disable-nouveau --dkms --install-libglvnd
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: If you are using a laptop with NVIDIA Optimius, run
|
||||||
|
|
||||||
|
```
|
||||||
|
sudo apt-get install nvidia-prime
|
||||||
|
sudo prime-select nvidia
|
||||||
|
```
|
||||||
|
|
||||||
|
</details></blockquote>
|
||||||
|
|
||||||
|
<blockquote><details><summary>Desktop</summary>
|
||||||
|
|
||||||
|
* Run
|
||||||
|
|
||||||
|
```
|
||||||
|
sudo sh NVIDIA-Linux-x86_64-560.35.03.run --no-cc-version-check --silent --disable-nouveau --dkms --install-libglvnd --run-nvidia-xconfig
|
||||||
|
```
|
||||||
|
|
||||||
|
**NOTE**: This step will disable the nouveau drivers.
|
||||||
|
|
||||||
|
* Reboot
|
||||||
|
|
||||||
|
```
|
||||||
|
sudo reboot
|
||||||
|
```
|
||||||
|
|
||||||
|
* Install
|
||||||
|
|
||||||
|
```
|
||||||
|
sudo sh NVIDIA-Linux-x86_64-560.35.03.run --no-cc-version-check --silent --disable-nouveau --dkms --install-libglvnd --run-nvidia-xconfig
|
||||||
|
```
|
||||||
|
|
||||||
|
</details></blockquote>
|
||||||
|
|
||||||
|
</blockquote></details>
|
||||||
|
|
||||||
|
<details><summary>Data center / Tesla series</summary><blockquote>
|
||||||
|
|
||||||
|
- Download
|
||||||
|
|
||||||
|
```
|
||||||
|
wget https://us.download.nvidia.com/tesla/535.183.06/NVIDIA-Linux-x86_64-535.183.06.run
|
||||||
|
```
|
||||||
|
|
||||||
|
* Run
|
||||||
|
|
||||||
|
```
|
||||||
|
sudo sh NVIDIA-Linux-x86_64-535.183.06.run --no-cc-version-check --silent --disable-nouveau --dkms --install-libglvnd --run-nvidia-xconfig
|
||||||
|
```
|
||||||
|
|
||||||
|
</blockquote></details>
|
||||||
|
|
||||||
|
### 5. CUDA
|
||||||
|
|
||||||
|
```
|
||||||
|
wget https://developer.download.nvidia.com/compute/cuda/12.6.2/local_installers/cuda_12.6.2_560.35.03_linux.run
|
||||||
|
sudo sh cuda_12.6.2_560.35.03_linux.run --silent --toolkit
|
||||||
|
```
|
||||||
|
|
||||||
|
* Export environment variables
|
||||||
|
|
||||||
|
```
|
||||||
|
echo $'export PATH=/usr/local/cuda-12.6/bin${PATH:+:${PATH}}\nexport LD_LIBRARY_PATH=/usr/local/cuda-12.6/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}' >> ~/.bashrc && source ~/.bashrc
|
||||||
|
```
|
||||||
|
|
||||||
|
### 6. TensorRT
|
||||||
|
|
||||||
|
```
|
||||||
|
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/3bf863cc.pub
|
||||||
|
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/ /"
|
||||||
|
sudo apt-get update
|
||||||
|
sudo apt-get install libnvinfer-dev=10.3.0.26-1+cuda12.5 libnvinfer-dispatch-dev=10.3.0.26-1+cuda12.5 libnvinfer-dispatch10=10.3.0.26-1+cuda12.5 libnvinfer-headers-dev=10.3.0.26-1+cuda12.5 libnvinfer-headers-plugin-dev=10.3.0.26-1+cuda12.5 libnvinfer-lean-dev=10.3.0.26-1+cuda12.5 libnvinfer-lean10=10.3.0.26-1+cuda12.5 libnvinfer-plugin-dev=10.3.0.26-1+cuda12.5 libnvinfer-plugin10=10.3.0.26-1+cuda12.5 libnvinfer-vc-plugin-dev=10.3.0.26-1+cuda12.5 libnvinfer-vc-plugin10=10.3.0.26-1+cuda12.5 libnvinfer10=10.3.0.26-1+cuda12.5 libnvonnxparsers-dev=10.3.0.26-1+cuda12.5 libnvonnxparsers10=10.3.0.26-1+cuda12.5 tensorrt-dev=10.3.0.26-1+cuda12.5 libnvinfer-samples=10.3.0.26-1+cuda12.5 libnvinfer-bin=10.3.0.26-1+cuda12.5 libcudnn9-cuda-12=9.3.0.75-1 libcudnn9-dev-cuda-12=9.3.0.75-1
|
||||||
|
sudo apt-mark hold libnvinfer* libnvparsers* libnvonnxparsers* libcudnn9* python3-libnvinfer* uff-converter-tf* onnx-graphsurgeon* graphsurgeon-tf* tensorrt*
|
||||||
|
```
|
||||||
|
|
||||||
|
### 7. DeepStream SDK
|
||||||
|
|
||||||
|
DeepStream 7.1 for Servers and Workstations
|
||||||
|
|
||||||
|
```
|
||||||
|
wget --content-disposition 'https://api.ngc.nvidia.com/v2/resources/org/nvidia/deepstream/7.1/files?redirect=true&path=deepstream-7.1_7.1.0-1_amd64.deb' -O deepstream-7.1_7.1.0-1_amd64.deb
|
||||||
|
sudo apt-get install ./deepstream-7.1_7.1.0-1_amd64.deb
|
||||||
|
rm ${HOME}/.cache/gstreamer-1.0/registry.x86_64.bin
|
||||||
|
sudo ln -snf /usr/local/cuda-12.6 /usr/local/cuda
|
||||||
|
```
|
||||||
|
|
||||||
|
### 8. Reboot
|
||||||
|
|
||||||
|
```
|
||||||
|
sudo reboot
|
||||||
|
```
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
<details><summary>DeepStream 7.0</summary>
|
<details><summary>DeepStream 7.0</summary>
|
||||||
|
|
||||||
### 1. Dependencies
|
### 1. Dependencies
|
||||||
|
|||||||
@@ -59,6 +59,7 @@ export CUDA_VER=XY.Z
|
|||||||
* x86 platform
|
* x86 platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 = 12.1
|
DeepStream 6.3 = 12.1
|
||||||
DeepStream 6.2 = 11.8
|
DeepStream 6.2 = 11.8
|
||||||
@@ -71,6 +72,7 @@ export CUDA_VER=XY.Z
|
|||||||
* Jetson platform
|
* Jetson platform
|
||||||
|
|
||||||
```
|
```
|
||||||
|
DeepStream 7.1 = 12.6
|
||||||
DeepStream 7.0 / 6.4 = 12.2
|
DeepStream 7.0 / 6.4 = 12.2
|
||||||
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
DeepStream 6.3 / 6.2 / 6.1.1 / 6.1 = 11.4
|
||||||
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
DeepStream 6.0.1 / 6.0 / 5.1 = 10.2
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
################################################################################
|
################################################################################
|
||||||
# Copyright (c) 2018-2023, NVIDIA CORPORATION. All rights reserved.
|
# Copyright (c) 2018-2024, NVIDIA CORPORATION. All rights reserved.
|
||||||
#
|
#
|
||||||
# Permission is hereby granted, free of charge, to any person obtaining a
|
# Permission is hereby granted, free of charge, to any person obtaining a
|
||||||
# copy of this software and associated documentation files (the "Software"),
|
# copy of this software and associated documentation files (the "Software"),
|
||||||
@@ -51,15 +51,20 @@ ifeq ($(OPENCV), 1)
|
|||||||
endif
|
endif
|
||||||
|
|
||||||
ifeq ($(GRAPH), 1)
|
ifeq ($(GRAPH), 1)
|
||||||
COMMON+= -GRAPH
|
COMMON+= -DGRAPH
|
||||||
endif
|
endif
|
||||||
|
|
||||||
CUFLAGS:= -I/opt/nvidia/deepstream/deepstream/sources/includes -I/usr/local/cuda-$(CUDA_VER)/include
|
CUFLAGS:= -I/opt/nvidia/deepstream/deepstream/sources/includes -I/usr/local/cuda-$(CUDA_VER)/include
|
||||||
|
|
||||||
LIBS+= -lnvinfer_plugin -lnvinfer -lnvparsers -lnvonnxparser -L/usr/local/cuda-$(CUDA_VER)/lib64 -lcudart -lcublas -lstdc++fs
|
ifeq ($(shell ldconfig -p | grep -q libnvparsers && echo 1 || echo 0), 1)
|
||||||
|
LIBS+= -lnvparsers
|
||||||
|
endif
|
||||||
|
|
||||||
|
LIBS+= -lnvinfer_plugin -lnvinfer -lnvonnxparser -L/usr/local/cuda-$(CUDA_VER)/lib64 -lcudart -lcublas -lstdc++fs
|
||||||
LFLAGS:= -shared -Wl,--start-group $(LIBS) -Wl,--end-group
|
LFLAGS:= -shared -Wl,--start-group $(LIBS) -Wl,--end-group
|
||||||
|
|
||||||
INCS:= $(wildcard *.h)
|
INCS:= $(wildcard layers/*.h)
|
||||||
|
INCS+= $(wildcard *.h)
|
||||||
|
|
||||||
SRCFILES:= $(filter-out calibrator.cpp, $(wildcard *.cpp))
|
SRCFILES:= $(filter-out calibrator.cpp, $(wildcard *.cpp))
|
||||||
|
|
||||||
|
|||||||
@@ -8,9 +8,10 @@
|
|||||||
#include <fstream>
|
#include <fstream>
|
||||||
#include <iterator>
|
#include <iterator>
|
||||||
|
|
||||||
Int8EntropyCalibrator2::Int8EntropyCalibrator2(const int& batchSize, const int& channels, const int& height, const int& width,
|
Int8EntropyCalibrator2::Int8EntropyCalibrator2(const int& batchSize, const int& channels, const int& height,
|
||||||
const float& scaleFactor, const float* offsets, const std::string& imgPath, const std::string& calibTablePath) :
|
const int& width, const float& scaleFactor, const float* offsets, const int& inputFormat,
|
||||||
batchSize(batchSize), inputC(channels), inputH(height), inputW(width), scaleFactor(scaleFactor), offsets(offsets),
|
const std::string& imgPath, const std::string& calibTablePath) : batchSize(batchSize), inputC(channels),
|
||||||
|
inputH(height), inputW(width), scaleFactor(scaleFactor), offsets(offsets), inputFormat(inputFormat),
|
||||||
calibTablePath(calibTablePath), imageIndex(0)
|
calibTablePath(calibTablePath), imageIndex(0)
|
||||||
{
|
{
|
||||||
inputCount = batchSize * channels * height * width;
|
inputCount = batchSize * channels * height * width;
|
||||||
@@ -54,7 +55,7 @@ Int8EntropyCalibrator2::getBatch(void** bindings, const char** names, int nbBind
|
|||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
std::vector<float> inputData = prepareImage(img, inputC, inputH, inputW, scaleFactor, offsets);
|
std::vector<float> inputData = prepareImage(img, inputC, inputH, inputW, scaleFactor, offsets, inputFormat);
|
||||||
|
|
||||||
size_t len = inputData.size();
|
size_t len = inputData.size();
|
||||||
memcpy(ptr, inputData.data(), len * sizeof(float));
|
memcpy(ptr, inputData.data(), len * sizeof(float));
|
||||||
@@ -93,32 +94,46 @@ Int8EntropyCalibrator2::writeCalibrationCache(const void* cache, std::size_t len
|
|||||||
}
|
}
|
||||||
|
|
||||||
std::vector<float>
|
std::vector<float>
|
||||||
prepareImage(cv::Mat& img, int input_c, int input_h, int input_w, float scaleFactor, const float* offsets)
|
prepareImage(cv::Mat& img, int inputC, int inputH, int inputW, float scaleFactor, const float* offsets, int inputFormat)
|
||||||
{
|
{
|
||||||
cv::Mat out;
|
cv::Mat out;
|
||||||
|
|
||||||
cv::cvtColor(img, out, cv::COLOR_BGR2RGB);
|
if (inputFormat == 0) {
|
||||||
|
cv::cvtColor(img, out, cv::COLOR_BGR2RGB);
|
||||||
|
}
|
||||||
|
else if (inputFormat == 2) {
|
||||||
|
cv::cvtColor(img, out, cv::COLOR_BGR2GRAY);
|
||||||
|
}
|
||||||
|
else {
|
||||||
|
out = img;
|
||||||
|
}
|
||||||
|
|
||||||
int image_w = img.cols;
|
int imageW = img.cols;
|
||||||
int image_h = img.rows;
|
int imageH = img.rows;
|
||||||
|
|
||||||
if (image_w != input_w || image_h != input_h) {
|
if (imageW != inputW || imageH != inputH) {
|
||||||
float resizeFactor = std::max(input_w / (float) image_w, input_h / (float) img.rows);
|
float resizeFactor = std::max(inputW / (float) imageW, inputH / (float) imageH);
|
||||||
cv::resize(out, out, cv::Size(0, 0), resizeFactor, resizeFactor, cv::INTER_CUBIC);
|
cv::resize(out, out, cv::Size(0, 0), resizeFactor, resizeFactor, cv::INTER_CUBIC);
|
||||||
cv::Rect crop(cv::Point(0.5 * (out.cols - input_w), 0.5 * (out.rows - input_h)), cv::Size(input_w, input_h));
|
cv::Rect crop(cv::Point(0.5 * (out.cols - inputW), 0.5 * (out.rows - inputH)), cv::Size(inputW, inputH));
|
||||||
out = out(crop);
|
out = out(crop);
|
||||||
}
|
}
|
||||||
|
|
||||||
out.convertTo(out, CV_32F, scaleFactor);
|
out.convertTo(out, CV_32F, scaleFactor);
|
||||||
cv::subtract(out, cv::Scalar(offsets[2] / 255, offsets[1] / 255, offsets[0] / 255), out, cv::noArray(), -1);
|
|
||||||
|
|
||||||
std::vector<cv::Mat> input_channels(input_c);
|
if (inputFormat == 2) {
|
||||||
cv::split(out, input_channels);
|
cv::subtract(out, cv::Scalar(offsets[0] / 255), out);
|
||||||
std::vector<float> result(input_h * input_w * input_c);
|
}
|
||||||
|
else {
|
||||||
|
cv::subtract(out, cv::Scalar(offsets[0] / 255, offsets[1] / 255, offsets[3] / 255), out);
|
||||||
|
}
|
||||||
|
|
||||||
|
std::vector<cv::Mat> inputChannels(inputC);
|
||||||
|
cv::split(out, inputChannels);
|
||||||
|
std::vector<float> result(inputH * inputW * inputC);
|
||||||
auto data = result.data();
|
auto data = result.data();
|
||||||
int channelLength = input_h * input_w;
|
int channelLength = inputH * inputW;
|
||||||
for (int i = 0; i < input_c; ++i) {
|
for (int i = 0; i < inputC; ++i) {
|
||||||
memcpy(data, input_channels[i].data, channelLength * sizeof(float));
|
memcpy(data, inputChannels[i].data, channelLength * sizeof(float));
|
||||||
data += channelLength;
|
data += channelLength;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -12,18 +12,19 @@
|
|||||||
#include "NvInfer.h"
|
#include "NvInfer.h"
|
||||||
#include "opencv2/opencv.hpp"
|
#include "opencv2/opencv.hpp"
|
||||||
|
|
||||||
#define CUDA_CHECK(status) { \
|
#define CUDA_CHECK(status) { \
|
||||||
if (status != 0) { \
|
if (status != 0) { \
|
||||||
std::cout << "CUDA failure: " << cudaGetErrorString(status) << " in file " << __FILE__ << " at line " << __LINE__ << \
|
std::cout << "CUDA failure: " << cudaGetErrorString(status) << " in file " << __FILE__ << " at line " << \
|
||||||
std::endl; \
|
__LINE__ << std::endl; \
|
||||||
abort(); \
|
abort(); \
|
||||||
} \
|
} \
|
||||||
}
|
}
|
||||||
|
|
||||||
class Int8EntropyCalibrator2 : public nvinfer1::IInt8EntropyCalibrator2 {
|
class Int8EntropyCalibrator2 : public nvinfer1::IInt8EntropyCalibrator2 {
|
||||||
public:
|
public:
|
||||||
Int8EntropyCalibrator2(const int& batchSize, const int& channels, const int& height, const int& width,
|
Int8EntropyCalibrator2(const int& batchSize, const int& channels, const int& height, const int& width,
|
||||||
const float& scaleFactor, const float* offsets, const std::string& imgPath, const std::string& calibTablePath);
|
const float& scaleFactor, const float* offsets, const int& inputFormat, const std::string& imgPath,
|
||||||
|
const std::string& calibTablePath);
|
||||||
|
|
||||||
virtual ~Int8EntropyCalibrator2();
|
virtual ~Int8EntropyCalibrator2();
|
||||||
|
|
||||||
@@ -43,6 +44,7 @@ class Int8EntropyCalibrator2 : public nvinfer1::IInt8EntropyCalibrator2 {
|
|||||||
int letterBox;
|
int letterBox;
|
||||||
float scaleFactor;
|
float scaleFactor;
|
||||||
const float* offsets;
|
const float* offsets;
|
||||||
|
int inputFormat;
|
||||||
std::string calibTablePath;
|
std::string calibTablePath;
|
||||||
size_t imageIndex;
|
size_t imageIndex;
|
||||||
size_t inputCount;
|
size_t inputCount;
|
||||||
@@ -53,7 +55,7 @@ class Int8EntropyCalibrator2 : public nvinfer1::IInt8EntropyCalibrator2 {
|
|||||||
std::vector<char> calibrationCache;
|
std::vector<char> calibrationCache;
|
||||||
};
|
};
|
||||||
|
|
||||||
std::vector<float> prepareImage(cv::Mat& img, int input_c, int input_h, int input_w, float scaleFactor,
|
std::vector<float> prepareImage(cv::Mat& img, int inputC, int inputH, int inputW, float scaleFactor,
|
||||||
const float* offsets);
|
const float* offsets, int inputFormat);
|
||||||
|
|
||||||
#endif //CALIBRATOR_H
|
#endif //CALIBRATOR_H
|
||||||
|
|||||||
@@ -14,8 +14,9 @@ activationLayer(int layerIdx, std::string activation, nvinfer1::ITensor* input,
|
|||||||
{
|
{
|
||||||
nvinfer1::ITensor* output;
|
nvinfer1::ITensor* output;
|
||||||
|
|
||||||
if (activation == "linear")
|
if (activation == "linear") {
|
||||||
output = input;
|
output = input;
|
||||||
|
}
|
||||||
else if (activation == "relu") {
|
else if (activation == "relu") {
|
||||||
nvinfer1::IActivationLayer* relu = network->addActivation(*input, nvinfer1::ActivationType::kRELU);
|
nvinfer1::IActivationLayer* relu = network->addActivation(*input, nvinfer1::ActivationType::kRELU);
|
||||||
assert(relu != nullptr);
|
assert(relu != nullptr);
|
||||||
|
|||||||
@@ -21,6 +21,11 @@ batchnormLayer(int layerIdx, std::map<std::string, std::string>& block, std::vec
|
|||||||
int filters = std::stoi(block.at("filters"));
|
int filters = std::stoi(block.at("filters"));
|
||||||
std::string activation = block.at("activation");
|
std::string activation = block.at("activation");
|
||||||
|
|
||||||
|
float eps = 1.0e-5;
|
||||||
|
if (block.find("eps") != block.end()) {
|
||||||
|
eps = std::stof(block.at("eps"));
|
||||||
|
}
|
||||||
|
|
||||||
std::vector<float> bnBiases;
|
std::vector<float> bnBiases;
|
||||||
std::vector<float> bnWeights;
|
std::vector<float> bnWeights;
|
||||||
std::vector<float> bnRunningMean;
|
std::vector<float> bnRunningMean;
|
||||||
@@ -39,7 +44,7 @@ batchnormLayer(int layerIdx, std::map<std::string, std::string>& block, std::vec
|
|||||||
++weightPtr;
|
++weightPtr;
|
||||||
}
|
}
|
||||||
for (int i = 0; i < filters; ++i) {
|
for (int i = 0; i < filters; ++i) {
|
||||||
bnRunningVar.push_back(sqrt(weights[weightPtr] + 1.0e-5));
|
bnRunningVar.push_back(sqrt(weights[weightPtr] + eps));
|
||||||
++weightPtr;
|
++weightPtr;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -47,18 +52,25 @@ batchnormLayer(int layerIdx, std::map<std::string, std::string>& block, std::vec
|
|||||||
nvinfer1::Weights shift {nvinfer1::DataType::kFLOAT, nullptr, size};
|
nvinfer1::Weights shift {nvinfer1::DataType::kFLOAT, nullptr, size};
|
||||||
nvinfer1::Weights scale {nvinfer1::DataType::kFLOAT, nullptr, size};
|
nvinfer1::Weights scale {nvinfer1::DataType::kFLOAT, nullptr, size};
|
||||||
nvinfer1::Weights power {nvinfer1::DataType::kFLOAT, nullptr, size};
|
nvinfer1::Weights power {nvinfer1::DataType::kFLOAT, nullptr, size};
|
||||||
|
|
||||||
float* shiftWt = new float[size];
|
float* shiftWt = new float[size];
|
||||||
for (int i = 0; i < size; ++i)
|
for (int i = 0; i < size; ++i) {
|
||||||
shiftWt[i] = bnBiases.at(i) - ((bnRunningMean.at(i) * bnWeights.at(i)) / bnRunningVar.at(i));
|
shiftWt[i] = bnBiases.at(i) - ((bnRunningMean.at(i) * bnWeights.at(i)) / bnRunningVar.at(i));
|
||||||
|
}
|
||||||
shift.values = shiftWt;
|
shift.values = shiftWt;
|
||||||
|
|
||||||
float* scaleWt = new float[size];
|
float* scaleWt = new float[size];
|
||||||
for (int i = 0; i < size; ++i)
|
for (int i = 0; i < size; ++i) {
|
||||||
scaleWt[i] = bnWeights.at(i) / bnRunningVar[i];
|
scaleWt[i] = bnWeights.at(i) / bnRunningVar[i];
|
||||||
|
}
|
||||||
scale.values = scaleWt;
|
scale.values = scaleWt;
|
||||||
|
|
||||||
float* powerWt = new float[size];
|
float* powerWt = new float[size];
|
||||||
for (int i = 0; i < size; ++i)
|
for (int i = 0; i < size; ++i) {
|
||||||
powerWt[i] = 1.0;
|
powerWt[i] = 1.0;
|
||||||
|
}
|
||||||
power.values = powerWt;
|
power.values = powerWt;
|
||||||
|
|
||||||
trtWeights.push_back(shift);
|
trtWeights.push_back(shift);
|
||||||
trtWeights.push_back(scale);
|
trtWeights.push_back(scale);
|
||||||
trtWeights.push_back(power);
|
trtWeights.push_back(power);
|
||||||
|
|||||||
@@ -15,7 +15,7 @@ convolutionalLayer(int layerIdx, std::map<std::string, std::string>& block, std:
|
|||||||
{
|
{
|
||||||
nvinfer1::ITensor* output;
|
nvinfer1::ITensor* output;
|
||||||
|
|
||||||
assert(block.at("type") == "convolutional" || block.at("type") == "c2f");
|
assert(block.at("type") == "conv" || block.at("type") == "convolutional");
|
||||||
assert(block.find("filters") != block.end());
|
assert(block.find("filters") != block.end());
|
||||||
assert(block.find("pad") != block.end());
|
assert(block.find("pad") != block.end());
|
||||||
assert(block.find("size") != block.end());
|
assert(block.find("size") != block.end());
|
||||||
@@ -28,27 +28,35 @@ convolutionalLayer(int layerIdx, std::map<std::string, std::string>& block, std:
|
|||||||
std::string activation = block.at("activation");
|
std::string activation = block.at("activation");
|
||||||
int bias = filters;
|
int bias = filters;
|
||||||
|
|
||||||
bool batchNormalize = false;
|
int batchNormalize = 0;
|
||||||
|
float eps = 1.0e-5;
|
||||||
if (block.find("batch_normalize") != block.end()) {
|
if (block.find("batch_normalize") != block.end()) {
|
||||||
bias = 0;
|
bias = 0;
|
||||||
batchNormalize = (block.at("batch_normalize") == "1");
|
batchNormalize = (block.at("batch_normalize") == "1");
|
||||||
|
if (block.find("eps") != block.end()) {
|
||||||
|
eps = std::stof(block.at("eps"));
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
if (block.find("bias") != block.end()) {
|
if (block.find("bias") != block.end()) {
|
||||||
bias = std::stoi(block.at("bias"));
|
bias = std::stoi(block.at("bias"));
|
||||||
if (bias == 1)
|
if (bias == 1) {
|
||||||
bias = filters;
|
bias = filters;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
int groups = 1;
|
int groups = 1;
|
||||||
if (block.find("groups") != block.end())
|
if (block.find("groups") != block.end()) {
|
||||||
groups = std::stoi(block.at("groups"));
|
groups = std::stoi(block.at("groups"));
|
||||||
|
}
|
||||||
|
|
||||||
int pad;
|
int pad;
|
||||||
if (padding)
|
if (padding) {
|
||||||
pad = (kernelSize - 1) / 2;
|
pad = (kernelSize - 1) / 2;
|
||||||
else
|
}
|
||||||
|
else {
|
||||||
pad = 0;
|
pad = 0;
|
||||||
|
}
|
||||||
|
|
||||||
int size = filters * inputChannels * kernelSize * kernelSize / groups;
|
int size = filters * inputChannels * kernelSize * kernelSize / groups;
|
||||||
std::vector<float> bnBiases;
|
std::vector<float> bnBiases;
|
||||||
@@ -58,7 +66,7 @@ convolutionalLayer(int layerIdx, std::map<std::string, std::string>& block, std:
|
|||||||
nvinfer1::Weights convWt {nvinfer1::DataType::kFLOAT, nullptr, size};
|
nvinfer1::Weights convWt {nvinfer1::DataType::kFLOAT, nullptr, size};
|
||||||
nvinfer1::Weights convBias {nvinfer1::DataType::kFLOAT, nullptr, bias};
|
nvinfer1::Weights convBias {nvinfer1::DataType::kFLOAT, nullptr, bias};
|
||||||
|
|
||||||
if (batchNormalize == false) {
|
if (batchNormalize == 0) {
|
||||||
float* val;
|
float* val;
|
||||||
if (bias != 0) {
|
if (bias != 0) {
|
||||||
val = new float[filters];
|
val = new float[filters];
|
||||||
@@ -91,7 +99,7 @@ convolutionalLayer(int layerIdx, std::map<std::string, std::string>& block, std:
|
|||||||
++weightPtr;
|
++weightPtr;
|
||||||
}
|
}
|
||||||
for (int i = 0; i < filters; ++i) {
|
for (int i = 0; i < filters; ++i) {
|
||||||
bnRunningVar.push_back(sqrt(weights[weightPtr] + 1.0e-5));
|
bnRunningVar.push_back(sqrt(weights[weightPtr] + eps));
|
||||||
++weightPtr;
|
++weightPtr;
|
||||||
}
|
}
|
||||||
float* val;
|
float* val;
|
||||||
@@ -110,40 +118,49 @@ convolutionalLayer(int layerIdx, std::map<std::string, std::string>& block, std:
|
|||||||
}
|
}
|
||||||
convWt.values = val;
|
convWt.values = val;
|
||||||
trtWeights.push_back(convWt);
|
trtWeights.push_back(convWt);
|
||||||
if (bias != 0)
|
if (bias != 0) {
|
||||||
trtWeights.push_back(convBias);
|
trtWeights.push_back(convBias);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
nvinfer1::IConvolutionLayer* conv = network->addConvolutionNd(*input, filters, nvinfer1::Dims{2, {kernelSize, kernelSize}},
|
nvinfer1::IConvolutionLayer* conv = network->addConvolutionNd(*input, filters,
|
||||||
convWt, convBias);
|
nvinfer1::Dims{2, {kernelSize, kernelSize}}, convWt, convBias);
|
||||||
assert(conv != nullptr);
|
assert(conv != nullptr);
|
||||||
std::string convLayerName = "conv_" + layerName + std::to_string(layerIdx);
|
std::string convLayerName = "conv_" + layerName + std::to_string(layerIdx);
|
||||||
conv->setName(convLayerName.c_str());
|
conv->setName(convLayerName.c_str());
|
||||||
conv->setStrideNd(nvinfer1::Dims{2, {stride, stride}});
|
conv->setStrideNd(nvinfer1::Dims{2, {stride, stride}});
|
||||||
conv->setPaddingNd(nvinfer1::Dims{2, {pad, pad}});
|
conv->setPaddingNd(nvinfer1::Dims{2, {pad, pad}});
|
||||||
|
|
||||||
if (block.find("groups") != block.end())
|
if (block.find("groups") != block.end()) {
|
||||||
conv->setNbGroups(groups);
|
conv->setNbGroups(groups);
|
||||||
|
}
|
||||||
|
|
||||||
output = conv->getOutput(0);
|
output = conv->getOutput(0);
|
||||||
|
|
||||||
if (batchNormalize == true) {
|
if (batchNormalize == 1) {
|
||||||
size = filters;
|
size = filters;
|
||||||
nvinfer1::Weights shift {nvinfer1::DataType::kFLOAT, nullptr, size};
|
nvinfer1::Weights shift {nvinfer1::DataType::kFLOAT, nullptr, size};
|
||||||
nvinfer1::Weights scale {nvinfer1::DataType::kFLOAT, nullptr, size};
|
nvinfer1::Weights scale {nvinfer1::DataType::kFLOAT, nullptr, size};
|
||||||
nvinfer1::Weights power {nvinfer1::DataType::kFLOAT, nullptr, size};
|
nvinfer1::Weights power {nvinfer1::DataType::kFLOAT, nullptr, size};
|
||||||
|
|
||||||
float* shiftWt = new float[size];
|
float* shiftWt = new float[size];
|
||||||
for (int i = 0; i < size; ++i)
|
for (int i = 0; i < size; ++i) {
|
||||||
shiftWt[i] = bnBiases.at(i) - ((bnRunningMean.at(i) * bnWeights.at(i)) / bnRunningVar.at(i));
|
shiftWt[i] = bnBiases.at(i) - ((bnRunningMean.at(i) * bnWeights.at(i)) / bnRunningVar.at(i));
|
||||||
|
}
|
||||||
shift.values = shiftWt;
|
shift.values = shiftWt;
|
||||||
|
|
||||||
float* scaleWt = new float[size];
|
float* scaleWt = new float[size];
|
||||||
for (int i = 0; i < size; ++i)
|
for (int i = 0; i < size; ++i) {
|
||||||
scaleWt[i] = bnWeights.at(i) / bnRunningVar[i];
|
scaleWt[i] = bnWeights.at(i) / bnRunningVar[i];
|
||||||
|
}
|
||||||
scale.values = scaleWt;
|
scale.values = scaleWt;
|
||||||
|
|
||||||
float* powerWt = new float[size];
|
float* powerWt = new float[size];
|
||||||
for (int i = 0; i < size; ++i)
|
for (int i = 0; i < size; ++i) {
|
||||||
powerWt[i] = 1.0;
|
powerWt[i] = 1.0;
|
||||||
|
}
|
||||||
power.values = powerWt;
|
power.values = powerWt;
|
||||||
|
|
||||||
trtWeights.push_back(shift);
|
trtWeights.push_back(shift);
|
||||||
trtWeights.push_back(scale);
|
trtWeights.push_back(scale);
|
||||||
trtWeights.push_back(power);
|
trtWeights.push_back(power);
|
||||||
|
|||||||
@@ -13,8 +13,8 @@
|
|||||||
|
|
||||||
#include "activation_layer.h"
|
#include "activation_layer.h"
|
||||||
|
|
||||||
nvinfer1::ITensor* convolutionalLayer(int layerIdx, std::map<std::string, std::string>& block, std::vector<float>& weights,
|
nvinfer1::ITensor* convolutionalLayer(int layerIdx, std::map<std::string, std::string>& block,
|
||||||
std::vector<nvinfer1::Weights>& trtWeights, int& weightPtr, int& inputChannels, nvinfer1::ITensor* input,
|
std::vector<float>& weights, std::vector<nvinfer1::Weights>& trtWeights, int& weightPtr, int& inputChannels,
|
||||||
nvinfer1::INetworkDefinition* network, std::string layerName = "");
|
nvinfer1::ITensor* input, nvinfer1::INetworkDefinition* network, std::string layerName = "");
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
@@ -6,6 +6,7 @@
|
|||||||
#include "deconvolutional_layer.h"
|
#include "deconvolutional_layer.h"
|
||||||
|
|
||||||
#include <cassert>
|
#include <cassert>
|
||||||
|
#include <math.h>
|
||||||
|
|
||||||
nvinfer1::ITensor*
|
nvinfer1::ITensor*
|
||||||
deconvolutionalLayer(int layerIdx, std::map<std::string, std::string>& block, std::vector<float>& weights,
|
deconvolutionalLayer(int layerIdx, std::map<std::string, std::string>& block, std::vector<float>& weights,
|
||||||
@@ -14,7 +15,7 @@ deconvolutionalLayer(int layerIdx, std::map<std::string, std::string>& block, st
|
|||||||
{
|
{
|
||||||
nvinfer1::ITensor* output;
|
nvinfer1::ITensor* output;
|
||||||
|
|
||||||
assert(block.at("type") == "deconvolutional");
|
assert(block.at("type") == "deconv" || block.at("type") == "deconvolutional");
|
||||||
assert(block.find("filters") != block.end());
|
assert(block.find("filters") != block.end());
|
||||||
assert(block.find("pad") != block.end());
|
assert(block.find("pad") != block.end());
|
||||||
assert(block.find("size") != block.end());
|
assert(block.find("size") != block.end());
|
||||||
@@ -24,20 +25,38 @@ deconvolutionalLayer(int layerIdx, std::map<std::string, std::string>& block, st
|
|||||||
int padding = std::stoi(block.at("pad"));
|
int padding = std::stoi(block.at("pad"));
|
||||||
int kernelSize = std::stoi(block.at("size"));
|
int kernelSize = std::stoi(block.at("size"));
|
||||||
int stride = std::stoi(block.at("stride"));
|
int stride = std::stoi(block.at("stride"));
|
||||||
|
std::string activation = block.at("activation");
|
||||||
int bias = filters;
|
int bias = filters;
|
||||||
|
|
||||||
int groups = 1;
|
int batchNormalize = 0;
|
||||||
if (block.find("groups") != block.end())
|
float eps = 1.0e-5;
|
||||||
groups = std::stoi(block.at("groups"));
|
if (block.find("batch_normalize") != block.end()) {
|
||||||
|
bias = 0;
|
||||||
|
batchNormalize = (block.at("batch_normalize") == "1");
|
||||||
|
if (block.find("eps") != block.end()) {
|
||||||
|
eps = std::stof(block.at("eps"));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
if (block.find("bias") != block.end())
|
if (block.find("bias") != block.end()) {
|
||||||
bias = std::stoi(block.at("bias"));
|
bias = std::stoi(block.at("bias"));
|
||||||
|
if (bias == 1) {
|
||||||
|
bias = filters;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
int groups = 1;
|
||||||
|
if (block.find("groups") != block.end()) {
|
||||||
|
groups = std::stoi(block.at("groups"));
|
||||||
|
}
|
||||||
|
|
||||||
int pad;
|
int pad;
|
||||||
if (padding)
|
if (padding) {
|
||||||
pad = (kernelSize - 1) / 2;
|
pad = (kernelSize - 1) / 2;
|
||||||
else
|
}
|
||||||
|
else {
|
||||||
pad = 0;
|
pad = 0;
|
||||||
|
}
|
||||||
|
|
||||||
int size = filters * inputChannels * kernelSize * kernelSize / groups;
|
int size = filters * inputChannels * kernelSize * kernelSize / groups;
|
||||||
std::vector<float> bnBiases;
|
std::vector<float> bnBiases;
|
||||||
@@ -47,23 +66,62 @@ deconvolutionalLayer(int layerIdx, std::map<std::string, std::string>& block, st
|
|||||||
nvinfer1::Weights convWt {nvinfer1::DataType::kFLOAT, nullptr, size};
|
nvinfer1::Weights convWt {nvinfer1::DataType::kFLOAT, nullptr, size};
|
||||||
nvinfer1::Weights convBias {nvinfer1::DataType::kFLOAT, nullptr, bias};
|
nvinfer1::Weights convBias {nvinfer1::DataType::kFLOAT, nullptr, bias};
|
||||||
|
|
||||||
float* val;
|
if (batchNormalize == 0) {
|
||||||
if (bias != 0) {
|
float* val;
|
||||||
val = new float[filters];
|
if (bias != 0) {
|
||||||
for (int i = 0; i < filters; ++i) {
|
val = new float[filters];
|
||||||
|
for (int i = 0; i < filters; ++i) {
|
||||||
|
val[i] = weights[weightPtr];
|
||||||
|
++weightPtr;
|
||||||
|
}
|
||||||
|
convBias.values = val;
|
||||||
|
trtWeights.push_back(convBias);
|
||||||
|
}
|
||||||
|
val = new float[size];
|
||||||
|
for (int i = 0; i < size; ++i) {
|
||||||
val[i] = weights[weightPtr];
|
val[i] = weights[weightPtr];
|
||||||
++weightPtr;
|
++weightPtr;
|
||||||
}
|
}
|
||||||
convBias.values = val;
|
convWt.values = val;
|
||||||
trtWeights.push_back(convBias);
|
trtWeights.push_back(convWt);
|
||||||
}
|
}
|
||||||
val = new float[size];
|
else {
|
||||||
for (int i = 0; i < size; ++i) {
|
for (int i = 0; i < filters; ++i) {
|
||||||
|
bnBiases.push_back(weights[weightPtr]);
|
||||||
|
++weightPtr;
|
||||||
|
}
|
||||||
|
for (int i = 0; i < filters; ++i) {
|
||||||
|
bnWeights.push_back(weights[weightPtr]);
|
||||||
|
++weightPtr;
|
||||||
|
}
|
||||||
|
for (int i = 0; i < filters; ++i) {
|
||||||
|
bnRunningMean.push_back(weights[weightPtr]);
|
||||||
|
++weightPtr;
|
||||||
|
}
|
||||||
|
for (int i = 0; i < filters; ++i) {
|
||||||
|
bnRunningVar.push_back(sqrt(weights[weightPtr] + eps));
|
||||||
|
++weightPtr;
|
||||||
|
}
|
||||||
|
float* val;
|
||||||
|
if (bias != 0) {
|
||||||
|
val = new float[filters];
|
||||||
|
for (int i = 0; i < filters; ++i) {
|
||||||
|
val[i] = weights[weightPtr];
|
||||||
|
++weightPtr;
|
||||||
|
}
|
||||||
|
convBias.values = val;
|
||||||
|
}
|
||||||
|
val = new float[size];
|
||||||
|
for (int i = 0; i < size; ++i) {
|
||||||
val[i] = weights[weightPtr];
|
val[i] = weights[weightPtr];
|
||||||
++weightPtr;
|
++weightPtr;
|
||||||
|
}
|
||||||
|
convWt.values = val;
|
||||||
|
trtWeights.push_back(convWt);
|
||||||
|
if (bias != 0) {
|
||||||
|
trtWeights.push_back(convBias);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
convWt.values = val;
|
|
||||||
trtWeights.push_back(convWt);
|
|
||||||
|
|
||||||
nvinfer1::IDeconvolutionLayer* conv = network->addDeconvolutionNd(*input, filters,
|
nvinfer1::IDeconvolutionLayer* conv = network->addDeconvolutionNd(*input, filters,
|
||||||
nvinfer1::Dims{2, {kernelSize, kernelSize}}, convWt, convBias);
|
nvinfer1::Dims{2, {kernelSize, kernelSize}}, convWt, convBias);
|
||||||
@@ -73,10 +131,49 @@ deconvolutionalLayer(int layerIdx, std::map<std::string, std::string>& block, st
|
|||||||
conv->setStrideNd(nvinfer1::Dims{2, {stride, stride}});
|
conv->setStrideNd(nvinfer1::Dims{2, {stride, stride}});
|
||||||
conv->setPaddingNd(nvinfer1::Dims{2, {pad, pad}});
|
conv->setPaddingNd(nvinfer1::Dims{2, {pad, pad}});
|
||||||
|
|
||||||
if (block.find("groups") != block.end())
|
if (block.find("groups") != block.end()) {
|
||||||
conv->setNbGroups(groups);
|
conv->setNbGroups(groups);
|
||||||
|
}
|
||||||
|
|
||||||
output = conv->getOutput(0);
|
output = conv->getOutput(0);
|
||||||
|
|
||||||
|
if (batchNormalize == 1) {
|
||||||
|
size = filters;
|
||||||
|
nvinfer1::Weights shift {nvinfer1::DataType::kFLOAT, nullptr, size};
|
||||||
|
nvinfer1::Weights scale {nvinfer1::DataType::kFLOAT, nullptr, size};
|
||||||
|
nvinfer1::Weights power {nvinfer1::DataType::kFLOAT, nullptr, size};
|
||||||
|
|
||||||
|
float* shiftWt = new float[size];
|
||||||
|
for (int i = 0; i < size; ++i) {
|
||||||
|
shiftWt[i] = bnBiases.at(i) - ((bnRunningMean.at(i) * bnWeights.at(i)) / bnRunningVar.at(i));
|
||||||
|
}
|
||||||
|
shift.values = shiftWt;
|
||||||
|
|
||||||
|
float* scaleWt = new float[size];
|
||||||
|
for (int i = 0; i < size; ++i) {
|
||||||
|
scaleWt[i] = bnWeights.at(i) / bnRunningVar[i];
|
||||||
|
}
|
||||||
|
scale.values = scaleWt;
|
||||||
|
|
||||||
|
float* powerWt = new float[size];
|
||||||
|
for (int i = 0; i < size; ++i) {
|
||||||
|
powerWt[i] = 1.0;
|
||||||
|
}
|
||||||
|
power.values = powerWt;
|
||||||
|
|
||||||
|
trtWeights.push_back(shift);
|
||||||
|
trtWeights.push_back(scale);
|
||||||
|
trtWeights.push_back(power);
|
||||||
|
|
||||||
|
nvinfer1::IScaleLayer* batchnorm = network->addScale(*output, nvinfer1::ScaleMode::kCHANNEL, shift, scale, power);
|
||||||
|
assert(batchnorm != nullptr);
|
||||||
|
std::string batchnormLayerName = "batchnorm_" + layerName + std::to_string(layerIdx);
|
||||||
|
batchnorm->setName(batchnormLayerName.c_str());
|
||||||
|
output = batchnorm->getOutput(0);
|
||||||
|
}
|
||||||
|
|
||||||
|
output = activationLayer(layerIdx, activation, output, network, layerName);
|
||||||
|
assert(output != nullptr);
|
||||||
|
|
||||||
return output;
|
return output;
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -8,12 +8,13 @@
|
|||||||
|
|
||||||
#include <map>
|
#include <map>
|
||||||
#include <vector>
|
#include <vector>
|
||||||
#include <string>
|
|
||||||
|
|
||||||
#include "NvInfer.h"
|
#include "NvInfer.h"
|
||||||
|
|
||||||
nvinfer1::ITensor* deconvolutionalLayer(int layerIdx, std::map<std::string, std::string>& block, std::vector<float>& weights,
|
#include "activation_layer.h"
|
||||||
std::vector<nvinfer1::Weights>& trtWeights, int& weightPtr, int& inputChannels, nvinfer1::ITensor* input,
|
|
||||||
nvinfer1::INetworkDefinition* network, std::string layerName = "");
|
nvinfer1::ITensor* deconvolutionalLayer(int layerIdx, std::map<std::string, std::string>& block,
|
||||||
|
std::vector<float>& weights, std::vector<nvinfer1::Weights>& trtWeights, int& weightPtr, int& inputChannels,
|
||||||
|
nvinfer1::ITensor* input, nvinfer1::INetworkDefinition* network, std::string layerName = "");
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
@@ -10,7 +10,7 @@
|
|||||||
|
|
||||||
nvinfer1::ITensor*
|
nvinfer1::ITensor*
|
||||||
reorgLayer(int layerIdx, std::map<std::string, std::string>& block, nvinfer1::ITensor* input,
|
reorgLayer(int layerIdx, std::map<std::string, std::string>& block, nvinfer1::ITensor* input,
|
||||||
nvinfer1::INetworkDefinition* network, uint batchSize)
|
nvinfer1::INetworkDefinition* network)
|
||||||
{
|
{
|
||||||
nvinfer1::ITensor* output;
|
nvinfer1::ITensor* output;
|
||||||
|
|
||||||
@@ -35,17 +35,17 @@ reorgLayer(int layerIdx, std::map<std::string, std::string>& block, nvinfer1::IT
|
|||||||
nvinfer1::Dims sizeAll = {4, {inputDims.d[0], inputDims.d[1], inputDims.d[2] / stride, inputDims.d[3] / stride}};
|
nvinfer1::Dims sizeAll = {4, {inputDims.d[0], inputDims.d[1], inputDims.d[2] / stride, inputDims.d[3] / stride}};
|
||||||
nvinfer1::Dims strideAll = {4, {1, 1, stride, stride}};
|
nvinfer1::Dims strideAll = {4, {1, 1, stride, stride}};
|
||||||
|
|
||||||
nvinfer1::ITensor* slice1 = sliceLayer(layerIdx, name1, input, start1, sizeAll, strideAll, network, batchSize);
|
nvinfer1::ITensor* slice1 = sliceLayer(layerIdx, name1, input, start1, sizeAll, strideAll, network);
|
||||||
assert(output != nullptr);
|
assert(slice1 != nullptr);
|
||||||
|
|
||||||
nvinfer1::ITensor* slice2 = sliceLayer(layerIdx, name2, input, start2, sizeAll, strideAll, network, batchSize);
|
nvinfer1::ITensor* slice2 = sliceLayer(layerIdx, name2, input, start2, sizeAll, strideAll, network);
|
||||||
assert(output != nullptr);
|
assert(slice2 != nullptr);
|
||||||
|
|
||||||
nvinfer1::ITensor* slice3 = sliceLayer(layerIdx, name3, input, start3, sizeAll, strideAll, network, batchSize);
|
nvinfer1::ITensor* slice3 = sliceLayer(layerIdx, name3, input, start3, sizeAll, strideAll, network);
|
||||||
assert(output != nullptr);
|
assert(slice3 != nullptr);
|
||||||
|
|
||||||
nvinfer1::ITensor* slice4 = sliceLayer(layerIdx, name4, input, start4, sizeAll, strideAll, network, batchSize);
|
nvinfer1::ITensor* slice4 = sliceLayer(layerIdx, name4, input, start4, sizeAll, strideAll, network);
|
||||||
assert(output != nullptr);
|
assert(slice4 != nullptr);
|
||||||
|
|
||||||
std::vector<nvinfer1::ITensor*> concatInputs;
|
std::vector<nvinfer1::ITensor*> concatInputs;
|
||||||
concatInputs.push_back(slice1);
|
concatInputs.push_back(slice1);
|
||||||
|
|||||||
@@ -14,6 +14,6 @@
|
|||||||
#include "slice_layer.h"
|
#include "slice_layer.h"
|
||||||
|
|
||||||
nvinfer1::ITensor* reorgLayer(int layerIdx, std::map<std::string, std::string>& block, nvinfer1::ITensor* input,
|
nvinfer1::ITensor* reorgLayer(int layerIdx, std::map<std::string, std::string>& block, nvinfer1::ITensor* input,
|
||||||
nvinfer1::INetworkDefinition* network, uint batchSize);
|
nvinfer1::INetworkDefinition* network);
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
@@ -7,7 +7,7 @@
|
|||||||
|
|
||||||
nvinfer1::ITensor*
|
nvinfer1::ITensor*
|
||||||
routeLayer(int layerIdx, std::string& layers, std::map<std::string, std::string>& block,
|
routeLayer(int layerIdx, std::string& layers, std::map<std::string, std::string>& block,
|
||||||
std::vector<nvinfer1::ITensor*> tensorOutputs, nvinfer1::INetworkDefinition* network, uint batchSize)
|
std::vector<nvinfer1::ITensor*> tensorOutputs, nvinfer1::INetworkDefinition* network)
|
||||||
{
|
{
|
||||||
nvinfer1::ITensor* output;
|
nvinfer1::ITensor* output;
|
||||||
|
|
||||||
@@ -49,7 +49,6 @@ routeLayer(int layerIdx, std::string& layers, std::map<std::string, std::string>
|
|||||||
int axis = 1;
|
int axis = 1;
|
||||||
if (block.find("axis") != block.end()) {
|
if (block.find("axis") != block.end()) {
|
||||||
axis += std::stoi(block.at("axis"));
|
axis += std::stoi(block.at("axis"));
|
||||||
std::cout << axis << std::endl;
|
|
||||||
}
|
}
|
||||||
if (axis < 0) {
|
if (axis < 0) {
|
||||||
axis += concatInputs[0]->getDimensions().nbDims;
|
axis += concatInputs[0]->getDimensions().nbDims;
|
||||||
@@ -75,7 +74,7 @@ routeLayer(int layerIdx, std::string& layers, std::map<std::string, std::string>
|
|||||||
nvinfer1::Dims size = {4, {prevTensorDims.d[0], channelSlice, prevTensorDims.d[2], prevTensorDims.d[3]}};
|
nvinfer1::Dims size = {4, {prevTensorDims.d[0], channelSlice, prevTensorDims.d[2], prevTensorDims.d[3]}};
|
||||||
nvinfer1::Dims stride = {4, {1, 1, 1, 1}};
|
nvinfer1::Dims stride = {4, {1, 1, 1, 1}};
|
||||||
|
|
||||||
output = sliceLayer(layerIdx, name, output, start, size, stride, network, batchSize);
|
output = sliceLayer(layerIdx, name, output, start, size, stride, network);
|
||||||
assert(output != nullptr);
|
assert(output != nullptr);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -11,6 +11,6 @@
|
|||||||
#include "slice_layer.h"
|
#include "slice_layer.h"
|
||||||
|
|
||||||
nvinfer1::ITensor* routeLayer(int layerIdx, std::string& layers, std::map<std::string, std::string>& block,
|
nvinfer1::ITensor* routeLayer(int layerIdx, std::string& layers, std::map<std::string, std::string>& block,
|
||||||
std::vector<nvinfer1::ITensor*> tensorOutputs, nvinfer1::INetworkDefinition* network, uint batchSize);
|
std::vector<nvinfer1::ITensor*> tensorOutputs, nvinfer1::INetworkDefinition* network);
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
@@ -10,7 +10,7 @@
|
|||||||
nvinfer1::ITensor*
|
nvinfer1::ITensor*
|
||||||
shortcutLayer(int layerIdx, std::string activation, std::string inputVol, std::string shortcutVol,
|
shortcutLayer(int layerIdx, std::string activation, std::string inputVol, std::string shortcutVol,
|
||||||
std::map<std::string, std::string>& block, nvinfer1::ITensor* input, nvinfer1::ITensor* shortcutInput,
|
std::map<std::string, std::string>& block, nvinfer1::ITensor* input, nvinfer1::ITensor* shortcutInput,
|
||||||
nvinfer1::INetworkDefinition* network, uint batchSize)
|
nvinfer1::INetworkDefinition* network)
|
||||||
{
|
{
|
||||||
nvinfer1::ITensor* output;
|
nvinfer1::ITensor* output;
|
||||||
|
|
||||||
@@ -20,15 +20,17 @@ shortcutLayer(int layerIdx, std::string activation, std::string inputVol, std::s
|
|||||||
std::string name = "slice";
|
std::string name = "slice";
|
||||||
nvinfer1::Dims start = {4, {0, 0, 0, 0}};
|
nvinfer1::Dims start = {4, {0, 0, 0, 0}};
|
||||||
nvinfer1::Dims size = input->getDimensions();
|
nvinfer1::Dims size = input->getDimensions();
|
||||||
nvinfer1::Dims stride = nvinfer1::Dims{4, {1, 1, 1, 1}};
|
nvinfer1::Dims stride = {4, {1, 1, 1, 1}};
|
||||||
|
|
||||||
output = sliceLayer(layerIdx, name, shortcutInput, start, size, stride, network, batchSize);
|
output = sliceLayer(layerIdx, name, shortcutInput, start, size, stride, network);
|
||||||
assert(output != nullptr);
|
assert(output != nullptr);
|
||||||
}
|
}
|
||||||
else
|
else {
|
||||||
output = shortcutInput;
|
output = shortcutInput;
|
||||||
|
}
|
||||||
|
|
||||||
nvinfer1::IElementWiseLayer* shortcut = network->addElementWise(*input, *output, nvinfer1::ElementWiseOperation::kSUM);
|
nvinfer1::IElementWiseLayer* shortcut = network->addElementWise(*input, *output,
|
||||||
|
nvinfer1::ElementWiseOperation::kSUM);
|
||||||
assert(shortcut != nullptr);
|
assert(shortcut != nullptr);
|
||||||
std::string shortcutLayerName = "shortcut_" + std::to_string(layerIdx);
|
std::string shortcutLayerName = "shortcut_" + std::to_string(layerIdx);
|
||||||
shortcut->setName(shortcutLayerName.c_str());
|
shortcut->setName(shortcutLayerName.c_str());
|
||||||
|
|||||||
@@ -15,6 +15,6 @@
|
|||||||
|
|
||||||
nvinfer1::ITensor* shortcutLayer(int layerIdx, std::string activation, std::string inputVol, std::string shortcutVol,
|
nvinfer1::ITensor* shortcutLayer(int layerIdx, std::string activation, std::string inputVol, std::string shortcutVol,
|
||||||
std::map<std::string, std::string>& block, nvinfer1::ITensor* input, nvinfer1::ITensor* shortcut,
|
std::map<std::string, std::string>& block, nvinfer1::ITensor* input, nvinfer1::ITensor* shortcut,
|
||||||
nvinfer1::INetworkDefinition* network, uint batchSize);
|
nvinfer1::INetworkDefinition* network);
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
@@ -9,58 +9,72 @@
|
|||||||
|
|
||||||
nvinfer1::ITensor*
|
nvinfer1::ITensor*
|
||||||
sliceLayer(int layerIdx, std::string& name, nvinfer1::ITensor* input, nvinfer1::Dims start, nvinfer1::Dims size,
|
sliceLayer(int layerIdx, std::string& name, nvinfer1::ITensor* input, nvinfer1::Dims start, nvinfer1::Dims size,
|
||||||
nvinfer1::Dims stride, nvinfer1::INetworkDefinition* network, uint batchSize)
|
nvinfer1::Dims stride, nvinfer1::INetworkDefinition* network)
|
||||||
{
|
{
|
||||||
nvinfer1::ITensor* output;
|
nvinfer1::ITensor* output;
|
||||||
|
|
||||||
int tensorBatch = input->getDimensions().d[0];
|
nvinfer1::ISliceLayer* slice;
|
||||||
|
|
||||||
nvinfer1::ISliceLayer* slice = network->addSlice(*input, start, size, stride);
|
nvinfer1::Dims inputDims = input->getDimensions();
|
||||||
|
|
||||||
|
if (inputDims.d[0] == -1) {
|
||||||
|
slice = network->addSlice(*input, start, nvinfer1::Dims{}, stride);
|
||||||
|
assert(slice != nullptr);
|
||||||
|
|
||||||
if (tensorBatch == -1) {
|
|
||||||
int nbDims = size.nbDims;
|
int nbDims = size.nbDims;
|
||||||
|
|
||||||
nvinfer1::Weights constant1Wt {nvinfer1::DataType::kINT32, nullptr, nbDims};
|
nvinfer1::IShapeLayer* shape = network->addShape(*input);
|
||||||
|
assert(shape != nullptr);
|
||||||
|
std::string shapeLayerName = "shape_" + name + "_" + std::to_string(layerIdx);
|
||||||
|
shape->setName(shapeLayerName.c_str());
|
||||||
|
nvinfer1::ITensor* shapeTensor = shape->getOutput(0);
|
||||||
|
assert(shapeTensor != nullptr);
|
||||||
|
|
||||||
int* val1 = new int[nbDims];
|
#if NV_TENSORRT_MAJOR >= 10
|
||||||
val1[0] = 1;
|
nvinfer1::ICastLayer* castShape = network->addCast(*shapeTensor, nvinfer1::DataType::kINT32);
|
||||||
for (int i = 1; i < nbDims; ++i) {
|
assert(castShape != nullptr);
|
||||||
val1[i] = size.d[i];
|
std::string castShapeLayerName = "cast_shape_" + name + "_" + std::to_string(layerIdx);
|
||||||
|
castShape->setName(castShapeLayerName.c_str());
|
||||||
|
nvinfer1::ITensor* castShapeTensor = castShape->getOutput(0);
|
||||||
|
assert(castShapeTensor != nullptr);
|
||||||
|
shapeTensor = castShapeTensor;
|
||||||
|
#endif
|
||||||
|
|
||||||
|
nvinfer1::Weights constantWt {nvinfer1::DataType::kINT32, nullptr, nbDims};
|
||||||
|
|
||||||
|
int* val = new int[nbDims];
|
||||||
|
for (int i = 0; i < nbDims; ++i) {
|
||||||
|
if (inputDims.d[i] == size.d[i]) {
|
||||||
|
val[i] = 0;
|
||||||
|
}
|
||||||
|
else {
|
||||||
|
val[i] = inputDims.d[i] - size.d[i];
|
||||||
|
}
|
||||||
}
|
}
|
||||||
constant1Wt.values = val1;
|
constantWt.values = val;
|
||||||
|
|
||||||
nvinfer1::IConstantLayer* constant1 = network->addConstant(nvinfer1::Dims{1, {nbDims}}, constant1Wt);
|
nvinfer1::IConstantLayer* constant = network->addConstant(nvinfer1::Dims{1, {nbDims}}, constantWt);
|
||||||
assert(constant1 != nullptr);
|
assert(constant != nullptr);
|
||||||
std::string constant1LayerName = "constant1_" + name + "_" + std::to_string(layerIdx);
|
std::string constantLayerName = "constant_" + name + "_" + std::to_string(layerIdx);
|
||||||
constant1->setName(constant1LayerName.c_str());
|
constant->setName(constantLayerName.c_str());
|
||||||
nvinfer1::ITensor* constant1Tensor = constant1->getOutput(0);
|
nvinfer1::ITensor* constantTensor = constant->getOutput(0);
|
||||||
|
assert(constantTensor != nullptr);
|
||||||
|
|
||||||
nvinfer1::Weights constant2Wt {nvinfer1::DataType::kINT32, nullptr, nbDims};
|
nvinfer1::IElementWiseLayer* divide = network->addElementWise(*shapeTensor, *constantTensor,
|
||||||
|
nvinfer1::ElementWiseOperation::kSUB);
|
||||||
|
assert(divide != nullptr);
|
||||||
|
std::string divideLayerName = "divide_" + name + "_" + std::to_string(layerIdx);
|
||||||
|
divide->setName(divideLayerName.c_str());
|
||||||
|
nvinfer1::ITensor* divideTensor = divide->getOutput(0);
|
||||||
|
assert(divideTensor != nullptr);
|
||||||
|
|
||||||
int* val2 = new int[nbDims];
|
slice->setInput(2, *divideTensor);
|
||||||
val2[0] = batchSize;
|
}
|
||||||
for (int i = 1; i < nbDims; ++i) {
|
else {
|
||||||
val2[i] = 1;
|
slice = network->addSlice(*input, start, size, stride);
|
||||||
}
|
assert(slice != nullptr);
|
||||||
constant2Wt.values = val2;
|
|
||||||
|
|
||||||
nvinfer1::IConstantLayer* constant2 = network->addConstant(nvinfer1::Dims{1, {nbDims}}, constant2Wt);
|
|
||||||
assert(constant2 != nullptr);
|
|
||||||
std::string constant2LayerName = "constant2_" + name + "_" + std::to_string(layerIdx);
|
|
||||||
constant2->setName(constant2LayerName.c_str());
|
|
||||||
nvinfer1::ITensor* constant2Tensor = constant2->getOutput(0);
|
|
||||||
|
|
||||||
nvinfer1::IElementWiseLayer* newSize = network->addElementWise(*constant1Tensor, *constant2Tensor,
|
|
||||||
nvinfer1::ElementWiseOperation::kPROD);
|
|
||||||
assert(newSize != nullptr);
|
|
||||||
std::string newSizeLayerName = "new_size_" + name + "_" + std::to_string(layerIdx);
|
|
||||||
newSize->setName(newSizeLayerName.c_str());
|
|
||||||
nvinfer1::ITensor* newSizeTensor = newSize->getOutput(0);
|
|
||||||
|
|
||||||
slice->setInput(2, *newSizeTensor);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
assert(slice != nullptr);
|
|
||||||
std::string sliceLayerName = name + "_" + std::to_string(layerIdx);
|
std::string sliceLayerName = name + "_" + std::to_string(layerIdx);
|
||||||
slice->setName(sliceLayerName.c_str());
|
slice->setName(sliceLayerName.c_str());
|
||||||
output = slice->getOutput(0);
|
output = slice->getOutput(0);
|
||||||
|
|||||||
@@ -11,6 +11,6 @@
|
|||||||
#include "NvInfer.h"
|
#include "NvInfer.h"
|
||||||
|
|
||||||
nvinfer1::ITensor* sliceLayer(int layerIdx, std::string& name, nvinfer1::ITensor* input, nvinfer1::Dims start,
|
nvinfer1::ITensor* sliceLayer(int layerIdx, std::string& name, nvinfer1::ITensor* input, nvinfer1::Dims start,
|
||||||
nvinfer1::Dims size, nvinfer1::Dims stride, nvinfer1::INetworkDefinition* network, uint batchSize);
|
nvinfer1::Dims size, nvinfer1::Dims stride, nvinfer1::INetworkDefinition* network);
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
@@ -24,7 +24,13 @@ upsampleLayer(int layerIdx, std::map<std::string, std::string>& block, nvinfer1:
|
|||||||
assert(resize != nullptr);
|
assert(resize != nullptr);
|
||||||
std::string resizeLayerName = "upsample_" + std::to_string(layerIdx);
|
std::string resizeLayerName = "upsample_" + std::to_string(layerIdx);
|
||||||
resize->setName(resizeLayerName.c_str());
|
resize->setName(resizeLayerName.c_str());
|
||||||
|
|
||||||
|
#if NV_TENSORRT_MAJOR > 8 || (NV_TENSORRT_MAJOR == 8 && NV_TENSORRT_MINOR > 4)
|
||||||
|
resize->setResizeMode(nvinfer1::InterpolationMode::kNEAREST);
|
||||||
|
#else
|
||||||
resize->setResizeMode(nvinfer1::ResizeMode::kNEAREST);
|
resize->setResizeMode(nvinfer1::ResizeMode::kNEAREST);
|
||||||
|
#endif
|
||||||
|
|
||||||
resize->setScales(scale, 4);
|
resize->setScales(scale, 4);
|
||||||
output = resize->getOutput(0);
|
output = resize->getOutput(0);
|
||||||
|
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
/*
|
/*
|
||||||
* Copyright (c) 2018-2023, NVIDIA CORPORATION. All rights reserved.
|
* Copyright (c) 2018-2024, NVIDIA CORPORATION. All rights reserved.
|
||||||
*
|
*
|
||||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||||
* copy of this software and associated documentation files (the "Software"),
|
* copy of this software and associated documentation files (the "Software"),
|
||||||
@@ -35,14 +35,14 @@
|
|||||||
static bool
|
static bool
|
||||||
getYoloNetworkInfo(NetworkInfo& networkInfo, const NvDsInferContextInitParams* initParams)
|
getYoloNetworkInfo(NetworkInfo& networkInfo, const NvDsInferContextInitParams* initParams)
|
||||||
{
|
{
|
||||||
std::string onnxWtsFilePath = initParams->onnxFilePath;
|
std::string onnxFilePath = initParams->onnxFilePath;
|
||||||
std::string darknetWtsFilePath = initParams->modelFilePath;
|
std::string wtsFilePath = initParams->modelFilePath;
|
||||||
std::string darknetCfgFilePath = initParams->customNetworkConfigFilePath;
|
std::string cfgFilePath = initParams->customNetworkConfigFilePath;
|
||||||
|
|
||||||
std::string yoloType = onnxWtsFilePath != "" ? "onnx" : "darknet";
|
std::string yoloType = onnxFilePath != "" ? "onnx" : "darknet";
|
||||||
std::string modelName = yoloType == "onnx" ?
|
std::string modelName = yoloType == "onnx" ?
|
||||||
onnxWtsFilePath.substr(0, onnxWtsFilePath.find(".onnx")).substr(onnxWtsFilePath.rfind("/") + 1) :
|
onnxFilePath.substr(0, onnxFilePath.find(".onnx")).substr(onnxFilePath.rfind("/") + 1) :
|
||||||
darknetWtsFilePath.substr(0, darknetWtsFilePath.find(".weights")).substr(darknetWtsFilePath.rfind("/") + 1);
|
cfgFilePath.substr(0, cfgFilePath.find(".cfg")).substr(cfgFilePath.rfind("/") + 1);
|
||||||
|
|
||||||
std::transform(modelName.begin(), modelName.end(), modelName.begin(), [] (uint8_t c) {
|
std::transform(modelName.begin(), modelName.end(), modelName.begin(), [] (uint8_t c) {
|
||||||
return std::tolower(c);
|
return std::tolower(c);
|
||||||
@@ -51,9 +51,9 @@ getYoloNetworkInfo(NetworkInfo& networkInfo, const NvDsInferContextInitParams* i
|
|||||||
networkInfo.inputBlobName = "input";
|
networkInfo.inputBlobName = "input";
|
||||||
networkInfo.networkType = yoloType;
|
networkInfo.networkType = yoloType;
|
||||||
networkInfo.modelName = modelName;
|
networkInfo.modelName = modelName;
|
||||||
networkInfo.onnxWtsFilePath = onnxWtsFilePath;
|
networkInfo.onnxFilePath = onnxFilePath;
|
||||||
networkInfo.darknetWtsFilePath = darknetWtsFilePath;
|
networkInfo.wtsFilePath = wtsFilePath;
|
||||||
networkInfo.darknetCfgFilePath = darknetCfgFilePath;
|
networkInfo.cfgFilePath = cfgFilePath;
|
||||||
networkInfo.batchSize = initParams->maxBatchSize;
|
networkInfo.batchSize = initParams->maxBatchSize;
|
||||||
networkInfo.implicitBatch = initParams->forceImplicitBatchDimension;
|
networkInfo.implicitBatch = initParams->forceImplicitBatchDimension;
|
||||||
networkInfo.int8CalibPath = initParams->int8CalibrationFilePath;
|
networkInfo.int8CalibPath = initParams->int8CalibrationFilePath;
|
||||||
@@ -63,26 +63,30 @@ getYoloNetworkInfo(NetworkInfo& networkInfo, const NvDsInferContextInitParams* i
|
|||||||
networkInfo.scaleFactor = initParams->networkScaleFactor;
|
networkInfo.scaleFactor = initParams->networkScaleFactor;
|
||||||
networkInfo.offsets = initParams->offsets;
|
networkInfo.offsets = initParams->offsets;
|
||||||
networkInfo.workspaceSize = initParams->workspaceSize;
|
networkInfo.workspaceSize = initParams->workspaceSize;
|
||||||
|
networkInfo.inputFormat = initParams->networkInputFormat;
|
||||||
|
|
||||||
if (initParams->networkMode == NvDsInferNetworkMode_FP32)
|
if (initParams->networkMode == NvDsInferNetworkMode_FP32) {
|
||||||
networkInfo.networkMode = "FP32";
|
networkInfo.networkMode = "FP32";
|
||||||
else if (initParams->networkMode == NvDsInferNetworkMode_INT8)
|
}
|
||||||
|
else if (initParams->networkMode == NvDsInferNetworkMode_INT8) {
|
||||||
networkInfo.networkMode = "INT8";
|
networkInfo.networkMode = "INT8";
|
||||||
else if (initParams->networkMode == NvDsInferNetworkMode_FP16)
|
}
|
||||||
|
else if (initParams->networkMode == NvDsInferNetworkMode_FP16) {
|
||||||
networkInfo.networkMode = "FP16";
|
networkInfo.networkMode = "FP16";
|
||||||
|
}
|
||||||
|
|
||||||
if (yoloType == "onnx") {
|
if (yoloType == "onnx") {
|
||||||
if (!fileExists(networkInfo.onnxWtsFilePath)) {
|
if (!fileExists(networkInfo.onnxFilePath)) {
|
||||||
std::cerr << "ONNX model file does not exist\n" << std::endl;
|
std::cerr << "ONNX file does not exist\n" << std::endl;
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
else {
|
else {
|
||||||
if (!fileExists(networkInfo.darknetWtsFilePath)) {
|
if (!fileExists(networkInfo.wtsFilePath)) {
|
||||||
std::cerr << "Darknet weights file does not exist\n" << std::endl;
|
std::cerr << "Darknet weights file does not exist\n" << std::endl;
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
else if (!fileExists(networkInfo.darknetCfgFilePath)) {
|
else if (!fileExists(networkInfo.cfgFilePath)) {
|
||||||
std::cerr << "Darknet cfg file does not exist\n" << std::endl;
|
std::cerr << "Darknet cfg file does not exist\n" << std::endl;
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
@@ -106,7 +110,8 @@ NvDsInferCreateModelParser(const NvDsInferContextInitParams* initParams)
|
|||||||
#if NV_TENSORRT_MAJOR >= 8
|
#if NV_TENSORRT_MAJOR >= 8
|
||||||
extern "C" bool
|
extern "C" bool
|
||||||
NvDsInferYoloCudaEngineGet(nvinfer1::IBuilder* const builder, nvinfer1::IBuilderConfig* const builderConfig,
|
NvDsInferYoloCudaEngineGet(nvinfer1::IBuilder* const builder, nvinfer1::IBuilderConfig* const builderConfig,
|
||||||
const NvDsInferContextInitParams* const initParams, nvinfer1::DataType dataType, nvinfer1::ICudaEngine*& cudaEngine);
|
const NvDsInferContextInitParams* const initParams, nvinfer1::DataType dataType,
|
||||||
|
nvinfer1::ICudaEngine*& cudaEngine);
|
||||||
|
|
||||||
extern "C" bool
|
extern "C" bool
|
||||||
NvDsInferYoloCudaEngineGet(nvinfer1::IBuilder* const builder, nvinfer1::IBuilderConfig* const builderConfig,
|
NvDsInferYoloCudaEngineGet(nvinfer1::IBuilder* const builder, nvinfer1::IBuilderConfig* const builderConfig,
|
||||||
|
|||||||
@@ -1,38 +0,0 @@
|
|||||||
/*
|
|
||||||
* Copyright (c) 2018-2023, NVIDIA CORPORATION. All rights reserved.
|
|
||||||
*
|
|
||||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
|
||||||
* copy of this software and associated documentation files (the "Software"),
|
|
||||||
* to deal in the Software without restriction, including without limitation
|
|
||||||
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
|
|
||||||
* and/or sell copies of the Software, and to permit persons to whom the
|
|
||||||
* Software is furnished to do so, subject to the following conditions:
|
|
||||||
*
|
|
||||||
* The above copyright notice and this permission notice shall be included in
|
|
||||||
* all copies or substantial portions of the Software.
|
|
||||||
*
|
|
||||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
||||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
||||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
|
||||||
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
||||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
|
||||||
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
|
|
||||||
* DEALINGS IN THE SOFTWARE.
|
|
||||||
*
|
|
||||||
* Edited by Marcos Luciano
|
|
||||||
* https://www.github.com/marcoslucianops
|
|
||||||
*/
|
|
||||||
|
|
||||||
#include "nvdsinfer_custom_impl.h"
|
|
||||||
|
|
||||||
bool
|
|
||||||
NvDsInferInitializeInputLayers(std::vector<NvDsInferLayerInfo> const& inputLayersInfo,
|
|
||||||
NvDsInferNetworkInfo const& networkInfo, unsigned int maxBatchSize)
|
|
||||||
{
|
|
||||||
float* scaleFactor = (float*) inputLayersInfo[0].buffer;
|
|
||||||
for (unsigned int i = 0; i < maxBatchSize; i++) {
|
|
||||||
scaleFactor[i * 2 + 0] = 1.0;
|
|
||||||
scaleFactor[i * 2 + 1] = 1.0;
|
|
||||||
}
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
@@ -1,5 +1,5 @@
|
|||||||
/*
|
/*
|
||||||
* Copyright (c) 2018-2023, NVIDIA CORPORATION. All rights reserved.
|
* Copyright (c) 2018-2024, NVIDIA CORPORATION. All rights reserved.
|
||||||
*
|
*
|
||||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||||
* copy of this software and associated documentation files (the "Software"),
|
* copy of this software and associated documentation files (the "Software"),
|
||||||
@@ -31,10 +31,6 @@ extern "C" bool
|
|||||||
NvDsInferParseYolo(std::vector<NvDsInferLayerInfo> const& outputLayersInfo, NvDsInferNetworkInfo const& networkInfo,
|
NvDsInferParseYolo(std::vector<NvDsInferLayerInfo> const& outputLayersInfo, NvDsInferNetworkInfo const& networkInfo,
|
||||||
NvDsInferParseDetectionParams const& detectionParams, std::vector<NvDsInferParseObjectInfo>& objectList);
|
NvDsInferParseDetectionParams const& detectionParams, std::vector<NvDsInferParseObjectInfo>& objectList);
|
||||||
|
|
||||||
extern "C" bool
|
|
||||||
NvDsInferParseYoloE(std::vector<NvDsInferLayerInfo> const& outputLayersInfo, NvDsInferNetworkInfo const& networkInfo,
|
|
||||||
NvDsInferParseDetectionParams const& detectionParams, std::vector<NvDsInferParseObjectInfo>& objectList);
|
|
||||||
|
|
||||||
static NvDsInferParseObjectInfo
|
static NvDsInferParseObjectInfo
|
||||||
convertBBox(const float& bx1, const float& by1, const float& bx2, const float& by2, const uint& netW, const uint& netH)
|
convertBBox(const float& bx1, const float& by1, const float& bx2, const float& by2, const uint& netW, const uint& netH)
|
||||||
{
|
{
|
||||||
@@ -65,7 +61,7 @@ addBBoxProposal(const float bx1, const float by1, const float bx2, const float b
|
|||||||
NvDsInferParseObjectInfo bbi = convertBBox(bx1, by1, bx2, by2, netW, netH);
|
NvDsInferParseObjectInfo bbi = convertBBox(bx1, by1, bx2, by2, netW, netH);
|
||||||
|
|
||||||
if (bbi.width < 1 || bbi.height < 1) {
|
if (bbi.width < 1 || bbi.height < 1) {
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
bbi.detectionConfidence = maxProb;
|
bbi.detectionConfidence = maxProb;
|
||||||
@@ -74,53 +70,23 @@ addBBoxProposal(const float bx1, const float by1, const float bx2, const float b
|
|||||||
}
|
}
|
||||||
|
|
||||||
static std::vector<NvDsInferParseObjectInfo>
|
static std::vector<NvDsInferParseObjectInfo>
|
||||||
decodeTensorYolo(const float* boxes, const float* scores, const float* classes, const uint& outputSize, const uint& netW,
|
decodeTensorYolo(const float* output, const uint& outputSize, const uint& netW, const uint& netH,
|
||||||
const uint& netH, const std::vector<float>& preclusterThreshold)
|
const std::vector<float>& preclusterThreshold)
|
||||||
{
|
{
|
||||||
std::vector<NvDsInferParseObjectInfo> binfo;
|
std::vector<NvDsInferParseObjectInfo> binfo;
|
||||||
|
|
||||||
for (uint b = 0; b < outputSize; ++b) {
|
for (uint b = 0; b < outputSize; ++b) {
|
||||||
float maxProb = scores[b];
|
float maxProb = output[b * 6 + 4];
|
||||||
int maxIndex = (int) classes[b];
|
int maxIndex = (int) output[b * 6 + 5];
|
||||||
|
|
||||||
if (maxProb < preclusterThreshold[maxIndex]) {
|
if (maxProb < preclusterThreshold[maxIndex]) {
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
|
|
||||||
float bxc = boxes[b * 4 + 0];
|
float bx1 = output[b * 6 + 0];
|
||||||
float byc = boxes[b * 4 + 1];
|
float by1 = output[b * 6 + 1];
|
||||||
float bw = boxes[b * 4 + 2];
|
float bx2 = output[b * 6 + 2];
|
||||||
float bh = boxes[b * 4 + 3];
|
float by2 = output[b * 6 + 3];
|
||||||
|
|
||||||
float bx1 = bxc - bw / 2;
|
|
||||||
float by1 = byc - bh / 2;
|
|
||||||
float bx2 = bx1 + bw;
|
|
||||||
float by2 = by1 + bh;
|
|
||||||
|
|
||||||
addBBoxProposal(bx1, by1, bx2, by2, netW, netH, maxIndex, maxProb, binfo);
|
|
||||||
}
|
|
||||||
|
|
||||||
return binfo;
|
|
||||||
}
|
|
||||||
|
|
||||||
static std::vector<NvDsInferParseObjectInfo>
|
|
||||||
decodeTensorYoloE(const float* boxes, const float* scores, const float* classes, const uint& outputSize, const uint& netW,
|
|
||||||
const uint& netH, const std::vector<float>& preclusterThreshold)
|
|
||||||
{
|
|
||||||
std::vector<NvDsInferParseObjectInfo> binfo;
|
|
||||||
|
|
||||||
for (uint b = 0; b < outputSize; ++b) {
|
|
||||||
float maxProb = scores[b];
|
|
||||||
int maxIndex = (int) classes[b];
|
|
||||||
|
|
||||||
if (maxProb < preclusterThreshold[maxIndex]) {
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
|
|
||||||
float bx1 = boxes[b * 4 + 0];
|
|
||||||
float by1 = boxes[b * 4 + 1];
|
|
||||||
float bx2 = boxes[b * 4 + 2];
|
|
||||||
float by2 = boxes[b * 4 + 3];
|
|
||||||
|
|
||||||
addBBoxProposal(bx1, by1, bx2, by2, netW, netH, maxIndex, maxProb, binfo);
|
addBBoxProposal(bx1, by1, bx2, by2, netW, netH, maxIndex, maxProb, binfo);
|
||||||
}
|
}
|
||||||
@@ -129,8 +95,9 @@ decodeTensorYoloE(const float* boxes, const float* scores, const float* classes,
|
|||||||
}
|
}
|
||||||
|
|
||||||
static bool
|
static bool
|
||||||
NvDsInferParseCustomYolo(std::vector<NvDsInferLayerInfo> const& outputLayersInfo, NvDsInferNetworkInfo const& networkInfo,
|
NvDsInferParseCustomYolo(std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
|
||||||
NvDsInferParseDetectionParams const& detectionParams, std::vector<NvDsInferParseObjectInfo>& objectList)
|
NvDsInferNetworkInfo const& networkInfo, NvDsInferParseDetectionParams const& detectionParams,
|
||||||
|
std::vector<NvDsInferParseObjectInfo>& objectList)
|
||||||
{
|
{
|
||||||
if (outputLayersInfo.empty()) {
|
if (outputLayersInfo.empty()) {
|
||||||
std::cerr << "ERROR: Could not find output layer in bbox parsing" << std::endl;
|
std::cerr << "ERROR: Could not find output layer in bbox parsing" << std::endl;
|
||||||
@@ -139,43 +106,11 @@ NvDsInferParseCustomYolo(std::vector<NvDsInferLayerInfo> const& outputLayersInfo
|
|||||||
|
|
||||||
std::vector<NvDsInferParseObjectInfo> objects;
|
std::vector<NvDsInferParseObjectInfo> objects;
|
||||||
|
|
||||||
const NvDsInferLayerInfo& boxes = outputLayersInfo[0];
|
const NvDsInferLayerInfo& output = outputLayersInfo[0];
|
||||||
const NvDsInferLayerInfo& scores = outputLayersInfo[1];
|
const uint outputSize = output.inferDims.d[0];
|
||||||
const NvDsInferLayerInfo& classes = outputLayersInfo[2];
|
|
||||||
|
|
||||||
const uint outputSize = boxes.inferDims.d[0];
|
std::vector<NvDsInferParseObjectInfo> outObjs = decodeTensorYolo((const float*) (output.buffer), outputSize,
|
||||||
|
networkInfo.width, networkInfo.height, detectionParams.perClassPreclusterThreshold);
|
||||||
std::vector<NvDsInferParseObjectInfo> outObjs = decodeTensorYolo((const float*) (boxes.buffer),
|
|
||||||
(const float*) (scores.buffer), (const float*) (classes.buffer), outputSize, networkInfo.width, networkInfo.height,
|
|
||||||
detectionParams.perClassPreclusterThreshold);
|
|
||||||
|
|
||||||
objects.insert(objects.end(), outObjs.begin(), outObjs.end());
|
|
||||||
|
|
||||||
objectList = objects;
|
|
||||||
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
|
|
||||||
static bool
|
|
||||||
NvDsInferParseCustomYoloE(std::vector<NvDsInferLayerInfo> const& outputLayersInfo, NvDsInferNetworkInfo const& networkInfo,
|
|
||||||
NvDsInferParseDetectionParams const& detectionParams, std::vector<NvDsInferParseObjectInfo>& objectList)
|
|
||||||
{
|
|
||||||
if (outputLayersInfo.empty()) {
|
|
||||||
std::cerr << "ERROR: Could not find output layer in bbox parsing" << std::endl;
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
std::vector<NvDsInferParseObjectInfo> objects;
|
|
||||||
|
|
||||||
const NvDsInferLayerInfo& boxes = outputLayersInfo[0];
|
|
||||||
const NvDsInferLayerInfo& scores = outputLayersInfo[1];
|
|
||||||
const NvDsInferLayerInfo& classes = outputLayersInfo[2];
|
|
||||||
|
|
||||||
const uint outputSize = boxes.inferDims.d[0];
|
|
||||||
|
|
||||||
std::vector<NvDsInferParseObjectInfo> outObjs = decodeTensorYoloE((const float*) (boxes.buffer),
|
|
||||||
(const float*) (scores.buffer), (const float*) (classes.buffer), outputSize, networkInfo.width, networkInfo.height,
|
|
||||||
detectionParams.perClassPreclusterThreshold);
|
|
||||||
|
|
||||||
objects.insert(objects.end(), outObjs.begin(), outObjs.end());
|
objects.insert(objects.end(), outObjs.begin(), outObjs.end());
|
||||||
|
|
||||||
@@ -191,12 +126,4 @@ NvDsInferParseYolo(std::vector<NvDsInferLayerInfo> const& outputLayersInfo, NvDs
|
|||||||
return NvDsInferParseCustomYolo(outputLayersInfo, networkInfo, detectionParams, objectList);
|
return NvDsInferParseCustomYolo(outputLayersInfo, networkInfo, detectionParams, objectList);
|
||||||
}
|
}
|
||||||
|
|
||||||
extern "C" bool
|
|
||||||
NvDsInferParseYoloE(std::vector<NvDsInferLayerInfo> const& outputLayersInfo, NvDsInferNetworkInfo const& networkInfo,
|
|
||||||
NvDsInferParseDetectionParams const& detectionParams, std::vector<NvDsInferParseObjectInfo>& objectList)
|
|
||||||
{
|
|
||||||
return NvDsInferParseCustomYoloE(outputLayersInfo, networkInfo, detectionParams, objectList);
|
|
||||||
}
|
|
||||||
|
|
||||||
CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseYolo);
|
CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseYolo);
|
||||||
CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseYoloE);
|
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
/*
|
/*
|
||||||
* Copyright (c) 2018-2023, NVIDIA CORPORATION. All rights reserved.
|
* Copyright (c) 2018-2024, NVIDIA CORPORATION. All rights reserved.
|
||||||
*
|
*
|
||||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||||
* copy of this software and associated documentation files (the "Software"),
|
* copy of this software and associated documentation files (the "Software"),
|
||||||
@@ -23,7 +23,6 @@
|
|||||||
* https://www.github.com/marcoslucianops
|
* https://www.github.com/marcoslucianops
|
||||||
*/
|
*/
|
||||||
|
|
||||||
#include <algorithm>
|
|
||||||
#include <thrust/host_vector.h>
|
#include <thrust/host_vector.h>
|
||||||
#include <thrust/device_vector.h>
|
#include <thrust/device_vector.h>
|
||||||
|
|
||||||
@@ -33,12 +32,8 @@ extern "C" bool
|
|||||||
NvDsInferParseYoloCuda(std::vector<NvDsInferLayerInfo> const& outputLayersInfo, NvDsInferNetworkInfo const& networkInfo,
|
NvDsInferParseYoloCuda(std::vector<NvDsInferLayerInfo> const& outputLayersInfo, NvDsInferNetworkInfo const& networkInfo,
|
||||||
NvDsInferParseDetectionParams const& detectionParams, std::vector<NvDsInferParseObjectInfo>& objectList);
|
NvDsInferParseDetectionParams const& detectionParams, std::vector<NvDsInferParseObjectInfo>& objectList);
|
||||||
|
|
||||||
extern "C" bool
|
__global__ void decodeTensorYoloCuda(NvDsInferParseObjectInfo *binfo, const float* output, const uint outputSize,
|
||||||
NvDsInferParseYoloECuda(std::vector<NvDsInferLayerInfo> const& outputLayersInfo, NvDsInferNetworkInfo const& networkInfo,
|
const uint netW, const uint netH, const float* preclusterThreshold)
|
||||||
NvDsInferParseDetectionParams const& detectionParams, std::vector<NvDsInferParseObjectInfo>& objectList);
|
|
||||||
|
|
||||||
__global__ void decodeTensorYoloCuda(NvDsInferParseObjectInfo *binfo, float* boxes, float* scores, float* classes,
|
|
||||||
int outputSize, int netW, int netH, float minPreclusterThreshold)
|
|
||||||
{
|
{
|
||||||
int x_id = blockIdx.x * blockDim.x + threadIdx.x;
|
int x_id = blockIdx.x * blockDim.x + threadIdx.x;
|
||||||
|
|
||||||
@@ -46,68 +41,28 @@ __global__ void decodeTensorYoloCuda(NvDsInferParseObjectInfo *binfo, float* box
|
|||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
float maxProb = scores[x_id];
|
float maxProb = output[x_id * 6 + 4];
|
||||||
int maxIndex = (int) classes[x_id];
|
int maxIndex = (int) output[x_id * 6 + 5];
|
||||||
|
|
||||||
if (maxProb < minPreclusterThreshold) {
|
if (maxProb < preclusterThreshold[maxIndex]) {
|
||||||
binfo[x_id].detectionConfidence = 0.0;
|
binfo[x_id].detectionConfidence = 0.0;
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
float bxc = boxes[x_id * 4 + 0];
|
float bx1 = output[x_id * 6 + 0];
|
||||||
float byc = boxes[x_id * 4 + 1];
|
float by1 = output[x_id * 6 + 1];
|
||||||
float bw = boxes[x_id * 4 + 2];
|
float bx2 = output[x_id * 6 + 2];
|
||||||
float bh = boxes[x_id * 4 + 3];
|
float by2 = output[x_id * 6 + 3];
|
||||||
|
|
||||||
float x0 = bxc - bw / 2;
|
bx1 = fminf(float(netW), fmaxf(float(0.0), bx1));
|
||||||
float y0 = byc - bh / 2;
|
by1 = fminf(float(netH), fmaxf(float(0.0), by1));
|
||||||
float x1 = x0 + bw;
|
bx2 = fminf(float(netW), fmaxf(float(0.0), bx2));
|
||||||
float y1 = y0 + bh;
|
by2 = fminf(float(netH), fmaxf(float(0.0), by2));
|
||||||
|
|
||||||
x0 = fminf(float(netW), fmaxf(float(0.0), x0));
|
binfo[x_id].left = bx1;
|
||||||
y0 = fminf(float(netH), fmaxf(float(0.0), y0));
|
binfo[x_id].top = by1;
|
||||||
x1 = fminf(float(netW), fmaxf(float(0.0), x1));
|
binfo[x_id].width = fminf(float(netW), fmaxf(float(0.0), bx2 - bx1));
|
||||||
y1 = fminf(float(netH), fmaxf(float(0.0), y1));
|
binfo[x_id].height = fminf(float(netH), fmaxf(float(0.0), by2 - by1));
|
||||||
|
|
||||||
binfo[x_id].left = x0;
|
|
||||||
binfo[x_id].top = y0;
|
|
||||||
binfo[x_id].width = fminf(float(netW), fmaxf(float(0.0), x1 - x0));
|
|
||||||
binfo[x_id].height = fminf(float(netH), fmaxf(float(0.0), y1 - y0));
|
|
||||||
binfo[x_id].detectionConfidence = maxProb;
|
|
||||||
binfo[x_id].classId = maxIndex;
|
|
||||||
}
|
|
||||||
|
|
||||||
__global__ void decodeTensorYoloECuda(NvDsInferParseObjectInfo *binfo, float* boxes, float* scores, float* classes,
|
|
||||||
int outputSize, int netW, int netH, float minPreclusterThreshold)
|
|
||||||
{
|
|
||||||
int x_id = blockIdx.x * blockDim.x + threadIdx.x;
|
|
||||||
|
|
||||||
if (x_id >= outputSize) {
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
float maxProb = scores[x_id];
|
|
||||||
int maxIndex = (int) classes[x_id];
|
|
||||||
|
|
||||||
if (maxProb < minPreclusterThreshold) {
|
|
||||||
binfo[x_id].detectionConfidence = 0.0;
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
float x0 = boxes[x_id * 4 + 0];
|
|
||||||
float y0 = boxes[x_id * 4 + 1];
|
|
||||||
float x1 = boxes[x_id * 4 + 2];
|
|
||||||
float y1 = boxes[x_id * 4 + 3];
|
|
||||||
|
|
||||||
x0 = fminf(float(netW), fmaxf(float(0.0), x0));
|
|
||||||
y0 = fminf(float(netH), fmaxf(float(0.0), y0));
|
|
||||||
x1 = fminf(float(netW), fmaxf(float(0.0), x1));
|
|
||||||
y1 = fminf(float(netH), fmaxf(float(0.0), y1));
|
|
||||||
|
|
||||||
binfo[x_id].left = x0;
|
|
||||||
binfo[x_id].top = y0;
|
|
||||||
binfo[x_id].width = fminf(float(netW), fmaxf(float(0.0), x1 - x0));
|
|
||||||
binfo[x_id].height = fminf(float(netH), fmaxf(float(0.0), y1 - y0));
|
|
||||||
binfo[x_id].detectionConfidence = maxProb;
|
binfo[x_id].detectionConfidence = maxProb;
|
||||||
binfo[x_id].classId = maxIndex;
|
binfo[x_id].classId = maxIndex;
|
||||||
}
|
}
|
||||||
@@ -121,56 +76,19 @@ static bool NvDsInferParseCustomYoloCuda(std::vector<NvDsInferLayerInfo> const&
|
|||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
const NvDsInferLayerInfo& boxes = outputLayersInfo[0];
|
const NvDsInferLayerInfo& output = outputLayersInfo[0];
|
||||||
const NvDsInferLayerInfo& scores = outputLayersInfo[1];
|
const uint outputSize = output.inferDims.d[0];
|
||||||
const NvDsInferLayerInfo& classes = outputLayersInfo[2];
|
|
||||||
|
|
||||||
const int outputSize = boxes.inferDims.d[0];
|
thrust::device_vector<float> perClassPreclusterThreshold = detectionParams.perClassPreclusterThreshold;
|
||||||
|
|
||||||
thrust::device_vector<NvDsInferParseObjectInfo> objects(outputSize);
|
thrust::device_vector<NvDsInferParseObjectInfo> objects(outputSize);
|
||||||
|
|
||||||
float minPreclusterThreshold = *(std::min_element(detectionParams.perClassPreclusterThreshold.begin(),
|
|
||||||
detectionParams.perClassPreclusterThreshold.end()));
|
|
||||||
|
|
||||||
int threads_per_block = 1024;
|
int threads_per_block = 1024;
|
||||||
int number_of_blocks = ((outputSize - 1) / threads_per_block) + 1;
|
int number_of_blocks = ((outputSize) / threads_per_block) + 1;
|
||||||
|
|
||||||
decodeTensorYoloCuda<<<number_of_blocks, threads_per_block>>>(
|
decodeTensorYoloCuda<<<number_of_blocks, threads_per_block>>>(
|
||||||
thrust::raw_pointer_cast(objects.data()), (float*) (boxes.buffer), (float*) (scores.buffer),
|
thrust::raw_pointer_cast(objects.data()), (float*) (output.buffer), outputSize, networkInfo.width,
|
||||||
(float*) (classes.buffer), outputSize, networkInfo.width, networkInfo.height, minPreclusterThreshold);
|
networkInfo.height, thrust::raw_pointer_cast(perClassPreclusterThreshold.data()));
|
||||||
|
|
||||||
objectList.resize(outputSize);
|
|
||||||
thrust::copy(objects.begin(), objects.end(), objectList.begin());
|
|
||||||
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
|
|
||||||
static bool NvDsInferParseCustomYoloECuda(std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
|
|
||||||
NvDsInferNetworkInfo const& networkInfo, NvDsInferParseDetectionParams const& detectionParams,
|
|
||||||
std::vector<NvDsInferParseObjectInfo>& objectList)
|
|
||||||
{
|
|
||||||
if (outputLayersInfo.empty()) {
|
|
||||||
std::cerr << "ERROR: Could not find output layer in bbox parsing" << std::endl;
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
const NvDsInferLayerInfo& boxes = outputLayersInfo[0];
|
|
||||||
const NvDsInferLayerInfo& scores = outputLayersInfo[1];
|
|
||||||
const NvDsInferLayerInfo& classes = outputLayersInfo[2];
|
|
||||||
|
|
||||||
const int outputSize = boxes.inferDims.d[0];
|
|
||||||
|
|
||||||
thrust::device_vector<NvDsInferParseObjectInfo> objects(outputSize);
|
|
||||||
|
|
||||||
float minPreclusterThreshold = *(std::min_element(detectionParams.perClassPreclusterThreshold.begin(),
|
|
||||||
detectionParams.perClassPreclusterThreshold.end()));
|
|
||||||
|
|
||||||
int threads_per_block = 1024;
|
|
||||||
int number_of_blocks = ((outputSize - 1) / threads_per_block) + 1;
|
|
||||||
|
|
||||||
decodeTensorYoloECuda<<<number_of_blocks, threads_per_block>>>(
|
|
||||||
thrust::raw_pointer_cast(objects.data()), (float*) (boxes.buffer), (float*) (scores.buffer),
|
|
||||||
(float*) (classes.buffer), outputSize, networkInfo.width, networkInfo.height, minPreclusterThreshold);
|
|
||||||
|
|
||||||
objectList.resize(outputSize);
|
objectList.resize(outputSize);
|
||||||
thrust::copy(objects.begin(), objects.end(), objectList.begin());
|
thrust::copy(objects.begin(), objects.end(), objectList.begin());
|
||||||
@@ -185,12 +103,4 @@ NvDsInferParseYoloCuda(std::vector<NvDsInferLayerInfo> const& outputLayersInfo,
|
|||||||
return NvDsInferParseCustomYoloCuda(outputLayersInfo, networkInfo, detectionParams, objectList);
|
return NvDsInferParseCustomYoloCuda(outputLayersInfo, networkInfo, detectionParams, objectList);
|
||||||
}
|
}
|
||||||
|
|
||||||
extern "C" bool
|
|
||||||
NvDsInferParseYoloECuda(std::vector<NvDsInferLayerInfo> const& outputLayersInfo, NvDsInferNetworkInfo const& networkInfo,
|
|
||||||
NvDsInferParseDetectionParams const& detectionParams, std::vector<NvDsInferParseObjectInfo>& objectList)
|
|
||||||
{
|
|
||||||
return NvDsInferParseCustomYoloECuda(outputLayersInfo, networkInfo, detectionParams, objectList);
|
|
||||||
}
|
|
||||||
|
|
||||||
CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseYoloCuda);
|
CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseYoloCuda);
|
||||||
CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseYoloECuda);
|
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
/*
|
/*
|
||||||
* Copyright (c) 2018-2023, NVIDIA CORPORATION. All rights reserved.
|
* Copyright (c) 2018-2024, NVIDIA CORPORATION. All rights reserved.
|
||||||
*
|
*
|
||||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||||
* copy of this software and associated documentation files (the "Software"),
|
* copy of this software and associated documentation files (the "Software"),
|
||||||
@@ -69,7 +69,7 @@ fileExists(const std::string fileName, bool verbose)
|
|||||||
}
|
}
|
||||||
|
|
||||||
std::vector<float>
|
std::vector<float>
|
||||||
loadWeights(const std::string weightsFilePath, const std::string& modelName)
|
loadWeights(const std::string weightsFilePath)
|
||||||
{
|
{
|
||||||
assert(fileExists(weightsFilePath));
|
assert(fileExists(weightsFilePath));
|
||||||
std::cout << "\nLoading pre-trained weights" << std::endl;
|
std::cout << "\nLoading pre-trained weights" << std::endl;
|
||||||
@@ -81,13 +81,14 @@ loadWeights(const std::string weightsFilePath, const std::string& modelName)
|
|||||||
assert(file.good());
|
assert(file.good());
|
||||||
std::string line;
|
std::string line;
|
||||||
|
|
||||||
if (modelName.find("yolov2") != std::string::npos && modelName.find("yolov2-tiny") == std::string::npos) {
|
if (weightsFilePath.find("yolov2") != std::string::npos &&
|
||||||
// Remove 4 int32 bytes of data from the stream belonging to the header
|
weightsFilePath.find("yolov2-tiny") == std::string::npos) {
|
||||||
file.ignore(4 * 4);
|
// Remove 4 int32 bytes of data from the stream belonging to the header
|
||||||
|
file.ignore(4 * 4);
|
||||||
}
|
}
|
||||||
else {
|
else {
|
||||||
// Remove 5 int32 bytes of data from the stream belonging to the header
|
// Remove 5 int32 bytes of data from the stream belonging to the header
|
||||||
file.ignore(4 * 5);
|
file.ignore(4 * 5);
|
||||||
}
|
}
|
||||||
|
|
||||||
char floatWeight[4];
|
char floatWeight[4];
|
||||||
@@ -105,7 +106,7 @@ loadWeights(const std::string weightsFilePath, const std::string& modelName)
|
|||||||
assert(0);
|
assert(0);
|
||||||
}
|
}
|
||||||
|
|
||||||
std::cout << "Loading weights of " << modelName << " complete" << std::endl;
|
std::cout << "Loading " << weightsFilePath << " complete" << std::endl;
|
||||||
std::cout << "Total weights read: " << weights.size() << std::endl;
|
std::cout << "Total weights read: " << weights.size() << std::endl;
|
||||||
|
|
||||||
return weights;
|
return weights;
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
/*
|
/*
|
||||||
* Copyright (c) 2018-2023, NVIDIA CORPORATION. All rights reserved.
|
* Copyright (c) 2018-2024, NVIDIA CORPORATION. All rights reserved.
|
||||||
*
|
*
|
||||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||||
* copy of this software and associated documentation files (the "Software"),
|
* copy of this software and associated documentation files (the "Software"),
|
||||||
@@ -41,13 +41,14 @@ float clamp(const float val, const float minVal, const float maxVal);
|
|||||||
|
|
||||||
bool fileExists(const std::string fileName, bool verbose = true);
|
bool fileExists(const std::string fileName, bool verbose = true);
|
||||||
|
|
||||||
std::vector<float> loadWeights(const std::string weightsFilePath, const std::string& modelName);
|
std::vector<float> loadWeights(const std::string weightsFilePath);
|
||||||
|
|
||||||
std::string dimsToString(const nvinfer1::Dims d);
|
std::string dimsToString(const nvinfer1::Dims d);
|
||||||
|
|
||||||
int getNumChannels(nvinfer1::ITensor* t);
|
int getNumChannels(nvinfer1::ITensor* t);
|
||||||
|
|
||||||
void printLayerInfo(
|
void printLayerInfo(
|
||||||
std::string layerIndex, std::string layerName, std::string layerInput, std::string layerOutput, std::string weightPtr);
|
std::string layerIndex, std::string layerName, std::string layerInput, std::string layerOutput,
|
||||||
|
std::string weightPtr);
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
/*
|
/*
|
||||||
* Copyright (c) 2018-2023, NVIDIA CORPORATION. All rights reserved.
|
* Copyright (c) 2018-2024, NVIDIA CORPORATION. All rights reserved.
|
||||||
*
|
*
|
||||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||||
* copy of this software and associated documentation files (the "Software"),
|
* copy of this software and associated documentation files (the "Software"),
|
||||||
@@ -34,13 +34,14 @@
|
|||||||
|
|
||||||
Yolo::Yolo(const NetworkInfo& networkInfo) : m_InputBlobName(networkInfo.inputBlobName),
|
Yolo::Yolo(const NetworkInfo& networkInfo) : m_InputBlobName(networkInfo.inputBlobName),
|
||||||
m_NetworkType(networkInfo.networkType), m_ModelName(networkInfo.modelName),
|
m_NetworkType(networkInfo.networkType), m_ModelName(networkInfo.modelName),
|
||||||
m_OnnxWtsFilePath(networkInfo.onnxWtsFilePath), m_DarknetWtsFilePath(networkInfo.darknetWtsFilePath),
|
m_OnnxFilePath(networkInfo.onnxFilePath), m_WtsFilePath(networkInfo.wtsFilePath),
|
||||||
m_DarknetCfgFilePath(networkInfo.darknetCfgFilePath), m_BatchSize(networkInfo.batchSize),
|
m_CfgFilePath(networkInfo.cfgFilePath), m_BatchSize(networkInfo.batchSize),
|
||||||
m_ImplicitBatch(networkInfo.implicitBatch), m_Int8CalibPath(networkInfo.int8CalibPath),
|
m_ImplicitBatch(networkInfo.implicitBatch), m_Int8CalibPath(networkInfo.int8CalibPath),
|
||||||
m_DeviceType(networkInfo.deviceType), m_NumDetectedClasses(networkInfo.numDetectedClasses),
|
m_DeviceType(networkInfo.deviceType), m_NumDetectedClasses(networkInfo.numDetectedClasses),
|
||||||
m_ClusterMode(networkInfo.clusterMode), m_NetworkMode(networkInfo.networkMode), m_ScaleFactor(networkInfo.scaleFactor),
|
m_ClusterMode(networkInfo.clusterMode), m_NetworkMode(networkInfo.networkMode),
|
||||||
m_Offsets(networkInfo.offsets), m_WorkspaceSize(networkInfo.workspaceSize), m_InputC(0), m_InputH(0), m_InputW(0),
|
m_ScaleFactor(networkInfo.scaleFactor), m_Offsets(networkInfo.offsets), m_WorkspaceSize(networkInfo.workspaceSize),
|
||||||
m_InputSize(0), m_NumClasses(0), m_LetterBox(0), m_NewCoords(0), m_YoloCount(0)
|
m_InputFormat(networkInfo.inputFormat), m_InputC(0), m_InputH(0), m_InputW(0), m_InputSize(0), m_NumClasses(0),
|
||||||
|
m_LetterBox(0), m_NewCoords(0), m_YoloCount(0)
|
||||||
{
|
{
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -76,14 +77,14 @@ Yolo::createEngine(nvinfer1::IBuilder* builder)
|
|||||||
|
|
||||||
if (m_NetworkType == "onnx") {
|
if (m_NetworkType == "onnx") {
|
||||||
|
|
||||||
#if NV_TENSORRT_MAJOR >= 8 && NV_TENSORRT_MINOR > 0
|
#if NV_TENSORRT_MAJOR > 8 || (NV_TENSORRT_MAJOR == 8 && NV_TENSORRT_MINOR > 0)
|
||||||
parser = nvonnxparser::createParser(*network, *builder->getLogger());
|
parser = nvonnxparser::createParser(*network, *builder->getLogger());
|
||||||
#else
|
#else
|
||||||
parser = nvonnxparser::createParser(*network, logger);
|
parser = nvonnxparser::createParser(*network, logger);
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
if (!parser->parseFromFile(m_OnnxWtsFilePath.c_str(), static_cast<INT>(nvinfer1::ILogger::Severity::kWARNING))) {
|
if (!parser->parseFromFile(m_OnnxFilePath.c_str(), static_cast<INT>(nvinfer1::ILogger::Severity::kWARNING))) {
|
||||||
std::cerr << "\nCould not parse the ONNX model\n" << std::endl;
|
std::cerr << "\nCould not parse the ONNX file\n" << std::endl;
|
||||||
|
|
||||||
#if NV_TENSORRT_MAJOR >= 8
|
#if NV_TENSORRT_MAJOR >= 8
|
||||||
delete parser;
|
delete parser;
|
||||||
@@ -101,7 +102,7 @@ Yolo::createEngine(nvinfer1::IBuilder* builder)
|
|||||||
m_InputW = network->getInput(0)->getDimensions().d[3];
|
m_InputW = network->getInput(0)->getDimensions().d[3];
|
||||||
}
|
}
|
||||||
else {
|
else {
|
||||||
m_ConfigBlocks = parseConfigFile(m_DarknetCfgFilePath);
|
m_ConfigBlocks = parseConfigFile(m_CfgFilePath);
|
||||||
parseConfigBlocks();
|
parseConfigBlocks();
|
||||||
if (parseModel(*network) != NVDSINFER_SUCCESS) {
|
if (parseModel(*network) != NVDSINFER_SUCCESS) {
|
||||||
|
|
||||||
@@ -138,15 +139,16 @@ Yolo::createEngine(nvinfer1::IBuilder* builder)
|
|||||||
if (m_NetworkType == "darknet") {
|
if (m_NetworkType == "darknet") {
|
||||||
if (m_NumClasses != m_NumDetectedClasses) {
|
if (m_NumClasses != m_NumDetectedClasses) {
|
||||||
std::cout << "NOTE: Number of classes mismatch, make sure to set num-detected-classes=" << m_NumClasses
|
std::cout << "NOTE: Number of classes mismatch, make sure to set num-detected-classes=" << m_NumClasses
|
||||||
<< " in config_infer file\n" << std::endl;
|
<< " on the config_infer file\n" << std::endl;
|
||||||
}
|
}
|
||||||
if (m_LetterBox == 1) {
|
if (m_LetterBox == 1) {
|
||||||
std::cout << "NOTE: letter_box is set in cfg file, make sure to set maintain-aspect-ratio=1 in config_infer file"
|
std::cout << "NOTE: letter_box is set in cfg file, make sure to set maintain-aspect-ratio=1 on the " <<
|
||||||
<< " to get better accuracy\n" << std::endl;
|
"config_infer file to get better accuracy\n" << std::endl;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
if (m_ClusterMode != 2) {
|
if (m_ClusterMode != 2 && m_ClusterMode != 4) {
|
||||||
std::cout << "NOTE: Wrong cluster-mode is set, make sure to set cluster-mode=2 in config_infer file\n" << std::endl;
|
std::cout << "NOTE: Wrong cluster-mode is set, make sure to set cluster-mode=4 (RT-DETR or custom NMS) or " <<
|
||||||
|
"cluster-mode=2 on the config_infer file\n" << std::endl;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (m_NetworkMode == "FP16") {
|
if (m_NetworkMode == "FP16") {
|
||||||
@@ -156,9 +158,11 @@ Yolo::createEngine(nvinfer1::IBuilder* builder)
|
|||||||
else if (m_NetworkMode == "INT8") {
|
else if (m_NetworkMode == "INT8") {
|
||||||
assert(builder->platformHasFastInt8());
|
assert(builder->platformHasFastInt8());
|
||||||
config->setFlag(nvinfer1::BuilderFlag::kINT8);
|
config->setFlag(nvinfer1::BuilderFlag::kINT8);
|
||||||
if (m_Int8CalibPath != "" && !fileExists(m_Int8CalibPath)) {
|
if (m_Int8CalibPath != "") {
|
||||||
|
|
||||||
#ifdef OPENCV
|
#ifdef OPENCV
|
||||||
|
fileExists(m_Int8CalibPath);
|
||||||
|
|
||||||
std::string calib_image_list;
|
std::string calib_image_list;
|
||||||
int calib_batch_size;
|
int calib_batch_size;
|
||||||
if (getenv("INT8_CALIB_IMG_PATH")) {
|
if (getenv("INT8_CALIB_IMG_PATH")) {
|
||||||
@@ -176,25 +180,10 @@ Yolo::createEngine(nvinfer1::IBuilder* builder)
|
|||||||
assert(0);
|
assert(0);
|
||||||
}
|
}
|
||||||
nvinfer1::IInt8EntropyCalibrator2* calibrator = new Int8EntropyCalibrator2(calib_batch_size, m_InputC, m_InputH,
|
nvinfer1::IInt8EntropyCalibrator2* calibrator = new Int8EntropyCalibrator2(calib_batch_size, m_InputC, m_InputH,
|
||||||
m_InputW, m_ScaleFactor, m_Offsets, calib_image_list, m_Int8CalibPath);
|
m_InputW, m_ScaleFactor, m_Offsets, m_InputFormat, calib_image_list, m_Int8CalibPath);
|
||||||
config->setInt8Calibrator(calibrator);
|
config->setInt8Calibrator(calibrator);
|
||||||
#else
|
#else
|
||||||
std::cerr << "OpenCV is required to run INT8 calibrator\n" << std::endl;
|
assert(0 && "OpenCV is required to run INT8 calibrator\n");
|
||||||
|
|
||||||
#if NV_TENSORRT_MAJOR >= 8
|
|
||||||
if (m_NetworkType == "onnx") {
|
|
||||||
delete parser;
|
|
||||||
}
|
|
||||||
delete network;
|
|
||||||
#else
|
|
||||||
if (m_NetworkType == "onnx") {
|
|
||||||
parser->destroy();
|
|
||||||
}
|
|
||||||
config->destroy();
|
|
||||||
network->destroy();
|
|
||||||
#endif
|
|
||||||
|
|
||||||
return nullptr;
|
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
}
|
}
|
||||||
@@ -204,7 +193,17 @@ Yolo::createEngine(nvinfer1::IBuilder* builder)
|
|||||||
config->setProfilingVerbosity(nvinfer1::ProfilingVerbosity::kDETAILED);
|
config->setProfilingVerbosity(nvinfer1::ProfilingVerbosity::kDETAILED);
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
nvinfer1::ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);
|
#if NV_TENSORRT_MAJOR > 8 || (NV_TENSORRT_MAJOR == 8 && NV_TENSORRT_MINOR > 0)
|
||||||
|
nvinfer1::IRuntime* runtime = nvinfer1::createInferRuntime(*builder->getLogger());
|
||||||
|
#else
|
||||||
|
nvinfer1::IRuntime* runtime = nvinfer1::createInferRuntime(logger);
|
||||||
|
#endif
|
||||||
|
|
||||||
|
assert(runtime);
|
||||||
|
|
||||||
|
nvinfer1::IHostMemory* serializedEngine = builder->buildSerializedNetwork(*network, *config);
|
||||||
|
|
||||||
|
nvinfer1::ICudaEngine* engine = runtime->deserializeCudaEngine(serializedEngine->data(), serializedEngine->size());
|
||||||
if (engine) {
|
if (engine) {
|
||||||
std::cout << "Building complete\n" << std::endl;
|
std::cout << "Building complete\n" << std::endl;
|
||||||
}
|
}
|
||||||
@@ -212,6 +211,12 @@ Yolo::createEngine(nvinfer1::IBuilder* builder)
|
|||||||
std::cerr << "Building engine failed\n" << std::endl;
|
std::cerr << "Building engine failed\n" << std::endl;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#if NV_TENSORRT_MAJOR >= 8
|
||||||
|
delete serializedEngine;
|
||||||
|
#else
|
||||||
|
serializedEngine->destroy();
|
||||||
|
#endif
|
||||||
|
|
||||||
#ifdef GRAPH
|
#ifdef GRAPH
|
||||||
nvinfer1::IExecutionContext *context = engine->createExecutionContext();
|
nvinfer1::IExecutionContext *context = engine->createExecutionContext();
|
||||||
nvinfer1::IEngineInspector *inpector = engine->createEngineInspector();
|
nvinfer1::IEngineInspector *inpector = engine->createEngineInspector();
|
||||||
@@ -252,7 +257,7 @@ NvDsInferStatus
|
|||||||
Yolo::parseModel(nvinfer1::INetworkDefinition& network) {
|
Yolo::parseModel(nvinfer1::INetworkDefinition& network) {
|
||||||
destroyNetworkUtils();
|
destroyNetworkUtils();
|
||||||
|
|
||||||
std::vector<float> weights = loadWeights(m_DarknetWtsFilePath, m_ModelName);
|
std::vector<float> weights = loadWeights(m_WtsFilePath);
|
||||||
std::cout << "Building YOLO network\n" << std::endl;
|
std::cout << "Building YOLO network\n" << std::endl;
|
||||||
NvDsInferStatus status = buildYoloNetwork(weights, network);
|
NvDsInferStatus status = buildYoloNetwork(weights, network);
|
||||||
|
|
||||||
@@ -292,14 +297,15 @@ Yolo::buildYoloNetwork(std::vector<float>& weights, nvinfer1::INetworkDefinition
|
|||||||
else if (m_ConfigBlocks.at(i).at("type") == "conv" || m_ConfigBlocks.at(i).at("type") == "convolutional") {
|
else if (m_ConfigBlocks.at(i).at("type") == "conv" || m_ConfigBlocks.at(i).at("type") == "convolutional") {
|
||||||
int channels = getNumChannels(previous);
|
int channels = getNumChannels(previous);
|
||||||
std::string inputVol = dimsToString(previous->getDimensions());
|
std::string inputVol = dimsToString(previous->getDimensions());
|
||||||
previous = convolutionalLayer(i, m_ConfigBlocks.at(i), weights, m_TrtWeights, weightPtr, channels, previous, &network);
|
previous = convolutionalLayer(i, m_ConfigBlocks.at(i), weights, m_TrtWeights, weightPtr, channels, previous,
|
||||||
|
&network);
|
||||||
assert(previous != nullptr);
|
assert(previous != nullptr);
|
||||||
std::string outputVol = dimsToString(previous->getDimensions());
|
std::string outputVol = dimsToString(previous->getDimensions());
|
||||||
tensorOutputs.push_back(previous);
|
tensorOutputs.push_back(previous);
|
||||||
std::string layerName = "conv_" + m_ConfigBlocks.at(i).at("activation");
|
std::string layerName = "conv_" + m_ConfigBlocks.at(i).at("activation");
|
||||||
printLayerInfo(layerIndex, layerName, inputVol, outputVol, std::to_string(weightPtr));
|
printLayerInfo(layerIndex, layerName, inputVol, outputVol, std::to_string(weightPtr));
|
||||||
}
|
}
|
||||||
else if (m_ConfigBlocks.at(i).at("type") == "deconvolutional") {
|
else if (m_ConfigBlocks.at(i).at("type") == "deconv" || m_ConfigBlocks.at(i).at("type") == "deconvolutional") {
|
||||||
int channels = getNumChannels(previous);
|
int channels = getNumChannels(previous);
|
||||||
std::string inputVol = dimsToString(previous->getDimensions());
|
std::string inputVol = dimsToString(previous->getDimensions());
|
||||||
previous = deconvolutionalLayer(i, m_ConfigBlocks.at(i), weights, m_TrtWeights, weightPtr, channels, previous,
|
previous = deconvolutionalLayer(i, m_ConfigBlocks.at(i), weights, m_TrtWeights, weightPtr, channels, previous,
|
||||||
@@ -328,11 +334,13 @@ Yolo::buildYoloNetwork(std::vector<float>& weights, nvinfer1::INetworkDefinition
|
|||||||
std::string layerName = "implicit";
|
std::string layerName = "implicit";
|
||||||
printLayerInfo(layerIndex, layerName, "-", outputVol, std::to_string(weightPtr));
|
printLayerInfo(layerIndex, layerName, "-", outputVol, std::to_string(weightPtr));
|
||||||
}
|
}
|
||||||
else if (m_ConfigBlocks.at(i).at("type") == "shift_channels" || m_ConfigBlocks.at(i).at("type") == "control_channels") {
|
else if (m_ConfigBlocks.at(i).at("type") == "shift_channels" ||
|
||||||
|
m_ConfigBlocks.at(i).at("type") == "control_channels") {
|
||||||
assert(m_ConfigBlocks.at(i).find("from") != m_ConfigBlocks.at(i).end());
|
assert(m_ConfigBlocks.at(i).find("from") != m_ConfigBlocks.at(i).end());
|
||||||
int from = stoi(m_ConfigBlocks.at(i).at("from"));
|
int from = stoi(m_ConfigBlocks.at(i).at("from"));
|
||||||
if (from > 0)
|
if (from > 0) {
|
||||||
from = from - i + 1;
|
from = from - i + 1;
|
||||||
|
}
|
||||||
assert((i - 2 >= 0) && (i - 2 < tensorOutputs.size()));
|
assert((i - 2 >= 0) && (i - 2 < tensorOutputs.size()));
|
||||||
assert((i + from - 1 >= 0) && (i + from - 1 < tensorOutputs.size()));
|
assert((i + from - 1 >= 0) && (i + from - 1 < tensorOutputs.size()));
|
||||||
assert(i + from - 1 < i - 2);
|
assert(i + from - 1 < i - 2);
|
||||||
@@ -348,41 +356,46 @@ Yolo::buildYoloNetwork(std::vector<float>& weights, nvinfer1::INetworkDefinition
|
|||||||
else if (m_ConfigBlocks.at(i).at("type") == "shortcut") {
|
else if (m_ConfigBlocks.at(i).at("type") == "shortcut") {
|
||||||
assert(m_ConfigBlocks.at(i).find("from") != m_ConfigBlocks.at(i).end());
|
assert(m_ConfigBlocks.at(i).find("from") != m_ConfigBlocks.at(i).end());
|
||||||
int from = stoi(m_ConfigBlocks.at(i).at("from"));
|
int from = stoi(m_ConfigBlocks.at(i).at("from"));
|
||||||
if (from > 0)
|
if (from > 0) {
|
||||||
from = from - i + 1;
|
from = from - i + 1;
|
||||||
|
}
|
||||||
assert((i - 2 >= 0) && (i - 2 < tensorOutputs.size()));
|
assert((i - 2 >= 0) && (i - 2 < tensorOutputs.size()));
|
||||||
assert((i + from - 1 >= 0) && (i + from - 1 < tensorOutputs.size()));
|
assert((i + from - 1 >= 0) && (i + from - 1 < tensorOutputs.size()));
|
||||||
assert(i + from - 1 < i - 2);
|
assert(i + from - 1 < i - 2);
|
||||||
|
|
||||||
std::string activation = "linear";
|
std::string activation = "linear";
|
||||||
if (m_ConfigBlocks.at(i).find("activation") != m_ConfigBlocks.at(i).end())
|
if (m_ConfigBlocks.at(i).find("activation") != m_ConfigBlocks.at(i).end()) {
|
||||||
activation = m_ConfigBlocks.at(i).at("activation");
|
activation = m_ConfigBlocks.at(i).at("activation");
|
||||||
|
}
|
||||||
|
|
||||||
std::string inputVol = dimsToString(previous->getDimensions());
|
std::string inputVol = dimsToString(previous->getDimensions());
|
||||||
std::string shortcutVol = dimsToString(tensorOutputs[i + from - 1]->getDimensions());
|
std::string shortcutVol = dimsToString(tensorOutputs[i + from - 1]->getDimensions());
|
||||||
previous = shortcutLayer(i, activation, inputVol, shortcutVol, m_ConfigBlocks.at(i), previous,
|
previous = shortcutLayer(i, activation, inputVol, shortcutVol, m_ConfigBlocks.at(i), previous,
|
||||||
tensorOutputs[i + from - 1], &network, m_BatchSize);
|
tensorOutputs[i + from - 1], &network);
|
||||||
assert(previous != nullptr);
|
assert(previous != nullptr);
|
||||||
std::string outputVol = dimsToString(previous->getDimensions());
|
std::string outputVol = dimsToString(previous->getDimensions());
|
||||||
tensorOutputs.push_back(previous);
|
tensorOutputs.push_back(previous);
|
||||||
std::string layerName = "shortcut_" + activation + ": " + std::to_string(i + from - 1);
|
std::string layerName = "shortcut_" + activation + ": " + std::to_string(i + from - 1);
|
||||||
printLayerInfo(layerIndex, layerName, inputVol, outputVol, "-");
|
printLayerInfo(layerIndex, layerName, inputVol, outputVol, "-");
|
||||||
|
|
||||||
if (inputVol != shortcutVol)
|
if (inputVol != shortcutVol) {
|
||||||
std::cout << inputVol << " +" << shortcutVol << std::endl;
|
std::cout << inputVol << " +" << shortcutVol << std::endl;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
else if (m_ConfigBlocks.at(i).at("type") == "sam") {
|
else if (m_ConfigBlocks.at(i).at("type") == "sam") {
|
||||||
assert(m_ConfigBlocks.at(i).find("from") != m_ConfigBlocks.at(i).end());
|
assert(m_ConfigBlocks.at(i).find("from") != m_ConfigBlocks.at(i).end());
|
||||||
int from = stoi(m_ConfigBlocks.at(i).at("from"));
|
int from = stoi(m_ConfigBlocks.at(i).at("from"));
|
||||||
if (from > 0)
|
if (from > 0) {
|
||||||
from = from - i + 1;
|
from = from - i + 1;
|
||||||
|
}
|
||||||
assert((i - 2 >= 0) && (i - 2 < tensorOutputs.size()));
|
assert((i - 2 >= 0) && (i - 2 < tensorOutputs.size()));
|
||||||
assert((i + from - 1 >= 0) && (i + from - 1 < tensorOutputs.size()));
|
assert((i + from - 1 >= 0) && (i + from - 1 < tensorOutputs.size()));
|
||||||
assert(i + from - 1 < i - 2);
|
assert(i + from - 1 < i - 2);
|
||||||
|
|
||||||
std::string activation = "linear";
|
std::string activation = "linear";
|
||||||
if (m_ConfigBlocks.at(i).find("activation") != m_ConfigBlocks.at(i).end())
|
if (m_ConfigBlocks.at(i).find("activation") != m_ConfigBlocks.at(i).end()) {
|
||||||
activation = m_ConfigBlocks.at(i).at("activation");
|
activation = m_ConfigBlocks.at(i).at("activation");
|
||||||
|
}
|
||||||
|
|
||||||
std::string inputVol = dimsToString(previous->getDimensions());
|
std::string inputVol = dimsToString(previous->getDimensions());
|
||||||
previous = samLayer(i, activation, m_ConfigBlocks.at(i), previous, tensorOutputs[i + from - 1], &network);
|
previous = samLayer(i, activation, m_ConfigBlocks.at(i), previous, tensorOutputs[i + from - 1], &network);
|
||||||
@@ -394,7 +407,7 @@ Yolo::buildYoloNetwork(std::vector<float>& weights, nvinfer1::INetworkDefinition
|
|||||||
}
|
}
|
||||||
else if (m_ConfigBlocks.at(i).at("type") == "route") {
|
else if (m_ConfigBlocks.at(i).at("type") == "route") {
|
||||||
std::string layers;
|
std::string layers;
|
||||||
previous = routeLayer(i, layers, m_ConfigBlocks.at(i), tensorOutputs, &network, m_BatchSize);
|
previous = routeLayer(i, layers, m_ConfigBlocks.at(i), tensorOutputs, &network);
|
||||||
assert(previous != nullptr);
|
assert(previous != nullptr);
|
||||||
std::string outputVol = dimsToString(previous->getDimensions());
|
std::string outputVol = dimsToString(previous->getDimensions());
|
||||||
tensorOutputs.push_back(previous);
|
tensorOutputs.push_back(previous);
|
||||||
@@ -422,7 +435,7 @@ Yolo::buildYoloNetwork(std::vector<float>& weights, nvinfer1::INetworkDefinition
|
|||||||
}
|
}
|
||||||
else if (m_ConfigBlocks.at(i).at("type") == "reorg" || m_ConfigBlocks.at(i).at("type") == "reorg3d") {
|
else if (m_ConfigBlocks.at(i).at("type") == "reorg" || m_ConfigBlocks.at(i).at("type") == "reorg3d") {
|
||||||
std::string inputVol = dimsToString(previous->getDimensions());
|
std::string inputVol = dimsToString(previous->getDimensions());
|
||||||
previous = reorgLayer(i, m_ConfigBlocks.at(i), previous, &network, m_BatchSize);
|
previous = reorgLayer(i, m_ConfigBlocks.at(i), previous, &network);
|
||||||
assert(previous != nullptr);
|
assert(previous != nullptr);
|
||||||
std::string outputVol = dimsToString(previous->getDimensions());
|
std::string outputVol = dimsToString(previous->getDimensions());
|
||||||
tensorOutputs.push_back(previous);
|
tensorOutputs.push_back(previous);
|
||||||
@@ -441,7 +454,7 @@ Yolo::buildYoloNetwork(std::vector<float>& weights, nvinfer1::INetworkDefinition
|
|||||||
tensorOutputs.push_back(previous);
|
tensorOutputs.push_back(previous);
|
||||||
yoloTensorInputs[yoloCountInputs] = previous;
|
yoloTensorInputs[yoloCountInputs] = previous;
|
||||||
++yoloCountInputs;
|
++yoloCountInputs;
|
||||||
std::string layerName = m_ConfigBlocks.at(i).at("type") == "yolo" ? "yolo" : "region";
|
std::string layerName = m_ConfigBlocks.at(i).at("type");
|
||||||
printLayerInfo(layerIndex, layerName, inputVol, "-", "-");
|
printLayerInfo(layerIndex, layerName, inputVol, "-", "-");
|
||||||
}
|
}
|
||||||
else if (m_ConfigBlocks.at(i).at("type") == "dropout") {
|
else if (m_ConfigBlocks.at(i).at("type") == "dropout") {
|
||||||
@@ -465,27 +478,19 @@ Yolo::buildYoloNetwork(std::vector<float>& weights, nvinfer1::INetworkDefinition
|
|||||||
outputSize += curYoloTensor.numBBoxes * curYoloTensor.gridSizeY * curYoloTensor.gridSizeX;
|
outputSize += curYoloTensor.numBBoxes * curYoloTensor.gridSizeY * curYoloTensor.gridSizeX;
|
||||||
}
|
}
|
||||||
|
|
||||||
nvinfer1::IPluginV2DynamicExt* yoloPlugin = new YoloLayer(m_InputW, m_InputH, m_NumClasses, m_NewCoords, m_YoloTensors,
|
nvinfer1::IPluginV2DynamicExt* yoloPlugin = new YoloLayer(m_InputW, m_InputH, m_NumClasses, m_NewCoords,
|
||||||
outputSize);
|
m_YoloTensors, outputSize);
|
||||||
assert(yoloPlugin != nullptr);
|
assert(yoloPlugin != nullptr);
|
||||||
nvinfer1::IPluginV2Layer* yolo = network.addPluginV2(yoloTensorInputs, m_YoloCount, *yoloPlugin);
|
nvinfer1::IPluginV2Layer* yolo = network.addPluginV2(yoloTensorInputs, m_YoloCount, *yoloPlugin);
|
||||||
assert(yolo != nullptr);
|
assert(yolo != nullptr);
|
||||||
std::string yoloLayerName = "yolo";
|
std::string yoloLayerName = m_WtsFilePath;
|
||||||
yolo->setName(yoloLayerName.c_str());
|
yolo->setName(yoloLayerName.c_str());
|
||||||
|
|
||||||
std::string outputlayerName;
|
std::string outputlayerName;
|
||||||
nvinfer1::ITensor* detection_boxes = yolo->getOutput(0);
|
nvinfer1::ITensor* detection_output = yolo->getOutput(0);
|
||||||
outputlayerName = "boxes";
|
outputlayerName = "output";
|
||||||
detection_boxes->setName(outputlayerName.c_str());
|
detection_output->setName(outputlayerName.c_str());
|
||||||
nvinfer1::ITensor* detection_scores = yolo->getOutput(1);
|
network.markOutput(*detection_output);
|
||||||
outputlayerName = "scores";
|
|
||||||
detection_scores->setName(outputlayerName.c_str());
|
|
||||||
nvinfer1::ITensor* detection_classes = yolo->getOutput(2);
|
|
||||||
outputlayerName = "classes";
|
|
||||||
detection_classes->setName(outputlayerName.c_str());
|
|
||||||
network.markOutput(*detection_boxes);
|
|
||||||
network.markOutput(*detection_scores);
|
|
||||||
network.markOutput(*detection_classes);
|
|
||||||
}
|
}
|
||||||
else {
|
else {
|
||||||
std::cerr << "\nError in yolo cfg file" << std::endl;
|
std::cerr << "\nError in yolo cfg file" << std::endl;
|
||||||
@@ -493,8 +498,9 @@ Yolo::buildYoloNetwork(std::vector<float>& weights, nvinfer1::INetworkDefinition
|
|||||||
}
|
}
|
||||||
|
|
||||||
std::cout << "\nOutput YOLO blob names: " << std::endl;
|
std::cout << "\nOutput YOLO blob names: " << std::endl;
|
||||||
for (auto& tensor : m_YoloTensors)
|
for (auto& tensor : m_YoloTensors) {
|
||||||
std::cout << tensor.blobName << std::endl;
|
std::cout << tensor.blobName << std::endl;
|
||||||
|
}
|
||||||
|
|
||||||
int nbLayers = network.getNbLayers();
|
int nbLayers = network.getNbLayers();
|
||||||
std::cout << "\nTotal number of YOLO layers: " << nbLayers << "\n" << std::endl;
|
std::cout << "\nTotal number of YOLO layers: " << nbLayers << "\n" << std::endl;
|
||||||
@@ -513,8 +519,9 @@ Yolo::parseConfigFile(const std::string cfgFilePath)
|
|||||||
std::map<std::string, std::string> block;
|
std::map<std::string, std::string> block;
|
||||||
|
|
||||||
while (getline(file, line)) {
|
while (getline(file, line)) {
|
||||||
if (line.size() == 0 || line.front() == ' ' || line.front() == '#')
|
if (line.size() == 0 || line.front() == ' ' || line.front() == '#') {
|
||||||
continue;
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
line = trim(line);
|
line = trim(line);
|
||||||
if (line.front() == '[') {
|
if (line.front() == '[') {
|
||||||
@@ -543,20 +550,21 @@ Yolo::parseConfigBlocks()
|
|||||||
{
|
{
|
||||||
for (auto block : m_ConfigBlocks) {
|
for (auto block : m_ConfigBlocks) {
|
||||||
if (block.at("type") == "net") {
|
if (block.at("type") == "net") {
|
||||||
|
assert((block.find("channels") != block.end()) && "Missing 'channels' param in network cfg");
|
||||||
assert((block.find("height") != block.end()) && "Missing 'height' param in network cfg");
|
assert((block.find("height") != block.end()) && "Missing 'height' param in network cfg");
|
||||||
assert((block.find("width") != block.end()) && "Missing 'width' param in network cfg");
|
assert((block.find("width") != block.end()) && "Missing 'width' param in network cfg");
|
||||||
assert((block.find("channels") != block.end()) && "Missing 'channels' param in network cfg");
|
|
||||||
|
|
||||||
|
m_InputC = std::stoul(block.at("channels"));
|
||||||
m_InputH = std::stoul(block.at("height"));
|
m_InputH = std::stoul(block.at("height"));
|
||||||
m_InputW = std::stoul(block.at("width"));
|
m_InputW = std::stoul(block.at("width"));
|
||||||
m_InputC = std::stoul(block.at("channels"));
|
|
||||||
m_InputSize = m_InputC * m_InputH * m_InputW;
|
m_InputSize = m_InputC * m_InputH * m_InputW;
|
||||||
|
|
||||||
if (block.find("letter_box") != block.end())
|
if (block.find("letter_box") != block.end()) {
|
||||||
m_LetterBox = std::stoul(block.at("letter_box"));
|
m_LetterBox = std::stoul(block.at("letter_box"));
|
||||||
|
}
|
||||||
}
|
}
|
||||||
else if ((block.at("type") == "region") || (block.at("type") == "yolo"))
|
else if ((block.at("type") == "region") || (block.at("type") == "yolo")) {
|
||||||
{
|
|
||||||
assert((block.find("num") != block.end()) &&
|
assert((block.find("num") != block.end()) &&
|
||||||
std::string("Missing 'num' param in " + block.at("type") + " layer").c_str());
|
std::string("Missing 'num' param in " + block.at("type") + " layer").c_str());
|
||||||
assert((block.find("classes") != block.end()) &&
|
assert((block.find("classes") != block.end()) &&
|
||||||
@@ -568,8 +576,9 @@ Yolo::parseConfigBlocks()
|
|||||||
|
|
||||||
m_NumClasses = std::stoul(block.at("classes"));
|
m_NumClasses = std::stoul(block.at("classes"));
|
||||||
|
|
||||||
if (block.find("new_coords") != block.end())
|
if (block.find("new_coords") != block.end()) {
|
||||||
m_NewCoords = std::stoul(block.at("new_coords"));
|
m_NewCoords = std::stoul(block.at("new_coords"));
|
||||||
|
}
|
||||||
|
|
||||||
TensorInfo outputTensor;
|
TensorInfo outputTensor;
|
||||||
|
|
||||||
@@ -605,12 +614,15 @@ Yolo::parseConfigBlocks()
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
if (block.find("scale_x_y") != block.end())
|
if (block.find("scale_x_y") != block.end()) {
|
||||||
outputTensor.scaleXY = std::stof(block.at("scale_x_y"));
|
outputTensor.scaleXY = std::stof(block.at("scale_x_y"));
|
||||||
else
|
}
|
||||||
|
else {
|
||||||
outputTensor.scaleXY = 1.0;
|
outputTensor.scaleXY = 1.0;
|
||||||
|
}
|
||||||
|
|
||||||
outputTensor.numBBoxes = outputTensor.mask.size() > 0 ? outputTensor.mask.size() : std::stoul(trim(block.at("num")));
|
outputTensor.numBBoxes = outputTensor.mask.size() > 0 ? outputTensor.mask.size() :
|
||||||
|
std::stoul(trim(block.at("num")));
|
||||||
|
|
||||||
m_YoloTensors.push_back(outputTensor);
|
m_YoloTensors.push_back(outputTensor);
|
||||||
}
|
}
|
||||||
@@ -620,8 +632,10 @@ Yolo::parseConfigBlocks()
|
|||||||
void
|
void
|
||||||
Yolo::destroyNetworkUtils()
|
Yolo::destroyNetworkUtils()
|
||||||
{
|
{
|
||||||
for (uint i = 0; i < m_TrtWeights.size(); ++i)
|
for (uint i = 0; i < m_TrtWeights.size(); ++i) {
|
||||||
if (m_TrtWeights[i].count > 0)
|
if (m_TrtWeights[i].count > 0) {
|
||||||
free(const_cast<void*>(m_TrtWeights[i].values));
|
free(const_cast<void*>(m_TrtWeights[i].values));
|
||||||
|
}
|
||||||
|
}
|
||||||
m_TrtWeights.clear();
|
m_TrtWeights.clear();
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
/*
|
/*
|
||||||
* Copyright (c) 2018-2023, NVIDIA CORPORATION. All rights reserved.
|
* Copyright (c) 2018-2024, NVIDIA CORPORATION. All rights reserved.
|
||||||
*
|
*
|
||||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||||
* copy of this software and associated documentation files (the "Software"),
|
* copy of this software and associated documentation files (the "Software"),
|
||||||
@@ -61,9 +61,9 @@ struct NetworkInfo
|
|||||||
std::string inputBlobName;
|
std::string inputBlobName;
|
||||||
std::string networkType;
|
std::string networkType;
|
||||||
std::string modelName;
|
std::string modelName;
|
||||||
std::string onnxWtsFilePath;
|
std::string onnxFilePath;
|
||||||
std::string darknetWtsFilePath;
|
std::string wtsFilePath;
|
||||||
std::string darknetCfgFilePath;
|
std::string cfgFilePath;
|
||||||
uint batchSize;
|
uint batchSize;
|
||||||
int implicitBatch;
|
int implicitBatch;
|
||||||
std::string int8CalibPath;
|
std::string int8CalibPath;
|
||||||
@@ -74,6 +74,7 @@ struct NetworkInfo
|
|||||||
float scaleFactor;
|
float scaleFactor;
|
||||||
const float* offsets;
|
const float* offsets;
|
||||||
uint workspaceSize;
|
uint workspaceSize;
|
||||||
|
int inputFormat;
|
||||||
};
|
};
|
||||||
|
|
||||||
struct TensorInfo
|
struct TensorInfo
|
||||||
@@ -96,8 +97,7 @@ class Yolo : public IModelParser {
|
|||||||
bool hasFullDimsSupported() const override { return false; }
|
bool hasFullDimsSupported() const override { return false; }
|
||||||
|
|
||||||
const char* getModelName() const override {
|
const char* getModelName() const override {
|
||||||
return m_NetworkType == "onnx" ? m_OnnxWtsFilePath.substr(0, m_OnnxWtsFilePath.find(".onnx")).c_str() :
|
return m_ModelName.c_str();
|
||||||
m_DarknetCfgFilePath.substr(0, m_DarknetCfgFilePath.find(".cfg")).c_str();
|
|
||||||
}
|
}
|
||||||
|
|
||||||
NvDsInferStatus parseModel(nvinfer1::INetworkDefinition& network) override;
|
NvDsInferStatus parseModel(nvinfer1::INetworkDefinition& network) override;
|
||||||
@@ -112,9 +112,9 @@ class Yolo : public IModelParser {
|
|||||||
const std::string m_InputBlobName;
|
const std::string m_InputBlobName;
|
||||||
const std::string m_NetworkType;
|
const std::string m_NetworkType;
|
||||||
const std::string m_ModelName;
|
const std::string m_ModelName;
|
||||||
const std::string m_OnnxWtsFilePath;
|
const std::string m_OnnxFilePath;
|
||||||
const std::string m_DarknetWtsFilePath;
|
const std::string m_WtsFilePath;
|
||||||
const std::string m_DarknetCfgFilePath;
|
const std::string m_CfgFilePath;
|
||||||
const uint m_BatchSize;
|
const uint m_BatchSize;
|
||||||
const int m_ImplicitBatch;
|
const int m_ImplicitBatch;
|
||||||
const std::string m_Int8CalibPath;
|
const std::string m_Int8CalibPath;
|
||||||
@@ -125,6 +125,7 @@ class Yolo : public IModelParser {
|
|||||||
const float m_ScaleFactor;
|
const float m_ScaleFactor;
|
||||||
const float* m_Offsets;
|
const float* m_Offsets;
|
||||||
const uint m_WorkspaceSize;
|
const uint m_WorkspaceSize;
|
||||||
|
const int m_InputFormat;
|
||||||
|
|
||||||
uint m_InputC;
|
uint m_InputC;
|
||||||
uint m_InputH;
|
uint m_InputH;
|
||||||
|
|||||||
@@ -7,8 +7,8 @@
|
|||||||
|
|
||||||
inline __device__ float sigmoidGPU(const float& x) { return 1.0f / (1.0f + __expf(-x)); }
|
inline __device__ float sigmoidGPU(const float& x) { return 1.0f / (1.0f + __expf(-x)); }
|
||||||
|
|
||||||
__global__ void gpuYoloLayer(const float* input, float* boxes, float* scores, float* classes, const uint netWidth,
|
__global__ void gpuYoloLayer(const float* input, float* output, const uint netWidth, const uint netHeight,
|
||||||
const uint netHeight, const uint gridSizeX, const uint gridSizeY, const uint numOutputClasses, const uint numBBoxes,
|
const uint gridSizeX, const uint gridSizeY, const uint numOutputClasses, const uint numBBoxes,
|
||||||
const uint64_t lastInputSize, const float scaleXY, const float* anchors, const int* mask)
|
const uint64_t lastInputSize, const float scaleXY, const float* anchors, const int* mask)
|
||||||
{
|
{
|
||||||
uint x_id = blockIdx.x * blockDim.x + threadIdx.x;
|
uint x_id = blockIdx.x * blockDim.x + threadIdx.x;
|
||||||
@@ -50,22 +50,22 @@ __global__ void gpuYoloLayer(const float* input, float* boxes, float* scores, fl
|
|||||||
|
|
||||||
int count = numGridCells * z_id + bbindex + lastInputSize;
|
int count = numGridCells * z_id + bbindex + lastInputSize;
|
||||||
|
|
||||||
boxes[count * 4 + 0] = xc;
|
output[count * 6 + 0] = xc - w * 0.5;
|
||||||
boxes[count * 4 + 1] = yc;
|
output[count * 6 + 1] = yc - h * 0.5;
|
||||||
boxes[count * 4 + 2] = w;
|
output[count * 6 + 2] = xc + w * 0.5;
|
||||||
boxes[count * 4 + 3] = h;
|
output[count * 6 + 3] = yc + h * 0.5;
|
||||||
scores[count] = maxProb * objectness;
|
output[count * 6 + 4] = maxProb * objectness;
|
||||||
classes[count] = (float) maxIndex;
|
output[count * 6 + 5] = (float) maxIndex;
|
||||||
}
|
}
|
||||||
|
|
||||||
cudaError_t cudaYoloLayer(const void* input, void* boxes, void* scores, void* classes, const uint& batchSize,
|
cudaError_t cudaYoloLayer(const void* input, void* output, const uint& batchSize, const uint64_t& inputSize,
|
||||||
const uint64_t& inputSize, const uint64_t& outputSize, const uint64_t& lastInputSize, const uint& netWidth,
|
const uint64_t& outputSize, const uint64_t& lastInputSize, const uint& netWidth, const uint& netHeight,
|
||||||
const uint& netHeight, const uint& gridSizeX, const uint& gridSizeY, const uint& numOutputClasses, const uint& numBBoxes,
|
const uint& gridSizeX, const uint& gridSizeY, const uint& numOutputClasses, const uint& numBBoxes,
|
||||||
const float& scaleXY, const void* anchors, const void* mask, cudaStream_t stream);
|
const float& scaleXY, const void* anchors, const void* mask, cudaStream_t stream);
|
||||||
|
|
||||||
cudaError_t cudaYoloLayer(const void* input, void* boxes, void* scores, void* classes, const uint& batchSize,
|
cudaError_t cudaYoloLayer(const void* input, void* output, const uint& batchSize, const uint64_t& inputSize,
|
||||||
const uint64_t& inputSize, const uint64_t& outputSize, const uint64_t& lastInputSize, const uint& netWidth,
|
const uint64_t& outputSize, const uint64_t& lastInputSize, const uint& netWidth, const uint& netHeight,
|
||||||
const uint& netHeight, const uint& gridSizeX, const uint& gridSizeY, const uint& numOutputClasses, const uint& numBBoxes,
|
const uint& gridSizeX, const uint& gridSizeY, const uint& numOutputClasses, const uint& numBBoxes,
|
||||||
const float& scaleXY, const void* anchors, const void* mask, cudaStream_t stream)
|
const float& scaleXY, const void* anchors, const void* mask, cudaStream_t stream)
|
||||||
{
|
{
|
||||||
dim3 threads_per_block(16, 16, 4);
|
dim3 threads_per_block(16, 16, 4);
|
||||||
@@ -75,9 +75,7 @@ cudaError_t cudaYoloLayer(const void* input, void* boxes, void* scores, void* cl
|
|||||||
for (unsigned int batch = 0; batch < batchSize; ++batch) {
|
for (unsigned int batch = 0; batch < batchSize; ++batch) {
|
||||||
gpuYoloLayer<<<number_of_blocks, threads_per_block, 0, stream>>>(
|
gpuYoloLayer<<<number_of_blocks, threads_per_block, 0, stream>>>(
|
||||||
reinterpret_cast<const float*> (input) + (batch * inputSize),
|
reinterpret_cast<const float*> (input) + (batch * inputSize),
|
||||||
reinterpret_cast<float*> (boxes) + (batch * 4 * outputSize),
|
reinterpret_cast<float*> (output) + (batch * 6 * outputSize),
|
||||||
reinterpret_cast<float*> (scores) + (batch * 1 * outputSize),
|
|
||||||
reinterpret_cast<float*> (classes) + (batch * 1 * outputSize),
|
|
||||||
netWidth, netHeight, gridSizeX, gridSizeY, numOutputClasses, numBBoxes, lastInputSize, scaleXY,
|
netWidth, netHeight, gridSizeX, gridSizeY, numOutputClasses, numBBoxes, lastInputSize, scaleXY,
|
||||||
reinterpret_cast<const float*> (anchors), reinterpret_cast<const int*> (mask));
|
reinterpret_cast<const float*> (anchors), reinterpret_cast<const int*> (mask));
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -5,8 +5,8 @@
|
|||||||
|
|
||||||
#include <stdint.h>
|
#include <stdint.h>
|
||||||
|
|
||||||
__global__ void gpuYoloLayer_nc(const float* input, float* boxes, float* scores, float* classes, const uint netWidth,
|
__global__ void gpuYoloLayer_nc(const float* input, float* output, const uint netWidth, const uint netHeight,
|
||||||
const uint netHeight, const uint gridSizeX, const uint gridSizeY, const uint numOutputClasses, const uint numBBoxes,
|
const uint gridSizeX, const uint gridSizeY, const uint numOutputClasses, const uint numBBoxes,
|
||||||
const uint64_t lastInputSize, const float scaleXY, const float* anchors, const int* mask)
|
const uint64_t lastInputSize, const float scaleXY, const float* anchors, const int* mask)
|
||||||
{
|
{
|
||||||
uint x_id = blockIdx.x * blockDim.x + threadIdx.x;
|
uint x_id = blockIdx.x * blockDim.x + threadIdx.x;
|
||||||
@@ -29,9 +29,11 @@ __global__ void gpuYoloLayer_nc(const float* input, float* boxes, float* scores,
|
|||||||
float yc = (input[bbindex + numGridCells * (z_id * (5 + numOutputClasses) + 1)] * alpha + beta + y_id) * netHeight /
|
float yc = (input[bbindex + numGridCells * (z_id * (5 + numOutputClasses) + 1)] * alpha + beta + y_id) * netHeight /
|
||||||
gridSizeY;
|
gridSizeY;
|
||||||
|
|
||||||
float w = __powf(input[bbindex + numGridCells * (z_id * (5 + numOutputClasses) + 2)] * 2, 2) * anchors[mask[z_id] * 2];
|
float w = __powf(input[bbindex + numGridCells * (z_id * (5 + numOutputClasses) + 2)] * 2, 2) *
|
||||||
|
anchors[mask[z_id] * 2];
|
||||||
|
|
||||||
float h = __powf(input[bbindex + numGridCells * (z_id * (5 + numOutputClasses) + 3)] * 2, 2) * anchors[mask[z_id] * 2 + 1];
|
float h = __powf(input[bbindex + numGridCells * (z_id * (5 + numOutputClasses) + 3)] * 2, 2) *
|
||||||
|
anchors[mask[z_id] * 2 + 1];
|
||||||
|
|
||||||
const float objectness = input[bbindex + numGridCells * (z_id * (5 + numOutputClasses) + 4)];
|
const float objectness = input[bbindex + numGridCells * (z_id * (5 + numOutputClasses) + 4)];
|
||||||
|
|
||||||
@@ -48,22 +50,22 @@ __global__ void gpuYoloLayer_nc(const float* input, float* boxes, float* scores,
|
|||||||
|
|
||||||
int count = numGridCells * z_id + bbindex + lastInputSize;
|
int count = numGridCells * z_id + bbindex + lastInputSize;
|
||||||
|
|
||||||
boxes[count * 4 + 0] = xc;
|
output[count * 6 + 0] = xc - w * 0.5;
|
||||||
boxes[count * 4 + 1] = yc;
|
output[count * 6 + 1] = yc - h * 0.5;
|
||||||
boxes[count * 4 + 2] = w;
|
output[count * 6 + 2] = xc + w * 0.5;
|
||||||
boxes[count * 4 + 3] = h;
|
output[count * 6 + 3] = yc + h * 0.5;
|
||||||
scores[count] = maxProb * objectness;
|
output[count * 6 + 4] = maxProb * objectness;
|
||||||
classes[count] = (float) maxIndex;
|
output[count * 6 + 5] = (float) maxIndex;
|
||||||
}
|
}
|
||||||
|
|
||||||
cudaError_t cudaYoloLayer_nc(const void* input, void* boxes, void* scores, void* classes, const uint& batchSize,
|
cudaError_t cudaYoloLayer_nc(const void* input, void* output, const uint& batchSize, const uint64_t& inputSize,
|
||||||
const uint64_t& inputSize, const uint64_t& outputSize, const uint64_t& lastInputSize, const uint& netWidth,
|
const uint64_t& outputSize, const uint64_t& lastInputSize, const uint& netWidth, const uint& netHeight,
|
||||||
const uint& netHeight, const uint& gridSizeX, const uint& gridSizeY, const uint& numOutputClasses, const uint& numBBoxes,
|
const uint& gridSizeX, const uint& gridSizeY, const uint& numOutputClasses, const uint& numBBoxes,
|
||||||
const float& scaleXY, const void* anchors, const void* mask, cudaStream_t stream);
|
const float& scaleXY, const void* anchors, const void* mask, cudaStream_t stream);
|
||||||
|
|
||||||
cudaError_t cudaYoloLayer_nc(const void* input, void* boxes, void* scores, void* classes, const uint& batchSize,
|
cudaError_t cudaYoloLayer_nc(const void* input, void* output, const uint& batchSize, const uint64_t& inputSize,
|
||||||
const uint64_t& inputSize, const uint64_t& outputSize, const uint64_t& lastInputSize, const uint& netWidth,
|
const uint64_t& outputSize, const uint64_t& lastInputSize, const uint& netWidth, const uint& netHeight,
|
||||||
const uint& netHeight, const uint& gridSizeX, const uint& gridSizeY, const uint& numOutputClasses, const uint& numBBoxes,
|
const uint& gridSizeX, const uint& gridSizeY, const uint& numOutputClasses, const uint& numBBoxes,
|
||||||
const float& scaleXY, const void* anchors, const void* mask, cudaStream_t stream)
|
const float& scaleXY, const void* anchors, const void* mask, cudaStream_t stream)
|
||||||
{
|
{
|
||||||
dim3 threads_per_block(16, 16, 4);
|
dim3 threads_per_block(16, 16, 4);
|
||||||
@@ -73,9 +75,7 @@ cudaError_t cudaYoloLayer_nc(const void* input, void* boxes, void* scores, void*
|
|||||||
for (unsigned int batch = 0; batch < batchSize; ++batch) {
|
for (unsigned int batch = 0; batch < batchSize; ++batch) {
|
||||||
gpuYoloLayer_nc<<<number_of_blocks, threads_per_block, 0, stream>>>(
|
gpuYoloLayer_nc<<<number_of_blocks, threads_per_block, 0, stream>>>(
|
||||||
reinterpret_cast<const float*> (input) + (batch * inputSize),
|
reinterpret_cast<const float*> (input) + (batch * inputSize),
|
||||||
reinterpret_cast<float*> (boxes) + (batch * 4 * outputSize),
|
reinterpret_cast<float*> (output) + (batch * 6 * outputSize),
|
||||||
reinterpret_cast<float*> (scores) + (batch * 1 * outputSize),
|
|
||||||
reinterpret_cast<float*> (classes) + (batch * 1 * outputSize),
|
|
||||||
netWidth, netHeight, gridSizeX, gridSizeY, numOutputClasses, numBBoxes, lastInputSize, scaleXY,
|
netWidth, netHeight, gridSizeX, gridSizeY, numOutputClasses, numBBoxes, lastInputSize, scaleXY,
|
||||||
reinterpret_cast<const float*> (anchors), reinterpret_cast<const int*> (mask));
|
reinterpret_cast<const float*> (anchors), reinterpret_cast<const int*> (mask));
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -27,9 +27,9 @@ __device__ void softmaxGPU(const float* input, const int bbindex, const int numG
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
__global__ void gpuRegionLayer(const float* input, float* softmax, float* boxes, float* scores, float* classes,
|
__global__ void gpuRegionLayer(const float* input, float* softmax, float* output, const uint netWidth,
|
||||||
const uint netWidth, const uint netHeight, const uint gridSizeX, const uint gridSizeY, const uint numOutputClasses,
|
const uint netHeight, const uint gridSizeX, const uint gridSizeY, const uint numOutputClasses, const uint numBBoxes,
|
||||||
const uint numBBoxes, const uint64_t lastInputSize, const float* anchors)
|
const uint64_t lastInputSize, const float* anchors)
|
||||||
{
|
{
|
||||||
uint x_id = blockIdx.x * blockDim.x + threadIdx.x;
|
uint x_id = blockIdx.x * blockDim.x + threadIdx.x;
|
||||||
uint y_id = blockIdx.y * blockDim.y + threadIdx.y;
|
uint y_id = blockIdx.y * blockDim.y + threadIdx.y;
|
||||||
@@ -42,15 +42,17 @@ __global__ void gpuRegionLayer(const float* input, float* softmax, float* boxes,
|
|||||||
const int numGridCells = gridSizeX * gridSizeY;
|
const int numGridCells = gridSizeX * gridSizeY;
|
||||||
const int bbindex = y_id * gridSizeX + x_id;
|
const int bbindex = y_id * gridSizeX + x_id;
|
||||||
|
|
||||||
float xc = (sigmoidGPU(input[bbindex + numGridCells * (z_id * (5 + numOutputClasses) + 0)]) + x_id) * netWidth / gridSizeX;
|
float xc = (sigmoidGPU(input[bbindex + numGridCells * (z_id * (5 + numOutputClasses) + 0)]) + x_id) * netWidth /
|
||||||
|
gridSizeX;
|
||||||
|
|
||||||
float yc = (sigmoidGPU(input[bbindex + numGridCells * (z_id * (5 + numOutputClasses) + 1)]) + y_id) * netHeight / gridSizeY;
|
float yc = (sigmoidGPU(input[bbindex + numGridCells * (z_id * (5 + numOutputClasses) + 1)]) + y_id) * netHeight /
|
||||||
|
gridSizeY;
|
||||||
|
|
||||||
float w = __expf(input[bbindex + numGridCells * (z_id * (5 + numOutputClasses) + 2)]) * anchors[z_id * 2] * netWidth /
|
float w = __expf(input[bbindex + numGridCells * (z_id * (5 + numOutputClasses) + 2)]) * anchors[z_id * 2] * netWidth /
|
||||||
gridSizeX;
|
gridSizeX;
|
||||||
|
|
||||||
float h = __expf(input[bbindex + numGridCells * (z_id * (5 + numOutputClasses) + 3)]) * anchors[z_id * 2 + 1] * netHeight /
|
float h = __expf(input[bbindex + numGridCells * (z_id * (5 + numOutputClasses) + 3)]) * anchors[z_id * 2 + 1] *
|
||||||
gridSizeY;
|
netHeight / gridSizeY;
|
||||||
|
|
||||||
const float objectness = sigmoidGPU(input[bbindex + numGridCells * (z_id * (5 + numOutputClasses) + 4)]);
|
const float objectness = sigmoidGPU(input[bbindex + numGridCells * (z_id * (5 + numOutputClasses) + 4)]);
|
||||||
|
|
||||||
@@ -69,22 +71,22 @@ __global__ void gpuRegionLayer(const float* input, float* softmax, float* boxes,
|
|||||||
|
|
||||||
int count = numGridCells * z_id + bbindex + lastInputSize;
|
int count = numGridCells * z_id + bbindex + lastInputSize;
|
||||||
|
|
||||||
boxes[count * 4 + 0] = xc;
|
output[count * 6 + 0] = xc - w * 0.5;
|
||||||
boxes[count * 4 + 1] = yc;
|
output[count * 6 + 1] = yc - h * 0.5;
|
||||||
boxes[count * 4 + 2] = w;
|
output[count * 6 + 2] = xc + w * 0.5;
|
||||||
boxes[count * 4 + 3] = h;
|
output[count * 6 + 3] = yc + h * 0.5;
|
||||||
scores[count] = maxProb * objectness;
|
output[count * 6 + 4] = maxProb * objectness;
|
||||||
classes[count] = (float) maxIndex;
|
output[count * 6 + 5] = (float) maxIndex;
|
||||||
}
|
}
|
||||||
|
|
||||||
cudaError_t cudaRegionLayer(const void* input, void* softmax, void* boxes, void* scores, void* classes,
|
cudaError_t cudaRegionLayer(const void* input, void* softmax, void* output, const uint& batchSize,
|
||||||
const uint& batchSize, const uint64_t& inputSize, const uint64_t& outputSize, const uint64_t& lastInputSize,
|
const uint64_t& inputSize, const uint64_t& outputSize, const uint64_t& lastInputSize, const uint& netWidth,
|
||||||
const uint& netWidth, const uint& netHeight, const uint& gridSizeX, const uint& gridSizeY, const uint& numOutputClasses,
|
const uint& netHeight, const uint& gridSizeX, const uint& gridSizeY, const uint& numOutputClasses,
|
||||||
const uint& numBBoxes, const void* anchors, cudaStream_t stream);
|
const uint& numBBoxes, const void* anchors, cudaStream_t stream);
|
||||||
|
|
||||||
cudaError_t cudaRegionLayer(const void* input, void* softmax, void* boxes, void* scores, void* classes,
|
cudaError_t cudaRegionLayer(const void* input, void* softmax, void* output, const uint& batchSize,
|
||||||
const uint& batchSize, const uint64_t& inputSize, const uint64_t& outputSize, const uint64_t& lastInputSize,
|
const uint64_t& inputSize, const uint64_t& outputSize, const uint64_t& lastInputSize, const uint& netWidth,
|
||||||
const uint& netWidth, const uint& netHeight, const uint& gridSizeX, const uint& gridSizeY, const uint& numOutputClasses,
|
const uint& netHeight, const uint& gridSizeX, const uint& gridSizeY, const uint& numOutputClasses,
|
||||||
const uint& numBBoxes, const void* anchors, cudaStream_t stream)
|
const uint& numBBoxes, const void* anchors, cudaStream_t stream)
|
||||||
{
|
{
|
||||||
dim3 threads_per_block(16, 16, 4);
|
dim3 threads_per_block(16, 16, 4);
|
||||||
@@ -95,9 +97,7 @@ cudaError_t cudaRegionLayer(const void* input, void* softmax, void* boxes, void*
|
|||||||
gpuRegionLayer<<<number_of_blocks, threads_per_block, 0, stream>>>(
|
gpuRegionLayer<<<number_of_blocks, threads_per_block, 0, stream>>>(
|
||||||
reinterpret_cast<const float*> (input) + (batch * inputSize),
|
reinterpret_cast<const float*> (input) + (batch * inputSize),
|
||||||
reinterpret_cast<float*> (softmax) + (batch * inputSize),
|
reinterpret_cast<float*> (softmax) + (batch * inputSize),
|
||||||
reinterpret_cast<float*> (boxes) + (batch * 4 * outputSize),
|
reinterpret_cast<float*> (output) + (batch * 6 * outputSize),
|
||||||
reinterpret_cast<float*> (scores) + (batch * 1 * outputSize),
|
|
||||||
reinterpret_cast<float*> (classes) + (batch * 1 * outputSize),
|
|
||||||
netWidth, netHeight, gridSizeX, gridSizeY, numOutputClasses, numBBoxes, lastInputSize,
|
netWidth, netHeight, gridSizeX, gridSizeY, numOutputClasses, numBBoxes, lastInputSize,
|
||||||
reinterpret_cast<const float*> (anchors));
|
reinterpret_cast<const float*> (anchors));
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
/*
|
/*
|
||||||
* Copyright (c) 2018-2023, NVIDIA CORPORATION. All rights reserved.
|
* Copyright (c) 2018-2024, NVIDIA CORPORATION. All rights reserved.
|
||||||
*
|
*
|
||||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||||
* copy of this software and associated documentation files (the "Software"),
|
* copy of this software and associated documentation files (the "Software"),
|
||||||
@@ -38,19 +38,19 @@ namespace {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
cudaError_t cudaYoloLayer_nc(const void* input, void* boxes, void* scores, void* classes, const uint& batchSize,
|
cudaError_t cudaYoloLayer_nc(const void* input, void* output, const uint& batchSize, const uint64_t& inputSize,
|
||||||
const uint64_t& inputSize, const uint64_t& outputSize, const uint64_t& lastInputSize, const uint& netWidth,
|
const uint64_t& outputSize, const uint64_t& lastInputSize, const uint& netWidth, const uint& netHeight,
|
||||||
const uint& netHeight, const uint& gridSizeX, const uint& gridSizeY, const uint& numOutputClasses, const uint& numBBoxes,
|
const uint& gridSizeX, const uint& gridSizeY, const uint& numOutputClasses, const uint& numBBoxes,
|
||||||
const float& scaleXY, const void* anchors, const void* mask, cudaStream_t stream);
|
const float& scaleXY, const void* anchors, const void* mask, cudaStream_t stream);
|
||||||
|
|
||||||
cudaError_t cudaYoloLayer(const void* input, void* boxes, void* scores, void* classes, const uint& batchSize,
|
cudaError_t cudaYoloLayer(const void* input, void* output, const uint& batchSize, const uint64_t& inputSize,
|
||||||
const uint64_t& inputSize, const uint64_t& outputSize, const uint64_t& lastInputSize, const uint& netWidth,
|
const uint64_t& outputSize, const uint64_t& lastInputSize, const uint& netWidth, const uint& netHeight,
|
||||||
const uint& netHeight, const uint& gridSizeX, const uint& gridSizeY, const uint& numOutputClasses, const uint& numBBoxes,
|
const uint& gridSizeX, const uint& gridSizeY, const uint& numOutputClasses, const uint& numBBoxes,
|
||||||
const float& scaleXY, const void* anchors, const void* mask, cudaStream_t stream);
|
const float& scaleXY, const void* anchors, const void* mask, cudaStream_t stream);
|
||||||
|
|
||||||
cudaError_t cudaRegionLayer(const void* input, void* softmax, void* boxes, void* scores, void* classes,
|
cudaError_t cudaRegionLayer(const void* input, void* softmax, void* output, const uint& batchSize,
|
||||||
const uint& batchSize, const uint64_t& inputSize, const uint64_t& outputSize, const uint64_t& lastInputSize,
|
const uint64_t& inputSize, const uint64_t& outputSize, const uint64_t& lastInputSize, const uint& netWidth,
|
||||||
const uint& netWidth, const uint& netHeight, const uint& gridSizeX, const uint& gridSizeY, const uint& numOutputClasses,
|
const uint& netHeight, const uint& gridSizeX, const uint& gridSizeY, const uint& numOutputClasses,
|
||||||
const uint& numBBoxes, const void* anchors, cudaStream_t stream);
|
const uint& numBBoxes, const void* anchors, cudaStream_t stream);
|
||||||
|
|
||||||
YoloLayer::YoloLayer(const void* data, size_t length) {
|
YoloLayer::YoloLayer(const void* data, size_t length) {
|
||||||
@@ -98,6 +98,8 @@ YoloLayer::YoloLayer(const uint& netWidth, const uint& netHeight, const uint& nu
|
|||||||
{
|
{
|
||||||
assert(m_NetWidth > 0);
|
assert(m_NetWidth > 0);
|
||||||
assert(m_NetHeight > 0);
|
assert(m_NetHeight > 0);
|
||||||
|
assert(m_NumClasses > 0);
|
||||||
|
assert(m_OutputSize > 0);
|
||||||
};
|
};
|
||||||
|
|
||||||
nvinfer1::IPluginV2DynamicExt*
|
nvinfer1::IPluginV2DynamicExt*
|
||||||
@@ -155,13 +157,15 @@ YoloLayer::serialize(void* buffer) const noexcept
|
|||||||
|
|
||||||
uint anchorsSize = curYoloTensor.anchors.size();
|
uint anchorsSize = curYoloTensor.anchors.size();
|
||||||
write(d, anchorsSize);
|
write(d, anchorsSize);
|
||||||
for (uint j = 0; j < anchorsSize; ++j)
|
for (uint j = 0; j < anchorsSize; ++j) {
|
||||||
write(d, curYoloTensor.anchors[j]);
|
write(d, curYoloTensor.anchors[j]);
|
||||||
|
}
|
||||||
|
|
||||||
uint maskSize = curYoloTensor.mask.size();
|
uint maskSize = curYoloTensor.mask.size();
|
||||||
write(d, maskSize);
|
write(d, maskSize);
|
||||||
for (uint j = 0; j < maskSize; ++j)
|
for (uint j = 0; j < maskSize; ++j) {
|
||||||
write(d, curYoloTensor.mask[j]);
|
write(d, curYoloTensor.mask[j]);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -169,17 +173,14 @@ nvinfer1::DimsExprs
|
|||||||
YoloLayer::getOutputDimensions(INT index, const nvinfer1::DimsExprs* inputs, INT nbInputDims,
|
YoloLayer::getOutputDimensions(INT index, const nvinfer1::DimsExprs* inputs, INT nbInputDims,
|
||||||
nvinfer1::IExprBuilder& exprBuilder)noexcept
|
nvinfer1::IExprBuilder& exprBuilder)noexcept
|
||||||
{
|
{
|
||||||
assert(index < 3);
|
assert(index < 1);
|
||||||
if (index == 0) {
|
|
||||||
return nvinfer1::DimsExprs{3, {inputs->d[0], exprBuilder.constant(static_cast<int>(m_OutputSize)),
|
|
||||||
exprBuilder.constant(4)}};
|
|
||||||
}
|
|
||||||
return nvinfer1::DimsExprs{3, {inputs->d[0], exprBuilder.constant(static_cast<int>(m_OutputSize)),
|
return nvinfer1::DimsExprs{3, {inputs->d[0], exprBuilder.constant(static_cast<int>(m_OutputSize)),
|
||||||
exprBuilder.constant(1)}};
|
exprBuilder.constant(6)}};
|
||||||
}
|
}
|
||||||
|
|
||||||
bool
|
bool
|
||||||
YoloLayer::supportsFormatCombination(INT pos, const nvinfer1::PluginTensorDesc* inOut, INT nbInputs, INT nbOutputs) noexcept
|
YoloLayer::supportsFormatCombination(INT pos, const nvinfer1::PluginTensorDesc* inOut, INT nbInputs, INT nbOutputs)
|
||||||
|
noexcept
|
||||||
{
|
{
|
||||||
return inOut[pos].format == nvinfer1::TensorFormat::kLINEAR && inOut[pos].type == nvinfer1::DataType::kFLOAT;
|
return inOut[pos].format == nvinfer1::TensorFormat::kLINEAR && inOut[pos].type == nvinfer1::DataType::kFLOAT;
|
||||||
}
|
}
|
||||||
@@ -187,7 +188,7 @@ YoloLayer::supportsFormatCombination(INT pos, const nvinfer1::PluginTensorDesc*
|
|||||||
nvinfer1::DataType
|
nvinfer1::DataType
|
||||||
YoloLayer::getOutputDataType(INT index, const nvinfer1::DataType* inputTypes, INT nbInputs) const noexcept
|
YoloLayer::getOutputDataType(INT index, const nvinfer1::DataType* inputTypes, INT nbInputs) const noexcept
|
||||||
{
|
{
|
||||||
assert(index < 3);
|
assert(index < 1);
|
||||||
return nvinfer1::DataType::kFLOAT;
|
return nvinfer1::DataType::kFLOAT;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -206,10 +207,6 @@ YoloLayer::enqueue(const nvinfer1::PluginTensorDesc* inputDesc, const nvinfer1::
|
|||||||
{
|
{
|
||||||
INT batchSize = inputDesc[0].dims.d[0];
|
INT batchSize = inputDesc[0].dims.d[0];
|
||||||
|
|
||||||
void* boxes = outputs[0];
|
|
||||||
void* scores = outputs[1];
|
|
||||||
void* classes = outputs[2];
|
|
||||||
|
|
||||||
uint64_t lastInputSize = 0;
|
uint64_t lastInputSize = 0;
|
||||||
|
|
||||||
uint yoloTensorsSize = m_YoloTensors.size();
|
uint yoloTensorsSize = m_YoloTensors.size();
|
||||||
@@ -223,27 +220,29 @@ YoloLayer::enqueue(const nvinfer1::PluginTensorDesc* inputDesc, const nvinfer1::
|
|||||||
const std::vector<float> anchors = curYoloTensor.anchors;
|
const std::vector<float> anchors = curYoloTensor.anchors;
|
||||||
const std::vector<int> mask = curYoloTensor.mask;
|
const std::vector<int> mask = curYoloTensor.mask;
|
||||||
|
|
||||||
void* v_anchors;
|
void* d_anchors;
|
||||||
void* v_mask;
|
void* d_mask;
|
||||||
if (anchors.size() > 0) {
|
if (anchors.size() > 0) {
|
||||||
CUDA_CHECK(cudaMalloc(&v_anchors, sizeof(float) * anchors.size()));
|
CUDA_CHECK(cudaMalloc(&d_anchors, sizeof(float) * anchors.size()));
|
||||||
CUDA_CHECK(cudaMemcpyAsync(v_anchors, anchors.data(), sizeof(float) * anchors.size(), cudaMemcpyHostToDevice, stream));
|
CUDA_CHECK(cudaMemcpyAsync(d_anchors, anchors.data(), sizeof(float) * anchors.size(), cudaMemcpyHostToDevice,
|
||||||
|
stream));
|
||||||
}
|
}
|
||||||
if (mask.size() > 0) {
|
if (mask.size() > 0) {
|
||||||
CUDA_CHECK(cudaMalloc(&v_mask, sizeof(int) * mask.size()));
|
CUDA_CHECK(cudaMalloc(&d_mask, sizeof(int) * mask.size()));
|
||||||
CUDA_CHECK(cudaMemcpyAsync(v_mask, mask.data(), sizeof(int) * mask.size(), cudaMemcpyHostToDevice, stream));
|
CUDA_CHECK(cudaMemcpyAsync(d_mask, mask.data(), sizeof(int) * mask.size(), cudaMemcpyHostToDevice, stream));
|
||||||
}
|
}
|
||||||
|
|
||||||
const uint64_t inputSize = (numBBoxes * (4 + 1 + m_NumClasses)) * gridSizeY * gridSizeX;
|
const uint64_t inputSize = (numBBoxes * (4 + 1 + m_NumClasses)) * gridSizeY * gridSizeX;
|
||||||
|
|
||||||
if (mask.size() > 0) {
|
if (mask.size() > 0) {
|
||||||
if (m_NewCoords) {
|
if (m_NewCoords) {
|
||||||
CUDA_CHECK(cudaYoloLayer_nc(inputs[i], boxes, scores, classes, batchSize, inputSize, m_OutputSize, lastInputSize,
|
CUDA_CHECK(cudaYoloLayer_nc(inputs[i], outputs[0], batchSize, inputSize, m_OutputSize, lastInputSize,
|
||||||
m_NetWidth, m_NetHeight, gridSizeX, gridSizeY, m_NumClasses, numBBoxes, scaleXY, v_anchors, v_mask, stream));
|
m_NetWidth, m_NetHeight, gridSizeX, gridSizeY, m_NumClasses, numBBoxes, scaleXY, d_anchors, d_mask,
|
||||||
|
stream));
|
||||||
}
|
}
|
||||||
else {
|
else {
|
||||||
CUDA_CHECK(cudaYoloLayer(inputs[i], boxes, scores, classes, batchSize, inputSize, m_OutputSize, lastInputSize,
|
CUDA_CHECK(cudaYoloLayer(inputs[i], outputs[0], batchSize, inputSize, m_OutputSize, lastInputSize, m_NetWidth,
|
||||||
m_NetWidth, m_NetHeight, gridSizeX, gridSizeY, m_NumClasses, numBBoxes, scaleXY, v_anchors, v_mask, stream));
|
m_NetHeight, gridSizeX, gridSizeY, m_NumClasses, numBBoxes, scaleXY, d_anchors, d_mask, stream));
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
else {
|
else {
|
||||||
@@ -251,17 +250,17 @@ YoloLayer::enqueue(const nvinfer1::PluginTensorDesc* inputDesc, const nvinfer1::
|
|||||||
CUDA_CHECK(cudaMalloc(&softmax, sizeof(float) * inputSize * batchSize));
|
CUDA_CHECK(cudaMalloc(&softmax, sizeof(float) * inputSize * batchSize));
|
||||||
CUDA_CHECK(cudaMemsetAsync((float*)softmax, 0, sizeof(float) * inputSize * batchSize, stream));
|
CUDA_CHECK(cudaMemsetAsync((float*)softmax, 0, sizeof(float) * inputSize * batchSize, stream));
|
||||||
|
|
||||||
CUDA_CHECK(cudaRegionLayer(inputs[i], softmax, boxes, scores, classes, batchSize, inputSize, m_OutputSize,
|
CUDA_CHECK(cudaRegionLayer(inputs[i], softmax, outputs[0], batchSize, inputSize, m_OutputSize, lastInputSize,
|
||||||
lastInputSize, m_NetWidth, m_NetHeight, gridSizeX, gridSizeY, m_NumClasses, numBBoxes, v_anchors, stream));
|
m_NetWidth, m_NetHeight, gridSizeX, gridSizeY, m_NumClasses, numBBoxes, d_anchors, stream));
|
||||||
|
|
||||||
CUDA_CHECK(cudaFree(softmax));
|
CUDA_CHECK(cudaFree(softmax));
|
||||||
}
|
}
|
||||||
|
|
||||||
if (anchors.size() > 0) {
|
if (anchors.size() > 0) {
|
||||||
CUDA_CHECK(cudaFree(v_anchors));
|
CUDA_CHECK(cudaFree(d_anchors));
|
||||||
}
|
}
|
||||||
if (mask.size() > 0) {
|
if (mask.size() > 0) {
|
||||||
CUDA_CHECK(cudaFree(v_mask));
|
CUDA_CHECK(cudaFree(d_mask));
|
||||||
}
|
}
|
||||||
|
|
||||||
lastInputSize += numBBoxes * gridSizeY * gridSizeX;
|
lastInputSize += numBBoxes * gridSizeY * gridSizeX;
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
/*
|
/*
|
||||||
* Copyright (c) 2018-2023, NVIDIA CORPORATION. All rights reserved.
|
* Copyright (c) 2018-2024, NVIDIA CORPORATION. All rights reserved.
|
||||||
*
|
*
|
||||||
* Permission is hereby granted, free of charge, to any person obtaining a
|
* Permission is hereby granted, free of charge, to any person obtaining a
|
||||||
* copy of this software and associated documentation files (the "Software"),
|
* copy of this software and associated documentation files (the "Software"),
|
||||||
@@ -30,12 +30,12 @@
|
|||||||
|
|
||||||
#include "yolo.h"
|
#include "yolo.h"
|
||||||
|
|
||||||
#define CUDA_CHECK(status) { \
|
#define CUDA_CHECK(status) { \
|
||||||
if (status != 0) { \
|
if (status != 0) { \
|
||||||
std::cout << "CUDA failure: " << cudaGetErrorString(status) << " in file " << __FILE__ << " at line " << __LINE__ << \
|
std::cout << "CUDA failure: " << cudaGetErrorString(status) << " in file " << __FILE__ << " at line " << \
|
||||||
std::endl; \
|
__LINE__ << std::endl; \
|
||||||
abort(); \
|
abort(); \
|
||||||
} \
|
} \
|
||||||
}
|
}
|
||||||
|
|
||||||
namespace {
|
namespace {
|
||||||
@@ -62,7 +62,7 @@ class YoloLayer : public nvinfer1::IPluginV2DynamicExt {
|
|||||||
|
|
||||||
void serialize(void* buffer) const noexcept override;
|
void serialize(void* buffer) const noexcept override;
|
||||||
|
|
||||||
int getNbOutputs() const noexcept override { return 3; }
|
int getNbOutputs() const noexcept override { return 1; }
|
||||||
|
|
||||||
nvinfer1::DimsExprs getOutputDimensions(INT index, const nvinfer1::DimsExprs* inputs, INT nbInputDims,
|
nvinfer1::DimsExprs getOutputDimensions(INT index, const nvinfer1::DimsExprs* inputs, INT nbInputDims,
|
||||||
nvinfer1::IExprBuilder& exprBuilder) noexcept override;
|
nvinfer1::IExprBuilder& exprBuilder) noexcept override;
|
||||||
@@ -70,8 +70,8 @@ class YoloLayer : public nvinfer1::IPluginV2DynamicExt {
|
|||||||
size_t getWorkspaceSize(const nvinfer1::PluginTensorDesc* inputs, INT nbInputs,
|
size_t getWorkspaceSize(const nvinfer1::PluginTensorDesc* inputs, INT nbInputs,
|
||||||
const nvinfer1::PluginTensorDesc* outputs, INT nbOutputs) const noexcept override { return 0; }
|
const nvinfer1::PluginTensorDesc* outputs, INT nbOutputs) const noexcept override { return 0; }
|
||||||
|
|
||||||
bool supportsFormatCombination(INT pos, const nvinfer1::PluginTensorDesc* inOut, INT nbInputs, INT nbOutputs) noexcept
|
bool supportsFormatCombination(INT pos, const nvinfer1::PluginTensorDesc* inOut, INT nbInputs, INT nbOutputs)
|
||||||
override;
|
noexcept override;
|
||||||
|
|
||||||
const char* getPluginType() const noexcept override { return YOLOLAYER_PLUGIN_NAME; }
|
const char* getPluginType() const noexcept override { return YOLOLAYER_PLUGIN_NAME; }
|
||||||
|
|
||||||
@@ -84,8 +84,8 @@ class YoloLayer : public nvinfer1::IPluginV2DynamicExt {
|
|||||||
nvinfer1::DataType getOutputDataType(INT index, const nvinfer1::DataType* inputTypes, INT nbInputs) const noexcept
|
nvinfer1::DataType getOutputDataType(INT index, const nvinfer1::DataType* inputTypes, INT nbInputs) const noexcept
|
||||||
override;
|
override;
|
||||||
|
|
||||||
void attachToContext(cudnnContext* cudnnContext, cublasContext* cublasContext, nvinfer1::IGpuAllocator* gpuAllocator)
|
void attachToContext(cudnnContext* cudnnContext, cublasContext* cublasContext,
|
||||||
noexcept override {}
|
nvinfer1::IGpuAllocator* gpuAllocator) noexcept override {}
|
||||||
|
|
||||||
void configurePlugin(const nvinfer1::DynamicPluginTensorDesc* in, INT nbInput,
|
void configurePlugin(const nvinfer1::DynamicPluginTensorDesc* in, INT nbInput,
|
||||||
const nvinfer1::DynamicPluginTensorDesc* out, INT nbOutput) noexcept override;
|
const nvinfer1::DynamicPluginTensorDesc* out, INT nbOutput) noexcept override;
|
||||||
@@ -126,8 +126,8 @@ class YoloLayerPluginCreator : public nvinfer1::IPluginCreator {
|
|||||||
return nullptr;
|
return nullptr;
|
||||||
}
|
}
|
||||||
|
|
||||||
nvinfer1::IPluginV2DynamicExt* deserializePlugin(const char* name, const void* serialData, size_t serialLength) noexcept
|
nvinfer1::IPluginV2DynamicExt* deserializePlugin(const char* name, const void* serialData, size_t serialLength)
|
||||||
override {
|
noexcept override {
|
||||||
std::cout << "Deserialize yoloLayer plugin: " << name << std::endl;
|
std::cout << "Deserialize yoloLayer plugin: " << name << std::endl;
|
||||||
return new YoloLayer(serialData, serialLength);
|
return new YoloLayer(serialData, serialLength);
|
||||||
}
|
}
|
||||||
|
|||||||
149
utils/export_codetr.py
Normal file
149
utils/export_codetr.py
Normal file
@@ -0,0 +1,149 @@
|
|||||||
|
import os
|
||||||
|
import types
|
||||||
|
import onnx
|
||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
from copy import deepcopy
|
||||||
|
|
||||||
|
from projects import *
|
||||||
|
from mmengine.registry import MODELS
|
||||||
|
from mmdeploy.utils import load_config
|
||||||
|
from mmdet.utils import register_all_modules
|
||||||
|
from mmengine.model import revert_sync_batchnorm
|
||||||
|
from mmengine.runner.checkpoint import load_checkpoint
|
||||||
|
|
||||||
|
|
||||||
|
class DeepStreamOutput(nn.Module):
|
||||||
|
def __init__(self):
|
||||||
|
super().__init__()
|
||||||
|
|
||||||
|
def forward(self, x):
|
||||||
|
boxes = []
|
||||||
|
scores = []
|
||||||
|
labels = []
|
||||||
|
for det in x:
|
||||||
|
boxes.append(det.bboxes)
|
||||||
|
scores.append(det.scores.unsqueeze(-1))
|
||||||
|
labels.append(det.labels.unsqueeze(-1))
|
||||||
|
boxes = torch.stack(boxes, dim=0)
|
||||||
|
scores = torch.stack(scores, dim=0)
|
||||||
|
labels = torch.stack(labels, dim=0)
|
||||||
|
return torch.cat([boxes, scores, labels.to(boxes.dtype)], dim=-1)
|
||||||
|
|
||||||
|
|
||||||
|
def forward_deepstream(self, batch_inputs, batch_data_samples):
|
||||||
|
b, _, h, w = batch_inputs.shape
|
||||||
|
batch_data_samples = [{'batch_input_shape': (h, w), 'img_shape': (h, w)} for _ in range(b)]
|
||||||
|
img_feats = self.extract_feat(batch_inputs)
|
||||||
|
return self.predict_query_head(img_feats, batch_data_samples, rescale=False)
|
||||||
|
|
||||||
|
|
||||||
|
def query_head_predict_deepstream(self, feats, batch_data_samples, rescale=False):
|
||||||
|
with torch.no_grad():
|
||||||
|
outs = self.forward(feats, batch_data_samples)
|
||||||
|
predictions = self.predict_by_feat(
|
||||||
|
*outs, batch_img_metas=batch_data_samples, rescale=rescale)
|
||||||
|
return predictions
|
||||||
|
|
||||||
|
|
||||||
|
def codetr_export(weights, config, device):
|
||||||
|
register_all_modules()
|
||||||
|
model_cfg = load_config(config)[0]
|
||||||
|
model = deepcopy(model_cfg.model)
|
||||||
|
model.pop('pretrained', None)
|
||||||
|
for key in model['train_cfg']:
|
||||||
|
if 'rpn_proposal' in key:
|
||||||
|
key['rpn_proposal'] = {}
|
||||||
|
model['test_cfg'] = [{}, {'rpn': {}, 'rcnn': {}}, {}]
|
||||||
|
preprocess_cfg = deepcopy(model_cfg.get('preprocess_cfg', {}))
|
||||||
|
preprocess_cfg.update(deepcopy(model_cfg.get('data_preprocessor', {})))
|
||||||
|
model.setdefault('data_preprocessor', preprocess_cfg)
|
||||||
|
model = MODELS.build(model)
|
||||||
|
load_checkpoint(model, weights, map_location=device)
|
||||||
|
model = revert_sync_batchnorm(model)
|
||||||
|
if hasattr(model, 'backbone') and hasattr(model.backbone, 'switch_to_deploy'):
|
||||||
|
model.backbone.switch_to_deploy()
|
||||||
|
if hasattr(model, 'switch_to_deploy') and callable(model.switch_to_deploy):
|
||||||
|
model.switch_to_deploy()
|
||||||
|
model = model.to(device)
|
||||||
|
model.eval()
|
||||||
|
del model.data_preprocessor
|
||||||
|
model._forward = types.MethodType(forward_deepstream, model)
|
||||||
|
model.query_head.predict = types.MethodType(query_head_predict_deepstream, model.query_head)
|
||||||
|
return model
|
||||||
|
|
||||||
|
|
||||||
|
def suppress_warnings():
|
||||||
|
import warnings
|
||||||
|
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=UserWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=FutureWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=ResourceWarning)
|
||||||
|
|
||||||
|
|
||||||
|
def main(args):
|
||||||
|
suppress_warnings()
|
||||||
|
|
||||||
|
print(f'\nStarting: {args.weights}')
|
||||||
|
|
||||||
|
print('Opening CO-DETR model')
|
||||||
|
|
||||||
|
device = torch.device('cpu')
|
||||||
|
model = codetr_export(args.weights, args.config, device)
|
||||||
|
|
||||||
|
model = nn.Sequential(model, DeepStreamOutput())
|
||||||
|
|
||||||
|
img_size = args.size * 2 if len(args.size) == 1 else args.size
|
||||||
|
|
||||||
|
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
||||||
|
onnx_output_file = f'{args.weights}.onnx'
|
||||||
|
|
||||||
|
dynamic_axes = {
|
||||||
|
'input': {
|
||||||
|
0: 'batch'
|
||||||
|
},
|
||||||
|
'output': {
|
||||||
|
0: 'batch'
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
print('Exporting the model to ONNX')
|
||||||
|
torch.onnx.export(
|
||||||
|
model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
|
||||||
|
input_names=['input'], output_names=['output'], dynamic_axes=dynamic_axes if args.dynamic else None
|
||||||
|
)
|
||||||
|
|
||||||
|
if args.simplify:
|
||||||
|
print('Simplifying the ONNX model')
|
||||||
|
import onnxslim
|
||||||
|
model_onnx = onnx.load(onnx_output_file)
|
||||||
|
model_onnx = onnxslim.slim(model_onnx)
|
||||||
|
onnx.save(model_onnx, onnx_output_file)
|
||||||
|
|
||||||
|
print(f'Done: {onnx_output_file}\n')
|
||||||
|
|
||||||
|
|
||||||
|
def parse_args():
|
||||||
|
import argparse
|
||||||
|
parser = argparse.ArgumentParser(description='DeepStream CO-DETR conversion')
|
||||||
|
parser.add_argument('-w', '--weights', required=True, type=str, help='Input weights (.pth) file path (required)')
|
||||||
|
parser.add_argument('-c', '--config', required=True, help='Input config (.py) file path (required)')
|
||||||
|
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
|
||||||
|
parser.add_argument('--opset', type=int, default=11, help='ONNX opset version')
|
||||||
|
parser.add_argument('--simplify', action='store_true', help='ONNX simplify model')
|
||||||
|
parser.add_argument('--dynamic', action='store_true', help='Dynamic batch-size')
|
||||||
|
parser.add_argument('--batch', type=int, default=1, help='Static batch-size')
|
||||||
|
args = parser.parse_args()
|
||||||
|
if not os.path.isfile(args.weights):
|
||||||
|
raise SystemExit('Invalid weights file')
|
||||||
|
if not os.path.isfile(args.config):
|
||||||
|
raise SystemExit('Invalid config file')
|
||||||
|
if args.dynamic and args.batch > 1:
|
||||||
|
raise SystemExit('Cannot set dynamic batch-size and static batch-size at same time')
|
||||||
|
return args
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
args = parse_args()
|
||||||
|
main(args)
|
||||||
@@ -1,14 +1,12 @@
|
|||||||
import os
|
import os
|
||||||
import sys
|
|
||||||
import argparse
|
|
||||||
import warnings
|
|
||||||
import onnx
|
import onnx
|
||||||
import torch
|
import torch
|
||||||
import torch.nn as nn
|
import torch.nn as nn
|
||||||
from damo.base_models.core.ops import RepConv, SiLU
|
|
||||||
from damo.config.base import parse_config
|
from damo.config.base import parse_config
|
||||||
from damo.detectors.detector import build_local_model
|
|
||||||
from damo.utils.model_utils import replace_module
|
from damo.utils.model_utils import replace_module
|
||||||
|
from damo.base_models.core.ops import RepConv, SiLU
|
||||||
|
from damo.detectors.detector import build_local_model
|
||||||
|
|
||||||
|
|
||||||
class DeepStreamOutput(nn.Module):
|
class DeepStreamOutput(nn.Module):
|
||||||
@@ -17,15 +15,17 @@ class DeepStreamOutput(nn.Module):
|
|||||||
|
|
||||||
def forward(self, x):
|
def forward(self, x):
|
||||||
boxes = x[1]
|
boxes = x[1]
|
||||||
scores, classes = torch.max(x[0], 2, keepdim=True)
|
scores, labels = torch.max(x[0], dim=-1, keepdim=True)
|
||||||
classes = classes.float()
|
return torch.cat([boxes, scores, labels.to(boxes.dtype)], dim=-1)
|
||||||
return boxes, scores, classes
|
|
||||||
|
|
||||||
|
|
||||||
def suppress_warnings():
|
def suppress_warnings():
|
||||||
|
import warnings
|
||||||
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
||||||
warnings.filterwarnings('ignore', category=UserWarning)
|
warnings.filterwarnings('ignore', category=UserWarning)
|
||||||
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=FutureWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=ResourceWarning)
|
||||||
|
|
||||||
|
|
||||||
def damoyolo_export(weights, config_file, device):
|
def damoyolo_export(weights, config_file, device):
|
||||||
@@ -48,7 +48,7 @@ def damoyolo_export(weights, config_file, device):
|
|||||||
def main(args):
|
def main(args):
|
||||||
suppress_warnings()
|
suppress_warnings()
|
||||||
|
|
||||||
print('\nStarting: %s' % args.weights)
|
print(f'\nStarting: {args.weights}')
|
||||||
|
|
||||||
print('Opening DAMO-YOLO model')
|
print('Opening DAMO-YOLO model')
|
||||||
|
|
||||||
@@ -57,49 +57,44 @@ def main(args):
|
|||||||
|
|
||||||
if len(cfg.dataset['class_names']) > 0:
|
if len(cfg.dataset['class_names']) > 0:
|
||||||
print('Creating labels.txt file')
|
print('Creating labels.txt file')
|
||||||
f = open('labels.txt', 'w')
|
with open('labels.txt', 'w', encoding='utf-8') as f:
|
||||||
for name in cfg.dataset['class_names']:
|
for name in cfg.dataset['class_names']:
|
||||||
f.write(name + '\n')
|
f.write(f'{name}\n')
|
||||||
f.close()
|
|
||||||
|
|
||||||
model = nn.Sequential(model, DeepStreamOutput())
|
model = nn.Sequential(model, DeepStreamOutput())
|
||||||
|
|
||||||
img_size = args.size * 2 if len(args.size) == 1 else args.size
|
img_size = args.size * 2 if len(args.size) == 1 else args.size
|
||||||
|
|
||||||
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
||||||
onnx_output_file = cfg.miscs['exp_name'] + '.onnx'
|
onnx_output_file = f'{args.weights}.onnx'
|
||||||
|
|
||||||
dynamic_axes = {
|
dynamic_axes = {
|
||||||
'input': {
|
'input': {
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
},
|
},
|
||||||
'boxes': {
|
'output': {
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'scores': {
|
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'classes': {
|
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
print('Exporting the model to ONNX')
|
print('Exporting the model to ONNX')
|
||||||
torch.onnx.export(model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset,
|
torch.onnx.export(
|
||||||
do_constant_folding=True, input_names=['input'], output_names=['boxes', 'scores', 'classes'],
|
model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
|
||||||
dynamic_axes=dynamic_axes if args.dynamic else None)
|
input_names=['input'], output_names=['output'], dynamic_axes=dynamic_axes if args.dynamic else None
|
||||||
|
)
|
||||||
|
|
||||||
if args.simplify:
|
if args.simplify:
|
||||||
print('Simplifying the ONNX model')
|
print('Simplifying the ONNX model')
|
||||||
import onnxsim
|
import onnxslim
|
||||||
model_onnx = onnx.load(onnx_output_file)
|
model_onnx = onnx.load(onnx_output_file)
|
||||||
model_onnx, _ = onnxsim.simplify(model_onnx)
|
model_onnx = onnxslim.slim(model_onnx)
|
||||||
onnx.save(model_onnx, onnx_output_file)
|
onnx.save(model_onnx, onnx_output_file)
|
||||||
|
|
||||||
print('Done: %s\n' % onnx_output_file)
|
print(f'Done: {onnx_output_file}\n')
|
||||||
|
|
||||||
|
|
||||||
def parse_args():
|
def parse_args():
|
||||||
|
import argparse
|
||||||
parser = argparse.ArgumentParser(description='DeepStream DAMO-YOLO conversion')
|
parser = argparse.ArgumentParser(description='DeepStream DAMO-YOLO conversion')
|
||||||
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pth) file path (required)')
|
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pth) file path (required)')
|
||||||
parser.add_argument('-c', '--config', required=True, help='Input config (.py) file path (required)')
|
parser.add_argument('-c', '--config', required=True, help='Input config (.py) file path (required)')
|
||||||
@@ -120,4 +115,4 @@ def parse_args():
|
|||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
args = parse_args()
|
args = parse_args()
|
||||||
sys.exit(main(args))
|
main(args)
|
||||||
|
|||||||
121
utils/export_goldyolo.py
Normal file
121
utils/export_goldyolo.py
Normal file
@@ -0,0 +1,121 @@
|
|||||||
|
import os
|
||||||
|
import onnx
|
||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
|
||||||
|
import yolov6.utils.general as _m
|
||||||
|
from yolov6.layers.common import SiLU
|
||||||
|
from gold_yolo.switch_tool import switch_to_deploy
|
||||||
|
from yolov6.utils.checkpoint import load_checkpoint
|
||||||
|
|
||||||
|
|
||||||
|
def _dist2bbox(distance, anchor_points, box_format='xyxy'):
|
||||||
|
lt, rb = torch.split(distance, 2, -1)
|
||||||
|
x1y1 = anchor_points - lt
|
||||||
|
x2y2 = anchor_points + rb
|
||||||
|
bbox = torch.cat([x1y1, x2y2], -1)
|
||||||
|
return bbox
|
||||||
|
|
||||||
|
_m.dist2bbox.__code__ = _dist2bbox.__code__
|
||||||
|
|
||||||
|
|
||||||
|
class DeepStreamOutput(nn.Module):
|
||||||
|
def __init__(self):
|
||||||
|
super().__init__()
|
||||||
|
|
||||||
|
def forward(self, x):
|
||||||
|
boxes = x[:, :, :4]
|
||||||
|
objectness = x[:, :, 4:5]
|
||||||
|
scores, labels = torch.max(x[:, :, 5:], dim=-1, keepdim=True)
|
||||||
|
scores *= objectness
|
||||||
|
return torch.cat([boxes, scores, labels.to(boxes.dtype)], dim=-1)
|
||||||
|
|
||||||
|
|
||||||
|
def gold_yolo_export(weights, device, inplace=True, fuse=True):
|
||||||
|
model = load_checkpoint(weights, map_location=device, inplace=inplace, fuse=fuse)
|
||||||
|
model = switch_to_deploy(model)
|
||||||
|
for layer in model.modules():
|
||||||
|
t = type(layer)
|
||||||
|
if t.__name__ == 'RepVGGBlock':
|
||||||
|
layer.switch_to_deploy()
|
||||||
|
model.eval()
|
||||||
|
for k, m in model.named_modules():
|
||||||
|
if m.__class__.__name__ == 'Conv':
|
||||||
|
if isinstance(m.act, nn.SiLU):
|
||||||
|
m.act = SiLU()
|
||||||
|
elif m.__class__.__name__ == 'Detect':
|
||||||
|
m.inplace = False
|
||||||
|
return model
|
||||||
|
|
||||||
|
|
||||||
|
def suppress_warnings():
|
||||||
|
import warnings
|
||||||
|
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=UserWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=FutureWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=ResourceWarning)
|
||||||
|
|
||||||
|
|
||||||
|
def main(args):
|
||||||
|
suppress_warnings()
|
||||||
|
|
||||||
|
print(f'\nStarting: {args.weights}')
|
||||||
|
|
||||||
|
print('Opening Gold-YOLO model')
|
||||||
|
|
||||||
|
device = torch.device('cpu')
|
||||||
|
model = gold_yolo_export(args.weights, device)
|
||||||
|
|
||||||
|
model = nn.Sequential(model, DeepStreamOutput())
|
||||||
|
|
||||||
|
img_size = args.size * 2 if len(args.size) == 1 else args.size
|
||||||
|
|
||||||
|
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
||||||
|
onnx_output_file = f'{args.weights}.onnx'
|
||||||
|
|
||||||
|
dynamic_axes = {
|
||||||
|
'input': {
|
||||||
|
0: 'batch'
|
||||||
|
},
|
||||||
|
'output': {
|
||||||
|
0: 'batch'
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
print('Exporting the model to ONNX')
|
||||||
|
torch.onnx.export(
|
||||||
|
model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
|
||||||
|
input_names=['input'], output_names=['output'], dynamic_axes=dynamic_axes if args.dynamic else None
|
||||||
|
)
|
||||||
|
|
||||||
|
if args.simplify:
|
||||||
|
print('Simplifying the ONNX model')
|
||||||
|
import onnxslim
|
||||||
|
model_onnx = onnx.load(onnx_output_file)
|
||||||
|
model_onnx = onnxslim.slim(model_onnx)
|
||||||
|
onnx.save(model_onnx, onnx_output_file)
|
||||||
|
|
||||||
|
print(f'Done: {onnx_output_file}\n')
|
||||||
|
|
||||||
|
|
||||||
|
def parse_args():
|
||||||
|
import argparse
|
||||||
|
parser = argparse.ArgumentParser(description='DeepStream Gold-YOLO conversion')
|
||||||
|
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pt) file path (required)')
|
||||||
|
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
|
||||||
|
parser.add_argument('--opset', type=int, default=13, help='ONNX opset version')
|
||||||
|
parser.add_argument('--simplify', action='store_true', help='ONNX simplify model')
|
||||||
|
parser.add_argument('--dynamic', action='store_true', help='Dynamic batch-size')
|
||||||
|
parser.add_argument('--batch', type=int, default=1, help='Static batch-size')
|
||||||
|
args = parser.parse_args()
|
||||||
|
if not os.path.isfile(args.weights):
|
||||||
|
raise SystemExit('Invalid weights file')
|
||||||
|
if args.dynamic and args.batch > 1:
|
||||||
|
raise SystemExit('Cannot set dynamic batch-size and static batch-size at same time')
|
||||||
|
return args
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
args = parse_args()
|
||||||
|
main(args)
|
||||||
@@ -1,14 +1,14 @@
|
|||||||
import os
|
import os
|
||||||
import sys
|
|
||||||
import onnx
|
import onnx
|
||||||
import paddle
|
import paddle
|
||||||
import paddle.nn as nn
|
import paddle.nn as nn
|
||||||
from ppdet.core.workspace import load_config, merge_config
|
|
||||||
from ppdet.utils.check import check_version, check_config
|
|
||||||
from ppdet.utils.cli import ArgsParser
|
|
||||||
from ppdet.engine import Trainer
|
from ppdet.engine import Trainer
|
||||||
|
from ppdet.utils.cli import ArgsParser
|
||||||
from ppdet.slim import build_slim_model
|
from ppdet.slim import build_slim_model
|
||||||
from ppdet.data.source.category import get_categories
|
from ppdet.data.source.category import get_categories
|
||||||
|
from ppdet.utils.check import check_version, check_config
|
||||||
|
from ppdet.core.workspace import load_config, merge_config
|
||||||
|
|
||||||
|
|
||||||
class DeepStreamOutput(nn.Layer):
|
class DeepStreamOutput(nn.Layer):
|
||||||
@@ -18,9 +18,20 @@ class DeepStreamOutput(nn.Layer):
|
|||||||
def forward(self, x):
|
def forward(self, x):
|
||||||
boxes = x['bbox']
|
boxes = x['bbox']
|
||||||
x['bbox_num'] = x['bbox_num'].transpose([0, 2, 1])
|
x['bbox_num'] = x['bbox_num'].transpose([0, 2, 1])
|
||||||
scores = paddle.max(x['bbox_num'], 2, keepdim=True)
|
scores = paddle.max(x['bbox_num'], axis=-1, keepdim=True)
|
||||||
classes = paddle.cast(paddle.argmax(x['bbox_num'], 2, keepdim=True), dtype='float32')
|
labels = paddle.argmax(x['bbox_num'], axis=-1, keepdim=True)
|
||||||
return boxes, scores, classes
|
return paddle.concat((boxes, scores, paddle.cast(labels, dtype=boxes.dtype)), axis=-1)
|
||||||
|
|
||||||
|
|
||||||
|
class DeepStreamInput(nn.Layer):
|
||||||
|
def __init__(self):
|
||||||
|
super().__init__()
|
||||||
|
|
||||||
|
def forward(self, x):
|
||||||
|
y = {}
|
||||||
|
y['image'] = x['image']
|
||||||
|
y['scale_factor'] = paddle.to_tensor([1.0, 1.0], dtype=x['image'].dtype)
|
||||||
|
return y
|
||||||
|
|
||||||
|
|
||||||
def ppyoloe_export(FLAGS):
|
def ppyoloe_export(FLAGS):
|
||||||
@@ -43,10 +54,17 @@ def ppyoloe_export(FLAGS):
|
|||||||
return trainer.cfg, static_model
|
return trainer.cfg, static_model
|
||||||
|
|
||||||
|
|
||||||
def main(FLAGS):
|
def suppress_warnings():
|
||||||
print('\nStarting: %s' % FLAGS.weights)
|
import warnings
|
||||||
|
warnings.filterwarnings('ignore')
|
||||||
|
|
||||||
print('\nOpening PPYOLOE model\n')
|
|
||||||
|
def main(FLAGS):
|
||||||
|
suppress_warnings()
|
||||||
|
|
||||||
|
print(f'\nStarting: {FLAGS.weights}')
|
||||||
|
|
||||||
|
print('Opening PPYOLOE model')
|
||||||
|
|
||||||
paddle.set_device('cpu')
|
paddle.set_device('cpu')
|
||||||
cfg, model = ppyoloe_export(FLAGS)
|
cfg, model = ppyoloe_export(FLAGS)
|
||||||
@@ -54,32 +72,30 @@ def main(FLAGS):
|
|||||||
anno_file = cfg['TestDataset'].get_anno()
|
anno_file = cfg['TestDataset'].get_anno()
|
||||||
if os.path.isfile(anno_file):
|
if os.path.isfile(anno_file):
|
||||||
_, catid2name = get_categories(cfg['metric'], anno_file, 'detection_arch')
|
_, catid2name = get_categories(cfg['metric'], anno_file, 'detection_arch')
|
||||||
print('\nCreating labels.txt file')
|
print('Creating labels.txt file')
|
||||||
f = open('labels.txt', 'w')
|
with open('labels.txt', 'w', encoding='utf-8') as f:
|
||||||
for name in catid2name.values():
|
for name in catid2name.values():
|
||||||
f.write(str(name) + '\n')
|
f.write(f'{name}\n')
|
||||||
f.close()
|
|
||||||
|
|
||||||
model = nn.Sequential(model, DeepStreamOutput())
|
model = nn.Sequential(DeepStreamInput(), model, DeepStreamOutput())
|
||||||
|
|
||||||
img_size = [cfg.eval_height, cfg.eval_width]
|
img_size = [cfg.eval_height, cfg.eval_width]
|
||||||
|
|
||||||
onnx_input_im = {}
|
onnx_input_im = {}
|
||||||
onnx_input_im['image'] = paddle.static.InputSpec(shape=[FLAGS.batch, 3, *img_size], dtype='float32', name='image')
|
onnx_input_im['image'] = paddle.static.InputSpec(shape=[FLAGS.batch, 3, *img_size], dtype='float32')
|
||||||
onnx_input_im['scale_factor'] = paddle.static.InputSpec(shape=[FLAGS.batch, 2], dtype='float32', name='scale_factor')
|
onnx_output_file = f'{FLAGS.weights}.onnx'
|
||||||
onnx_output_file = cfg.filename + '.onnx'
|
|
||||||
|
|
||||||
print('\nExporting the model to ONNX\n')
|
print('Exporting the model to ONNX')
|
||||||
paddle.onnx.export(model, cfg.filename, input_spec=[onnx_input_im], opset_version=FLAGS.opset)
|
paddle.onnx.export(model, FLAGS.weights, input_spec=[onnx_input_im], opset_version=FLAGS.opset)
|
||||||
|
|
||||||
if FLAGS.simplify:
|
if FLAGS.simplify:
|
||||||
print('\nSimplifying the ONNX model')
|
print('Simplifying the ONNX model')
|
||||||
import onnxsim
|
import onnxslim
|
||||||
model_onnx = onnx.load(onnx_output_file)
|
model_onnx = onnx.load(onnx_output_file)
|
||||||
model_onnx, _ = onnxsim.simplify(model_onnx)
|
model_onnx = onnxslim.slim(model_onnx)
|
||||||
onnx.save(model_onnx, onnx_output_file)
|
onnx.save(model_onnx, onnx_output_file)
|
||||||
|
|
||||||
print('\nDone: %s\n' % onnx_output_file)
|
print(f'Done: {onnx_output_file}\n')
|
||||||
|
|
||||||
|
|
||||||
def parse_args():
|
def parse_args():
|
||||||
@@ -92,9 +108,9 @@ def parse_args():
|
|||||||
parser.add_argument('--batch', type=int, default=1, help='Static batch-size')
|
parser.add_argument('--batch', type=int, default=1, help='Static batch-size')
|
||||||
args = parser.parse_args()
|
args = parser.parse_args()
|
||||||
if not os.path.isfile(args.weights):
|
if not os.path.isfile(args.weights):
|
||||||
raise SystemExit('\nInvalid weights file')
|
raise SystemExit('Invalid weights file')
|
||||||
if args.dynamic and args.batch > 1:
|
if args.dynamic and args.batch > 1:
|
||||||
raise SystemExit('\nCannot set dynamic batch-size and static batch-size at same time')
|
raise SystemExit('Cannot set dynamic batch-size and static batch-size at same time')
|
||||||
elif args.dynamic:
|
elif args.dynamic:
|
||||||
args.batch = None
|
args.batch = None
|
||||||
return args
|
return args
|
||||||
@@ -102,4 +118,4 @@ def parse_args():
|
|||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
FLAGS = parse_args()
|
FLAGS = parse_args()
|
||||||
sys.exit(main(FLAGS))
|
main(FLAGS)
|
||||||
|
|||||||
@@ -1,34 +1,32 @@
|
|||||||
import os
|
import os
|
||||||
import sys
|
|
||||||
import warnings
|
|
||||||
import onnx
|
import onnx
|
||||||
import paddle
|
import paddle
|
||||||
import paddle.nn as nn
|
import paddle.nn as nn
|
||||||
import paddle.nn.functional as F
|
import paddle.nn.functional as F
|
||||||
from ppdet.core.workspace import load_config, merge_config
|
|
||||||
from ppdet.utils.check import check_version, check_config
|
|
||||||
from ppdet.utils.cli import ArgsParser
|
|
||||||
from ppdet.engine import Trainer
|
from ppdet.engine import Trainer
|
||||||
|
from ppdet.utils.cli import ArgsParser
|
||||||
|
from ppdet.utils.check import check_version, check_config
|
||||||
|
from ppdet.core.workspace import load_config, merge_config
|
||||||
|
|
||||||
|
|
||||||
class DeepStreamOutput(nn.Layer):
|
class DeepStreamOutput(nn.Layer):
|
||||||
def __init__(self, img_size, use_focal_loss):
|
def __init__(self, img_size, use_focal_loss):
|
||||||
|
super().__init__()
|
||||||
self.img_size = img_size
|
self.img_size = img_size
|
||||||
self.use_focal_loss = use_focal_loss
|
self.use_focal_loss = use_focal_loss
|
||||||
super().__init__()
|
|
||||||
|
|
||||||
def forward(self, x):
|
def forward(self, x):
|
||||||
boxes = x['bbox']
|
boxes = x['bbox']
|
||||||
out_shape = paddle.to_tensor([[*self.img_size]]).flip(1).tile([1, 2]).unsqueeze(1)
|
convert_matrix = paddle.to_tensor(
|
||||||
boxes *= out_shape
|
[[1, 0, 1, 0], [0, 1, 0, 1], [-0.5, 0, 0.5, 0], [0, -0.5, 0, 0.5]], dtype=boxes.dtype
|
||||||
|
)
|
||||||
|
boxes @= convert_matrix
|
||||||
|
boxes *= paddle.to_tensor([[*self.img_size]]).flip(1).tile([1, 2]).unsqueeze(1)
|
||||||
bbox_num = F.sigmoid(x['bbox_num']) if self.use_focal_loss else F.softmax(x['bbox_num'])[:, :, :-1]
|
bbox_num = F.sigmoid(x['bbox_num']) if self.use_focal_loss else F.softmax(x['bbox_num'])[:, :, :-1]
|
||||||
scores = paddle.max(bbox_num, 2, keepdim=True)
|
scores = paddle.max(bbox_num, axis=-1, keepdim=True)
|
||||||
classes = paddle.cast(paddle.argmax(bbox_num, 2, keepdim=True), dtype='float32')
|
labels = paddle.argmax(bbox_num, axis=-1, keepdim=True)
|
||||||
return boxes, scores, classes
|
return paddle.concat((boxes, scores, paddle.cast(labels, dtype=boxes.dtype)), axis=-1)
|
||||||
|
|
||||||
|
|
||||||
def suppress_warnings():
|
|
||||||
warnings.filterwarnings('ignore')
|
|
||||||
|
|
||||||
|
|
||||||
def rtdetr_paddle_export(FLAGS):
|
def rtdetr_paddle_export(FLAGS):
|
||||||
@@ -50,12 +48,17 @@ def rtdetr_paddle_export(FLAGS):
|
|||||||
return trainer.cfg, static_model
|
return trainer.cfg, static_model
|
||||||
|
|
||||||
|
|
||||||
|
def suppress_warnings():
|
||||||
|
import warnings
|
||||||
|
warnings.filterwarnings('ignore')
|
||||||
|
|
||||||
|
|
||||||
def main(FLAGS):
|
def main(FLAGS):
|
||||||
suppress_warnings()
|
suppress_warnings()
|
||||||
|
|
||||||
print('\nStarting: %s' % FLAGS.weights)
|
print(f'\nStarting: {FLAGS.weights}')
|
||||||
|
|
||||||
print('\nOpening RT-DETR Paddle model\n')
|
print('Opening RT-DETR Paddle model')
|
||||||
|
|
||||||
paddle.set_device('cpu')
|
paddle.set_device('cpu')
|
||||||
cfg, model = rtdetr_paddle_export(FLAGS)
|
cfg, model = rtdetr_paddle_export(FLAGS)
|
||||||
@@ -65,20 +68,20 @@ def main(FLAGS):
|
|||||||
model = nn.Sequential(model, DeepStreamOutput(img_size, cfg.use_focal_loss))
|
model = nn.Sequential(model, DeepStreamOutput(img_size, cfg.use_focal_loss))
|
||||||
|
|
||||||
onnx_input_im = {}
|
onnx_input_im = {}
|
||||||
onnx_input_im['image'] = paddle.static.InputSpec(shape=[FLAGS.batch, 3, *img_size], dtype='float32', name='image')
|
onnx_input_im['image'] = paddle.static.InputSpec(shape=[FLAGS.batch, 3, *img_size], dtype='float32')
|
||||||
onnx_output_file = cfg.filename + '.onnx'
|
onnx_output_file = f'{FLAGS.weights}.onnx'
|
||||||
|
|
||||||
print('\nExporting the model to ONNX\n')
|
print('Exporting the model to ONNX\n')
|
||||||
paddle.onnx.export(model, cfg.filename, input_spec=[onnx_input_im], opset_version=FLAGS.opset)
|
paddle.onnx.export(model, FLAGS.weights, input_spec=[onnx_input_im], opset_version=FLAGS.opset)
|
||||||
|
|
||||||
if FLAGS.simplify:
|
if FLAGS.simplify:
|
||||||
print('\nSimplifying the ONNX model')
|
print('Simplifying the ONNX model')
|
||||||
import onnxsim
|
import onnxslim
|
||||||
model_onnx = onnx.load(onnx_output_file)
|
model_onnx = onnx.load(onnx_output_file)
|
||||||
model_onnx, _ = onnxsim.simplify(model_onnx)
|
model_onnx = onnxslim.slim(model_onnx)
|
||||||
onnx.save(model_onnx, onnx_output_file)
|
onnx.save(model_onnx, onnx_output_file)
|
||||||
|
|
||||||
print('\nDone: %s\n' % onnx_output_file)
|
print(f'Done: {onnx_output_file}\n')
|
||||||
|
|
||||||
|
|
||||||
def parse_args():
|
def parse_args():
|
||||||
@@ -91,9 +94,9 @@ def parse_args():
|
|||||||
parser.add_argument('--batch', type=int, default=1, help='Static batch-size')
|
parser.add_argument('--batch', type=int, default=1, help='Static batch-size')
|
||||||
args = parser.parse_args()
|
args = parser.parse_args()
|
||||||
if not os.path.isfile(args.weights):
|
if not os.path.isfile(args.weights):
|
||||||
raise SystemExit('\nInvalid weights file')
|
raise SystemExit('Invalid weights file')
|
||||||
if args.dynamic and args.batch > 1:
|
if args.dynamic and args.batch > 1:
|
||||||
raise SystemExit('\nCannot set dynamic batch-size and static batch-size at same time')
|
raise SystemExit('Cannot set dynamic batch-size and static batch-size at same time')
|
||||||
elif args.dynamic:
|
elif args.dynamic:
|
||||||
args.batch = None
|
args.batch = None
|
||||||
return args
|
return args
|
||||||
@@ -101,4 +104,4 @@ def parse_args():
|
|||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
FLAGS = parse_args()
|
FLAGS = parse_args()
|
||||||
sys.exit(main(FLAGS))
|
main(FLAGS)
|
||||||
|
|||||||
@@ -1,31 +1,28 @@
|
|||||||
import os
|
import os
|
||||||
import sys
|
|
||||||
import argparse
|
|
||||||
import warnings
|
|
||||||
import onnx
|
import onnx
|
||||||
import torch
|
import torch
|
||||||
import torch.nn as nn
|
import torch.nn as nn
|
||||||
|
import torch.nn.functional as F
|
||||||
|
|
||||||
from src.core import YAMLConfig
|
from src.core import YAMLConfig
|
||||||
|
|
||||||
|
|
||||||
class DeepStreamOutput(nn.Module):
|
class DeepStreamOutput(nn.Module):
|
||||||
def __init__(self, img_size):
|
def __init__(self, img_size, use_focal_loss):
|
||||||
self.img_size = img_size
|
|
||||||
super().__init__()
|
super().__init__()
|
||||||
|
self.img_size = img_size
|
||||||
|
self.use_focal_loss = use_focal_loss
|
||||||
|
|
||||||
def forward(self, x):
|
def forward(self, x):
|
||||||
boxes = x['pred_boxes']
|
boxes = x['pred_boxes']
|
||||||
boxes[:, :, [0, 2]] *= self.img_size[1]
|
convert_matrix = torch.tensor(
|
||||||
boxes[:, :, [1, 3]] *= self.img_size[0]
|
[[1, 0, 1, 0], [0, 1, 0, 1], [-0.5, 0, 0.5, 0], [0, -0.5, 0, 0.5]], dtype=boxes.dtype, device=boxes.device
|
||||||
scores, classes = torch.max(x['pred_logits'], 2, keepdim=True)
|
)
|
||||||
classes = classes.float()
|
boxes @= convert_matrix
|
||||||
return boxes, scores, classes
|
boxes *= torch.as_tensor([[*self.img_size]]).flip(1).tile([1, 2]).unsqueeze(1)
|
||||||
|
scores = F.sigmoid(x['pred_logits']) if self.use_focal_loss else F.softmax(x['pred_logits'])[:, :, :-1]
|
||||||
|
scores, labels = torch.max(scores, dim=-1, keepdim=True)
|
||||||
def suppress_warnings():
|
return torch.cat([boxes, scores, labels.to(boxes.dtype)], dim=-1)
|
||||||
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
|
||||||
warnings.filterwarnings('ignore', category=UserWarning)
|
|
||||||
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
|
||||||
|
|
||||||
|
|
||||||
def rtdetr_pytorch_export(weights, cfg_file, device):
|
def rtdetr_pytorch_export(weights, cfg_file, device):
|
||||||
@@ -36,57 +33,62 @@ def rtdetr_pytorch_export(weights, cfg_file, device):
|
|||||||
else:
|
else:
|
||||||
state = checkpoint['model']
|
state = checkpoint['model']
|
||||||
cfg.model.load_state_dict(state)
|
cfg.model.load_state_dict(state)
|
||||||
return cfg.model.deploy()
|
return cfg.model.deploy(), cfg.postprocessor.use_focal_loss
|
||||||
|
|
||||||
|
|
||||||
|
def suppress_warnings():
|
||||||
|
import warnings
|
||||||
|
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=UserWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=FutureWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=ResourceWarning)
|
||||||
|
|
||||||
|
|
||||||
def main(args):
|
def main(args):
|
||||||
suppress_warnings()
|
suppress_warnings()
|
||||||
|
|
||||||
print('\nStarting: %s' % args.weights)
|
print(f'\nStarting: {args.weights}')
|
||||||
|
|
||||||
print('Opening RT-DETR PyTorch model\n')
|
print('Opening RT-DETR PyTorch model')
|
||||||
|
|
||||||
device = torch.device('cpu')
|
device = torch.device('cpu')
|
||||||
model = rtdetr_pytorch_export(args.weights, args.config, device)
|
model, use_focal_loss = rtdetr_pytorch_export(args.weights, args.config, device)
|
||||||
|
|
||||||
img_size = args.size * 2 if len(args.size) == 1 else args.size
|
img_size = args.size * 2 if len(args.size) == 1 else args.size
|
||||||
|
|
||||||
model = nn.Sequential(model, DeepStreamOutput(img_size))
|
model = nn.Sequential(model, DeepStreamOutput(img_size, use_focal_loss))
|
||||||
|
|
||||||
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
||||||
onnx_output_file = os.path.basename(args.weights).split('.pt')[0] + '.onnx'
|
onnx_output_file = f'{args.weights}.onnx'
|
||||||
|
|
||||||
dynamic_axes = {
|
dynamic_axes = {
|
||||||
'input': {
|
'input': {
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
},
|
},
|
||||||
'boxes': {
|
'output': {
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'scores': {
|
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'classes': {
|
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
print('\nExporting the model to ONNX')
|
print('Exporting the model to ONNX')
|
||||||
torch.onnx.export(model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset,
|
torch.onnx.export(
|
||||||
do_constant_folding=True, input_names=['input'], output_names=['boxes', 'scores', 'classes'],
|
model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
|
||||||
dynamic_axes=dynamic_axes if args.dynamic else None)
|
input_names=['input'], output_names=['output'], dynamic_axes=dynamic_axes if args.dynamic else None
|
||||||
|
)
|
||||||
|
|
||||||
if args.simplify:
|
if args.simplify:
|
||||||
print('Simplifying the ONNX model')
|
print('Simplifying the ONNX model')
|
||||||
import onnxsim
|
import onnxslim
|
||||||
model_onnx = onnx.load(onnx_output_file)
|
model_onnx = onnx.load(onnx_output_file)
|
||||||
model_onnx, _ = onnxsim.simplify(model_onnx)
|
model_onnx = onnxslim.slim(model_onnx)
|
||||||
onnx.save(model_onnx, onnx_output_file)
|
onnx.save(model_onnx, onnx_output_file)
|
||||||
|
|
||||||
print('Done: %s\n' % onnx_output_file)
|
print(f'Done: {onnx_output_file}\n')
|
||||||
|
|
||||||
|
|
||||||
def parse_args():
|
def parse_args():
|
||||||
|
import argparse
|
||||||
parser = argparse.ArgumentParser(description='DeepStream RT-DETR PyTorch conversion')
|
parser = argparse.ArgumentParser(description='DeepStream RT-DETR PyTorch conversion')
|
||||||
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pth) file path (required)')
|
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pth) file path (required)')
|
||||||
parser.add_argument('-c', '--config', required=True, help='Input YAML (.yml) file path (required)')
|
parser.add_argument('-c', '--config', required=True, help='Input YAML (.yml) file path (required)')
|
||||||
@@ -107,4 +109,4 @@ def parse_args():
|
|||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
args = parse_args()
|
args = parse_args()
|
||||||
sys.exit(main(args))
|
main(args)
|
||||||
|
|||||||
@@ -1,34 +1,25 @@
|
|||||||
import os
|
import os
|
||||||
import sys
|
|
||||||
import argparse
|
|
||||||
import warnings
|
|
||||||
import onnx
|
|
||||||
import torch
|
import torch
|
||||||
import torch.nn as nn
|
import torch.nn as nn
|
||||||
from copy import deepcopy
|
from copy import deepcopy
|
||||||
|
|
||||||
from ultralytics import RTDETR
|
from ultralytics import RTDETR
|
||||||
from ultralytics.utils.torch_utils import select_device
|
|
||||||
from ultralytics.nn.modules import C2f, Detect, RTDETRDecoder
|
|
||||||
|
|
||||||
|
|
||||||
class DeepStreamOutput(nn.Module):
|
class DeepStreamOutput(nn.Module):
|
||||||
def __init__(self, img_size):
|
def __init__(self, img_size):
|
||||||
self.img_size = img_size
|
|
||||||
super().__init__()
|
super().__init__()
|
||||||
|
self.img_size = img_size
|
||||||
|
|
||||||
def forward(self, x):
|
def forward(self, x):
|
||||||
boxes = x[:, :, :4]
|
boxes = x[:, :, :4]
|
||||||
boxes[:, :, [0, 2]] *= self.img_size[1]
|
convert_matrix = torch.tensor(
|
||||||
boxes[:, :, [1, 3]] *= self.img_size[0]
|
[[1, 0, 1, 0], [0, 1, 0, 1], [-0.5, 0, 0.5, 0], [0, -0.5, 0, 0.5]], dtype=boxes.dtype, device=boxes.device
|
||||||
scores, classes = torch.max(x[:, :, 4:], 2, keepdim=True)
|
)
|
||||||
classes = classes.float()
|
boxes @= convert_matrix
|
||||||
return boxes, scores, classes
|
boxes *= torch.as_tensor([[*self.img_size]]).flip(1).tile([1, 2]).unsqueeze(1)
|
||||||
|
scores, labels = torch.max(x[:, :, 4:], dim=-1, keepdim=True)
|
||||||
|
return torch.cat([boxes, scores, labels.to(boxes.dtype)], dim=-1)
|
||||||
def suppress_warnings():
|
|
||||||
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
|
||||||
warnings.filterwarnings('ignore', category=UserWarning)
|
|
||||||
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
|
||||||
|
|
||||||
|
|
||||||
def rtdetr_ultralytics_export(weights, device):
|
def rtdetr_ultralytics_export(weights, device):
|
||||||
@@ -40,74 +31,74 @@ def rtdetr_ultralytics_export(weights, device):
|
|||||||
model.float()
|
model.float()
|
||||||
model = model.fuse()
|
model = model.fuse()
|
||||||
for k, m in model.named_modules():
|
for k, m in model.named_modules():
|
||||||
if isinstance(m, (Detect, RTDETRDecoder)):
|
if m.__class__.__name__ in ('Detect', 'RTDETRDecoder'):
|
||||||
m.dynamic = False
|
m.dynamic = False
|
||||||
m.export = True
|
m.export = True
|
||||||
m.format = 'onnx'
|
m.format = 'onnx'
|
||||||
elif isinstance(m, C2f):
|
elif m.__class__.__name__ == 'C2f':
|
||||||
m.forward = m.forward_split
|
m.forward = m.forward_split
|
||||||
return model
|
return model
|
||||||
|
|
||||||
|
|
||||||
|
def suppress_warnings():
|
||||||
|
import warnings
|
||||||
|
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=UserWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=FutureWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=ResourceWarning)
|
||||||
|
|
||||||
|
|
||||||
def main(args):
|
def main(args):
|
||||||
suppress_warnings()
|
suppress_warnings()
|
||||||
|
|
||||||
print('\nStarting: %s' % args.weights)
|
print(f'\nStarting: {args.weights}')
|
||||||
|
|
||||||
print('Opening RT-DETR Ultralytics model\n')
|
print('Opening RT-DETR Ultralytics model')
|
||||||
|
|
||||||
device = select_device('cpu')
|
device = torch.device('cpu')
|
||||||
model = rtdetr_ultralytics_export(args.weights, device)
|
model = rtdetr_ultralytics_export(args.weights, device)
|
||||||
|
|
||||||
if len(model.names.keys()) > 0:
|
if len(model.names.keys()) > 0:
|
||||||
print('\nCreating labels.txt file')
|
print('Creating labels.txt file')
|
||||||
f = open('labels.txt', 'w')
|
with open('labels.txt', 'w', encoding='utf-8') as f:
|
||||||
for name in model.names.values():
|
for name in model.names.values():
|
||||||
f.write(name + '\n')
|
f.write(f'{name}\n')
|
||||||
f.close()
|
|
||||||
|
|
||||||
img_size = args.size * 2 if len(args.size) == 1 else args.size
|
img_size = args.size * 2 if len(args.size) == 1 else args.size
|
||||||
|
|
||||||
model = nn.Sequential(model, DeepStreamOutput(img_size))
|
model = nn.Sequential(model, DeepStreamOutput(img_size))
|
||||||
|
|
||||||
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
||||||
onnx_output_file = os.path.basename(args.weights).split('.pt')[0] + '.onnx'
|
onnx_output_file = f'{args.weights}.onnx'
|
||||||
|
|
||||||
dynamic_axes = {
|
dynamic_axes = {
|
||||||
'input': {
|
'input': {
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
},
|
},
|
||||||
'boxes': {
|
'output': {
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'scores': {
|
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'classes': {
|
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
print('\nExporting the model to ONNX')
|
print('Exporting the model to ONNX')
|
||||||
torch.onnx.export(model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset,
|
torch.onnx.export(
|
||||||
do_constant_folding=True, input_names=['input'], output_names=['boxes', 'scores', 'classes'],
|
model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
|
||||||
dynamic_axes=dynamic_axes if args.dynamic else None)
|
input_names=['input'], output_names=['output'], dynamic_axes=dynamic_axes if args.dynamic else None
|
||||||
|
)
|
||||||
|
|
||||||
if args.simplify:
|
if args.simplify:
|
||||||
print('Simplifying the ONNX model')
|
print('Simplifying is not available for this model')
|
||||||
import onnxsim
|
|
||||||
model_onnx = onnx.load(onnx_output_file)
|
|
||||||
model_onnx, _ = onnxsim.simplify(model_onnx)
|
|
||||||
onnx.save(model_onnx, onnx_output_file)
|
|
||||||
|
|
||||||
print('Done: %s\n' % onnx_output_file)
|
print(f'Done: {onnx_output_file}\n')
|
||||||
|
|
||||||
|
|
||||||
def parse_args():
|
def parse_args():
|
||||||
|
import argparse
|
||||||
parser = argparse.ArgumentParser(description='DeepStream RT-DETR Ultralytics conversion')
|
parser = argparse.ArgumentParser(description='DeepStream RT-DETR Ultralytics conversion')
|
||||||
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pt) file path (required)')
|
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pt) file path (required)')
|
||||||
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
|
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
|
||||||
parser.add_argument('--opset', type=int, default=16, help='ONNX opset version')
|
parser.add_argument('--opset', type=int, default=17, help='ONNX opset version')
|
||||||
parser.add_argument('--simplify', action='store_true', help='ONNX simplify model')
|
parser.add_argument('--simplify', action='store_true', help='ONNX simplify model')
|
||||||
parser.add_argument('--dynamic', action='store_true', help='Dynamic batch-size')
|
parser.add_argument('--dynamic', action='store_true', help='Dynamic batch-size')
|
||||||
parser.add_argument('--batch', type=int, default=1, help='Static batch-size')
|
parser.add_argument('--batch', type=int, default=1, help='Static batch-size')
|
||||||
@@ -121,4 +112,4 @@ def parse_args():
|
|||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
args = parse_args()
|
args = parse_args()
|
||||||
sys.exit(main(args))
|
main(args)
|
||||||
|
|||||||
150
utils/export_rtmdet.py
Normal file
150
utils/export_rtmdet.py
Normal file
@@ -0,0 +1,150 @@
|
|||||||
|
import os
|
||||||
|
import types
|
||||||
|
import onnx
|
||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
|
||||||
|
from mmdet.apis import init_detector
|
||||||
|
from projects.easydeploy.model import DeployModel, MMYOLOBackend
|
||||||
|
from projects.easydeploy.bbox_code import rtmdet_bbox_decoder as bbox_decoder
|
||||||
|
|
||||||
|
|
||||||
|
class DeepStreamOutput(nn.Module):
|
||||||
|
def __init__(self):
|
||||||
|
super().__init__()
|
||||||
|
|
||||||
|
def forward(self, x):
|
||||||
|
boxes = x[0]
|
||||||
|
scores, labels = torch.max(x[1], dim=-1, keepdim=True)
|
||||||
|
return torch.cat([boxes, scores, labels.to(boxes.dtype)], dim=-1)
|
||||||
|
|
||||||
|
|
||||||
|
def pred_by_feat_deepstream(self, cls_scores, bbox_preds, objectnesses=None, **kwargs):
|
||||||
|
assert len(cls_scores) == len(bbox_preds)
|
||||||
|
dtype = cls_scores[0].dtype
|
||||||
|
device = cls_scores[0].device
|
||||||
|
|
||||||
|
num_imgs = cls_scores[0].shape[0]
|
||||||
|
featmap_sizes = [cls_score.shape[2:] for cls_score in cls_scores]
|
||||||
|
|
||||||
|
mlvl_priors = self.prior_generate(featmap_sizes, dtype=dtype, device=device)
|
||||||
|
|
||||||
|
flatten_priors = torch.cat(mlvl_priors)
|
||||||
|
|
||||||
|
mlvl_strides = [
|
||||||
|
flatten_priors.new_full(
|
||||||
|
(featmap_size[0] * featmap_size[1] * self.num_base_priors,), stride
|
||||||
|
) for featmap_size, stride in zip(
|
||||||
|
featmap_sizes, self.featmap_strides
|
||||||
|
)
|
||||||
|
]
|
||||||
|
flatten_stride = torch.cat(mlvl_strides)
|
||||||
|
|
||||||
|
flatten_cls_scores = [
|
||||||
|
cls_score.permute(0, 2, 3, 1).reshape(num_imgs, -1, self.num_classes) for cls_score in cls_scores
|
||||||
|
]
|
||||||
|
cls_scores = torch.cat(flatten_cls_scores, dim=1).sigmoid()
|
||||||
|
|
||||||
|
flatten_bbox_preds = [bbox_pred.permute(0, 2, 3, 1).reshape(num_imgs, -1, 4) for bbox_pred in bbox_preds]
|
||||||
|
flatten_bbox_preds = torch.cat(flatten_bbox_preds, dim=1)
|
||||||
|
|
||||||
|
if objectnesses is not None:
|
||||||
|
flatten_objectness = [objectness.permute(0, 2, 3, 1).reshape(num_imgs, -1) for objectness in objectnesses]
|
||||||
|
flatten_objectness = torch.cat(flatten_objectness, dim=1).sigmoid()
|
||||||
|
cls_scores = cls_scores * (flatten_objectness.unsqueeze(-1))
|
||||||
|
|
||||||
|
scores = cls_scores
|
||||||
|
|
||||||
|
bboxes = bbox_decoder(flatten_priors[None], flatten_bbox_preds, flatten_stride)
|
||||||
|
|
||||||
|
return bboxes, scores
|
||||||
|
|
||||||
|
|
||||||
|
def rtmdet_export(weights, config, device):
|
||||||
|
model = init_detector(config, weights, device=device)
|
||||||
|
model.eval()
|
||||||
|
deploy_model = DeployModel(baseModel=model, backend=MMYOLOBackend.ONNXRUNTIME, postprocess_cfg=None)
|
||||||
|
deploy_model.eval()
|
||||||
|
deploy_model.with_postprocess = True
|
||||||
|
deploy_model.prior_generate = model.bbox_head.prior_generator.grid_priors
|
||||||
|
deploy_model.num_base_priors = model.bbox_head.num_base_priors
|
||||||
|
deploy_model.featmap_strides = model.bbox_head.featmap_strides
|
||||||
|
deploy_model.num_classes = model.bbox_head.num_classes
|
||||||
|
deploy_model.pred_by_feat = types.MethodType(pred_by_feat_deepstream, deploy_model)
|
||||||
|
return deploy_model
|
||||||
|
|
||||||
|
|
||||||
|
def suppress_warnings():
|
||||||
|
import warnings
|
||||||
|
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=UserWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=FutureWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=ResourceWarning)
|
||||||
|
|
||||||
|
|
||||||
|
def main(args):
|
||||||
|
suppress_warnings()
|
||||||
|
|
||||||
|
print(f'\nStarting: {args.weights}')
|
||||||
|
|
||||||
|
print('Opening RTMDet model')
|
||||||
|
|
||||||
|
device = torch.device('cpu')
|
||||||
|
model = rtmdet_export(args.weights, args.config, device)
|
||||||
|
|
||||||
|
model = nn.Sequential(model, DeepStreamOutput())
|
||||||
|
|
||||||
|
img_size = args.size * 2 if len(args.size) == 1 else args.size
|
||||||
|
|
||||||
|
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
||||||
|
onnx_output_file = f'{args.weights}.onnx'
|
||||||
|
|
||||||
|
dynamic_axes = {
|
||||||
|
'input': {
|
||||||
|
0: 'batch'
|
||||||
|
},
|
||||||
|
'output': {
|
||||||
|
0: 'batch'
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
print('Exporting the model to ONNX')
|
||||||
|
torch.onnx.export(
|
||||||
|
model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
|
||||||
|
input_names=['input'], output_names=['output'], dynamic_axes=dynamic_axes if args.dynamic else None
|
||||||
|
)
|
||||||
|
|
||||||
|
if args.simplify:
|
||||||
|
print('Simplifying the ONNX model')
|
||||||
|
import onnxslim
|
||||||
|
model_onnx = onnx.load(onnx_output_file)
|
||||||
|
model_onnx = onnxslim.slim(model_onnx)
|
||||||
|
onnx.save(model_onnx, onnx_output_file)
|
||||||
|
|
||||||
|
print(f'Done: {onnx_output_file}\n')
|
||||||
|
|
||||||
|
|
||||||
|
def parse_args():
|
||||||
|
import argparse
|
||||||
|
parser = argparse.ArgumentParser(description='DeepStream RTMDet conversion')
|
||||||
|
parser.add_argument('-w', '--weights', required=True, type=str, help='Input weights (.pt) file path (required)')
|
||||||
|
parser.add_argument('-c', '--config', required=True, help='Input config (.py) file path (required)')
|
||||||
|
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
|
||||||
|
parser.add_argument('--opset', type=int, default=17, help='ONNX opset version')
|
||||||
|
parser.add_argument('--simplify', action='store_true', help='ONNX simplify model')
|
||||||
|
parser.add_argument('--dynamic', action='store_true', help='Dynamic batch-size')
|
||||||
|
parser.add_argument('--batch', type=int, default=1, help='Static batch-size')
|
||||||
|
args = parser.parse_args()
|
||||||
|
if not os.path.isfile(args.weights):
|
||||||
|
raise SystemExit('Invalid weights file')
|
||||||
|
if not os.path.isfile(args.config):
|
||||||
|
raise SystemExit('Invalid config file')
|
||||||
|
if args.dynamic and args.batch > 1:
|
||||||
|
raise SystemExit('Cannot set dynamic batch-size and static batch-size at same time')
|
||||||
|
return args
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
args = parse_args()
|
||||||
|
main(args)
|
||||||
@@ -1,13 +1,9 @@
|
|||||||
import os
|
import os
|
||||||
import sys
|
|
||||||
import argparse
|
|
||||||
import warnings
|
|
||||||
import onnx
|
import onnx
|
||||||
import torch
|
import torch
|
||||||
import torch.nn as nn
|
import torch.nn as nn
|
||||||
|
|
||||||
from models.experimental import attempt_load
|
from models.experimental import attempt_load
|
||||||
from utils.torch_utils import select_device
|
|
||||||
from models.yolo import Detect
|
|
||||||
|
|
||||||
|
|
||||||
class DeepStreamOutput(nn.Module):
|
class DeepStreamOutput(nn.Module):
|
||||||
@@ -17,46 +13,51 @@ class DeepStreamOutput(nn.Module):
|
|||||||
def forward(self, x):
|
def forward(self, x):
|
||||||
x = x[0]
|
x = x[0]
|
||||||
boxes = x[:, :, :4]
|
boxes = x[:, :, :4]
|
||||||
|
convert_matrix = torch.tensor(
|
||||||
|
[[1, 0, 1, 0], [0, 1, 0, 1], [-0.5, 0, 0.5, 0], [0, -0.5, 0, 0.5]], dtype=boxes.dtype, device=boxes.device
|
||||||
|
)
|
||||||
|
boxes @= convert_matrix
|
||||||
objectness = x[:, :, 4:5]
|
objectness = x[:, :, 4:5]
|
||||||
scores, classes = torch.max(x[:, :, 5:], 2, keepdim=True)
|
scores, labels = torch.max(x[:, :, 5:], dim=-1, keepdim=True)
|
||||||
scores *= objectness
|
scores *= objectness
|
||||||
classes = classes.float()
|
return torch.cat([boxes, scores, labels.to(boxes.dtype)], dim=-1)
|
||||||
return boxes, scores, classes
|
|
||||||
|
|
||||||
|
|
||||||
def suppress_warnings():
|
def yolov5_export(weights, device, inplace=True, fuse=True):
|
||||||
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
model = attempt_load(weights, device=device, inplace=inplace, fuse=fuse)
|
||||||
warnings.filterwarnings('ignore', category=UserWarning)
|
|
||||||
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
|
||||||
|
|
||||||
|
|
||||||
def yolov5_export(weights, device):
|
|
||||||
model = attempt_load(weights, device=device, inplace=True, fuse=True)
|
|
||||||
model.eval()
|
model.eval()
|
||||||
for k, m in model.named_modules():
|
for k, m in model.named_modules():
|
||||||
if isinstance(m, Detect):
|
if m.__class__.__name__ == 'Detect':
|
||||||
m.inplace = False
|
m.inplace = False
|
||||||
m.dynamic = False
|
m.dynamic = False
|
||||||
m.export = True
|
m.export = True
|
||||||
return model
|
return model
|
||||||
|
|
||||||
|
|
||||||
|
def suppress_warnings():
|
||||||
|
import warnings
|
||||||
|
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=UserWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=FutureWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=ResourceWarning)
|
||||||
|
|
||||||
|
|
||||||
def main(args):
|
def main(args):
|
||||||
suppress_warnings()
|
suppress_warnings()
|
||||||
|
|
||||||
print('\nStarting: %s' % args.weights)
|
print(f'\nStarting: {args.weights}')
|
||||||
|
|
||||||
print('Opening YOLOv5 model\n')
|
print('Opening YOLOv5 model')
|
||||||
|
|
||||||
device = select_device('cpu')
|
device = torch.device('cpu')
|
||||||
model = yolov5_export(args.weights, device)
|
model = yolov5_export(args.weights, device)
|
||||||
|
|
||||||
if len(model.names.keys()) > 0:
|
if len(model.names.keys()) > 0:
|
||||||
print('\nCreating labels.txt file')
|
print('Creating labels.txt file')
|
||||||
f = open('labels.txt', 'w')
|
with open('labels.txt', 'w', encoding='utf-8') as f:
|
||||||
for name in model.names.values():
|
for name in model.names.values():
|
||||||
f.write(name + '\n')
|
f.write(f'{name}\n')
|
||||||
f.close()
|
|
||||||
|
|
||||||
model = nn.Sequential(model, DeepStreamOutput())
|
model = nn.Sequential(model, DeepStreamOutput())
|
||||||
|
|
||||||
@@ -66,41 +67,37 @@ def main(args):
|
|||||||
img_size = [1280] * 2
|
img_size = [1280] * 2
|
||||||
|
|
||||||
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
||||||
onnx_output_file = os.path.basename(args.weights).split('.pt')[0] + '.onnx'
|
onnx_output_file = f'{args.weights}.onnx'
|
||||||
|
|
||||||
dynamic_axes = {
|
dynamic_axes = {
|
||||||
'input': {
|
'input': {
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
},
|
},
|
||||||
'boxes': {
|
'output': {
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'scores': {
|
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'classes': {
|
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
print('\nExporting the model to ONNX')
|
print('Exporting the model to ONNX')
|
||||||
torch.onnx.export(model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset,
|
torch.onnx.export(
|
||||||
do_constant_folding=True, input_names=['input'], output_names=['boxes', 'scores', 'classes'],
|
model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
|
||||||
dynamic_axes=dynamic_axes if args.dynamic else None)
|
input_names=['input'], output_names=['output'], dynamic_axes=dynamic_axes if args.dynamic else None
|
||||||
|
)
|
||||||
|
|
||||||
if args.simplify:
|
if args.simplify:
|
||||||
print('Simplifying the ONNX model')
|
print('Simplifying the ONNX model')
|
||||||
import onnxsim
|
import onnxslim
|
||||||
model_onnx = onnx.load(onnx_output_file)
|
model_onnx = onnx.load(onnx_output_file)
|
||||||
model_onnx, _ = onnxsim.simplify(model_onnx)
|
model_onnx = onnxslim.slim(model_onnx)
|
||||||
onnx.save(model_onnx, onnx_output_file)
|
onnx.save(model_onnx, onnx_output_file)
|
||||||
|
|
||||||
print('Done: %s\n' % onnx_output_file)
|
print(f'Done: {onnx_output_file}\n')
|
||||||
|
|
||||||
|
|
||||||
def parse_args():
|
def parse_args():
|
||||||
|
import argparse
|
||||||
parser = argparse.ArgumentParser(description='DeepStream YOLOv5 conversion')
|
parser = argparse.ArgumentParser(description='DeepStream YOLOv5 conversion')
|
||||||
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pt) file path (required)')
|
parser.add_argument('-w', '--weights', required=True, type=str, help='Input weights (.pt) file path (required)')
|
||||||
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
|
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
|
||||||
parser.add_argument('--p6', action='store_true', help='P6 model')
|
parser.add_argument('--p6', action='store_true', help='P6 model')
|
||||||
parser.add_argument('--opset', type=int, default=17, help='ONNX opset version')
|
parser.add_argument('--opset', type=int, default=17, help='ONNX opset version')
|
||||||
@@ -117,4 +114,4 @@ def parse_args():
|
|||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
args = parse_args()
|
args = parse_args()
|
||||||
sys.exit(main(args))
|
main(args)
|
||||||
|
|||||||
@@ -1,13 +1,11 @@
|
|||||||
import os
|
import os
|
||||||
import sys
|
|
||||||
import argparse
|
|
||||||
import warnings
|
|
||||||
import onnx
|
import onnx
|
||||||
import torch
|
import torch
|
||||||
import torch.nn as nn
|
import torch.nn as nn
|
||||||
from yolov6.utils.checkpoint import load_checkpoint
|
|
||||||
from yolov6.layers.common import RepVGGBlock, SiLU
|
|
||||||
from yolov6.models.effidehead import Detect
|
from yolov6.models.effidehead import Detect
|
||||||
|
from yolov6.layers.common import RepVGGBlock, SiLU
|
||||||
|
from yolov6.utils.checkpoint import load_checkpoint
|
||||||
|
|
||||||
try:
|
try:
|
||||||
from yolov6.layers.common import ConvModule
|
from yolov6.layers.common import ConvModule
|
||||||
@@ -21,17 +19,14 @@ class DeepStreamOutput(nn.Module):
|
|||||||
|
|
||||||
def forward(self, x):
|
def forward(self, x):
|
||||||
boxes = x[:, :, :4]
|
boxes = x[:, :, :4]
|
||||||
|
convert_matrix = torch.tensor(
|
||||||
|
[[1, 0, 1, 0], [0, 1, 0, 1], [-0.5, 0, 0.5, 0], [0, -0.5, 0, 0.5]], dtype=boxes.dtype, device=boxes.device
|
||||||
|
)
|
||||||
|
boxes @= convert_matrix
|
||||||
objectness = x[:, :, 4:5]
|
objectness = x[:, :, 4:5]
|
||||||
scores, classes = torch.max(x[:, :, 5:], 2, keepdim=True)
|
scores, labels = torch.max(x[:, :, 5:], dim=-1, keepdim=True)
|
||||||
scores *= objectness
|
scores *= objectness
|
||||||
classes = classes.float()
|
return torch.cat([boxes, scores, labels.to(boxes.dtype)], dim=-1)
|
||||||
return boxes, scores, classes
|
|
||||||
|
|
||||||
|
|
||||||
def suppress_warnings():
|
|
||||||
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
|
||||||
warnings.filterwarnings('ignore', category=UserWarning)
|
|
||||||
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
|
||||||
|
|
||||||
|
|
||||||
def yolov6_export(weights, device):
|
def yolov6_export(weights, device):
|
||||||
@@ -51,12 +46,21 @@ def yolov6_export(weights, device):
|
|||||||
return model
|
return model
|
||||||
|
|
||||||
|
|
||||||
|
def suppress_warnings():
|
||||||
|
import warnings
|
||||||
|
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=UserWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=FutureWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=ResourceWarning)
|
||||||
|
|
||||||
|
|
||||||
def main(args):
|
def main(args):
|
||||||
suppress_warnings()
|
suppress_warnings()
|
||||||
|
|
||||||
print('\nStarting: %s' % args.weights)
|
print(f'\nStarting: {args.weights}')
|
||||||
|
|
||||||
print('Opening YOLOv6 model\n')
|
print('Opening YOLOv6 model')
|
||||||
|
|
||||||
device = torch.device('cpu')
|
device = torch.device('cpu')
|
||||||
model = yolov6_export(args.weights, device)
|
model = yolov6_export(args.weights, device)
|
||||||
@@ -69,39 +73,35 @@ def main(args):
|
|||||||
img_size = [1280] * 2
|
img_size = [1280] * 2
|
||||||
|
|
||||||
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
||||||
onnx_output_file = os.path.basename(args.weights).split('.pt')[0] + '.onnx'
|
onnx_output_file = f'{args.weights}.onnx'
|
||||||
|
|
||||||
dynamic_axes = {
|
dynamic_axes = {
|
||||||
'input': {
|
'input': {
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
},
|
},
|
||||||
'boxes': {
|
'output': {
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'scores': {
|
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'classes': {
|
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
print('\nExporting the model to ONNX')
|
print('Exporting the model to ONNX')
|
||||||
torch.onnx.export(model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset,
|
torch.onnx.export(
|
||||||
do_constant_folding=True, input_names=['input'], output_names=['boxes', 'scores', 'classes'],
|
model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
|
||||||
dynamic_axes=dynamic_axes if args.dynamic else None)
|
input_names=['input'], output_names=['output'], dynamic_axes=dynamic_axes if args.dynamic else None
|
||||||
|
)
|
||||||
|
|
||||||
if args.simplify:
|
if args.simplify:
|
||||||
print('Simplifying the ONNX model')
|
print('Simplifying the ONNX model')
|
||||||
import onnxsim
|
import onnxslim
|
||||||
model_onnx = onnx.load(onnx_output_file)
|
model_onnx = onnx.load(onnx_output_file)
|
||||||
model_onnx, _ = onnxsim.simplify(model_onnx)
|
model_onnx = onnxslim.slim(model_onnx)
|
||||||
onnx.save(model_onnx, onnx_output_file)
|
onnx.save(model_onnx, onnx_output_file)
|
||||||
|
|
||||||
print('Done: %s\n' % onnx_output_file)
|
print(f'Done: {onnx_output_file}\n')
|
||||||
|
|
||||||
|
|
||||||
def parse_args():
|
def parse_args():
|
||||||
|
import argparse
|
||||||
parser = argparse.ArgumentParser(description='DeepStream YOLOv6 conversion')
|
parser = argparse.ArgumentParser(description='DeepStream YOLOv6 conversion')
|
||||||
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pt) file path (required)')
|
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pt) file path (required)')
|
||||||
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
|
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
|
||||||
@@ -120,4 +120,4 @@ def parse_args():
|
|||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
args = parse_args()
|
args = parse_args()
|
||||||
sys.exit(main(args))
|
main(args)
|
||||||
|
|||||||
@@ -1,10 +1,8 @@
|
|||||||
import os
|
import os
|
||||||
import sys
|
|
||||||
import argparse
|
|
||||||
import warnings
|
|
||||||
import onnx
|
import onnx
|
||||||
import torch
|
import torch
|
||||||
import torch.nn as nn
|
import torch.nn as nn
|
||||||
|
|
||||||
import models
|
import models
|
||||||
from models.experimental import attempt_load
|
from models.experimental import attempt_load
|
||||||
from utils.torch_utils import select_device
|
from utils.torch_utils import select_device
|
||||||
@@ -17,17 +15,14 @@ class DeepStreamOutput(nn.Module):
|
|||||||
|
|
||||||
def forward(self, x):
|
def forward(self, x):
|
||||||
boxes = x[:, :, :4]
|
boxes = x[:, :, :4]
|
||||||
|
convert_matrix = torch.tensor(
|
||||||
|
[[1, 0, 1, 0], [0, 1, 0, 1], [-0.5, 0, 0.5, 0], [0, -0.5, 0, 0.5]], dtype=boxes.dtype, device=boxes.device
|
||||||
|
)
|
||||||
|
boxes @= convert_matrix
|
||||||
objectness = x[:, :, 4:5]
|
objectness = x[:, :, 4:5]
|
||||||
scores, classes = torch.max(x[:, :, 5:], 2, keepdim=True)
|
scores, labels = torch.max(x[:, :, 5:], dim=-1, keepdim=True)
|
||||||
scores *= objectness
|
scores *= objectness
|
||||||
classes = classes.float()
|
return torch.cat([boxes, scores, labels.to(boxes.dtype)], dim=-1)
|
||||||
return boxes, scores, classes
|
|
||||||
|
|
||||||
|
|
||||||
def suppress_warnings():
|
|
||||||
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
|
||||||
warnings.filterwarnings('ignore', category=UserWarning)
|
|
||||||
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
|
||||||
|
|
||||||
|
|
||||||
def yolov7_export(weights, device):
|
def yolov7_export(weights, device):
|
||||||
@@ -45,22 +40,30 @@ def yolov7_export(weights, device):
|
|||||||
return model
|
return model
|
||||||
|
|
||||||
|
|
||||||
|
def suppress_warnings():
|
||||||
|
import warnings
|
||||||
|
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=UserWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=FutureWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=ResourceWarning)
|
||||||
|
|
||||||
|
|
||||||
def main(args):
|
def main(args):
|
||||||
suppress_warnings()
|
suppress_warnings()
|
||||||
|
|
||||||
print('\nStarting: %s' % args.weights)
|
print(f'\nStarting: {args.weights}')
|
||||||
|
|
||||||
print('Opening YOLOv7 model\n')
|
print('Opening YOLOv7 model')
|
||||||
|
|
||||||
device = select_device('cpu')
|
device = select_device('cpu')
|
||||||
model = yolov7_export(args.weights, device)
|
model = yolov7_export(args.weights, device)
|
||||||
|
|
||||||
if len(model.names) > 0:
|
if hasattr(model, 'names') and len(model.names) > 0:
|
||||||
print('\nCreating labels.txt file')
|
print('Creating labels.txt file')
|
||||||
f = open('labels.txt', 'w')
|
with open('labels.txt', 'w', encoding='utf-8') as f:
|
||||||
for name in model.names:
|
for name in model.names:
|
||||||
f.write(name + '\n')
|
f.write(f'{name}\n')
|
||||||
f.close()
|
|
||||||
|
|
||||||
model = nn.Sequential(model, DeepStreamOutput())
|
model = nn.Sequential(model, DeepStreamOutput())
|
||||||
|
|
||||||
@@ -70,39 +73,35 @@ def main(args):
|
|||||||
img_size = [1280] * 2
|
img_size = [1280] * 2
|
||||||
|
|
||||||
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
||||||
onnx_output_file = os.path.basename(args.weights).split('.pt')[0] + '.onnx'
|
onnx_output_file = f'{args.weights}.onnx'
|
||||||
|
|
||||||
dynamic_axes = {
|
dynamic_axes = {
|
||||||
'input': {
|
'input': {
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
},
|
},
|
||||||
'boxes': {
|
'output': {
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'scores': {
|
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'classes': {
|
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
print('\nExporting the model to ONNX')
|
print('Exporting the model to ONNX')
|
||||||
torch.onnx.export(model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset,
|
torch.onnx.export(
|
||||||
do_constant_folding=True, input_names=['input'], output_names=['boxes', 'scores', 'classes'],
|
model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
|
||||||
dynamic_axes=dynamic_axes if args.dynamic else None)
|
input_names=['input'], output_names=['output'], dynamic_axes=dynamic_axes if args.dynamic else None
|
||||||
|
)
|
||||||
|
|
||||||
if args.simplify:
|
if args.simplify:
|
||||||
print('Simplifying the ONNX model')
|
print('Simplifying the ONNX model')
|
||||||
import onnxsim
|
import onnxslim
|
||||||
model_onnx = onnx.load(onnx_output_file)
|
model_onnx = onnx.load(onnx_output_file)
|
||||||
model_onnx, _ = onnxsim.simplify(model_onnx)
|
model_onnx = onnxslim.slim(model_onnx)
|
||||||
onnx.save(model_onnx, onnx_output_file)
|
onnx.save(model_onnx, onnx_output_file)
|
||||||
|
|
||||||
print('Done: %s\n' % onnx_output_file)
|
print(f'Done: {onnx_output_file}\n')
|
||||||
|
|
||||||
|
|
||||||
def parse_args():
|
def parse_args():
|
||||||
|
import argparse
|
||||||
parser = argparse.ArgumentParser(description='DeepStream YOLOv7 conversion')
|
parser = argparse.ArgumentParser(description='DeepStream YOLOv7 conversion')
|
||||||
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pt) file path (required)')
|
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pt) file path (required)')
|
||||||
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
|
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
|
||||||
@@ -121,4 +120,4 @@ def parse_args():
|
|||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
args = parse_args()
|
args = parse_args()
|
||||||
sys.exit(main(args))
|
main(args)
|
||||||
|
|||||||
@@ -1,13 +1,11 @@
|
|||||||
import os
|
import os
|
||||||
import sys
|
|
||||||
import argparse
|
|
||||||
import warnings
|
|
||||||
import onnx
|
import onnx
|
||||||
import torch
|
import torch
|
||||||
import torch.nn as nn
|
import torch.nn as nn
|
||||||
|
|
||||||
|
from utils.torch_utils import select_device
|
||||||
from models.experimental import attempt_load
|
from models.experimental import attempt_load
|
||||||
from models.yolo import Detect, V6Detect, IV6Detect
|
from models.yolo import Detect, V6Detect, IV6Detect
|
||||||
from utils.torch_utils import select_device
|
|
||||||
|
|
||||||
|
|
||||||
class DeepStreamOutput(nn.Module):
|
class DeepStreamOutput(nn.Module):
|
||||||
@@ -17,15 +15,12 @@ class DeepStreamOutput(nn.Module):
|
|||||||
def forward(self, x):
|
def forward(self, x):
|
||||||
x = x.transpose(1, 2)
|
x = x.transpose(1, 2)
|
||||||
boxes = x[:, :, :4]
|
boxes = x[:, :, :4]
|
||||||
scores, classes = torch.max(x[:, :, 4:], 2, keepdim=True)
|
convert_matrix = torch.tensor(
|
||||||
classes = classes.float()
|
[[1, 0, 1, 0], [0, 1, 0, 1], [-0.5, 0, 0.5, 0], [0, -0.5, 0, 0.5]], dtype=boxes.dtype, device=boxes.device
|
||||||
return boxes, scores, classes
|
)
|
||||||
|
boxes @= convert_matrix
|
||||||
|
scores, labels = torch.max(x[:, :, 4:], dim=-1, keepdim=True)
|
||||||
def suppress_warnings():
|
return torch.cat([boxes, scores, labels.to(boxes.dtype)], dim=-1)
|
||||||
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
|
||||||
warnings.filterwarnings('ignore', category=UserWarning)
|
|
||||||
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
|
||||||
|
|
||||||
|
|
||||||
def yolov7_u6_export(weights, device):
|
def yolov7_u6_export(weights, device):
|
||||||
@@ -39,61 +34,65 @@ def yolov7_u6_export(weights, device):
|
|||||||
return model
|
return model
|
||||||
|
|
||||||
|
|
||||||
|
def suppress_warnings():
|
||||||
|
import warnings
|
||||||
|
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=UserWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=FutureWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=ResourceWarning)
|
||||||
|
|
||||||
|
|
||||||
def main(args):
|
def main(args):
|
||||||
suppress_warnings()
|
suppress_warnings()
|
||||||
|
|
||||||
print('\nStarting: %s' % args.weights)
|
print(f'\nStarting: {args.weights}')
|
||||||
|
|
||||||
print('Opening YOLOv7_u6 model\n')
|
print('Opening YOLOv7_u6 model')
|
||||||
|
|
||||||
device = select_device('cpu')
|
device = select_device('cpu')
|
||||||
model = yolov7_u6_export(args.weights, device)
|
model = yolov7_u6_export(args.weights, device)
|
||||||
|
|
||||||
if len(model.names.keys()) > 0:
|
if len(model.names.keys()) > 0:
|
||||||
print('\nCreating labels.txt file')
|
print('Creating labels.txt file')
|
||||||
f = open('labels.txt', 'w')
|
with open('labels.txt', 'w', encoding='utf-8') as f:
|
||||||
for name in model.names.values():
|
for name in model.names.values():
|
||||||
f.write(name + '\n')
|
f.write(f'{name}\n')
|
||||||
f.close()
|
|
||||||
|
|
||||||
model = nn.Sequential(model, DeepStreamOutput())
|
model = nn.Sequential(model, DeepStreamOutput())
|
||||||
|
|
||||||
img_size = args.size * 2 if len(args.size) == 1 else args.size
|
img_size = args.size * 2 if len(args.size) == 1 else args.size
|
||||||
|
|
||||||
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
||||||
onnx_output_file = os.path.basename(args.weights).split('.pt')[0] + '.onnx'
|
onnx_output_file = f'{args.weights}.onnx'
|
||||||
|
|
||||||
dynamic_axes = {
|
dynamic_axes = {
|
||||||
'input': {
|
'input': {
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
},
|
},
|
||||||
'boxes': {
|
'output': {
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'scores': {
|
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'classes': {
|
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
print('\nExporting the model to ONNX')
|
print('Exporting the model to ONNX')
|
||||||
torch.onnx.export(model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset,
|
torch.onnx.export(
|
||||||
do_constant_folding=True, input_names=['input'], output_names=['boxes', 'scores', 'classes'],
|
model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
|
||||||
dynamic_axes=dynamic_axes if args.dynamic else None)
|
input_names=['input'], output_names=['output'], dynamic_axes=dynamic_axes if args.dynamic else None
|
||||||
|
)
|
||||||
|
|
||||||
if args.simplify:
|
if args.simplify:
|
||||||
print('Simplifying the ONNX model')
|
print('Simplifying the ONNX model')
|
||||||
import onnxsim
|
import onnxslim
|
||||||
model_onnx = onnx.load(onnx_output_file)
|
model_onnx = onnx.load(onnx_output_file)
|
||||||
model_onnx, _ = onnxsim.simplify(model_onnx)
|
model_onnx = onnxslim.slim(model_onnx)
|
||||||
onnx.save(model_onnx, onnx_output_file)
|
onnx.save(model_onnx, onnx_output_file)
|
||||||
|
|
||||||
print('Done: %s\n' % onnx_output_file)
|
print(f'Done: {onnx_output_file}\n')
|
||||||
|
|
||||||
|
|
||||||
def parse_args():
|
def parse_args():
|
||||||
|
import argparse
|
||||||
parser = argparse.ArgumentParser(description='DeepStream YOLOv7-u6 conversion')
|
parser = argparse.ArgumentParser(description='DeepStream YOLOv7-u6 conversion')
|
||||||
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pt) file path (required)')
|
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pt) file path (required)')
|
||||||
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
|
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
|
||||||
@@ -111,4 +110,4 @@ def parse_args():
|
|||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
args = parse_args()
|
args = parse_args()
|
||||||
sys.exit(main(args))
|
main(args)
|
||||||
|
|||||||
@@ -1,14 +1,26 @@
|
|||||||
import os
|
import os
|
||||||
import sys
|
import sys
|
||||||
import argparse
|
|
||||||
import warnings
|
|
||||||
import onnx
|
import onnx
|
||||||
import torch
|
import torch
|
||||||
import torch.nn as nn
|
import torch.nn as nn
|
||||||
from copy import deepcopy
|
from copy import deepcopy
|
||||||
from ultralytics import YOLO
|
|
||||||
from ultralytics.utils.torch_utils import select_device
|
import ultralytics.utils
|
||||||
from ultralytics.nn.modules import C2f, Detect, RTDETRDecoder
|
import ultralytics.models.yolo
|
||||||
|
import ultralytics.utils.tal as _m
|
||||||
|
|
||||||
|
sys.modules['ultralytics.yolo'] = ultralytics.models.yolo
|
||||||
|
sys.modules['ultralytics.yolo.utils'] = ultralytics.utils
|
||||||
|
|
||||||
|
|
||||||
|
def _dist2bbox(distance, anchor_points, xywh=False, dim=-1):
|
||||||
|
lt, rb = distance.chunk(2, dim)
|
||||||
|
x1y1 = anchor_points - lt
|
||||||
|
x2y2 = anchor_points + rb
|
||||||
|
return torch.cat((x1y1, x2y2), dim)
|
||||||
|
|
||||||
|
|
||||||
|
_m.dist2bbox.__code__ = _dist2bbox.__code__
|
||||||
|
|
||||||
|
|
||||||
class DeepStreamOutput(nn.Module):
|
class DeepStreamOutput(nn.Module):
|
||||||
@@ -18,94 +30,103 @@ class DeepStreamOutput(nn.Module):
|
|||||||
def forward(self, x):
|
def forward(self, x):
|
||||||
x = x.transpose(1, 2)
|
x = x.transpose(1, 2)
|
||||||
boxes = x[:, :, :4]
|
boxes = x[:, :, :4]
|
||||||
scores, classes = torch.max(x[:, :, 4:], 2, keepdim=True)
|
scores, labels = torch.max(x[:, :, 4:], dim=-1, keepdim=True)
|
||||||
classes = classes.float()
|
return torch.cat([boxes, scores, labels.to(boxes.dtype)], dim=-1)
|
||||||
return boxes, scores, classes
|
|
||||||
|
|
||||||
|
|
||||||
def suppress_warnings():
|
def yolov8_export(weights, device, inplace=True, fuse=True):
|
||||||
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
ckpt = torch.load(weights, map_location='cpu')
|
||||||
warnings.filterwarnings('ignore', category=UserWarning)
|
ckpt = (ckpt.get('ema') or ckpt['model']).to(device).float()
|
||||||
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
if not hasattr(ckpt, 'stride'):
|
||||||
|
ckpt.stride = torch.tensor([32.])
|
||||||
|
if hasattr(ckpt, 'names') and isinstance(ckpt.names, (list, tuple)):
|
||||||
def yolov8_export(weights, device):
|
ckpt.names = dict(enumerate(ckpt.names))
|
||||||
model = YOLO(weights)
|
model = ckpt.fuse().eval() if fuse and hasattr(ckpt, 'fuse') else ckpt.eval()
|
||||||
model = deepcopy(model.model).to(device)
|
for m in model.modules():
|
||||||
|
t = type(m)
|
||||||
|
if hasattr(m, 'inplace'):
|
||||||
|
m.inplace = inplace
|
||||||
|
elif t.__name__ == 'Upsample' and not hasattr(m, 'recompute_scale_factor'):
|
||||||
|
m.recompute_scale_factor = None
|
||||||
|
model = deepcopy(model).to(device)
|
||||||
for p in model.parameters():
|
for p in model.parameters():
|
||||||
p.requires_grad = False
|
p.requires_grad = False
|
||||||
model.eval()
|
model.eval()
|
||||||
model.float()
|
model.float()
|
||||||
model = model.fuse()
|
model = model.fuse()
|
||||||
for k, m in model.named_modules():
|
for k, m in model.named_modules():
|
||||||
if isinstance(m, (Detect, RTDETRDecoder)):
|
if m.__class__.__name__ in ('Detect', 'RTDETRDecoder'):
|
||||||
m.dynamic = False
|
m.dynamic = False
|
||||||
m.export = True
|
m.export = True
|
||||||
m.format = 'onnx'
|
m.format = 'onnx'
|
||||||
elif isinstance(m, C2f):
|
elif m.__class__.__name__ == 'C2f':
|
||||||
m.forward = m.forward_split
|
m.forward = m.forward_split
|
||||||
return model
|
return model
|
||||||
|
|
||||||
|
|
||||||
|
def suppress_warnings():
|
||||||
|
import warnings
|
||||||
|
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=UserWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=FutureWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=ResourceWarning)
|
||||||
|
|
||||||
|
|
||||||
def main(args):
|
def main(args):
|
||||||
suppress_warnings()
|
suppress_warnings()
|
||||||
|
|
||||||
print('\nStarting: %s' % args.weights)
|
print(f'\nStarting: {args.weights}')
|
||||||
|
|
||||||
print('Opening YOLOv8 model\n')
|
print('Opening YOLOv8 model')
|
||||||
|
|
||||||
device = select_device('cpu')
|
device = torch.device('cpu')
|
||||||
model = yolov8_export(args.weights, device)
|
model = yolov8_export(args.weights, device)
|
||||||
|
|
||||||
if len(model.names.keys()) > 0:
|
if len(model.names.keys()) > 0:
|
||||||
print('\nCreating labels.txt file')
|
print('Creating labels.txt file')
|
||||||
f = open('labels.txt', 'w')
|
with open('labels.txt', 'w', encoding='utf-8') as f:
|
||||||
for name in model.names.values():
|
for name in model.names.values():
|
||||||
f.write(name + '\n')
|
f.write(f'{name}\n')
|
||||||
f.close()
|
|
||||||
|
|
||||||
model = nn.Sequential(model, DeepStreamOutput())
|
model = nn.Sequential(model, DeepStreamOutput())
|
||||||
|
|
||||||
img_size = args.size * 2 if len(args.size) == 1 else args.size
|
img_size = args.size * 2 if len(args.size) == 1 else args.size
|
||||||
|
|
||||||
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
||||||
onnx_output_file = os.path.basename(args.weights).split('.pt')[0] + '.onnx'
|
onnx_output_file = f'{args.weights}.onnx'
|
||||||
|
|
||||||
dynamic_axes = {
|
dynamic_axes = {
|
||||||
'input': {
|
'input': {
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
},
|
},
|
||||||
'boxes': {
|
'output': {
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'scores': {
|
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'classes': {
|
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
print('\nExporting the model to ONNX')
|
print('Exporting the model to ONNX')
|
||||||
torch.onnx.export(model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset,
|
torch.onnx.export(
|
||||||
do_constant_folding=True, input_names=['input'], output_names=['boxes', 'scores', 'classes'],
|
model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
|
||||||
dynamic_axes=dynamic_axes if args.dynamic else None)
|
input_names=['input'], output_names=['output'], dynamic_axes=dynamic_axes if args.dynamic else None
|
||||||
|
)
|
||||||
|
|
||||||
if args.simplify:
|
if args.simplify:
|
||||||
print('Simplifying the ONNX model')
|
print('Simplifying the ONNX model')
|
||||||
import onnxsim
|
import onnxslim
|
||||||
model_onnx = onnx.load(onnx_output_file)
|
model_onnx = onnx.load(onnx_output_file)
|
||||||
model_onnx, _ = onnxsim.simplify(model_onnx)
|
model_onnx = onnxslim.slim(model_onnx)
|
||||||
onnx.save(model_onnx, onnx_output_file)
|
onnx.save(model_onnx, onnx_output_file)
|
||||||
|
|
||||||
print('Done: %s\n' % onnx_output_file)
|
print(f'Done: {onnx_output_file}\n')
|
||||||
|
|
||||||
|
|
||||||
def parse_args():
|
def parse_args():
|
||||||
|
import argparse
|
||||||
parser = argparse.ArgumentParser(description='DeepStream YOLOv8 conversion')
|
parser = argparse.ArgumentParser(description='DeepStream YOLOv8 conversion')
|
||||||
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pt) file path (required)')
|
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pt) file path (required)')
|
||||||
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
|
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
|
||||||
parser.add_argument('--opset', type=int, default=16, help='ONNX opset version')
|
parser.add_argument('--opset', type=int, default=17, help='ONNX opset version')
|
||||||
parser.add_argument('--simplify', action='store_true', help='ONNX simplify model')
|
parser.add_argument('--simplify', action='store_true', help='ONNX simplify model')
|
||||||
parser.add_argument('--dynamic', action='store_true', help='Dynamic batch-size')
|
parser.add_argument('--dynamic', action='store_true', help='Dynamic batch-size')
|
||||||
parser.add_argument('--batch', type=int, default=1, help='Static batch-size')
|
parser.add_argument('--batch', type=int, default=1, help='Static batch-size')
|
||||||
@@ -119,4 +140,4 @@ def parse_args():
|
|||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
args = parse_args()
|
args = parse_args()
|
||||||
sys.exit(main(args))
|
main(args)
|
||||||
|
|||||||
145
utils/export_yoloV9.py
Normal file
145
utils/export_yoloV9.py
Normal file
@@ -0,0 +1,145 @@
|
|||||||
|
import os
|
||||||
|
import onnx
|
||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
|
||||||
|
import utils.tal.anchor_generator as _m
|
||||||
|
|
||||||
|
|
||||||
|
def _dist2bbox(distance, anchor_points, xywh=False, dim=-1):
|
||||||
|
lt, rb = torch.split(distance, 2, dim)
|
||||||
|
x1y1 = anchor_points - lt
|
||||||
|
x2y2 = anchor_points + rb
|
||||||
|
return torch.cat((x1y1, x2y2), dim)
|
||||||
|
|
||||||
|
|
||||||
|
_m.dist2bbox.__code__ = _dist2bbox.__code__
|
||||||
|
|
||||||
|
|
||||||
|
class DeepStreamOutputDual(nn.Module):
|
||||||
|
def __init__(self):
|
||||||
|
super().__init__()
|
||||||
|
|
||||||
|
def forward(self, x):
|
||||||
|
x = x[1].transpose(1, 2)
|
||||||
|
boxes = x[:, :, :4]
|
||||||
|
scores, labels = torch.max(x[:, :, 4:], dim=-1, keepdim=True)
|
||||||
|
return torch.cat([boxes, scores, labels.to(boxes.dtype)], dim=-1)
|
||||||
|
|
||||||
|
|
||||||
|
class DeepStreamOutput(nn.Module):
|
||||||
|
def __init__(self):
|
||||||
|
super().__init__()
|
||||||
|
|
||||||
|
def forward(self, x):
|
||||||
|
x = x.transpose(1, 2)
|
||||||
|
boxes = x[:, :, :4]
|
||||||
|
scores, labels = torch.max(x[:, :, 4:], dim=-1, keepdim=True)
|
||||||
|
return torch.cat([boxes, scores, labels.to(boxes.dtype)], dim=-1)
|
||||||
|
|
||||||
|
|
||||||
|
def yolov9_export(weights, device, inplace=True, fuse=True):
|
||||||
|
ckpt = torch.load(weights, map_location='cpu')
|
||||||
|
ckpt = (ckpt.get('ema') or ckpt['model']).to(device).float()
|
||||||
|
if not hasattr(ckpt, 'stride'):
|
||||||
|
ckpt.stride = torch.tensor([32.])
|
||||||
|
if hasattr(ckpt, 'names') and isinstance(ckpt.names, (list, tuple)):
|
||||||
|
ckpt.names = dict(enumerate(ckpt.names))
|
||||||
|
model = ckpt.fuse().eval() if fuse and hasattr(ckpt, 'fuse') else ckpt.eval()
|
||||||
|
for m in model.modules():
|
||||||
|
t = type(m)
|
||||||
|
if t.__name__ in ('Hardswish', 'LeakyReLU', 'ReLU', 'ReLU6', 'SiLU', 'Detect', 'Model'):
|
||||||
|
m.inplace = inplace
|
||||||
|
elif t.__name__ == 'Upsample' and not hasattr(m, 'recompute_scale_factor'):
|
||||||
|
m.recompute_scale_factor = None
|
||||||
|
model.eval()
|
||||||
|
head = 'Detect'
|
||||||
|
for k, m in model.named_modules():
|
||||||
|
if m.__class__.__name__ in ('Detect', 'DDetect', 'DualDetect', 'DualDDetect'):
|
||||||
|
m.inplace = False
|
||||||
|
m.dynamic = False
|
||||||
|
m.export = True
|
||||||
|
head = m.__class__.__name__
|
||||||
|
return model, head
|
||||||
|
|
||||||
|
|
||||||
|
def suppress_warnings():
|
||||||
|
import warnings
|
||||||
|
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=UserWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=FutureWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=ResourceWarning)
|
||||||
|
|
||||||
|
|
||||||
|
def main(args):
|
||||||
|
suppress_warnings()
|
||||||
|
|
||||||
|
print(f'\nStarting: {args.weights}')
|
||||||
|
|
||||||
|
print('Opening YOLOv9 model')
|
||||||
|
|
||||||
|
device = torch.device('cpu')
|
||||||
|
model, head = yolov9_export(args.weights, device)
|
||||||
|
|
||||||
|
if len(model.names.keys()) > 0:
|
||||||
|
print('Creating labels.txt file')
|
||||||
|
with open('labels.txt', 'w', encoding='utf-8') as f:
|
||||||
|
for name in model.names.values():
|
||||||
|
f.write(f'{name}\n')
|
||||||
|
|
||||||
|
if head in ('Detect', 'DDetect'):
|
||||||
|
model = nn.Sequential(model, DeepStreamOutput())
|
||||||
|
else:
|
||||||
|
model = nn.Sequential(model, DeepStreamOutputDual())
|
||||||
|
|
||||||
|
img_size = args.size * 2 if len(args.size) == 1 else args.size
|
||||||
|
|
||||||
|
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
||||||
|
onnx_output_file = f'{args.weights}.onnx'
|
||||||
|
|
||||||
|
dynamic_axes = {
|
||||||
|
'input': {
|
||||||
|
0: 'batch'
|
||||||
|
},
|
||||||
|
'output': {
|
||||||
|
0: 'batch'
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
print('Exporting the model to ONNX')
|
||||||
|
torch.onnx.export(
|
||||||
|
model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
|
||||||
|
input_names=['input'], output_names=['output'], dynamic_axes=dynamic_axes if args.dynamic else None
|
||||||
|
)
|
||||||
|
|
||||||
|
if args.simplify:
|
||||||
|
print('Simplifying the ONNX model')
|
||||||
|
import onnxslim
|
||||||
|
model_onnx = onnx.load(onnx_output_file)
|
||||||
|
model_onnx = onnxslim.slim(model_onnx)
|
||||||
|
onnx.save(model_onnx, onnx_output_file)
|
||||||
|
|
||||||
|
print(f'Done: {onnx_output_file}\n')
|
||||||
|
|
||||||
|
|
||||||
|
def parse_args():
|
||||||
|
import argparse
|
||||||
|
parser = argparse.ArgumentParser(description='DeepStream YOLOv9 conversion')
|
||||||
|
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pt) file path (required)')
|
||||||
|
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
|
||||||
|
parser.add_argument('--opset', type=int, default=17, help='ONNX opset version')
|
||||||
|
parser.add_argument('--simplify', action='store_true', help='ONNX simplify model')
|
||||||
|
parser.add_argument('--dynamic', action='store_true', help='Dynamic batch-size')
|
||||||
|
parser.add_argument('--batch', type=int, default=1, help='Static batch-size')
|
||||||
|
args = parser.parse_args()
|
||||||
|
if not os.path.isfile(args.weights):
|
||||||
|
raise SystemExit('Invalid weights file')
|
||||||
|
if args.dynamic and args.batch > 1:
|
||||||
|
raise SystemExit('Cannot set dynamic batch-size and static batch-size at same time')
|
||||||
|
return args
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
args = parse_args()
|
||||||
|
main(args)
|
||||||
@@ -1,10 +1,8 @@
|
|||||||
import os
|
import os
|
||||||
import sys
|
|
||||||
import argparse
|
|
||||||
import warnings
|
|
||||||
import onnx
|
import onnx
|
||||||
import torch
|
import torch
|
||||||
import torch.nn as nn
|
import torch.nn as nn
|
||||||
|
|
||||||
from super_gradients.training import models
|
from super_gradients.training import models
|
||||||
|
|
||||||
|
|
||||||
@@ -14,15 +12,17 @@ class DeepStreamOutput(nn.Module):
|
|||||||
|
|
||||||
def forward(self, x):
|
def forward(self, x):
|
||||||
boxes = x[0]
|
boxes = x[0]
|
||||||
scores, classes = torch.max(x[1], 2, keepdim=True)
|
scores, labels = torch.max(x[1], dim=-1, keepdim=True)
|
||||||
classes = classes.float()
|
return torch.cat([boxes, scores, labels.to(boxes.dtype)], dim=-1)
|
||||||
return boxes, scores, classes
|
|
||||||
|
|
||||||
|
|
||||||
def suppress_warnings():
|
def suppress_warnings():
|
||||||
|
import warnings
|
||||||
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
||||||
warnings.filterwarnings('ignore', category=UserWarning)
|
warnings.filterwarnings('ignore', category=UserWarning)
|
||||||
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=FutureWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=ResourceWarning)
|
||||||
|
|
||||||
|
|
||||||
def yolonas_export(model_name, weights, num_classes, size):
|
def yolonas_export(model_name, weights, num_classes, size):
|
||||||
@@ -36,9 +36,9 @@ def yolonas_export(model_name, weights, num_classes, size):
|
|||||||
def main(args):
|
def main(args):
|
||||||
suppress_warnings()
|
suppress_warnings()
|
||||||
|
|
||||||
print('\nStarting: %s' % args.weights)
|
print(f'\nStarting: {args.weights}')
|
||||||
|
|
||||||
print('Opening YOLO-NAS model\n')
|
print('Opening YOLO-NAS model')
|
||||||
|
|
||||||
device = torch.device('cpu')
|
device = torch.device('cpu')
|
||||||
model = yolonas_export(args.model, args.weights, args.classes, args.size)
|
model = yolonas_export(args.model, args.weights, args.classes, args.size)
|
||||||
@@ -48,39 +48,35 @@ def main(args):
|
|||||||
img_size = args.size * 2 if len(args.size) == 1 else args.size
|
img_size = args.size * 2 if len(args.size) == 1 else args.size
|
||||||
|
|
||||||
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
||||||
onnx_output_file = os.path.basename(args.weights).split('.pt')[0] + '.onnx'
|
onnx_output_file = f'{args.weights}.onnx'
|
||||||
|
|
||||||
dynamic_axes = {
|
dynamic_axes = {
|
||||||
'input': {
|
'input': {
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
},
|
},
|
||||||
'boxes': {
|
'output': {
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'scores': {
|
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'classes': {
|
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
print('\nExporting the model to ONNX')
|
print('Exporting the model to ONNX')
|
||||||
torch.onnx.export(model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset,
|
torch.onnx.export(
|
||||||
do_constant_folding=True, input_names=['input'], output_names=['boxes', 'scores', 'classes'],
|
model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
|
||||||
dynamic_axes=dynamic_axes if args.dynamic else None)
|
input_names=['input'], output_names=['output'], dynamic_axes=dynamic_axes if args.dynamic else None
|
||||||
|
)
|
||||||
|
|
||||||
if args.simplify:
|
if args.simplify:
|
||||||
print('Simplifying the ONNX model')
|
print('Simplifying the ONNX model')
|
||||||
import onnxsim
|
import onnxslim
|
||||||
model_onnx = onnx.load(onnx_output_file)
|
model_onnx = onnx.load(onnx_output_file)
|
||||||
model_onnx, _ = onnxsim.simplify(model_onnx)
|
model_onnx = onnxslim.slim(model_onnx)
|
||||||
onnx.save(model_onnx, onnx_output_file)
|
onnx.save(model_onnx, onnx_output_file)
|
||||||
|
|
||||||
print('Done: %s\n' % onnx_output_file)
|
print(f'Done: {onnx_output_file}\n')
|
||||||
|
|
||||||
|
|
||||||
def parse_args():
|
def parse_args():
|
||||||
|
import argparse
|
||||||
parser = argparse.ArgumentParser(description='DeepStream YOLO-NAS conversion')
|
parser = argparse.ArgumentParser(description='DeepStream YOLO-NAS conversion')
|
||||||
parser.add_argument('-m', '--model', required=True, help='Model name (required)')
|
parser.add_argument('-m', '--model', required=True, help='Model name (required)')
|
||||||
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pth) file path (required)')
|
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pth) file path (required)')
|
||||||
@@ -102,4 +98,4 @@ def parse_args():
|
|||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
args = parse_args()
|
args = parse_args()
|
||||||
sys.exit(main(args))
|
main(args)
|
||||||
|
|||||||
@@ -1,7 +1,4 @@
|
|||||||
import os
|
import os
|
||||||
import sys
|
|
||||||
import argparse
|
|
||||||
import warnings
|
|
||||||
import onnx
|
import onnx
|
||||||
import torch
|
import torch
|
||||||
import torch.nn as nn
|
import torch.nn as nn
|
||||||
@@ -14,17 +11,14 @@ class DeepStreamOutput(nn.Module):
|
|||||||
def forward(self, x):
|
def forward(self, x):
|
||||||
x = x[0]
|
x = x[0]
|
||||||
boxes = x[:, :, :4]
|
boxes = x[:, :, :4]
|
||||||
|
convert_matrix = torch.tensor(
|
||||||
|
[[1, 0, 1, 0], [0, 1, 0, 1], [-0.5, 0, 0.5, 0], [0, -0.5, 0, 0.5]], dtype=boxes.dtype, device=boxes.device
|
||||||
|
)
|
||||||
|
boxes @= convert_matrix
|
||||||
objectness = x[:, :, 4:5]
|
objectness = x[:, :, 4:5]
|
||||||
scores, classes = torch.max(x[:, :, 5:], 2, keepdim=True)
|
scores, labels = torch.max(x[:, :, 5:], dim=-1, keepdim=True)
|
||||||
scores *= objectness
|
scores *= objectness
|
||||||
classes = classes.float()
|
return torch.cat([boxes, scores, labels.to(boxes.dtype)], dim=-1)
|
||||||
return boxes, scores, classes
|
|
||||||
|
|
||||||
|
|
||||||
def suppress_warnings():
|
|
||||||
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
|
||||||
warnings.filterwarnings('ignore', category=UserWarning)
|
|
||||||
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
|
||||||
|
|
||||||
|
|
||||||
def yolor_export(weights, cfg, size, device):
|
def yolor_export(weights, cfg, size, device):
|
||||||
@@ -57,22 +51,30 @@ def yolor_export(weights, cfg, size, device):
|
|||||||
return model
|
return model
|
||||||
|
|
||||||
|
|
||||||
|
def suppress_warnings():
|
||||||
|
import warnings
|
||||||
|
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=UserWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=FutureWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=ResourceWarning)
|
||||||
|
|
||||||
|
|
||||||
def main(args):
|
def main(args):
|
||||||
suppress_warnings()
|
suppress_warnings()
|
||||||
|
|
||||||
print('\nStarting: %s' % args.weights)
|
print(f'\nStarting: {args.weights}')
|
||||||
|
|
||||||
print('Opening YOLOR model\n')
|
print('Opening YOLOR model')
|
||||||
|
|
||||||
device = torch.device('cpu')
|
device = torch.device('cpu')
|
||||||
model = yolor_export(args.weights, args.cfg, args.size, device)
|
model = yolor_export(args.weights, args.cfg, args.size, device)
|
||||||
|
|
||||||
if hasattr(model, 'names') and len(model.names) > 0:
|
if hasattr(model, 'names') and len(model.names) > 0:
|
||||||
print('\nCreating labels.txt file')
|
print('Creating labels.txt file')
|
||||||
f = open('labels.txt', 'w')
|
with open('labels.txt', 'w', encoding='utf-8') as f:
|
||||||
for name in model.names:
|
for name in model.names:
|
||||||
f.write(name + '\n')
|
f.write(f'{name}\n')
|
||||||
f.close()
|
|
||||||
|
|
||||||
model = nn.Sequential(model, DeepStreamOutput())
|
model = nn.Sequential(model, DeepStreamOutput())
|
||||||
|
|
||||||
@@ -82,41 +84,37 @@ def main(args):
|
|||||||
img_size = [1280] * 2
|
img_size = [1280] * 2
|
||||||
|
|
||||||
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
||||||
onnx_output_file = os.path.basename(args.weights).split('.pt')[0] + '.onnx'
|
onnx_output_file = f'{args.weights}.onnx'
|
||||||
|
|
||||||
dynamic_axes = {
|
dynamic_axes = {
|
||||||
'input': {
|
'input': {
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
},
|
},
|
||||||
'boxes': {
|
'output': {
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'scores': {
|
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'classes': {
|
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
print('\nExporting the model to ONNX')
|
print('Exporting the model to ONNX')
|
||||||
torch.onnx.export(model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset,
|
torch.onnx.export(
|
||||||
do_constant_folding=True, input_names=['input'], output_names=['boxes', 'scores', 'classes'],
|
model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
|
||||||
dynamic_axes=dynamic_axes if args.dynamic else None)
|
input_names=['input'], output_names=['output'], dynamic_axes=dynamic_axes if args.dynamic else None
|
||||||
|
)
|
||||||
|
|
||||||
if args.simplify:
|
if args.simplify:
|
||||||
print('Simplifying the ONNX model')
|
print('Simplifying the ONNX model')
|
||||||
import onnxsim
|
import onnxslim
|
||||||
model_onnx = onnx.load(onnx_output_file)
|
model_onnx = onnx.load(onnx_output_file)
|
||||||
model_onnx, _ = onnxsim.simplify(model_onnx)
|
model_onnx = onnxslim.slim(model_onnx)
|
||||||
onnx.save(model_onnx, onnx_output_file)
|
onnx.save(model_onnx, onnx_output_file)
|
||||||
|
|
||||||
print('Done: %s\n' % onnx_output_file)
|
print(f'Done: {onnx_output_file}\n')
|
||||||
|
|
||||||
|
|
||||||
def parse_args():
|
def parse_args():
|
||||||
|
import argparse
|
||||||
parser = argparse.ArgumentParser(description='DeepStream YOLOR conversion')
|
parser = argparse.ArgumentParser(description='DeepStream YOLOR conversion')
|
||||||
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pt) file path (required)')
|
parser.add_argument('-w', '--weights', required=True, type=str, help='Input weights (.pt) file path (required)')
|
||||||
parser.add_argument('-c', '--cfg', default='', help='Input cfg (.cfg) file path')
|
parser.add_argument('-c', '--cfg', default='', help='Input cfg (.cfg) file path')
|
||||||
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
|
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
|
||||||
parser.add_argument('--p6', action='store_true', help='P6 model')
|
parser.add_argument('--p6', action='store_true', help='P6 model')
|
||||||
@@ -134,4 +132,4 @@ def parse_args():
|
|||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
args = parse_args()
|
args = parse_args()
|
||||||
sys.exit(main(args))
|
main(args)
|
||||||
|
|||||||
@@ -1,10 +1,8 @@
|
|||||||
import os
|
import os
|
||||||
import sys
|
|
||||||
import argparse
|
|
||||||
import warnings
|
|
||||||
import onnx
|
import onnx
|
||||||
import torch
|
import torch
|
||||||
import torch.nn as nn
|
import torch.nn as nn
|
||||||
|
|
||||||
from yolox.exp import get_exp
|
from yolox.exp import get_exp
|
||||||
from yolox.utils import replace_module
|
from yolox.utils import replace_module
|
||||||
from yolox.models.network_blocks import SiLU
|
from yolox.models.network_blocks import SiLU
|
||||||
@@ -16,17 +14,14 @@ class DeepStreamOutput(nn.Module):
|
|||||||
|
|
||||||
def forward(self, x):
|
def forward(self, x):
|
||||||
boxes = x[:, :, :4]
|
boxes = x[:, :, :4]
|
||||||
|
convert_matrix = torch.tensor(
|
||||||
|
[[1, 0, 1, 0], [0, 1, 0, 1], [-0.5, 0, 0.5, 0], [0, -0.5, 0, 0.5]], dtype=boxes.dtype, device=boxes.device
|
||||||
|
)
|
||||||
|
boxes @= convert_matrix
|
||||||
objectness = x[:, :, 4:5]
|
objectness = x[:, :, 4:5]
|
||||||
scores, classes = torch.max(x[:, :, 5:], 2, keepdim=True)
|
scores, labels = torch.max(x[:, :, 5:], dim=-1, keepdim=True)
|
||||||
scores *= objectness
|
scores *= objectness
|
||||||
classes = classes.float()
|
return torch.cat([boxes, scores, labels.to(boxes.dtype)], dim=-1)
|
||||||
return boxes, scores, classes
|
|
||||||
|
|
||||||
|
|
||||||
def suppress_warnings():
|
|
||||||
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
|
||||||
warnings.filterwarnings('ignore', category=UserWarning)
|
|
||||||
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
|
||||||
|
|
||||||
|
|
||||||
def yolox_export(weights, exp_file):
|
def yolox_export(weights, exp_file):
|
||||||
@@ -42,10 +37,19 @@ def yolox_export(weights, exp_file):
|
|||||||
return model, exp
|
return model, exp
|
||||||
|
|
||||||
|
|
||||||
|
def suppress_warnings():
|
||||||
|
import warnings
|
||||||
|
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=UserWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=DeprecationWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=FutureWarning)
|
||||||
|
warnings.filterwarnings('ignore', category=ResourceWarning)
|
||||||
|
|
||||||
|
|
||||||
def main(args):
|
def main(args):
|
||||||
suppress_warnings()
|
suppress_warnings()
|
||||||
|
|
||||||
print('\nStarting: %s' % args.weights)
|
print(f'\nStarting: {args.weights}')
|
||||||
|
|
||||||
print('Opening YOLOX model')
|
print('Opening YOLOX model')
|
||||||
|
|
||||||
@@ -57,39 +61,35 @@ def main(args):
|
|||||||
img_size = [exp.input_size[1], exp.input_size[0]]
|
img_size = [exp.input_size[1], exp.input_size[0]]
|
||||||
|
|
||||||
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
|
||||||
onnx_output_file = os.path.basename(args.weights).split('.pt')[0] + '.onnx'
|
onnx_output_file = f'{args.weights}.onnx'
|
||||||
|
|
||||||
dynamic_axes = {
|
dynamic_axes = {
|
||||||
'input': {
|
'input': {
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
},
|
},
|
||||||
'boxes': {
|
'output': {
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'scores': {
|
|
||||||
0: 'batch'
|
|
||||||
},
|
|
||||||
'classes': {
|
|
||||||
0: 'batch'
|
0: 'batch'
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
print('Exporting the model to ONNX')
|
print('Exporting the model to ONNX')
|
||||||
torch.onnx.export(model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset,
|
torch.onnx.export(
|
||||||
do_constant_folding=True, input_names=['input'], output_names=['boxes', 'scores', 'classes'],
|
model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
|
||||||
dynamic_axes=dynamic_axes if args.dynamic else None)
|
input_names=['input'], output_names=['output'], dynamic_axes=dynamic_axes if args.dynamic else None
|
||||||
|
)
|
||||||
|
|
||||||
if args.simplify:
|
if args.simplify:
|
||||||
print('Simplifying the ONNX model')
|
print('Simplifying the ONNX model')
|
||||||
import onnxsim
|
import onnxslim
|
||||||
model_onnx = onnx.load(onnx_output_file)
|
model_onnx = onnx.load(onnx_output_file)
|
||||||
model_onnx, _ = onnxsim.simplify(model_onnx)
|
model_onnx = onnxslim.slim(model_onnx)
|
||||||
onnx.save(model_onnx, onnx_output_file)
|
onnx.save(model_onnx, onnx_output_file)
|
||||||
|
|
||||||
print('Done: %s\n' % onnx_output_file)
|
print(f'Done: {onnx_output_file}\n')
|
||||||
|
|
||||||
|
|
||||||
def parse_args():
|
def parse_args():
|
||||||
|
import argparse
|
||||||
parser = argparse.ArgumentParser(description='DeepStream YOLOX conversion')
|
parser = argparse.ArgumentParser(description='DeepStream YOLOX conversion')
|
||||||
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pth) file path (required)')
|
parser.add_argument('-w', '--weights', required=True, help='Input weights (.pth) file path (required)')
|
||||||
parser.add_argument('-c', '--exp', required=True, help='Input exp (.py) file path (required)')
|
parser.add_argument('-c', '--exp', required=True, help='Input exp (.py) file path (required)')
|
||||||
@@ -100,8 +100,6 @@ def parse_args():
|
|||||||
args = parser.parse_args()
|
args = parser.parse_args()
|
||||||
if not os.path.isfile(args.weights):
|
if not os.path.isfile(args.weights):
|
||||||
raise SystemExit('Invalid weights file')
|
raise SystemExit('Invalid weights file')
|
||||||
if not os.path.isfile(args.exp):
|
|
||||||
raise SystemExit('Invalid exp file')
|
|
||||||
if args.dynamic and args.batch > 1:
|
if args.dynamic and args.batch > 1:
|
||||||
raise SystemExit('Cannot set dynamic batch-size and static batch-size at same time')
|
raise SystemExit('Cannot set dynamic batch-size and static batch-size at same time')
|
||||||
return args
|
return args
|
||||||
@@ -109,4 +107,4 @@ def parse_args():
|
|||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
args = parse_args()
|
args = parse_args()
|
||||||
sys.exit(main(args))
|
main(args)
|
||||||
|
|||||||
Reference in New Issue
Block a user