首先我在X3开发板上按照教程1https://blog.csdn.net/SA2672873269/article/details/139780749 中方法,训练并部署推理yolov8目标检测模型,最终精度达标但推理时延140-180ms不符合要求(60ms内),随即将同一模型根据教程2 https://blog.csdn.net/SA2672873269/article/details/142663629]要求转换到最终bin模型,使用同一套(教程1中)后处理及推理代码,发现推理得到的输出全为空(测试集也没变),请问1)怎么排查并解决X5平台上输出不符的问题(优先解决);2)以及X3上是否还有推理延时优化空间?(模型训练使用ultralytics官方8.3版本直接训练后下放到开发板推理延时高,用地平线8.0.175修改版训练后推理延时很低但精度很差)。
head.py修改如下:
# def forward_horizon(self, x):
# # x3适配实现
# bbox = []
# cls = []
# for i in range(self.nl):
# bbox.append(self.cv2[i](x[i]))
# cls.append(self.cv3[i](x[i]))
# return (bbox, cls)
def forward_horizon(self, x):
# x5适配实现
bboxes = [self.cv2[i](x[i]).permute(0, 2, 3, 1).contiguous() for i in range(self.nl)]
clses = [self.cv3[i](x[i]).permute(0, 2, 3, 1).contiguous() for i in range(self.nl)]
return (bboxes, clses)
模型信息:
hrt_model_exec model_info --model_file yolo8s_0708_bayes_nv12_modified.bin
hrt_model_exec model_info --model_file yolo8s_0708_bayes_nv12_modified.bin
I0000 00:00:00.000000 15937 vlog_is_on.cc:197] RAW: Set VLOG level for "*" to 3
core[0] open!
core[1] open!
[HBRT] set log level as 0. version = 3.15.55.0
[DNN] Runtime version = 1.24.5_(3.15.55 HBRT)
[A][DNN][packed_model.cpp:247][Model](2025-07-08,14:59:42.398.58) [HorizonRT] The model builder version = 1.24.3
Load model to DDR cost 589.837ms.
This model file has 1 model:
[yolo8s_0708_bayes_nv12]
---------------------------------------------------------------------
[model name]: yolo8s_0708_bayes_nv12
input[0]:
name: images
input source: HB_DNN_INPUT_FROM_PYRAMID
valid shape: (1,3,640,640,)
aligned shape: (1,3,640,640,)
aligned byte size: 614400
tensor type: HB_DNN_IMG_TYPE_NV12
tensor layout: HB_DNN_LAYOUT_NCHW
quanti type: NONE
stride: (0,0,0,0,)
output[0]:
name: output0
valid shape: (1,80,80,64,)
aligned shape: (1,80,80,64,)
aligned byte size: 1638400
tensor type: HB_DNN_TENSOR_TYPE_S32
tensor layout: HB_DNN_LAYOUT_NHWC
quanti type: SCALE
stride: (1638400,20480,256,4,)
scale data: 0.000761131,0.000722571,0.000573587,0.000642382,0.000706796,0.000599878,0.000532397,0.000292052,0.000277373,0.000162567,0.000160596,0.000166511,0.000168045,0.00016202,0.000152818,0.000179657,0.000854903,0.000634933,0.000598564,0.000528454,0.000447389,0.000427452,0.000303226,0.000305855,0.00021274,0.000158624,0.000153913,0.00014493,0.000138905,0.000123021,0.000109656,0.000114367,0.00060908,0.000703729,0.000587171,0.000631866,0.000708987,0.000510926,0.000469299,0.000395245,0.00026598,0.00016684,0.000147779,0.000168702,0.000177137,0.000174837,0.000173851,0.000180424,0.00103675,0.00067437,0.000673056,0.000567891,0.000521443,0.000512241,0.000375307,0.000242975,0.000167935,0.000148984,0.000155228,0.000151722,0.000135619,0.000118858,0.000119844,0.000120282,
quantizeAxis: 3
output[1]:
name: 326
valid shape: (1,40,40,64,)
aligned shape: (1,40,40,64,)
aligned byte size: 409600
tensor type: HB_DNN_TENSOR_TYPE_S32
tensor layout: HB_DNN_LAYOUT_NHWC
quanti type: SCALE
stride: (409600,10240,256,4,)
scale data: 0.00100865,0.000932273,0.000794042,0.00072461,0.000837594,0.000672852,0.000638137,0.000505586,0.000413432,0.000344632,0.000314177,0.000278041,0.000281197,0.000291454,0.000273938,0.000314334,0.00131793,0.00106672,0.00109638,0.000930379,0.00070883,0.000615729,0.000539986,0.000372404,0.000354731,0.000297766,0.000206558,0.000226756,0.000250899,0.000250899,0.000235909,0.000246008,0.00117654,0.00101496,0.000869154,0.0010907,0.000863473,0.000838225,0.000545983,0.000547561,0.000355993,0.000392287,0.000357571,0.000312125,0.00025374,0.000289402,0.000302815,0.000360727,0.00127754,0.00100739,0.00110143,0.00100739,0.000768163,0.000586695,0.000497696,0.000419113,0.000287193,0.000234962,0.000246481,0.000227545,0.00019567,0.00022723,0.000248217,0.000281512,
quantizeAxis: 3
output[2]:
name: 334
valid shape: (1,20,20,64,)
aligned shape: (1,20,20,64,)
aligned byte size: 102400
tensor type: HB_DNN_TENSOR_TYPE_S32
tensor layout: HB_DNN_LAYOUT_NHWC
quanti type: SCALE
stride: (102400,5120,256,4,)
scale data: 0.00119648,0.00124915,0.00126144,0.00112538,0.00115522,0.00129655,0.00131499,0.00109202,0.00109729,0.000969123,0.00119824,0.000865539,0.000752299,0.000564444,0.000238331,0.000147804,0.00142735,0.00126495,0.00130972,0.00146949,0.00142208,0.0011324,0.00105164,0.00110431,0.00108149,0.0011482,0.000833937,0.000879585,0.000688218,0.000382733,0.00010523,0.00010501,0.00117629,0.00129655,0.00121667,0.00129392,0.00117805,0.0012632,0.00130358,0.00111835,0.0011245,0.00111309,0.000958589,0.000796191,0.000455593,0.000517041,0.000255229,0.000114776,0.00140979,0.00156692,0.00164769,0.00138785,0.00116927,0.00105076,0.000942788,0.000862467,0.000817259,0.000826476,0.000725965,0.000618869,0.000467883,0.000282222,0.000112801,0.000110058,
quantizeAxis: 3
output[3]:
name: 342
valid shape: (1,80,80,1,)
aligned shape: (1,80,80,1,)
aligned byte size: 25600
tensor type: HB_DNN_TENSOR_TYPE_F32
tensor layout: HB_DNN_LAYOUT_NCHW
quanti type: NONE
stride: (25600,320,4,4,)
output[4]:
name: 350
valid shape: (1,40,40,1,)
aligned shape: (1,40,40,1,)
aligned byte size: 6400
tensor type: HB_DNN_TENSOR_TYPE_F32
tensor layout: HB_DNN_LAYOUT_NCHW
quanti type: NONE
stride: (6400,160,4,4,)
output[5]:
name: 358
valid shape: (1,20,20,1,)
aligned shape: (1,20,20,1,)
aligned byte size: 1600
tensor type: HB_DNN_TENSOR_TYPE_F32
tensor layout: HB_DNN_LAYOUT_NCHW
quanti type: NONE
stride: (1600,80,4,4,)
发现两个教程中后处理推理代码有些差异,但做部分修改后仍行不通,第一个问题就是两个教程怎么理解其中差异并在教程1代码中统一。求大佬解答