使用自训练的yolov3 替换AI Express内的默认版本推理结果不正确的问题

使用自训练的yolov3 替换AI Express内的默认版本。遇到推理不出正确结果的问题。步骤如下,请论坛内的研发人员帮忙看看

1. 自行训练了一个yolov3。使用工具链转换成x3可以使用的版本。转换时已经修改了 input_type_rt: ‘nv12’

2. 将工具链的例子包括转换后的bin文件、脚本和测试图片 scp到x3板子上,运行dev_board_01_infer.sh 测试一个图片可以看到正确的结果

3. 对原始的aiExpress2.9 进行build和deploy到同一个X3板子上,运行各个例子都正常。包括yolov3的例子

3. 用同样的bin文件,替换aiExpress(2.9)里的对应文件 yolov3_nv12_hybrid_horizonrt.bin。修改了yolov3_post_process_method.cc文件中的default_yolo3_config变量用来对应自己训练的分类数和名称。重新编译aiExpress后上版运行yolov3的例子。使用callback调用方式指向到同一个测试图片,显示推理成功但结果却不对。但log没有报错,看不出来哪里可能有问题

映像文件是 disk_X3SDB-Linux-20210104_p1_wb_2G.img,aiExpress是2.9.0

运行sh run.sh d ,依次输入17,3,2,1,2

log如下:

(convert.cpp:45): vio message, frame_id = 1

(convert.cpp:47): vio message, pym_level_0, width=1920, height=1080, stride=1920

(convert.cpp:47): vio message, pym_level_1, width=0, height=0, stride=0

(convert.cpp:47): vio message, pym_level_2, width=0, height=0, stride=0

(convert.cpp:47): vio message, pym_level_3, width=0, height=0, stride=0

(convert.cpp:47): vio message, pym_level_4, width=960, height=540, stride=960

(convert.cpp:47): vio message, pym_level_5, width=800, height=480, stride=800

(convert.cpp:47): vio message, pym_level_6, width=640, height=360, stride=640

(convert.cpp:47): vio message, pym_level_7, width=0, height=0, stride=0

(convert.cpp:47): vio message, pym_level_8, width=480, height=270, stride=480

(convert.cpp:47): vio message, pym_level_9, width=0, height=0, stride=0

(convert.cpp:47): vio message, pym_level_10, width=0, height=0, stride=0

(convert.cpp:47): vio message, pym_level_11, width=0, height=0, stride=0

(convert.cpp:47): vio message, pym_level_12, width=240, height=134, stride=240

(convert.cpp:47): vio message, pym_level_13, width=0, height=0, stride=0

(convert.cpp:47): vio message, pym_level_14, width=0, height=0, stride=0

(convert.cpp:47): vio message, pym_level_15, width=0, height=0, stride=0

(convert.cpp:47): vio message, pym_level_16, width=120, height=66, stride=128

(convert.cpp:47): vio message, pym_level_17, width=0, height=0, stride=0

(convert.cpp:47): vio message, pym_level_18, width=0, height=0, stride=0

(convert.cpp:47): vio message, pym_level_19, width=0, height=0, stride=0

(convert.cpp:47): vio message, pym_level_20, width=60, height=32, stride=64

(convert.cpp:47): vio message, pym_level_21, width=0, height=0, stride=0

(convert.cpp:47): vio message, pym_level_22, width=0, height=0, stride=0

(convert.cpp:47): vio message, pym_level_23, width=0, height=0, stride=0

(convert.cpp:57): Input Frame ID = 1, Timestamp = 1

(convert.cpp:77): input name:image

(runtime_monitor.cpp:40): PushFrame frame_id = 1

(smartplugin.cpp:2760): feed one task to xtream workflow

(DnnPredictMethod.cpp:317): DnnPredictMethod DoProcess

(yolov3_predict_method.cc:60): src image height: 1080, src image width: 1920

(yolov3_predict_method.cc:70): Yolov3PredictMethod PrepareInputData

(DnnPredictMethod.cpp:137): input_tensor.data_shape.d[0]: 1, input_tensor.data_shape.d[1]: 3, input_tensor.data_shape.d[2]: 416, input_tensor.data_shape.d[3]: 416, input_tensor.data_shape.layout: 2

(DnnPredictMethod.cpp:146): input_height: 416, input_width: 416, input_channel: 3

(websocketplugin.cpp:355): WebsocketPLugin Feedvideo

(convert.cpp:80): websocketplugin x3 mediacodec: GetYUV

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:1

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:13

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:13

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:87

(DnnPredictMethod.cpp:238): output_size: 58816

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:1

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:26

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:26

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:87

(DnnPredictMethod.cpp:238): output_size: 235248

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:1

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:52

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:52

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:87

(DnnPredictMethod.cpp:238): output_size: 940992

(DnnPredictMethod.cpp:450): DnnPredictMethod DoProcess Success

(scheduler.cpp:457): ScheduleImp2: Yolov3PredictMethod

(yolov3_post_process_method.cc:63): Yolov3PostProcessMethod ParseDnnResult

(yolov3_post_process_method.cc:82): yolov3 model output layer: 3

(yolov3_post_process_method.cc:194): ( x1: 0 y1: 838.817 x2: 1919 y2: 1078 score: 0.696941 ), id: 16, category_name: Panthera pardus

(scheduler.cpp:457): ScheduleImp2: Yolov3PostProcessMethod

(DnnPostProcessMethod.cpp:70): DnnPostProcessMethod DoProcess

(DnnPredictMethod.cpp:317): DnnPredictMethod DoProcess

(mobilenetv2_predict_method.cc:39): Mobilenetv2PredictMethod PrepareInputData

(DnnPredictMethod.cpp:137): input_tensor.data_shape.d[0]: 1, input_tensor.data_shape.d[1]: 3, input_tensor.data_shape.d[2]: 224, input_tensor.data_shape.d[3]: 224, input_tensor.data_shape.layout: 2

(DnnPredictMethod.cpp:146): input_height: 224, input_width: 224, input_channel: 3

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:1

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:1000

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:1

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:1

(DnnPredictMethod.cpp:238): output_size: 4000

(DnnPredictMethod.cpp:450): DnnPredictMethod DoProcess Success

(scheduler.cpp:457): ScheduleImp2: Mobilenetv2PredictMethod

(DnnPostProcessMethod.cpp:70): DnnPostProcessMethod DoProcess

(mobilenetv2_post_process_method.cc:51): Mobilenetv2PostProcessMethod ParseDnnResult

(mobilenetv2_post_process_method.cc:71): id: 399, category_name: abaya

(scheduler.cpp:457): ScheduleImp2: Mobilenetv2PostProcessMethod

(smartplugin.cpp:2907): smart plugin got one smart result

(smartplugin.cpp:2913): image, type is ImageFrame

(smartplugin.cpp:2913): detect_box, type is BaseDataVector

(smartplugin.cpp:2913): classify, type is BaseDataVector

(mergehandbody.cpp:48): image, type is ImageFrame

(mergehandbody.cpp:48): detect_box, type is BaseDataVector

(mergehandbody.cpp:48): classify, type is BaseDataVector

(mergehandbody.cpp:325): UpdateHandTrackID recv null pointer

(mergehandbody.cpp:376): FilterGesture recv null pointer

explicit ExampleCustomSmartMessage

(smartplugin.cpp:2958): smart result image name = configs/vio_hg/Heixiong0.jpg

(smartplugin.cpp:2959): smart result frame_id = 0

(runtime_monitor.cpp:55): Pop frame 0

(viomessage.cpp:51): begin remove one vio slot

(viomessage.cpp:89): free feedback context success

(viomessage.cpp:45): call ~ImageVioMessage

(scheduler.cpp:423): FrameworkDataState_Ready

(ExampleSmartPlugin.cpp:40): output name: image

(ExampleSmartPlugin.cpp:40): output name: detect_box

(ExampleSmartPlugin.cpp:45): box type: detect_box, box size: 1

(ExampleSmartPlugin.cpp:40): output name: classify

^Crecv signal 2, stop

(vioproduce.cpp:206): consumed_vio_buffers_=2

(vioproduce.cpp:379): wait task to finish

(viopipeline.cpp:177): Enter vio pipeline module stop

(viopipeline.cpp:183): viopipeline stop, cam_en: 0 pipe_id: 0

(vpsmodule.cpp:525): Enter vps module stop, pipe_id: 0

(vpsmodule.cpp:470): Enter VpsDestoryPymThread, pipe_id: 0 start_flag: 1

(vpsmodule.cpp:1113): No Pym info in RingQueue! pipe_id: 0

(vpsmodule.cpp:1418): Quit get pym data thread, pipe_id: 0

(viopipeline.cpp:221): vps module get info failed, pipe_id: 0

(vioproduce.cpp:634): iot_vio_pyramid_info failed

(vioproduce.cpp:1561): fill vio image failed, ret: 0

(vpsmodule.cpp:487): Quit VpsDestoryPymThread, pipe_id: 0 start_flag: 0

(vpsmodule.cpp:538): HB_VPS_StopGrp, pipe_id: 0

(executor.cpp:123): Finish a job

(executor.cpp:107): task_queue_ is empty

(yolov3_post_process_method.cc:63): Yolov3PostProcessMethod ParseDnnResult

(yolov3_post_process_method.cc:82): yolov3 model output layer: 3

(yolov3_post_process_method.cc:194): ( x1: 0 y1: 838.817 x2: 1919 y2: 1078 score: 0.696941 ), id: 16, category_name: Panthera pardus

(scheduler.cpp:457): ScheduleImp2: Yolov3PostProcessMethod

(DnnPredictMethod.cpp:317): DnnPredictMethod DoProcess

(mobilenetv2_predict_method.cc:39): Mobilenetv2PredictMethod PrepareInputData

(DnnPredictMethod.cpp:137): input_tensor.data_shape.d[0]: 1, input_tensor.data_shape.d[1]: 3, input_tensor.data_shape.d[2]: 224, input_tensor.data_shape.d[3]: 224, input_tensor.data_shape.layout: 2

(DnnPredictMethod.cpp:146): input_height: 224, input_width: 224, input_channel: 3

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:1

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:1000

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:1

(DnnPredictMethod.cpp:230): node.aligned_shape.d[j]:1

(DnnPredictMethod.cpp:238): output_size: 4000

(DnnPredictMethod.cpp:450): DnnPredictMethod DoProcess Success

(scheduler.cpp:457): ScheduleImp2: Mobilenetv2PredictMethod

(DnnPostProcessMethod.cpp:70): DnnPostProcessMethod DoProcess

(mobilenetv2_post_process_method.cc:51): Mobilenetv2PostProcessMethod ParseDnnResult

(mobilenetv2_post_process_method.cc:71): id: 399, category_name: abaya

(scheduler.cpp:457): ScheduleImp2: Mobilenetv2PostProcessMethod

(smartplugin.cpp:2907): smart plugin got one smart result

(smartplugin.cpp:2913): image, type is ImageFrame

(smartplugin.cpp:2913): detect_box, type is BaseDataVector

(smartplugin.cpp:2913): classify, type is BaseDataVector

(mergehandbody.cpp:48): image, type is ImageFrame

(mergehandbody.cpp:48): detect_box, type is BaseDataVector

(mergehandbody.cpp:48): classify, type is BaseDataVector

(mergehandbody.cpp:325): UpdateHandTrackID recv null pointer

(mergehandbody.cpp:376): FilterGesture recv null pointer

explicit ExampleCustomSmartMessage

(smartplugin.cpp:2958): smart result image name = configs/vio_hg/Heixiong0.jpg

(smartplugin.cpp:2959): smart result frame_id = 1

补充一下在同一块板子上,工具链内的例子推理出来正确结果的log。同一张图片,正确结果是id=21.

root@x3sdbx3-samsung2G-3200:/userdata/samples/yolov3# sh dev_board_01_infer.sh

./release/bin/example --model_file=./yolov3_hybrid_horizonrt.bin --model_name=yolov3 --input_type=image --input_config_string={“image_list_file”:“image_list.txt”,“width”:416,“height”:416,“data_type”:1} --output_type=image --output_config_string={“image_output_dir”:“image_out”} --core_num=1 --post_process_config_string={“score_threshold”:0.45} --enable_post_process=true

[1970-01-05 07:49:13 INFO 547578486800 hr_api.cpp:467] HorizonRT version = 1.2.2

[1970-01-05 07:49:13 INFO 547578486800 hr_api.cpp:472] hbrt version = 3.10.4

[HBRT] set log level as 0. version = 3.10.4

bpu engine_type is group.

[BPU_PLAT]Makesure BPU Core Opened!!!)

[BPU_PLAT]BPU Platform Version(1.2.1)!

[BPU_PLAT]Makesure BPU Core(1) Opened!!!

[HorizonRT] The model builder version = 1.1.51

[HorizonRT] Hybrid model hbm_hbrt_version = 3.10.4

I0105 07:49:15.614523 20083 simple_example.cc:74] Model info:Input num:1, input[0]: name:data, data type:BPU_TYPE_IMG_YUV_NV12, shape:(1,3,416,416,), layout:BPU_LAYOUT_NCHW, aligned shape:(1,4,416,416,), layout:BPU_LAYOUT_NCHW, Output num:3, output[0]: name:transpose_layer82-conv2nhwc, op:1, data type:BPU_TYPE_TENSOR_F32, shape:(1,13,13,87,), layout:BPU_LAYOUT_NHWC, aligned shape:(1,13,13,87,), layout:BPU_LAYOUT_NHWC, output[1]: name:transpose_layer94-conv2nhwc, op:1, data type:BPU_TYPE_TENSOR_F32, shape:(1,26,26,87,), layout:BPU_LAYOUT_NHWC, aligned shape:(1,26,26,87,), layout:BPU_LAYOUT_NHWC, output[2]: name:transpose_layer106-conv2nhwc, op:1, data type:BPU_TYPE_TENSOR_F32, shape:(1,52,52,87,), layout:BPU_LAYOUT_NHWC, aligned shape:(1,52,52,87,), layout:BPU_LAYOUT_NHWC

I0105 07:49:16.126553 20083 simple_example.cc:156] Image:Heixiong0.jpg, infer result:[{“bbox”:[0.000000,51.704739,1551.651978,1028.003540],“score”:0.982805,“id”:21}]

I0105 07:49:16.126795 20083 simple_example.cc:172] Whole process statistics:count:1, duration:174.362ms, min:174.362ms, max:174.362ms, average:174.362ms, fps:5.73519/s

, Infer stage statistics:count:1, duration:169.986ms, min:169.986ms, max:169.986ms, average:169.986ms, fps:5.88284/s

, Post process stage statistics:count:1, duration:4.375ms, min:4.375ms, max:4.375ms, average:4.375ms, fps:228.571/s

工具链没有报错,说明不是模型的问题,后处理模块您是否校验过呢?

后处理模块只改动了yolov3_post_process_method.cc文件中的default_yolo3_config变量,因为我的分类数不是原模型的80,只有24.所以改动了这个变量的 class_num和 class_names。其他的都没有改。和工具链里面的后处理函数对比了一下, 也没发现明显的差异。

用的是yolov3+mobilenet 这个method?

是的。 例子里面yolov3和mobilenet是 串行的过程,但暂时不关心mobilenet 的结果。所以也没有去删除这个过程,只是修改了yolov的部分

你好,可以把修改的源码发一下吗?

你好, 只改动了这个变量的后面两个参数 class num和 class name

Yolo3Config default_yolo3_config = {

{32, 16, 8},

{{{3.625, 2.8125}, {4.875, 6.1875}, {11.65625, 10.1875}},

{{1.875, 3.8125}, {3.875, 2.8125}, {3.6875, 7.4375}},

{{1.25, 1.625}, {2.0, 3.75}, {4.125, 2.875}}},

24,

{ “Ailuropoda melanoleuca”, “Ailurus fulgens”, “Budorcas taxicolor”,

“Capricornis sumatraensis”, “Cuon alpinus”, “Dog”,

“Felis lynx”, “Goose”, “Hydropotes inermis”,

“Leopard”, “Lophophorus lhuysii”, “Macaca arctoides”,

“Macaca mulatta”, “Manis pentadactyla”, “Naemorhedus goral”,

“Neofelis nebulosa”, “Panthera pardus”, “Pucrasia macrolopha”,

“Rhinopithecu”, “Rabbit”, “Tragopan temminckii”,

“Ursus thibetanus”, “Viverra zibetha”, “Viverricula indica”,

} // changed class num and name

/*80,

{“person”, “bicycle”, “car”,

“motorcycle”, “airplane”, “bus”,

“train”, “truck”, “boat”,

“traffic light”, “fire hydrant”, “stop sign”,

“parking meter”, “bench”, “bird”,

“cat”, “dog”, “horse”,

“sheep”, “cow”, “elephant”,

“bear”, “zebra”, “giraffe”,

“backpack”, “umbrella”, “handbag”,

“tie”, “suitcase”, “frisbee”,

“skis”, “snowboard”, “sports ball”,

“kite”, “baseball bat”, “baseball glove”,

“skateboard”, “surfboard”, “tennis racket”,

“bottle”, “wine glass”, “cup”,

“fork”, “knife”, “spoon”,

“bowl”, “banana”, “apple”,

“sandwich”, “orange”, “broccoli”,

“carrot”, “hot dog”, “pizza”,

“donut”, “cake”, “chair”,

“couch”, “potted plant”, “bed”,

“dining table”, “toilet”, “tv”,

“laptop”, “mouse”, “remote”,

“keyboard”, “cell phone”, “microwave”,

“oven”, “toaster”, “sink”,

“refrigerator”, “book”, “clock”,

“vase”, “scissors”, “teddy bear”,

“hair drier”, “toothbrush”}*/

};

您好,可以把代码和模型发我油箱一下,guosheng.xu@horizon.ai

是否图像输入大小没有指定呢?

不是太理解需要改动哪些文件?能否说的具体一点。我没有改动原先例子中对金字塔部分处理的代码,还有哪些因为改变了分类数而需要特别修改设置的吗?我读了金字塔的说明文档,感觉只是在做缩放以及padding,并不涉及推理的分类数。

原先aiExpress默认的yolov3例子模型是416*416的输入,nv12。 我在训练自己的yolov3模型的时候也是416*416。转换到x3的bin文件的时候也指定了nv12. 可以说除了分类数从80变成24以外,其他部分都保持与aiExpress例子相同,就是为了简化替换过程。 所以不太理解还需要哪些设置?

在技术支持帮助下,问题原因找到了

我在训练模型的时候对图片的处理是直接resize的,而aiExpress中是先padding,后resize。

提供的修改方案已经验证可行。

感谢论坛内的技术支持

?