J5 dsp的使用

用户您好,请详细描述您所遇到的问题。

1.硬件获取渠道:

2.当前系统镜像版本:

3.当前天工开物版本:

4.问题定位:

5.开发的demo/案例:

6.需要提供的解决方案:

使用OE1.1.37,ddk/samples/vdsp_rpc_sample路径下提供的Softmax自定义算子示例上板运行正常,示例中使用的模型是mobilenet-v1。

当前我手中没有DSP工具链,想替换模型,然后测试下DSP侧Softmax算子能不能用。

将模型替换为Resnet模型,运行时返回错误码 -6000002,查了下说是模型无效,但是这个模型我在CPU运行都是OK的。

当前已经没有思路就怎么搞了~,哪位朋友有时间麻烦帮忙看下~

您先用1.1.37OE包提供的hrt_model_exec工具,在板端对你的resnet模型跑一下perf看能不能成功出结果呢?

hrt_model_exec位置:ddk/package/board/hrt_tools/bin/hrt_model_exec

参考指令:./hrt_model_exec perf --model_file resnet.bin

如果能正常打印FPS和latency等信息,说明模型没问题。

你好~,多谢回复

模型我已经在板子上跑起来了,结果已经验证是正确的

只是Softmax运行在CPU上,导致FPS很低,所以想用DSP

一会回办公室,我再用这个工具试下

这个是在x86上跑出来的结果,J5没在手上,帧率会比这个高很多

./hrt_model_exec perf --core_id 0 --frame_count 1 --model_file /home/shiyucun/workspace/zero_one/src/perception/lidars_fpn_resnet18/model_inference_cc/data/fpn_resnet.bin

I0000 00:00:00.000000 179 vlog_is_on.cc:197] RAW: Set VLOG level for “*” to 3

core[0] open!

core[1] open!

[HBRT] set log level as 0. version = 3.15.8.0

[DNN] Runtime version = 1.14.4_(3.15.8 HBRT)

[A][DNN][initializer.cpp:37](1686279622307) Direct mode

[A][DNN][packed_model.cpp:208](1686279622389) [HorizonRT] The model builder version = 1.13.5

Load model to DDR cost 127.18ms.

I0609 11:00:22.434839 179 function_util.cpp:319] get model handle success

I0609 11:00:22.434887 179 function_util.cpp:607] get model input count success

I0609 11:00:22.435356 179 function_util.cpp:638] prepare input tensor success!

I0609 11:00:22.435359 179 function_util.cpp:644] get model output count success

WARNING: hb_bpu_core_estimate_loading is ignored in pseudo_firmware

WARNING: cnn_core_fc_avl_cap is ignored in pseudo_firmware

WARNING: bpu_mem_cache_flush is ignored in pseudo_firmware

Frame count: 1, Thread Average: 70910.718750 ms, thread max latency: 70910.718750 ms, thread min latency: 70910.718750 ms, FPS: 0.014102

Running condition:

Thread number is: 1

Frame count is: 1

Program run time: 70911.344000 ms

Perf result:

Frame totally latency is: 70910.718750 ms

Average latency is: 70910.718750 ms

Frame rate is: 0.014102 FPS

把Softmax部署到BPU:

run_on_bpu: /Softmax;/Softmax_1;/Softmax_2;/Softmax_3;

run_on_bpu: Softmax;Softmax_1;Softmax_2;Softmax_3;

run_on_bpu: Softmax;

run_on_bpu: {/Softmax, /Softmax_1, /Softmax_2, /Softmax_3}

run_on_bpu: {Softmax, Softmax_1, Softmax_2, Softmax_3}

run_on_bpu: {Softmax}

转换使用的配置文件,使用了上面的参数,转换后Softmax依然跑在CPU:

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs# hb_mapper makertbin --config /mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/config.yaml --model-type onnx

2023-06-09 14:27:45,249 INFO log will be stored in /mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs/hb_mapper_makertbin.log

2023-06-09 14:27:45,249 INFO Start hb_mapper…

2023-06-09 14:27:45,249 INFO hbdk version 3.41.5

2023-06-09 14:27:45,249 INFO horizon_nn version 0.15.5

2023-06-09 14:27:45,249 INFO hb_mapper version 1.13.5

2023-06-09 14:27:45,249 INFO Start Model Convert…

2023-06-09 14:27:45,252 INFO Using onnx model file: /mnt/sda/shiyucun/pcp_j5/larry_pcp/src/super_fast_object_detection/src/sfa/model2onnx/fpn_resnet0420.onnx

2023-06-09 14:27:45,282 INFO Model has 1 inputs according to model file

2023-06-09 14:27:45,282 INFO The calibration dir name suffix is the same as the value float32 of the cal_data_type parameter and will be read with the value of cal_data_type.

2023-06-09 14:27:45,282 INFO custom_op does not exist, skipped

2023-06-09 14:27:45,282 WARNING Input node data’s input_source not set, it will be set to ddr by default

2023-06-09 14:27:45,284 INFO *******************************************

2023-06-09 14:27:45,284 INFO First calibration picture name: 1589168947_0.pcd.bgr

2023-06-09 14:27:45,284 INFO First calibration picture md5:

6f5602903a2881239cdec090376ae782 /mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/calibration_data_bgr_f32/1589168947_0.pcd.bgr

2023-06-09 14:27:45,290 INFO *******************************************

2023-06-09 14:27:45,513 INFO [Fri Jun 9 14:27:45 2023] Start to Horizon NN Model Convert.

2023-06-09 14:27:45,513 INFO Parsing the input parameter:{‘data’: {‘input_shape’: [1, 3, 608, 608], ‘input_batch’: 1, ‘expected_input_type’: ‘BGR_128’, ‘original_input_type’: ‘BGR’, ‘original_input_layout’: ‘NCHW’}}

2023-06-09 14:27:45,513 INFO Parsing the calibration parameter

2023-06-09 14:27:45,513 INFO There are 1 nodes designated to run on the bpu: [‘{ Softmax : None}’].

2023-06-09 14:27:45,513 INFO Parsing the hbdk parameter:{‘hbdk_pass_through_params’: '–O3 --core-num 1 --fast ', ‘input-source’: {‘data’: ‘ddr’, ‘_default_value’: ‘ddr’}}

2023-06-09 14:27:45,513 INFO HorizonNN version: 0.15.5

2023-06-09 14:27:45,513 INFO HBDK version: 3.41.5

2023-06-09 14:27:45,514 INFO [Fri Jun 9 14:27:45 2023] Start to parse the onnx model.

2023-06-09 14:27:45,538 INFO Input ONNX model infomation:

ONNX IR version: 6

Opset version: [11]

Producer: pytorch2.0.1

Domain: none

Input name: data, [1, 3, 608, 608]

Output name: output, [1, 3, 152, 152]

Output name: 342, [1, 2, 152, 152]

Output name: 368, [1, 2, 152, 152]

Output name: 394, [1, 1, 152, 152]

Output name: 420, [1, 3, 152, 152]

2023-06-09 14:27:45,717 INFO [Fri Jun 9 14:27:45 2023] End to parse the onnx model.

2023-06-09 14:27:45,718 INFO Model input names parsed from model: [‘data’]

2023-06-09 14:27:45,718 INFO Create a preprocessing operator for input_name data with means=None, std=None, original_input_layout=NCHW, color convert from ‘BGR’ to ‘BGR’.

2023-06-09 14:27:45,994 INFO Saving the original float model: fpn_resnet_original_float_model.onnx.

2023-06-09 14:27:45,994 INFO [Fri Jun 9 14:27:45 2023] Start to optimize the model.

2023-06-09 14:27:46,294 INFO [Fri Jun 9 14:27:46 2023] End to optimize the model.

2023-06-09 14:27:46,489 INFO Saving the optimized model: fpn_resnet_optimized_float_model.onnx.

2023-06-09 14:27:46,489 INFO [Fri Jun 9 14:27:46 2023] Start to calibrate the model.

2023-06-09 14:27:46,490 INFO There are 102 samples in the calibration data set.

2023-06-09 14:27:46,696 INFO Run calibration model with kl method.

kl calibration in progress: 0%| | 0/13 [00:00<?, ?it/s]2023-06-09 14:27:49.684791867 [E:onnxruntime:, sequential_executor.cc:183 Execute] Non-zero status code returned while running Reshape node. Name:‘/Unsqueeze_14’ Status Message: /home/jenkins/agent/workspace/model_convert/onnxruntime/onnxruntime/core/providers/cpu/tensor/reshape_helper.h:43 onnxruntime::ReshapeHelper::ReshapeHelper(const onnxruntime::TensorShape&, std::vector&) gsl::narrow_cast<int64_t>(input_shape.Size()) == size was false. The input tensor cannot be reshaped to the requested shape. Input shape:{8,3,152,152}, requested shape:{1,3,152,152,1}

kl calibration in progress: 0%| | 0/13 [00:02<?, ?it/s]

2023-06-09 14:27:49,685 INFO Above info is caused by batch mode infer and can be ignored

2023-06-09 14:27:49,685 INFO Reset batch_size=1 and execute calibration again…

kl calibration in progress: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 102/102 [01:51<00:00, 1.09s/it]

2023-06-09 14:29:40,980 INFO [Fri Jun 9 14:29:40 2023] End to calibrate the model.

2023-06-09 14:29:40,981 INFO [Fri Jun 9 14:29:40 2023] Start to quantize the model.

2023-06-09 14:29:50,457 INFO [Fri Jun 9 14:29:50 2023] End to quantize the model.

2023-06-09 14:29:51,016 INFO Saving the quantized model: fpn_resnet_quantized_model.onnx.

2023-06-09 14:29:51,840 INFO [Fri Jun 9 14:29:51 2023] Start to compile the model with march bayes.

2023-06-09 14:29:52,208 INFO Compile submodel: torch_jit_subgraph_0

2023-06-09 14:29:52,768 INFO hbdk-cc parameters:[‘–O3’, ‘–core-num’, ‘1’, ‘–fast’, ‘–input-layout’, ‘NHWC’, ‘–output-layout’, ‘NHWC’, ‘–input-source’, ‘ddr’]

2023-06-09 14:29:52,781 WARNING Can not find the scale for node HZ_PREPROCESS_FOR_data_NCHW2NHWC_LayoutConvert_Input0

2023-06-09 14:41:54,504 INFO Compile submodel: torch_jit_subgraph_1

2023-06-09 14:41:54,542 INFO hbdk-cc parameters:[‘–O3’, ‘–core-num’, ‘1’, ‘–fast’, ‘–input-layout’, ‘NHWC’, ‘–output-layout’, ‘NHWC’, ‘–input-source’, ‘ddr,ddr’]

2023-06-09 14:41:54,717 INFO Compile submodel: torch_jit_subgraph_2

2023-06-09 14:41:54,754 INFO hbdk-cc parameters:[‘–O3’, ‘–core-num’, ‘1’, ‘–fast’, ‘–input-layout’, ‘NHWC’, ‘–output-layout’, ‘NHWC’, ‘–input-source’, ‘ddr,ddr’]

2023-06-09 14:41:54,872 INFO Compile submodel: torch_jit_subgraph_3

2023-06-09 14:41:54,910 INFO hbdk-cc parameters:[‘–O3’, ‘–core-num’, ‘1’, ‘–fast’, ‘–input-layout’, ‘NHWC’, ‘–output-layout’, ‘NHWC’, ‘–input-source’, ‘ddr,ddr’]

2023-06-09 14:41:55,028 INFO Compile submodel: torch_jit_subgraph_4

2023-06-09 14:41:55,067 INFO hbdk-cc parameters:[‘–O3’, ‘–core-num’, ‘1’, ‘–fast’, ‘–input-layout’, ‘NHWC’, ‘–output-layout’, ‘NHWC’, ‘–input-source’, ‘ddr,ddr’]

2023-06-09 14:41:55,175 INFO Compile submodel: torch_jit_subgraph_5

2023-06-09 14:41:55,211 INFO hbdk-cc parameters:[‘–O3’, ‘–core-num’, ‘1’, ‘–fast’, ‘–input-layout’, ‘NHWC’, ‘–output-layout’, ‘NHWC’, ‘–input-source’, ‘ddr,ddr’]

2023-06-09 14:41:55,730 INFO [Fri Jun 9 14:41:55 2023] End to compile the model with march bayes.

2023-06-09 14:41:55,731 INFO The converted model node information:

==============================================================================================================================================

Node ON Subgraph Type Cosine Similarity Threshold

-----------------------------------------------------------------------------------------------------------------------------------------------

HZ_PREPROCESS_FOR_data BPU id(0) HzSQuantizedPreprocess 0.999902 127.000000

/conv1/Conv BPU id(0) HzSQuantizedConv 0.995028 253.856461

/maxpool/MaxPool BPU id(0) HzQuantizedMaxPool 0.986078 1.761778

/layer1/layer1.0/conv1/Conv BPU id(0) HzSQuantizedConv 0.944165 1.761778

/layer1/layer1.0/conv2/Conv BPU id(0) HzSQuantizedConv 0.964870 1.422739

/layer1/layer1.1/conv1/Conv BPU id(0) HzSQuantizedConv 0.912611 2.758681

/layer1/layer1.1/conv2/Conv BPU id(0) HzSQuantizedConv 0.958991 1.346770

/layer2/layer2.0/conv1/Conv BPU id(0) HzSQuantizedConv 0.935197 3.533995

/layer2/layer2.0/conv2/Conv BPU id(0) HzSQuantizedConv 0.904247 1.255401

/layer2/layer2.0/downsample/downsample.0/Conv BPU id(0) HzSQuantizedConv 0.938371 3.533995

/layer2/layer2.1/conv1/Conv BPU id(0) HzSQuantizedConv 0.888894 2.211774

/layer2/layer2.1/conv2/Conv BPU id(0) HzSQuantizedConv 0.935048 1.245804

/layer3/layer3.0/conv1/Conv BPU id(0) HzSQuantizedConv 0.907809 2.861637

/layer3/layer3.0/conv2/Conv BPU id(0) HzSQuantizedConv 0.909788 1.452334

/layer3/layer3.0/downsample/downsample.0/Conv BPU id(0) HzSQuantizedConv 0.924617 2.861637

/layer3/layer3.1/conv1/Conv BPU id(0) HzSQuantizedConv 0.883150 1.643583

/layer3/layer3.1/conv2/Conv BPU id(0) HzSQuantizedConv 0.910612 0.954917

/layer4/layer4.0/conv1/Conv BPU id(0) HzSQuantizedConv 0.821044 2.078356

/layer4/layer4.0/conv2/Conv BPU id(0) HzSQuantizedConv 0.861210 1.876143

/layer4/layer4.0/downsample/downsample.0/Conv BPU id(0) HzSQuantizedConv 0.782653 2.078356

/layer4/layer4.1/conv1/Conv BPU id(0) HzSQuantizedConv 0.667947 2.044354

/layer4/layer4.1/conv2/Conv BPU id(0) HzSQuantizedConv 0.774661 2.900229

/Resize BPU id(0) HzQuantizedRoiResize 0.802121 16.436541

/layer3/layer3.1/relu_1/Relu_output_0_Requantize BPU id(0) HzRequantize

/Concat BPU id(0) Concat 0.801647 16.436541

/conv_up_level1/Conv BPU id(0) HzSQuantizedConv 0.810718 16.436541

/Resize_1 BPU id(0) HzQuantizedRoiResize 0.816626 33.241734

/Resize_1_output_0_Requantize BPU id(0) HzRequantize

/layer2/layer2.1/relu_1/Relu_output_0_Requantize BPU id(0) HzRequantize

/Concat_1 BPU id(0) Concat 0.816599 33.241734

/conv_up_level2/Conv BPU id(0) HzSQuantizedConv 0.836094 30.123194

/Resize_2 BPU id(0) HzQuantizedRoiResize 0.837536 44.009842

/Resize_2_output_0_Requantize BPU id(0) HzRequantize

/layer1/layer1.1/relu_1/Relu_output_0_Requantize BPU id(0) HzRequantize

/Concat_2 BPU id(0) Concat 0.838752 44.009842

/conv_up_level3/Conv BPU id(0) HzSQuantizedConv 0.870276 44.154037

/fpn0_hm_cen/fpn0_hm_cen.0/Conv BPU id(0) HzSQuantizedConv 0.924563 33.241734

/fpn0_hm_cen/fpn0_hm_cen.2/Conv BPU id(0) HzSQuantizedConv 0.943938 313.639984

/Resize_3 BPU id(0) HzQuantizedResizeUpsample 0.943938 312.455658

/fpn1_hm_cen/fpn1_hm_cen.0/Conv BPU id(0) HzSQuantizedConv 0.874081 44.009842

/fpn1_hm_cen/fpn1_hm_cen.2/Conv BPU id(0) HzSQuantizedConv 0.861351 500.141083

/fpn2_hm_cen/fpn2_hm_cen.0/Conv BPU id(0) HzSQuantizedConv 0.892319 38.183331

/fpn2_hm_cen/fpn2_hm_cen.2/Conv BPU id(0) HzSQuantizedConv 0.917691 220.486969

/Unsqueeze BPU id(0) Reshape

/Unsqueeze_1 BPU id(0) Reshape

/Unsqueeze_2 BPU id(0) Reshape

/Concat_4 BPU id(0) Concat 0.892625 312.455658

/Softmax CPU – Softmax 0.683802 312.455658

/Mul BPU id(1) HzSElementwiseMul 0.637066 312.455658

/ReduceSum CPU – ReduceSum 0.932151 91.498909

/ReduceSum_reshape CPU – Reshape

/fpn0_cen_offset/fpn0_cen_offset.0/Conv BPU id(0) HzSQuantizedConv 0.779345 33.241734

/fpn0_cen_offset/fpn0_cen_offset.2/Conv BPU id(0) HzSQuantizedConv 0.867806 267.370972

/Resize_4 BPU id(0) HzQuantizedResizeUpsample 0.867808 299.037598

/fpn1_cen_offset/fpn1_cen_offset.0/Conv BPU id(0) HzSQuantizedConv 0.782416 44.009842

/fpn1_cen_offset/fpn1_cen_offset.2/Conv BPU id(0) HzSQuantizedConv 0.665010 233.543671

/fpn2_cen_offset/fpn2_cen_offset.0/Conv BPU id(0) HzSQuantizedConv 0.765405 38.183331

/fpn2_cen_offset/fpn2_cen_offset.2/Conv BPU id(0) HzSQuantizedConv 0.604727 102.658096

/Unsqueeze_3 BPU id(0) Reshape

/Unsqueeze_4 BPU id(0) Reshape

/Unsqueeze_5 BPU id(0) Reshape

/Concat_6 BPU id(0) Concat 0.863411 299.037598

/Softmax_1 CPU – Softmax 0.865288 299.037598

/Mul_1 BPU id(2) HzSElementwiseMul 0.548331 299.037598

/ReduceSum_1 CPU – ReduceSum 0.621538 9.692646

/ReduceSum_1_reshape CPU – Reshape

/fpn0_direction/fpn0_direction.0/Conv BPU id(0) HzSQuantizedConv 0.755548 33.241734

/fpn0_direction/fpn0_direction.2/Conv BPU id(0) HzSQuantizedConv 0.717000 425.373810

/Resize_5 BPU id(0) HzQuantizedResizeUpsample 0.716995 73.949379

/fpn1_direction/fpn1_direction.0/Conv BPU id(0) HzSQuantizedConv 0.705615 44.009842

/fpn1_direction/fpn1_direction.2/Conv BPU id(0) HzSQuantizedConv 0.450235 41.534039

/fpn2_direction/fpn2_direction.0/Conv BPU id(0) HzSQuantizedConv 0.606136 38.183331

/fpn2_direction/fpn2_direction.2/Conv BPU id(0) HzSQuantizedConv 0.606858 138.430069

/Unsqueeze_6 BPU id(0) Reshape

/Unsqueeze_7 BPU id(0) Reshape

/Unsqueeze_8 BPU id(0) Reshape

/Concat_8 BPU id(0) Concat 0.606557 73.949379

/Softmax_2 CPU – Softmax 0.850093 73.949379

/Mul_2 BPU id(3) HzSElementwiseMul 0.620950 73.949379

/ReduceSum_2 CPU – ReduceSum 0.513161 1.124602

/ReduceSum_2_reshape CPU – Reshape

/fpn0_z_coor/fpn0_z_coor.0/Conv BPU id(0) HzSQuantizedConv 0.844084 33.241734

/fpn0_z_coor/fpn0_z_coor.2/Conv BPU id(0) HzSQuantizedConv 0.762663 382.411621

/Resize_6 BPU id(0) HzQuantizedResizeUpsample 0.762663 79.672722

/Resize_6_NHWC2NCHW_LayoutConvert_Output0_reshape BPU id(0) Reshape

/fpn1_z_coor/fpn1_z_coor.0/Conv BPU id(0) HzSQuantizedConv 0.869190 44.009842

/fpn1_z_coor/fpn1_z_coor.2/Conv BPU id(0) HzSQuantizedConv 0.836509 84.870514

…2/Conv_NHWC2NCHW_LayoutConvert_Output0_reshape BPU id(0) Reshape

/fpn2_z_coor/fpn2_z_coor.0/Conv BPU id(0) HzSQuantizedConv 0.915549 38.183331

/fpn2_z_coor/fpn2_z_coor.2/Conv BPU id(0) HzSQuantizedConv 0.958527 53.304226

…2/Conv_NHWC2NCHW_LayoutConvert_Output0_reshape BPU id(0) Reshape

/Concat_10 BPU id(0) Concat 0.760884 79.672722

/Softmax_3 CPU – Softmax 0.953106 79.672722

/Mul_3 BPU id(4) HzSElementwiseMul 0.849502 79.672722

/ReduceSum_3 CPU – ReduceSum 0.933492 2.750437

/ReduceSum_3_reshape CPU – Reshape

/fpn0_dim/fpn0_dim.0/Conv BPU id(0) HzSQuantizedConv 0.905381 33.241734

/fpn0_dim/fpn0_dim.2/Conv BPU id(0) HzSQuantizedConv 0.959605 139.523239

/Resize_7 BPU id(0) HzQuantizedResizeUpsample 0.959608 61.583912

/fpn1_dim/fpn1_dim.0/Conv BPU id(0) HzSQuantizedConv 0.856017 44.009842

/fpn1_dim/fpn1_dim.2/Conv BPU id(0) HzSQuantizedConv 0.675916 107.564522

/fpn2_dim/fpn2_dim.0/Conv BPU id(0) HzSQuantizedConv 0.823923 38.183331

/fpn2_dim/fpn2_dim.2/Conv BPU id(0) HzSQuantizedConv 0.463424 109.336662

/Unsqueeze_12 BPU id(0) Reshape

/Unsqueeze_13 BPU id(0) Reshape

/Unsqueeze_14 BPU id(0) Reshape

/Concat_12 BPU id(0) Concat 0.734860 61.583912

/Softmax_4 CPU – Softmax 0.974800 61.583912

/Mul_4 BPU id(5) HzSElementwiseMul 0.934541 61.583912

/ReduceSum_4 CPU – ReduceSum 0.956766 27.699253

/ReduceSum_4_reshape CPU – Reshape

2023-06-09 14:41:55,732 INFO The quantify model output:

===============================================================================

Node Cosine Similarity L1 Distance L2 Distance Chebyshev Distance

-------------------------------------------------------------------------------

/ReduceSum 0.932151 6.658581 0.038718 72.361755

/ReduceSum_1 0.621538 1.135461 0.006949 20.349892

/ReduceSum_2 0.513161 0.414121 0.002889 2.062102

/ReduceSum_3 0.933492 0.284830 0.002482 2.326995

/ReduceSum_4 0.956766 0.805929 0.006046 19.764456

2023-06-09 14:41:55,732 INFO [Fri Jun 9 14:41:55 2023] End to Horizon NN Model Convert.

2023-06-09 14:41:55,846 WARNING node: { Softmax : None} does not exist, please double check your input

2023-06-09 14:41:55,846 INFO start convert to *.bin file…

2023-06-09 14:41:55,883 INFO ONNX model output num : 5

2023-06-09 14:41:55,888 INFO ############# model deps info #############

2023-06-09 14:41:55,888 INFO hb_mapper version : 1.13.5

2023-06-09 14:41:55,888 INFO hbdk version : 3.41.5

2023-06-09 14:41:55,888 INFO hbdk runtime version: 3.15.8.0

2023-06-09 14:41:55,888 INFO horizon_nn version : 0.15.5

2023-06-09 14:41:55,888 INFO ############# model_parameters info #############

2023-06-09 14:41:55,888 INFO onnx_model : /mnt/sda/shiyucun/pcp_j5/larry_pcp/src/super_fast_object_detection/src/sfa/model2onnx/fpn_resnet0420.onnx

2023-06-09 14:41:55,888 INFO BPU march : bayes

2023-06-09 14:41:55,888 INFO layer_out_dump : False

2023-06-09 14:41:55,888 INFO log_level : DEBUG

2023-06-09 14:41:55,888 INFO working dir : /mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs

2023-06-09 14:41:55,888 INFO output_model_file_prefix: fpn_resnet

2023-06-09 14:41:55,888 INFO ############# input_parameters info #############

2023-06-09 14:41:55,888 INFO ------------------------------------------

2023-06-09 14:41:55,888 INFO ---------input info : data ---------

2023-06-09 14:41:55,888 INFO input_name : data

2023-06-09 14:41:55,888 INFO input_type_rt : bgr

2023-06-09 14:41:55,888 INFO input_space&range : regular

2023-06-09 14:41:55,888 INFO input_layout_rt : NCHW

2023-06-09 14:41:55,888 INFO input_type_train : bgr

2023-06-09 14:41:55,889 INFO input_layout_train : NCHW

2023-06-09 14:41:55,889 INFO norm_type : no_preprocess

2023-06-09 14:41:55,889 INFO input_shape : 1x3x608x608

2023-06-09 14:41:55,889 INFO input_batch : 1

2023-06-09 14:41:55,889 INFO cal_data_dir : /mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/calibration_data_bgr_f32

2023-06-09 14:41:55,889 INFO cal_data_type : float32

2023-06-09 14:41:55,889 INFO ---------input info : data end -------

2023-06-09 14:41:55,889 INFO ------------------------------------------

2023-06-09 14:41:55,889 INFO ############# calibration_parameters info #############

2023-06-09 14:41:55,889 INFO preprocess_on : False

2023-06-09 14:41:55,889 INFO calibration_type: : kl

2023-06-09 14:41:55,889 INFO max_percentile : 1.0

2023-06-09 14:41:55,889 INFO run_on_bpu : { Softmax : None};

2023-06-09 14:41:55,889 INFO ############# compiler_parameters info #############

2023-06-09 14:41:55,889 INFO hbdk_pass_through_params: --O3 --core-num 1 --fast

2023-06-09 14:41:55,889 INFO input-source : {‘data’: ‘ddr’, ‘_default_value’: ‘ddr’}

2023-06-09 14:41:55,931 INFO Convert to runtime bin file sucessfully!

2023-06-09 14:41:55,932 INFO End Model Convert

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs#

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs#

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs#

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs#

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs# clear

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs#

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs#

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs#

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs#

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs#

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs#

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs#

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs#

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs#

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs#

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs# hb_mapper makertbin --config /mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/config.yaml --model-type onnx

2023-06-09 14:46:21,017 INFO log will be stored in /mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs/hb_mapper_makertbin.log

2023-06-09 14:46:21,018 INFO Start hb_mapper…

2023-06-09 14:46:21,018 INFO hbdk version 3.41.5

2023-06-09 14:46:21,018 INFO horizon_nn version 0.15.5

2023-06-09 14:46:21,018 INFO hb_mapper version 1.13.5

2023-06-09 14:46:21,018 INFO Start Model Convert…

2023-06-09 14:46:21,021 INFO Using onnx model file: /mnt/sda/shiyucun/pcp_j5/larry_pcp/src/super_fast_object_detection/src/sfa/model2onnx/fpn_resnet0420.onnx

2023-06-09 14:46:21,048 INFO Model has 1 inputs according to model file

2023-06-09 14:46:21,049 INFO The calibration dir name suffix is the same as the value float32 of the cal_data_type parameter and will be read with the value of cal_data_type.

2023-06-09 14:46:21,049 INFO custom_op does not exist, skipped

2023-06-09 14:46:21,049 WARNING Input node data’s input_source not set, it will be set to ddr by default

2023-06-09 14:46:21,051 INFO *******************************************

2023-06-09 14:46:21,051 INFO First calibration picture name: 1589168947_0.pcd.bgr

2023-06-09 14:46:21,051 INFO First calibration picture md5:

6f5602903a2881239cdec090376ae782 /mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/calibration_data_bgr_f32/1589168947_0.pcd.bgr

2023-06-09 14:46:21,056 INFO *******************************************

2023-06-09 14:46:21,281 INFO [Fri Jun 9 14:46:21 2023] Start to Horizon NN Model Convert.

2023-06-09 14:46:21,281 INFO Parsing the input parameter:{‘data’: {‘input_shape’: [1, 3, 608, 608], ‘input_batch’: 1, ‘expected_input_type’: ‘BGR_128’, ‘original_input_type’: ‘BGR’, ‘original_input_layout’: ‘NCHW’}}

2023-06-09 14:46:21,281 INFO Parsing the calibration parameter

2023-06-09 14:46:21,281 INFO There are 1 nodes designated to run on the bpu: [‘Softmax’].

2023-06-09 14:46:21,281 INFO Parsing the hbdk parameter:{‘hbdk_pass_through_params’: '–O3 --core-num 1 --fast ', ‘input-source’: {‘data’: ‘ddr’, ‘_default_value’: ‘ddr’}}

2023-06-09 14:46:21,281 INFO HorizonNN version: 0.15.5

2023-06-09 14:46:21,281 INFO HBDK version: 3.41.5

2023-06-09 14:46:21,281 INFO [Fri Jun 9 14:46:21 2023] Start to parse the onnx model.

2023-06-09 14:46:21,307 INFO Input ONNX model infomation:

ONNX IR version: 6

Opset version: [11]

Producer: pytorch2.0.1

Domain: none

Input name: data, [1, 3, 608, 608]

Output name: output, [1, 3, 152, 152]

Output name: 342, [1, 2, 152, 152]

Output name: 368, [1, 2, 152, 152]

Output name: 394, [1, 1, 152, 152]

Output name: 420, [1, 3, 152, 152]

2023-06-09 14:46:21,488 INFO [Fri Jun 9 14:46:21 2023] End to parse the onnx model.

2023-06-09 14:46:21,488 INFO Model input names parsed from model: [‘data’]

2023-06-09 14:46:21,488 INFO Create a preprocessing operator for input_name data with means=None, std=None, original_input_layout=NCHW, color convert from ‘BGR’ to ‘BGR’.

2023-06-09 14:46:21,810 INFO Saving the original float model: fpn_resnet_original_float_model.onnx.

2023-06-09 14:46:21,810 INFO [Fri Jun 9 14:46:21 2023] Start to optimize the model.

2023-06-09 14:46:22,125 INFO [Fri Jun 9 14:46:22 2023] End to optimize the model.

2023-06-09 14:46:22,311 INFO Saving the optimized model: fpn_resnet_optimized_float_model.onnx.

2023-06-09 14:46:22,311 INFO [Fri Jun 9 14:46:22 2023] Start to calibrate the model.

2023-06-09 14:46:22,312 INFO There are 102 samples in the calibration data set.

2023-06-09 14:46:22,510 INFO Run calibration model with kl method.

kl calibration in progress: 0%| | 0/13 [00:00<?, ?it/s]2023-06-09 14:46:25.515300646 [E:onnxruntime:, sequential_executor.cc:183 Execute] Non-zero status code returned while running Reshape node. Name:‘/Unsqueeze_14’ Status Message: /home/jenkins/agent/workspace/model_convert/onnxruntime/onnxruntime/core/providers/cpu/tensor/reshape_helper.h:43 onnxruntime::ReshapeHelper::ReshapeHelper(const onnxruntime::TensorShape&, std::vector&) gsl::narrow_cast<int64_t>(input_shape.Size()) == size was false. The input tensor cannot be reshaped to the requested shape. Input shape:{8,3,152,152}, requested shape:{1,3,152,152,1}

kl calibration in progress: 0%| | 0/13 [00:02<?, ?it/s]

2023-06-09 14:46:25,515 INFO Above info is caused by batch mode infer and can be ignored

2023-06-09 14:46:25,515 INFO Reset batch_size=1 and execute calibration again…

kl calibration in progress: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 102/102 [01:51<00:00, 1.09s/it]

2023-06-09 14:48:16,860 INFO [Fri Jun 9 14:48:16 2023] End to calibrate the model.

2023-06-09 14:48:16,860 INFO [Fri Jun 9 14:48:16 2023] Start to quantize the model.

2023-06-09 14:48:26,257 INFO [Fri Jun 9 14:48:26 2023] End to quantize the model.

2023-06-09 14:48:26,817 INFO Saving the quantized model: fpn_resnet_quantized_model.onnx.

2023-06-09 14:48:27,615 INFO [Fri Jun 9 14:48:27 2023] Start to compile the model with march bayes.

2023-06-09 14:48:27,985 INFO Compile submodel: torch_jit_subgraph_0

2023-06-09 14:48:28,224 WARNING Can not find the scale for node HZ_PREPROCESS_FOR_data_NCHW2NHWC_LayoutConvert_Input0

2023-06-09 14:48:28,550 INFO hbdk-cc parameters:[‘–O3’, ‘–core-num’, ‘1’, ‘–fast’, ‘–input-layout’, ‘NHWC’, ‘–output-layout’, ‘NHWC’, ‘–input-source’, ‘ddr’]

2023-06-09 15:00:36,069 INFO Compile submodel: torch_jit_subgraph_1

2023-06-09 15:00:36,109 INFO hbdk-cc parameters:[‘–O3’, ‘–core-num’, ‘1’, ‘–fast’, ‘–input-layout’, ‘NHWC’, ‘–output-layout’, ‘NHWC’, ‘–input-source’, ‘ddr,ddr’]

2023-06-09 15:00:36,286 INFO Compile submodel: torch_jit_subgraph_2

2023-06-09 15:00:36,325 INFO hbdk-cc parameters:[‘–O3’, ‘–core-num’, ‘1’, ‘–fast’, ‘–input-layout’, ‘NHWC’, ‘–output-layout’, ‘NHWC’, ‘–input-source’, ‘ddr,ddr’]

2023-06-09 15:00:36,530 INFO Compile submodel: torch_jit_subgraph_3

2023-06-09 15:00:36,573 INFO hbdk-cc parameters:[‘–O3’, ‘–core-num’, ‘1’, ‘–fast’, ‘–input-layout’, ‘NHWC’, ‘–output-layout’, ‘NHWC’, ‘–input-source’, ‘ddr,ddr’]

2023-06-09 15:00:36,697 INFO Compile submodel: torch_jit_subgraph_4

2023-06-09 15:00:36,736 INFO hbdk-cc parameters:[‘–O3’, ‘–core-num’, ‘1’, ‘–fast’, ‘–input-layout’, ‘NHWC’, ‘–output-layout’, ‘NHWC’, ‘–input-source’, ‘ddr,ddr’]

2023-06-09 15:00:36,842 INFO Compile submodel: torch_jit_subgraph_5

2023-06-09 15:00:36,881 INFO hbdk-cc parameters:[‘–O3’, ‘–core-num’, ‘1’, ‘–fast’, ‘–input-layout’, ‘NHWC’, ‘–output-layout’, ‘NHWC’, ‘–input-source’, ‘ddr,ddr’]

2023-06-09 15:00:37,414 INFO [Fri Jun 9 15:00:37 2023] End to compile the model with march bayes.

2023-06-09 15:00:37,415 INFO The converted model node information:

==============================================================================================================================================

Node ON Subgraph Type Cosine Similarity Threshold

-----------------------------------------------------------------------------------------------------------------------------------------------

HZ_PREPROCESS_FOR_data BPU id(0) HzSQuantizedPreprocess 0.999902 127.000000

/conv1/Conv BPU id(0) HzSQuantizedConv 0.995028 253.856461

/maxpool/MaxPool BPU id(0) HzQuantizedMaxPool 0.986078 1.761778

/layer1/layer1.0/conv1/Conv BPU id(0) HzSQuantizedConv 0.944165 1.761778

/layer1/layer1.0/conv2/Conv BPU id(0) HzSQuantizedConv 0.964870 1.422739

/layer1/layer1.1/conv1/Conv BPU id(0) HzSQuantizedConv 0.912611 2.758681

/layer1/layer1.1/conv2/Conv BPU id(0) HzSQuantizedConv 0.958991 1.346770

/layer2/layer2.0/conv1/Conv BPU id(0) HzSQuantizedConv 0.935197 3.533995

/layer2/layer2.0/conv2/Conv BPU id(0) HzSQuantizedConv 0.904247 1.255401

/layer2/layer2.0/downsample/downsample.0/Conv BPU id(0) HzSQuantizedConv 0.938371 3.533995

/layer2/layer2.1/conv1/Conv BPU id(0) HzSQuantizedConv 0.888894 2.211774

/layer2/layer2.1/conv2/Conv BPU id(0) HzSQuantizedConv 0.935048 1.245804

/layer3/layer3.0/conv1/Conv BPU id(0) HzSQuantizedConv 0.907809 2.861637

/layer3/layer3.0/conv2/Conv BPU id(0) HzSQuantizedConv 0.909788 1.452334

/layer3/layer3.0/downsample/downsample.0/Conv BPU id(0) HzSQuantizedConv 0.924617 2.861637

/layer3/layer3.1/conv1/Conv BPU id(0) HzSQuantizedConv 0.883150 1.643583

/layer3/layer3.1/conv2/Conv BPU id(0) HzSQuantizedConv 0.910612 0.954917

/layer4/layer4.0/conv1/Conv BPU id(0) HzSQuantizedConv 0.821044 2.078356

/layer4/layer4.0/conv2/Conv BPU id(0) HzSQuantizedConv 0.861210 1.876143

/layer4/layer4.0/downsample/downsample.0/Conv BPU id(0) HzSQuantizedConv 0.782653 2.078356

/layer4/layer4.1/conv1/Conv BPU id(0) HzSQuantizedConv 0.667947 2.044354

/layer4/layer4.1/conv2/Conv BPU id(0) HzSQuantizedConv 0.774661 2.900229

/Resize BPU id(0) HzQuantizedRoiResize 0.802121 16.436541

/layer3/layer3.1/relu_1/Relu_output_0_Requantize BPU id(0) HzRequantize

/Concat BPU id(0) Concat 0.801647 16.436541

/conv_up_level1/Conv BPU id(0) HzSQuantizedConv 0.810718 16.436541

/Resize_1 BPU id(0) HzQuantizedRoiResize 0.816626 33.241734

/Resize_1_output_0_Requantize BPU id(0) HzRequantize

/layer2/layer2.1/relu_1/Relu_output_0_Requantize BPU id(0) HzRequantize

/Concat_1 BPU id(0) Concat 0.816599 33.241734

/conv_up_level2/Conv BPU id(0) HzSQuantizedConv 0.836094 30.123194

/Resize_2 BPU id(0) HzQuantizedRoiResize 0.837536 44.009842

/Resize_2_output_0_Requantize BPU id(0) HzRequantize

/layer1/layer1.1/relu_1/Relu_output_0_Requantize BPU id(0) HzRequantize

/Concat_2 BPU id(0) Concat 0.838752 44.009842

/conv_up_level3/Conv BPU id(0) HzSQuantizedConv 0.870276 44.154037

/fpn0_hm_cen/fpn0_hm_cen.0/Conv BPU id(0) HzSQuantizedConv 0.924563 33.241734

/fpn0_hm_cen/fpn0_hm_cen.2/Conv BPU id(0) HzSQuantizedConv 0.943938 313.639984

/Resize_3 BPU id(0) HzQuantizedResizeUpsample 0.943938 312.455658

/fpn1_hm_cen/fpn1_hm_cen.0/Conv BPU id(0) HzSQuantizedConv 0.874081 44.009842

/fpn1_hm_cen/fpn1_hm_cen.2/Conv BPU id(0) HzSQuantizedConv 0.861351 500.141083

/fpn2_hm_cen/fpn2_hm_cen.0/Conv BPU id(0) HzSQuantizedConv 0.892319 38.183331

/fpn2_hm_cen/fpn2_hm_cen.2/Conv BPU id(0) HzSQuantizedConv 0.917691 220.486969

/Unsqueeze BPU id(0) Reshape

/Unsqueeze_1 BPU id(0) Reshape

/Unsqueeze_2 BPU id(0) Reshape

/Concat_4 BPU id(0) Concat 0.892625 312.455658

/Softmax CPU – Softmax 0.683802 312.455658

/Mul BPU id(1) HzSElementwiseMul 0.637066 312.455658

/ReduceSum CPU – ReduceSum 0.932151 91.498909

/ReduceSum_reshape CPU – Reshape

/fpn0_cen_offset/fpn0_cen_offset.0/Conv BPU id(0) HzSQuantizedConv 0.779345 33.241734

/fpn0_cen_offset/fpn0_cen_offset.2/Conv BPU id(0) HzSQuantizedConv 0.867806 267.370972

/Resize_4 BPU id(0) HzQuantizedResizeUpsample 0.867808 299.037598

/fpn1_cen_offset/fpn1_cen_offset.0/Conv BPU id(0) HzSQuantizedConv 0.782416 44.009842

/fpn1_cen_offset/fpn1_cen_offset.2/Conv BPU id(0) HzSQuantizedConv 0.665010 233.543671

/fpn2_cen_offset/fpn2_cen_offset.0/Conv BPU id(0) HzSQuantizedConv 0.765405 38.183331

/fpn2_cen_offset/fpn2_cen_offset.2/Conv BPU id(0) HzSQuantizedConv 0.604727 102.658096

/Unsqueeze_3 BPU id(0) Reshape

/Unsqueeze_4 BPU id(0) Reshape

/Unsqueeze_5 BPU id(0) Reshape

/Concat_6 BPU id(0) Concat 0.863411 299.037598

/Softmax_1 CPU – Softmax 0.865288 299.037598

/Mul_1 BPU id(2) HzSElementwiseMul 0.548331 299.037598

/ReduceSum_1 CPU – ReduceSum 0.621538 9.692646

/ReduceSum_1_reshape CPU – Reshape

/fpn0_direction/fpn0_direction.0/Conv BPU id(0) HzSQuantizedConv 0.755548 33.241734

/fpn0_direction/fpn0_direction.2/Conv BPU id(0) HzSQuantizedConv 0.717000 425.373810

/Resize_5 BPU id(0) HzQuantizedResizeUpsample 0.716995 73.949379

/fpn1_direction/fpn1_direction.0/Conv BPU id(0) HzSQuantizedConv 0.705615 44.009842

/fpn1_direction/fpn1_direction.2/Conv BPU id(0) HzSQuantizedConv 0.450235 41.534039

/fpn2_direction/fpn2_direction.0/Conv BPU id(0) HzSQuantizedConv 0.606136 38.183331

/fpn2_direction/fpn2_direction.2/Conv BPU id(0) HzSQuantizedConv 0.606858 138.430069

/Unsqueeze_6 BPU id(0) Reshape

/Unsqueeze_7 BPU id(0) Reshape

/Unsqueeze_8 BPU id(0) Reshape

/Concat_8 BPU id(0) Concat 0.606557 73.949379

/Softmax_2 CPU – Softmax 0.850093 73.949379

/Mul_2 BPU id(3) HzSElementwiseMul 0.620950 73.949379

/ReduceSum_2 CPU – ReduceSum 0.513161 1.124602

/ReduceSum_2_reshape CPU – Reshape

/fpn0_z_coor/fpn0_z_coor.0/Conv BPU id(0) HzSQuantizedConv 0.844084 33.241734

/fpn0_z_coor/fpn0_z_coor.2/Conv BPU id(0) HzSQuantizedConv 0.762663 382.411621

/Resize_6 BPU id(0) HzQuantizedResizeUpsample 0.762663 79.672722

/Resize_6_NHWC2NCHW_LayoutConvert_Output0_reshape BPU id(0) Reshape

/fpn1_z_coor/fpn1_z_coor.0/Conv BPU id(0) HzSQuantizedConv 0.869190 44.009842

/fpn1_z_coor/fpn1_z_coor.2/Conv BPU id(0) HzSQuantizedConv 0.836509 84.870514

…2/Conv_NHWC2NCHW_LayoutConvert_Output0_reshape BPU id(0) Reshape

/fpn2_z_coor/fpn2_z_coor.0/Conv BPU id(0) HzSQuantizedConv 0.915549 38.183331

/fpn2_z_coor/fpn2_z_coor.2/Conv BPU id(0) HzSQuantizedConv 0.958527 53.304226

…2/Conv_NHWC2NCHW_LayoutConvert_Output0_reshape BPU id(0) Reshape

/Concat_10 BPU id(0) Concat 0.760884 79.672722

/Softmax_3 CPU – Softmax 0.953106 79.672722

/Mul_3 BPU id(4) HzSElementwiseMul 0.849502 79.672722

/ReduceSum_3 CPU – ReduceSum 0.933492 2.750437

/ReduceSum_3_reshape CPU – Reshape

/fpn0_dim/fpn0_dim.0/Conv BPU id(0) HzSQuantizedConv 0.905381 33.241734

/fpn0_dim/fpn0_dim.2/Conv BPU id(0) HzSQuantizedConv 0.959605 139.523239

/Resize_7 BPU id(0) HzQuantizedResizeUpsample 0.959608 61.583912

/fpn1_dim/fpn1_dim.0/Conv BPU id(0) HzSQuantizedConv 0.856017 44.009842

/fpn1_dim/fpn1_dim.2/Conv BPU id(0) HzSQuantizedConv 0.675916 107.564522

/fpn2_dim/fpn2_dim.0/Conv BPU id(0) HzSQuantizedConv 0.823923 38.183331

/fpn2_dim/fpn2_dim.2/Conv BPU id(0) HzSQuantizedConv 0.463424 109.336662

/Unsqueeze_12 BPU id(0) Reshape

/Unsqueeze_13 BPU id(0) Reshape

/Unsqueeze_14 BPU id(0) Reshape

/Concat_12 BPU id(0) Concat 0.734860 61.583912

/Softmax_4 CPU – Softmax 0.974800 61.583912

/Mul_4 BPU id(5) HzSElementwiseMul 0.934541 61.583912

/ReduceSum_4 CPU – ReduceSum 0.956766 27.699253

/ReduceSum_4_reshape CPU – Reshape

2023-06-09 15:00:37,416 INFO The quantify model output:

===============================================================================

Node Cosine Similarity L1 Distance L2 Distance Chebyshev Distance

-------------------------------------------------------------------------------

/ReduceSum 0.932151 6.658581 0.038718 72.361755

/ReduceSum_1 0.621538 1.135461 0.006949 20.349892

/ReduceSum_2 0.513161 0.414121 0.002889 2.062102

/ReduceSum_3 0.933492 0.284830 0.002482 2.326995

/ReduceSum_4 0.956766 0.805929 0.006046 19.764456

2023-06-09 15:00:37,416 INFO [Fri Jun 9 15:00:37 2023] End to Horizon NN Model Convert.

2023-06-09 15:00:37,530 WARNING node: Softmax does not exist, please double check your input

2023-06-09 15:00:37,531 INFO start convert to *.bin file…

2023-06-09 15:00:37,566 INFO ONNX model output num : 5

2023-06-09 15:00:37,572 INFO ############# model deps info #############

2023-06-09 15:00:37,572 INFO hb_mapper version : 1.13.5

2023-06-09 15:00:37,572 INFO hbdk version : 3.41.5

2023-06-09 15:00:37,572 INFO hbdk runtime version: 3.15.8.0

2023-06-09 15:00:37,572 INFO horizon_nn version : 0.15.5

2023-06-09 15:00:37,572 INFO ############# model_parameters info #############

2023-06-09 15:00:37,572 INFO onnx_model : /mnt/sda/shiyucun/pcp_j5/larry_pcp/src/super_fast_object_detection/src/sfa/model2onnx/fpn_resnet0420.onnx

2023-06-09 15:00:37,572 INFO BPU march : bayes

2023-06-09 15:00:37,572 INFO layer_out_dump : False

2023-06-09 15:00:37,572 INFO log_level : DEBUG

2023-06-09 15:00:37,572 INFO working dir : /mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs

2023-06-09 15:00:37,572 INFO output_model_file_prefix: fpn_resnet

2023-06-09 15:00:37,573 INFO ############# input_parameters info #############

2023-06-09 15:00:37,573 INFO ------------------------------------------

2023-06-09 15:00:37,573 INFO ---------input info : data ---------

2023-06-09 15:00:37,573 INFO input_name : data

2023-06-09 15:00:37,573 INFO input_type_rt : bgr

2023-06-09 15:00:37,573 INFO input_space&range : regular

2023-06-09 15:00:37,573 INFO input_layout_rt : NCHW

2023-06-09 15:00:37,573 INFO input_type_train : bgr

2023-06-09 15:00:37,573 INFO input_layout_train : NCHW

2023-06-09 15:00:37,573 INFO norm_type : no_preprocess

2023-06-09 15:00:37,573 INFO input_shape : 1x3x608x608

2023-06-09 15:00:37,573 INFO input_batch : 1

2023-06-09 15:00:37,573 INFO cal_data_dir : /mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/calibration_data_bgr_f32

2023-06-09 15:00:37,573 INFO cal_data_type : float32

2023-06-09 15:00:37,573 INFO ---------input info : data end -------

2023-06-09 15:00:37,573 INFO ------------------------------------------

2023-06-09 15:00:37,573 INFO ############# calibration_parameters info #############

2023-06-09 15:00:37,573 INFO preprocess_on : False

2023-06-09 15:00:37,573 INFO calibration_type: : kl

2023-06-09 15:00:37,573 INFO max_percentile : 1.0

2023-06-09 15:00:37,573 INFO run_on_bpu : Softmax;

2023-06-09 15:00:37,573 INFO ############# compiler_parameters info #############

2023-06-09 15:00:37,573 INFO hbdk_pass_through_params: --O3 --core-num 1 --fast

2023-06-09 15:00:37,573 INFO input-source : {‘data’: ‘ddr’, ‘_default_value’: ‘ddr’}

2023-06-09 15:00:37,621 INFO Convert to runtime bin file sucessfully!

2023-06-09 15:00:37,621 INFO End Model Convert

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs#

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs#

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs#

root@425524437de2:/mnt/sda/shiyucun/oe/oe1.1.37/zeron/model/model_outputs#

如果您只是希望加速softmax的计算,那其实并不需要使用dsp,softmax是支持在BPU上运行的,需要在模型编译前,在yaml文件里配置run_on_bpu,具体方式可以查看工具链手册的4.1.1.6 PTQ章节。不过运行在BPU上有一定可能性导致精度下降。

嗯嗯,这边建议先尝试把softmax运行在BPU上。

您好,参数是这样配的哈,run_on_bpu之后跟的是softmax那个算子的具体名称,比如说你打开转换之前的浮点onnx模型,选中那个算子,会显示softmax算子的具体名称,比如Softmax_103,Softmax_127等等,yaml配置的参考写法如下

run_on_bpu: ‘Softmax_103; Softmax_127’

可以再试一下

run_on_bpu: ‘/Softmax;/Softmax_1;/Softmax_2;/Softmax_3;/Softmax_4’

这样写就好了,我上板测试下,多谢~

另外,使用DSP模型报错的问题还得麻烦帮忙看下,后续Softmax层我们多半要裁掉,这里我们我们只是做个测试,我们是想把使用DSP的这个通路跑通。

目前的状态:

1. DSP侧镜像使用的是OE 1.1.37目录ddk/samples/vdsp_rpc_sample/nn/script/image下的编译好的镜像文件

2. J5 CPU侧程序在初始化时注册了Softmax算子:

ret = hbDNNRegisterLayerCreator(“Softmax”, hobot::dnn::DSPSoftmax_layer_creator);-
if (0 != ret) {-
ROS_ERROR(“hbDNNRegisterLayerCreator failed, ret = %d”, ret);

return ret;

}

一运行就会返回错误码 -6000002

了解,首先-6000002确实是模型初始化出问题了,其次用DSP跑softmax之前确保softmax属于CPU算子(即没有配置run_on_bpu)。另外我们的J5工具链已经更新到1.1.52版本了,您可以更新一下OE包及docker环境重新试一下看看,需要在1.1.52下重新编译模型,并且按照1.1.52手册的运行指导来运行。

地平线征程®️5 OpenExplorer算法工具链 版本发布 (horizon.ai)

好的好的~,run_on_bpu的确没有配置,我先升级下OE包试试~

嗯嗯,后续可以关注下手册这两个部分。

以1.1.52的手册为准。

OK,我试试,3Q~