转化完 Yolo11n私有模型的bin模型 rdkx5上板之后 打开usb摄像头 没有识别出我的类别(用训练集里的图片 测试也没有识别出)


这是上板的bin模型的图
{
“summary”: {
“BPU OPs per frame (effective)”: 6376473600,
“BPU OPs per run (effective)”: 6376473600,
“BPU PE number”: 1,
“BPU core number”: 1,
“BPU march”: “B25E”,
“DDR bytes per frame”: 20956480,
“DDR bytes per run”: 20956480,
“DDR bytes per second”: 2982739901,
“DDR megabytes per frame”: 19.986,
“DDR megabytes per run”: 19.986,
“DDR megabytes per second”: 2844.6,
“FPS”: 142.33,
“HBDK version”: “3.49.14”,
“compiling options”: “-f hbir -m /tmp/tmp_88mpk26/main_graph_subgraph_0.hbir -o /tmp/tmp_88mpk26/main_graph_subgraph_0.hbm --march bayes-e --progressbar --O3 --core-num 1 --fast --input-layout NHWC --output-layout NHWC --input-source ddr”,
“frame per run”: 1,
“frame per second”: 142.33,
“input features”: [
[
“input name”,
“input size”
],
[
“images”,
“1x3x640x640”
]
],
“interval computing unit utilization”: [
0.88,
0.947,
0.719,
0.912,
0.944,
0.937,
0.968,
0.975,
0.969,
0.958,
0.989,
0.714,
0.694,
0.732,
0.951,
0.95,
0.949,
0.95,
0.951,
0.946,
0.944,
0.766,
0.886,
0.989,
0.989,
0.992,
0.885,
0.966,
0.756,
0.58,
0.945,
0.963,
0.825,
0.925,
0.933,
0.119
],
“interval computing units utilization”: [
0.88,
0.947,
0.719,
0.912,
0.944,
0.937,
0.968,
0.975,
0.969,
0.958,
0.989,
0.714,
0.694,
0.732,
0.951,
0.95,
0.949,
0.95,
0.951,
0.946,
0.944,
0.766,
0.886,
0.989,
0.989,
0.992,
0.885,
0.966,
0.756,
0.58,
0.945,
0.963,
0.825,
0.925,
0.933,
0.119
],
“interval loading bandwidth (megabytes/s)”: [
3005,
2930,
2593,
1824,
962,
741,
1320,
2221,
1589,
250,
0,
8,
788,
1297,
968,
907,
974,
1007,
897,
938,
1095,
3436,
3972,
1101,
0,
0,
1756,
2932,
3332,
2859,
1491,
1302,
1544,
2036,
2210,
1205
],
“interval number”: 36,
“interval storing bandwidth (megabytes/s)”: [
1562,
1757,
1171,
195,
0,
954,
1953,
1487,
781,
449,
156,
1751,
3753,
3363,
1758,
842,
854,
835,
836,
854,
781,
1266,
976,
45,
0,
0,
0,
680,
1757,
2053,
2495,
3601,
3607,
2089,
1037,
733
],
“interval time (ms)”: 0.2,
“latency (ms)”: 7.03,
“latency (ms) by segments”: [
7.026
],
“latency (us)”: 7025.9,
“loaded bytes per frame”: 11325504,
“loaded bytes per run”: 11325504,
“model json CRC”: “a121834c”,
“model json file”: “/tmp/tmp_88mpk26/main_graph_subgraph_0.hbir”,
“model name”: “main_graph_subgraph_0”,
“model param CRC”: “00000000”,
“multicore sync time (ms)”: 0.0,
“run per second”: 142.33,
“runtime version”: “3.15.54.0”,
“stored bytes per frame”: 9630976,
“stored bytes per run”: 9630976,
“worst FPS”: 142.33
}
}
这是我的json文件

你好, 在算法开发的过程中,遇到各种数值不可控的问题都是正常的,算法开发本身就是需要厚积薄发的领域。算法工具链提供了完整的流程说明,debug工具及流程说明,供您参考。 PTQ流程详解:6.1. PTQ转换原理及流程 — Horizon Open Explorer

精度调优:8.2. PTQ模型精度调优 — Horizon Open Explorer

性能调优:8.1. 模型性能调优 — Horizon Open Explorer

精度debug工具详解:6.2.12. 精度debug工具 — Horizon Open Explorer

Runtime程序编写详解:9. 嵌入式应用开发(runtime)手册 — Horizon Open Explorer

如果将工具链手册所述的所有流程走完仍然不及预期,则说明模型及其权重本身无法量化。特别的,过拟合的模型本身容易出现异常值导致量化表示能力不足。

新算法开发建议

  1. 基本上新算法都需要做pipeline检查,来摸明白前后处理,一般不会是精度问题。

  2. 编写使用ONNXRuntime来推理原始浮点onnx的程序,来确定前后处理的baseline。

  3. 将输入类型设置为NCHW和featuremap,包括train和rt的两个type,前处理类型修改为no_preprocess,这样编译出来的quantized模型和bin模型所需要的数据,也就是所需要的前处理,和浮点onnx完全一致。建议在全featuremap的基础上进行准备校准数据,和bin模型编译。由于featuremap在板子上的python接口无法推理,只能用C/C++推理,调试阶段建议使用开发机器的HB_ONNXRuntime推理quantized onnx来调试。quantized onnx在全featuremap的编译基础上,前处理与浮点onnx完全一致。

  4. 如果在全featuremap的基础上,精度不达预期,可以查阅手册使用全int16编译,来确定精度上限。