一、准备工作
1. 安装docker
根据系统完成docker安装,可参考docker官方安装教程
2. 配置使用D-Robotics算法工具链
- 下载RDK的OE交付包及对应的Docker镜像
- 解压OE交付包
tar -xvf horizon_x5_open_explorer_v1.2.8-py310_20240926.tar.gz
- 设置docker的映射路径并导入镜像
export version=v1.2.8
export ai_toolchain_package_path=/path/to/horizon_x5_open_explorer_v1.2.8-py310_20240926
docker load < docker_openexplorer_ubuntu_20_x5_gpu_v1.2.8.tar.gz
- 启动并验证镜像
sudo docker run -it --rm -v "$ai_toolchain_package_path":/open_explorer -v "$dataset_path":/data openexplorer/ai_toolchain_ubuntu_20_x5_cpu:v1.2.8-py310
hb_mapper #有输出打印则说明环境安装完成
3. 转换模型为ONNX格式并准备矫正数据集
files.zip (1.2 MB)
- onnx模型——best.onnx
- cal_data1(shape:1 * 39,input通道输入)
- cal_data2(shape:1 * 10 * 39,obs_hist通道输入)
二、PTQ 量化模型
1.编写模型转换配置文件
— 参考链接(模型量化yaml配置文件模板来源)
— yaml参数说明
# Copyright (c) 2020 D-Robotics.All Rights Reserved.
# 模型转化相关的参数
model_parameters:
# 必选参数
# Onnx浮点网络数据模型文件, 例如:onnx_model: './horizon_ultra_onnx.onnx'
onnx_model: ''
march: "bayes-e"
layer_out_dump: False
working_dir: 'model_output'
output_model_file_prefix: 'horizon_x5'
# 模型输入相关参数
input_parameters:
input_name: ""
input_shape: ''
input_type_rt: 'nv12'
input_layout_rt: ''
# 必选参数
# 原始浮点模型训练框架中所使用训练的数据类型,可选的值为rgb/bgr/gray/featuremap/yuv444, 例如:input_type_train: 'bgr'
input_type_train: ''
# 必选参数
# 原始浮点模型训练框架中所使用训练的数据排布, 可选值为 NHWC/NCHW, 例如:input_layout_train: 'NHWC'
input_layout_train: ''
#input_batch: 1
# 必选参数
# 原始浮点模型训练框架中所使用数据预处理方法,可配置:no_preprocess/data_mean/data_scale/data_mean_and_scale
# no_preprocess 不做任何操作,对应的 mean_value 或者 scale_value 均无需配置
# data_mean 减去通道均值mean_value,对应的 mean_value 需要配置,并注释掉scale_value
# data_scale 对图像像素乘以data_scale系数,对应的 scale_value需要配置,并注释掉mean_value
# data_mean_and_scale 减去通道均值后再乘以scale系数,标识下方对应的 mean_value 和 scale_value 均需配置
norm_type: ''
# 必选参数
# 图像减去的均值, 如果是通道均值,value之间必须用空格分隔
# 例如:mean_value: 128.0 或者 mean_value: 111.0 109.0 118.0
mean_value:
# 必选参数
# 图像预处理缩放比例,如果是通道缩放比例,value之间必须用空格分隔,计算公式:scale = 1/std
# 例如:scale_value: 0.0078125 或者 scale_value: 0.0078125 0.001215 0.003680
scale_value:
# 模型量化相关参数
calibration_parameters:
# 必选参数
# 模型量化的参考图像的存放目录,图片格式支持Jpeg、Bmp等格式,图片来源一般是从测试集中选择100张图片,并要覆盖典型场景,不要是偏僻场景,如过曝光、饱和、模糊、纯黑、纯白等图片
# 请根据 02_preprocess.sh 脚本中的文件夹路径来配置,例如:cal_data_dir: './calibration_data_yuv_f32'
cal_data_dir: ''
cal_data_type: 'float32'
calibration_type: 'default'
# max_percentile: 0.99996
# 编译器相关参数
compiler_parameters:
compile_mode: 'latency'
debug: False
# core_num: 2
optimize_level: 'O3'
进入所需目录,生成配置文件
cd /path/to/xxx
touch xxx.yaml
nano xxx.yaml
复制粘贴上述模板内容至yam文件,并按需修改必选参数,其中模型输入相关参数可以从可视化模型结构获取。(具体配置可参考yaml文件如下)
calibration_parameters:
cal_data_dir: './cal_data1_bin;./cal_data2_bin'
cal_data_type: 'float32;float32'
calibration_type: 'default'
# max_percentile: 0.99999
optimization: set_all_nodes_int16
per_channel: True
compiler_parameters:
compile_mode: latency
optimize_level: O3
input_parameters:
input_layout_rt: NCHW;NCHW
input_layout_train: NCHW;NCHW
input_name: input;obs_hist.1
input_shape: 1x39;1x10x39
input_type_rt: featuremap;featuremap
input_type_train: featuremap;featuremap
model_parameters:
march: bayes-e
onnx_model: best.onnx
output_model_file_prefix: sup
working_dir: ./model_output_sup
注意,多输入模型的参数需要按照以下结构填写,要求每个输入的各个参数严格按列对应:
input_name: 'input;obs_hist.1'
input_shape: '1x39;1X10x39'
input_type_rt: 'featuremap;featuremap'
input_layout_rt: 'NCHW;NCHW'
接下来输入指令转换模型。
cd /path/to/xxx.yaml
hb_mapper makertbin --model-type onnx --config xxx.yaml
终端没有报错则表示转换成功,转换过程中会输出FPS、各个算子的余弦相似度、模型余弦相似度等信息,并且也会以hb_mapper_makertbin.log文件形式保存至当前目录;下面贴上一份具体的log供参考。
2025-10-11 17:45:17,691 file: model_builder.py func: model_builder line No: 35 Start to Horizon NN Model Convert.
2025-10-11 17:45:17,696 file: model_debugger.py func: model_debugger line No: 67 Loading horizon_nn debug methods:set()
2025-10-11 17:45:17,696 file: quantization_config.py func: quantization_config line No: 305 The activation calibration parameters:
calibration_type: ['max', 'kl']
per_channel: [True, False]
max_percentile: [0.99995, 1.0]
asymmetric: [True, False]
The modelwise search parameters:
similarity: 0.995
metric: cosine-similarity
All nodes in the model are set to datatype: int16
2025-10-11 17:45:17,696 file: model_builder.py func: model_builder line No: 197 The specified model compilation architecture: bayes-e.
2025-10-11 17:45:17,696 file: model_builder.py func: model_builder line No: 207 The specified model compilation optimization parameters: [].
2025-10-11 17:45:17,696 file: model_builder.py func: model_builder line No: 35 Start to prepare the onnx model.
2025-10-11 17:45:17,702 file: prepare.py func: prepare line No: 106 Input ONNX Model Information:
ONNX IR version: 6
Opset version: ['ai.onnx v11', 'horizon v1']
Producer: pytorch v2.4.1
Domain: None
Version: None
Graph input:
input: shape=[1, 39], dtype=FLOAT32
obs_hist.1: shape=[1, 10, 39], dtype=FLOAT32
Graph output:
output: shape=[1, 10], dtype=FLOAT32
2025-10-11 17:45:17,733 file: model_builder.py func: model_builder line No: 38 End to prepare the onnx model.
2025-10-11 17:45:17,739 file: model_builder.py func: model_builder line No: 265 Saving model to: sup_original_float_model.onnx.
2025-10-11 17:45:17,739 file: model_builder.py func: model_builder line No: 35 Start to optimize the onnx model.
2025-10-11 17:45:17,786 file: constant_folding.py func: constant_folding line No: 66 Summary info for constant_folding:
2025-10-11 17:45:17,786 file: constant_folding.py func: constant_folding line No: 67 After constant_folding, the number of nodes has changed from 96 to 71.
2025-10-11 17:45:17,786 file: constant_folding.py func: constant_folding line No: 71 After constant_folding, the number of parameters has changed from 409592 to 409592.
2025-10-11 17:45:17,786 file: constant_folding.py func: constant_folding line No: 76 Detailed info for constant_folding:
2025-10-11 17:45:17,786 file: constant_folding.py func: constant_folding line No: 88 After folding node (op_name: /actor/Flatten, op_type: Flatten), the number of increased parameters is 0.
After folding node (op_name: /actor/Flatten_1, op_type: Flatten), the number of increased parameters is 0.
After folding node (op_name: /actor/Flatten_2, op_type: Flatten), the number of increased parameters is 0.
2025-10-11 17:45:17,825 file: model_builder.py func: model_builder line No: 38 End to optimize the onnx model.
2025-10-11 17:45:17,832 file: model_builder.py func: model_builder line No: 265 Saving model to: sup_optimized_float_model.onnx.
2025-10-11 17:45:17,832 file: model_builder.py func: model_builder line No: 35 Start to calibrate the model.
2025-10-11 17:45:18,038 file: tool_utils.py func: tool_utils line No: 321 The input0 of Node(name:/actor/MatMul_8, type:MatMul) does not support data type: int16
2025-10-11 17:45:18,043 file: calibration_data_set.py func: calibration_data_set line No: 111 input name: input, number_of_samples: 173
2025-10-11 17:45:18,043 file: tool_utils.py func: tool_utils line No: 321 The input0 of Node(name:/actor/MatMul_5, type:MatMul) does not support data type: int16
2025-10-11 17:45:18,044 file: calibration_data_set.py func: calibration_data_set line No: 111 input name: obs_hist.1, number_of_samples: 173
2025-10-11 17:45:18,044 file: tool_utils.py func: tool_utils line No: 321 The input0 of Node(name:/actor/MatMul_2, type:MatMul) does not support data type: int16
2025-10-11 17:45:18,045 file: calibration_data_set.py func: calibration_data_set line No: 123 There are 173 samples in the data set.
2025-10-11 17:45:18,045 file: infer_thresholds.py func: infer_thresholds line No: 84 Run calibration model with modelwise search method.
2025-10-11 17:45:18,062 file: tool_utils.py func: tool_utils line No: 321 The input1 of Node(name:/actor/MatMul_2, type:MatMul) does not support data type: int16
2025-10-11 17:45:18,088 file: tool_utils.py func: tool_utils line No: 321 The input1 of Node(name:/actor/MatMul_5, type:MatMul) does not support data type: int16
2025-10-11 17:45:18,136 file: tool_utils.py func: tool_utils line No: 321 The input1 of Node(name:/actor/MatMul_8, type:MatMul) does not support data type: int16
2025-10-11 17:45:18,147 file: base.py func: base line No: 138 Calibration using batch 8
2025-10-11 17:45:18,152 file: tool_utils.py func: tool_utils line No: 321 The output of Node(name:/actor/Unsqueeze) is int16, then requantized to int8
2025-10-11 17:45:18,212 file: tool_utils.py func: tool_utils line No: 321 The output of Node(name:/actor/MatMul_transpose_0_reshape) is int16, then requantized to int8
2025-10-11 17:45:18,214 file: ort.py func: ort line No: 207 Reset batch_size=1 and execute forward again...
2025-10-11 17:45:18,214 file: tool_utils.py func: tool_utils line No: 321 The output of Node(name:/actor/Unsqueeze_2) is int16, then requantized to int8
2025-10-11 17:45:18,283 file: tool_utils.py func: tool_utils line No: 321 The output of Node(name:/actor/MatMul_3_transpose_0_reshape) is int16, then requantized to int8
2025-10-11 17:45:18,284 file: tool_utils.py func: tool_utils line No: 321 The output of Node(name:/actor/Unsqueeze_4) is int16, then requantized to int8
2025-10-11 17:45:18,284 file: tool_utils.py func: tool_utils line No: 321 The output of Node(name:/actor/MatMul_6_transpose_0_reshape) is int16, then requantized to int8
2025-10-11 17:59:00,088 file: modelwise_search.py func: modelwise_search line No: 75 Select max-percentile:percentile=0.99995 method.
2025-10-11 17:59:01,501 file: model_builder.py func: model_builder line No: 38 End to calibrate the model.
2025-10-11 17:59:01,725 file: model_builder.py func: model_builder line No: 265 Saving model to: sup_calibrated_model.onnx.
2025-10-11 17:59:01,725 file: model_builder.py func: model_builder line No: 35 Start to quantize the model.
2025-10-11 17:59:03,636 file: constant_folding.py func: constant_folding line No: 66 Summary info for constant_folding:
2025-10-11 17:59:03,636 file: constant_folding.py func: constant_folding line No: 67 After constant_folding, the number of nodes has changed from 88 to 88.
2025-10-11 17:59:03,636 file: constant_folding.py func: constant_folding line No: 71 After constant_folding, the number of parameters has changed from 468136 to 468136.
2025-10-11 17:59:03,636 file: constant_folding.py func: constant_folding line No: 76 Detailed info for constant_folding:
2025-10-11 17:59:03,636 file: constant_folding.py func: constant_folding line No: 88
2025-10-11 17:59:03,948 file: model_builder.py func: model_builder line No: 38 End to quantize the model.
2025-10-11 17:59:04,262 file: model_builder.py func: model_builder line No: 265 Saving model to: sup_quantized_model.onnx.
2025-10-11 17:59:04,262 file: model_builder.py func: model_builder line No: 35 Start to compile the model with march bayes-e.
2025-10-11 17:59:05,868 file: hybrid_build.py func: hybrid_build line No: 111 Compile submodel: main_graph_subgraph_0
2025-10-11 17:59:05,888 file: hbdk_cc.py func: hbdk_cc line No: 126 hbdk-cc parameters:['--O3', '--debug', '--core-num', '1', '--fast', '--input-layout', 'NHWC', '--output-layout', 'NHWC', '--input-source', 'ddr,ddr']
2025-10-11 17:59:05,888 file: hbdk_cc.py func: hbdk_cc line No: 127 hbdk-cc command used:hbdk-cc -f hbir -m /tmp/tmpwc6dher3/main_graph_subgraph_0.hbir -o /tmp/tmpwc6dher3/main_graph_subgraph_0.hbm --march bayes-e --progressbar --O3 --debug --core-num 1 --fast --input-layout NHWC --output-layout NHWC --input-source ddr,ddr
2025-10-11 17:59:27,179 file: tool_utils.py func: tool_utils line No: 326 consumed time 21.2712
2025-10-11 17:59:27,238 file: tool_utils.py func: tool_utils line No: 326 FPS=738.86, latency = 1353.4 us, DDR = 8020704 bytes (see main_graph_subgraph_0.html)
2025-10-11 17:59:27,305 file: model_builder.py func: model_builder line No: 38 End to compile the model with march bayes-e.
2025-10-11 17:59:29,190 file: print_info_dict.py func: print_info_dict line No: 72 The main quantized node information:
===================================================================================================================================
Node ON Subgraph Type Cosine Similarity Threshold DataType
-----------------------------------------------------------------------------------------------------------------------------------
/Slice BPU id(0) Slice 1.000000 1.41351 int16
/Unsqueeze BPU id(0) Reshape 1.000000 3.30783 int16
/Unsqueeze_output_0_calibrated_Requantize BPU id(0) HzRequantize -- -- int16
/Concat BPU id(0) Concat 1.000000 1.41351 int16
/Slice_1 BPU id(0) Slice 1.000000 1.41351 int16
/Reshape BPU id(0) Reshape 1.000000 1.41351 int16
/mlp_encoder/0/Gemm BPU id(0) HzSQuantizedConv 0.999983 1.41351 int16
/mlp_encoder//Elu BPU id(0) HzLut2Layer 0.999984 2.22292 int16
/mlp_encoder/2/Gemm BPU id(0) HzSQuantizedConv 0.999997 2.09289 int16
/mlp_encoder/Elu BPU id(0) HzLut2Layer 0.999996 10.168 int16
/mlp_encoder/4/ReduceMean BPU id(0) HzSQuantizedReduceMean 1.000000 8.73395 int16
/mlp_encoder/4/Sub BPU id(0) HzSElementwiseSub 0.999996 8.73395 int16
/mlp_encoder/4/Pow BPU id(0) HzLut2Layer 0.999994 8.19725 int16
/mlp_encoder/4/ReduceMean_1 BPU id(0) HzSQuantizedReduceMean 1.000000 67.195 int16
/mlp_encoder/4/Div_reciprocal BPU id(0) HzLut2Layer 1.000000 4.9403 int16
/mlp_encoder/4/Div_mul BPU id(0) HzSElementwiseMul 0.999997 8.19725 int16
/mlp_encoder/5/Gemm BPU id(0) HzSQuantizedConv 0.999995 5.55314 int16
/mlp_encoder/Elu_1 BPU id(0) HzLut2Layer 0.999992 13.3564 int16
/mlp_encoder/7/ReduceMean BPU id(0) HzSQuantizedReduceMean 1.000000 9.16719 int16
/mlp_encoder/7/Sub BPU id(0) HzSElementwiseSub 0.999992 9.16719 int16
/mlp_encoder/7/Pow BPU id(0) HzLut2Layer 0.999992 8.24024 int16
/mlp_encoder/7/ReduceMean_1 BPU id(0) HzSQuantizedReduceMean 1.000000 67.9016 int16
/mlp_encoder/7/Div_reciprocal BPU id(0) HzLut2Layer 1.000000 4.58363 int16
/mlp_encoder/7/Div_mul BPU id(0) HzSElementwiseMul 0.999992 8.24024 int16
/mlp_encoder/8/Gemm BPU id(0) HzSQuantizedConv 0.999987 4.93075 int16
/actor/Concat BPU id(0) Concat 0.999987 3.30783 int16
/actor/gate/0/Gemm_pre_reshape BPU id(0) Reshape 0.999987 3.30783 int16
/actor/gate/0/Gemm BPU id(0) HzSQuantizedConv 0.999962 3.30783 int16
/actor/gate/1/Elu BPU id(0) HzLut2Layer 0.999962 1.55307 int16
/actor/gate/2/Gemm BPU id(0) HzSQuantizedConv 0.999979 1.52875 int16
/actor/gate/3/Elu BPU id(0) HzLut2Layer 0.999977 2.0178 int16
/actor/gate/4/Gemm BPU id(0) HzSQuantizedConv 0.999989 1.95905 int16
/actor/Softmax_reducemax_FROM_QUANTIZED_SOFTMAX BPU id(0) HzQuantizedReduceMax 1.000000 2.79715 int16
/actor/Softmax_sub_FROM_QUANTIZED_SOFTMAX BPU id(0) HzSElementwiseSub 0.999997 2.79715 int16
/actor/Softmax_exp_FROM_QUANTIZED_SOFTMAX BPU id(0) HzLut2Layer 0.999998 11.0903 int16
/actor/Softmax_reducesum_FROM_QUANTIZED_SOFTMAX BPU id(0) HzSQuantizedReduceSum 1.000000 1.0 int16
/actor/Softmax_reciprocal_FROM_QUANTIZED_SOFTMAX BPU id(0) HzLut2Layer 1.000000 3.53229 int16
/actor/Softmax_mul_FROM_QUANTIZED_SOFTMAX BPU id(0) HzSElementwiseMul 0.999998 1.0 int16
/actor/MatMul_reshape_input BPU id(0) Reshape 0.999998 0.920062 int16
/actor/MatMul BPU id(0) HzSQuantizedConv 0.999996 0.920062 int16
variable_227_Requantize BPU id(0) HzRequantize -- -- int16
/actor/Unsqueeze BPU id(0) Reshape 0.999987 3.30783 int16
/actor/MatMul_1 BPU id(0) HzSQuantizedConv 0.999991 0.920062 int16
variable_228_Requantize BPU id(0) HzRequantize -- -- int16
/actor/Unsqueeze_output_0_calibrated_Requantize BPU id(0) HzRequantize -- -- int16
/actor/MatMul_2 BPU id(0) HzSQuantizedMatmul 0.999709 3.30783 int8
/actor/Mul BPU id(0) HzSElementwiseMul 0.999709 0.791948 int16
/actor/Add BPU id(0) HzSElementwiseAdd 0.999719 0.791948 int16
/actor/Elu BPU id(0) HzLut2Layer 0.999726 0.791326 int16
/actor/Elu/actor/Elu_output_0_Reshape_0 BPU id(0) Reshape 0.999726 3.28868 int16
/actor/MatMul_3 BPU id(0) HzSQuantizedConv 0.999994 0.920062 int16
variable_229_Requantize BPU id(0) HzRequantize -- -- int16
/mlp_encoder/8/Gemm_output_0_calibrated_Requantize BPU id(0) HzRequantize -- -- int16
/actor/Elu_output_0_calibrated_Requantize BPU id(0) HzRequantize -- -- int16
/actor/Concat_1 BPU id(0) Concat 0.999852 3.30783 int8
/actor/Unsqueeze_2 BPU id(0) Reshape 0.999852 3.28868 int8
/actor/MatMul_4 BPU id(0) HzSQuantizedConv 0.999994 0.920062 int16
variable_225_Requantize BPU id(0) HzRequantize -- -- int16
/actor/MatMul_5 BPU id(0) HzSQuantizedMatmul 0.999979 3.28868 int8
/actor/Mul_2 BPU id(0) HzSElementwiseMul 0.999979 2.44794 int16
/actor/Add_1 BPU id(0) HzSElementwiseAdd 0.999980 2.44794 int16
/actor/Elu_1 BPU id(0) HzLut2Layer 0.999982 2.46283 int16
/actor/Elu_1/actor/Elu_1_output_0_Reshape_0 BPU id(0) Reshape 0.999982 3.28868 int16
/actor/MatMul_6 BPU id(0) HzSQuantizedConv 0.999994 0.920062 int16
variable_226_Requantize BPU id(0) HzRequantize -- -- int16
/actor/Elu_1_output_0_calibrated_Requantize BPU id(0) HzRequantize -- -- int16
/actor/Concat_2 BPU id(0) Concat 0.999909 3.30783 int8
/actor/Unsqueeze_4 BPU id(0) Reshape 0.999909 3.28868 int8
/actor/MatMul_7 BPU id(0) HzSQuantizedConv 0.999996 0.920062 int16
/actor/MatMul_8 BPU id(0) HzSQuantizedMatmul 0.999995 3.28868 int8
/actor/Mul_4 BPU id(0) HzSElementwiseMul 0.999995 4.41123 int16
/actor/Add_2 BPU id(0) HzSElementwiseAdd 0.999995 4.41123 int16
/actor/Squeeze_2 BPU id(0) Reshape 0.999995 4.43782 int16
2025-10-11 17:59:29,190 file: print_info_dict.py func: print_info_dict line No: 72 The quantized model output:
=============================================================================
Output Cosine Similarity L1 Distance L2 Distance Chebyshev Distance
-----------------------------------------------------------------------------
output 0.999995 0.003761 0.001519 0.009816
2025-10-11 17:59:29,195 file: model_builder.py func: model_builder line No: 38 End to Horizon NN Model Convert.
2025-10-11 17:59:29,203 file: hb_mapper_makertbin.py func: hb_mapper_makertbin line No: 601 start convert to *.bin file....
2025-10-11 17:59:29,224 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4326 ONNX model output num : 1
2025-10-11 17:59:29,228 file: layout_util.py func: layout_util line No: 15 set_featuremap_layout start
2025-10-11 17:59:29,228 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4060 model_deps_info: {'hb_mapper_version': '1.24.3', 'hbdk_version': '3.49.15', 'hbdk_runtime_version': ' 3.15.55.0', 'horizon_nn_version': '1.1.0', 'onnx_model': '/open_explorer/models/duck/best.onnx', 'march': 'bayes-e', 'layer_out_dump': False, 'log_level': 'DEBUG', 'working_dir': '/open_explorer/models/duck/model_output_sup', 'model_prefix': 'sup', 'input_names': ['obs_hist.1', 'input'], 'input_type_rt': ['featuremap', 'featuremap'], 'input_space_and_range': ['regular', 'regular'], 'input_type_train': ['featuremap', 'featuremap'], 'input_layout_rt': ['NCHW', 'NCHW'], 'input_layout_train': ['NCHW', 'NCHW'], 'norm_type': ['no_preprocess', 'no_preprocess'], 'scale_value': ['', ''], 'mean_value': ['', ''], 'input_shape': ['1x10x39', '1x39'], 'input_batch': [], 'cal_dir': ['/open_explorer/models/duck/cal_data2_bin', '/open_explorer/models/duck/cal_data1_bin'], 'cal_data_type': ['float32', 'float32'], 'preprocess_on': False, 'calibration_type': 'default', 'per_channel': 'True', 'optimization': ['set_all_nodes_int16'], 'hbdk_params': {'hbdk_pass_through_params': '--O3 --debug --core-num 1 --fast ', 'input-source': {'input': 'ddr', 'obs_hist.1': 'ddr', '_default_value': 'ddr'}}, 'debug': True, 'compile_mode': 'latency'}
2025-10-11 17:59:29,228 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4183 ############# model deps info #############
2025-10-11 17:59:29,228 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4184 hb_mapper version : 1.24.3
2025-10-11 17:59:29,228 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4187 hbdk version : 3.49.15
2025-10-11 17:59:29,228 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4189 hbdk runtime version: 3.15.55.0
2025-10-11 17:59:29,228 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4192 horizon_nn version : 1.1.0
2025-10-11 17:59:29,228 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4196 ############# model_parameters info #############
2025-10-11 17:59:29,229 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4202 onnx_model : /open_explorer/models/duck/best.onnx
2025-10-11 17:59:29,229 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4203 BPU march : bayes-e
2025-10-11 17:59:29,229 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4204 layer_out_dump : False
2025-10-11 17:59:29,229 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4205 log_level : DEBUG
2025-10-11 17:59:29,229 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4206 working dir : /open_explorer/models/duck/model_output_sup
2025-10-11 17:59:29,229 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4207 output_model_file_prefix: sup
2025-10-11 17:59:29,229 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4228 ############# input_parameters info #############
2025-10-11 17:59:29,229 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4246 ------------------------------------------
2025-10-11 17:59:29,230 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4248 ---------input info : obs_hist.1 ---------
2025-10-11 17:59:29,230 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4249 input_name : obs_hist.1
2025-10-11 17:59:29,230 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4250 input_type_rt : featuremap
2025-10-11 17:59:29,230 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4252 input_space&range : regular
2025-10-11 17:59:29,230 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4254 input_layout_rt : NCHW
2025-10-11 17:59:29,230 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4255 input_type_train : featuremap
2025-10-11 17:59:29,230 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4256 input_layout_train : NCHW
2025-10-11 17:59:29,230 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4257 norm_type : no_preprocess
2025-10-11 17:59:29,230 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4258 input_shape : 1x10x39
2025-10-11 17:59:29,230 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4268 cal_data_dir : /open_explorer/models/duck/cal_data2_bin
2025-10-11 17:59:29,230 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4270 cal_data_type : float32
2025-10-11 17:59:29,231 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4271 ---------input info : obs_hist.1 end -------
2025-10-11 17:59:29,231 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4248 ---------input info : input ---------
2025-10-11 17:59:29,231 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4249 input_name : input
2025-10-11 17:59:29,231 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4250 input_type_rt : featuremap
2025-10-11 17:59:29,231 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4252 input_space&range : regular
2025-10-11 17:59:29,231 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4254 input_layout_rt : NCHW
2025-10-11 17:59:29,231 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4255 input_type_train : featuremap
2025-10-11 17:59:29,231 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4256 input_layout_train : NCHW
2025-10-11 17:59:29,231 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4257 norm_type : no_preprocess
2025-10-11 17:59:29,231 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4258 input_shape : 1x39
2025-10-11 17:59:29,231 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4268 cal_data_dir : /open_explorer/models/duck/cal_data1_bin
2025-10-11 17:59:29,232 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4270 cal_data_type : float32
2025-10-11 17:59:29,232 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4271 ---------input info : input end -------
2025-10-11 17:59:29,232 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4272 ------------------------------------------
2025-10-11 17:59:29,232 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4274 ############# calibration_parameters info #############
2025-10-11 17:59:29,232 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4275 preprocess_on : False
2025-10-11 17:59:29,232 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4276 calibration_type: : default
2025-10-11 17:59:29,232 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4278 optimization : set_all_nodes_int16;
2025-10-11 17:59:29,232 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4284 per_channel : True
2025-10-11 17:59:29,232 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4303 ############# compiler_parameters info #############
2025-10-11 17:59:29,232 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4305 debug : True
2025-10-11 17:59:29,232 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4307 compile_mode : latency
2025-10-11 17:59:29,233 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4310 hbdk_pass_through_params: --O3 --debug --core-num 1 --fast
2025-10-11 17:59:29,233 file: onnx2horizonrt.py func: onnx2horizonrt line No: 4310 input-source : {'input': 'ddr', 'obs_hist.1': 'ddr', '_default_value': 'ddr'}
2025-10-11 17:59:29,236 file: hb_mapper_makertbin.py func: hb_mapper_makertbin line No: 783 Convert to runtime bin file successfully!
2025-10-11 17:59:29,236 file: hb_mapper_makertbin.py func: hb_mapper_makertbin line No: 784 End Model Convert
成功执行转换命令后便会在当前目录保存有output文件夹。转换产出物解读参考链接
至此,模型量化完成。
三、PTQ权重拆分调优
1.找到模型精度表现最佳的yaml组合
调整量化策略如max,max_percentile,per_channel等等,并且set_all_nodes_int16通常都需要打开。
| calibration_type | max_percentile | per_channel | advanced_parameters.set_all_nodes_int16 | 余弦相似度 |
|---|---|---|---|---|
| default | - | - | - | 0.997538 |
| max | 0.99996 | True | True | 0.999967 |
| mix | - | True | True | 0.999988 |
| default | - | True | - | 0.999995 |
2.做权重的精度debug
按照手册的精度debug指南章节,进行weight的精度debug,记录debug结果中相似度低于0.9999conv算子的完整node name。
并将下述脚本代码粘贴至文件中,并按需修改main部分代码。注意,需要输入由PTQ量化步骤输出的xxx_optimized_float_model.onnx文件,与精度debug步骤中得到的节点名:
import numpy as np
from copy import deepcopy
# horizon_nn 1.1.0
from horizon_nn.common import constant_folding
from horizon_nn.ir import load_model, save_model
# horizon_nn newer version
# from hmct.common import constant_folding
# from hmct.it import load_model, save_model
# 最近的patch也有可能改成了这个
# from hmct.common import ConstantFolding
# from hmct.ir import load_model, save_model
def split_conv_nodes(model, conv_names):
for conv_name in conv_names:
conv_node = model.graph.node_mappings[conv_name]
before_node = conv_node.inputs[0].src_op
conv_weight_value = deepcopy(conv_node.inputs[1].value)
conv_weight_max = abs(conv_weight_value).max(axis=(1,2,3))
moded = (conv_weight_max / 127)[:, np.newaxis, np.newaxis, np.newaxis] + 1e-10
conv_weight_high = np.floor(np.clip(conv_weight_value / moded + 1e-5, -127, 127)) * moded
conv_weight_low = conv_weight_value - conv_weight_high
conv_bias_value = conv_node.inputs[2].value if len(conv_node.inputs) == 3 else np.zeros(conv_weight_value.shape[0], np.float32)
conv1_weight_var = model.graph.create_variable(
is_param=True,
value=conv_weight_high,
)
conv1_bias_var = conv_node.inputs[2] if len(conv_node.inputs) == 3 else model.graph.create_variable(
is_param=True,
value=np.zeros_like(conv_bias_value, np.float32),
)
conv1_node = model.graph.create_node(
op_type="Conv",
name = conv_node.name + "_split0",
attributes=conv_node.attributes,
inputs=[conv_node.inputs[0], conv1_weight_var, conv1_bias_var],
num_outputs=1)
if before_node is not None:
conv1_node.insert_after(before_node)
else:
conv1_node.prepend_on()
conv2_weight_var = model.graph.create_variable(
is_param=True,
value=conv_weight_low,
)
conv2_bias_var = model.graph.create_variable(
is_param=True,
value=np.zeros_like(conv_bias_value, np.float32),
)
conv2_node = model.graph.create_node(
op_type="Conv",
name = conv_node.name + "_split1",
attributes=conv_node.attributes,
inputs=[conv_node.inputs[0], conv2_weight_var, conv2_bias_var],
num_outputs=1)
if before_node is not None:
conv2_node.insert_after(before_node)
else:
conv2_node.prepend_on()
add1_node = model.graph.create_node(
op_type="Add",
inputs=[conv1_node.outputs[0], conv2_node.outputs[0]],
name=conv_node.name + "_split_add0",
num_outputs=1).insert_after(conv1_node)
conv_node.replace_all_uses_with(add1_node)
if not conv_node.is_used:
conv_node.destroy()
model.infer_shapes()
model.check_validity()
return model
if __name__ == "__main__":
model = constant_folding(load_model("./xxx_optimized_float_model.onnx"))
model = split_conv_nodes(model, conv_names=[
"/mlp_encoder/0/Gemm",
"/mlp_encoder/2/Gemm",
"/mlp_encoder/5/Gemm",
"/mlp_encoder/8/Gemm",
"/actor/MatMul_1",
"/actor/MatMul_3",
"/actor/MatMul_4",
"/actor/MatMul_6",
"/actor/MatMul_7",
])
save_model(model, "xxx_split.onnx")
接着将得到的xxx_split.onnx文件重复一遍《二、PTQ量化模型》章节的流程,注意必须修改量化yaml文件中模型参数部分的配置(参考以下):
model_parameters:
march: bayes-e
onnx_model: xxxSplit.onnx
output_model_file_prefix: xxxSplit
working_dir: ./model_output_xxxSplit
量化后就能得到经过权重拆分调优的.bin文件,最终输出文件夹目录如图。
sup_split.zip (1.1 MB)
![]()
