sigsegv（越界崩溃）一直存在怎么解决

yuzhe123 · 2026 年5 月 26 日 13:33

整理：test3.bin 在 dnn_node_example 上 SIGSEGV 的完整状况

一、环境

项	值
板型	RDK X5（board_id=302）
系统版本	RDK OS 3.4.1
TROS	Humble，`/opt/tros/humble/`
板上 hobot-dnn	1.24.5
BPU 平台	1.3.6, HBRT 3.15.55.0
摄像头	MIPI IMX219, channel=1（cam1）
摄像头驱动	正常，vin mipi0~3 probed

二、模型信息

项	值
模型	自训练 YOLOv8s，10 类
文件名	`/root/models/test3.bin`（12.9MB）
编译 march	`bayes-e`
输入	1×3×640×640, NV12, NHWC
输出	6 个张量，NCHW 布局，F32 未量化
归一化	`data_scale`, scale 0.003921568627451
校准	`/v2/v2/calib_images`, float32, max
编译模式	latency, O3

编译 YAML 全文（用户已提供）：

yaml

march: "bayes-e"
layer_out_dump: False

input_parameters:
  input_name: "images"
  input_shape: "1x3x640x640"
  input_type_rt: "nv12"
  input_layout_rt: "NHWC"
  input_type_train: "rgb"
  input_layout_train: "NCHW"
  norm_type: "data_scale"
  scale_value: "0.003921568627451"

calibration_parameters:
  cal_data_dir: "/v2/v2/calib_images"
  cal_data_type: "float32"
  calibration_type: "max"

compiler_parameters:
  compile_mode: "latency"
  debug: false
  optimize_level: "O3"

注意：YAML 中只有 `input_layout_rt: “NHWC”`（输入布局），没有 `output_layout_rt` 字段。

三、已做的验证

Python 侧推理正常：dnn.load('/root/models/test3.bin') → forward(NV12输入) → 6 个输出张量，形状正确，无报错
hrt_model_exec 推理正常：hrt_model_exec infer --model_file /root/models/test3.bin --input_file /tmp/test_input_nv12.bin → 成功，耗时 14.6ms，exit 0
官方 yolov8s_640x640_nv12.bin 模型在同环境、同 launch 命令、同 yolo8 parser 下正常运行
yolov2、yolov5 模型在同环境正常运行，排除 ROS2 环境问题

四、复现现象

启动命令：

bash

source /opt/tros/humble/setup.bash
ros2 launch dnn_node_example dnn_node_example.launch.py \
  dnn_example_config_file:=config/yolov8workconfig.json \
  dnn_example_image_width:=640 dnn_example_image_height:=640 \
  mipi_channel:=1

结果：dnn_node_example首帧推理时 SIGSEGV 退出（exit code -11），终端无具体错误信息。
配置文件yolov8workconfig.json尝试过的变体（均失败）：

model_output_count:6, strides:[8,16,32] → SIGSEGV
model_output_count:3, strides:[8,16,32] → 日志 “strides size 3 is not equal to model_output_count 3”（矛盾报错）
model_output_count:6, strides:[8,8,16,16,32,32] → 同样矛盾报错

五、对比官方模型

属性	官方 yolov8 (能跑)	test3.bin (崩溃)
输出 Layout	NHWC	NCHW
reg 数据类型	S32 + SCALE 量化	F32 未量化
cls 数据类型	F32	F32
模型文件	`/opt/hobot/model/x5/basic/yolov8_640x640_nv12.bin`	`/root/models/test3.bin`

六、根因推断（待确认）

怀疑 yolo8 parser 要求输出为 NHWC 布局。test3.bin 输出是 NCHW（编译 YAML 中未设置output_layout_rt，默认 NCHW），parser 按 NHWC 解析时内存布局不对，越界导致 SIGSEGV。hrt_model_exec只做前向不解析所以不崩溃，Python 端同样不经过 parser 也不崩溃。

七、求助问题

RDK X5 的 yolo8 parser 是否强制要求输出张量为 NHWC 布局？
若是：编译 YAML 中是否添加 output_layout_rt: "NHWC" 即可解决？还是需要其他额外参数？
输出 reg 是否需要像官方模型一样做量化（S32+SCALE），还是 F32 也兼容？
若 parser 不强制 NHWC，有无其他需要关注的编译参数或已知兼容性约束？

地瓜小助手_bot · 2026 年5 月 26 日 13:34

问题结论

你的 SIGSEGV 崩溃问题根因已确认：输出张量布局不匹配 + 缺少量化信息。

核心问题

输出 Layout 不匹配：你的模型输出是 NCHW，但 TROS 的 yolov8 parser 强制要求 NHWC
bbox 分支缺少量化信息：quanti type: NONE 导致 parser 无法正确反量化
输出顺序错误：parser 期望 先 cls 后 bbox，你的模型是 先 bbox 后 cls

解决方案

步骤 1：修正 ONNX 导出脚本

修改 export_yolo26_detect_bpu.py 中的 bpu_detect_forward 函数：

def bpu_detect_forward(self, x):
    result = []
    for i in range(self.nl):
        # bbox 分支：保持 float32，不要手动转 int32！
        bbox = self.cv2[i](x[i])  # (B, 4*reg_max, H, W)
        # cls 分支：保持 float32
        cls = self.cv3[i](x[i])   # (B, nc, H, W)
        
        # 关键：NCHW -> NHWC
        bbox = bbox.permute(0, 2, 3, 1).contiguous()  # (B, H, W, 4*reg_max)
        cls = cls.permute(0, 2, 3, 1).contiguous()    # (B, H, W, nc)
        
        # 关键：输出顺序 先 cls 后 bbox（匹配官方 parser）
        result.append(cls)
        result.append(bbox)
    return result

重新导出 ONNX：

python export_yolo26_detect_bpu.py --pt your_model.pt --imgsz 640 640 --opset 11

步骤 2：修正编译 YAML

model_parameters:
  onnx_model: './best.onnx'
  march: "bayes-e"
  layer_out_dump: False
  working_dir: 'model_output'
  output_model_file_prefix: 'yolov8_cow'

input_parameters:
  input_name: "images"
  input_type_rt: 'nv12'
  input_type_train: 'rgb'
  input_layout_train: 'NCHW'
  norm_type: 'data_scale'
  scale_value: 0.003921568627451

calibration_parameters:
  cal_data_dir: './cal_bin'
  cal_data_type: 'float32'
  calibration_type: 'default'

compiler_parameters:
  compile_mode: 'latency'
  debug: False
  optimize_level: 'O3'

注意：

不要添加 input_layout_rt（BPU 自动处理）
不要添加 cal_data_count（参数不存在）
不要添加 optimization_parameters（1.24.3 不支持）

步骤 3：重新编译并验证

hb_mapper makertbin --model-type onnx --config yolov8.yaml
hrt_model_exec model_info --model_file=./model_output/yolov8_cow.bin

期望输出：

output[0]: shape=(1,80,80,5), layout=NHWC, type=F32, quanti=SCALE  # cls_P3
output[1]: shape=(1,80,80,64), layout=NHWC, type=S32, quanti=SCALE  # bbox_P3
output[2]: shape=(1,40,40,5), layout=NHWC, type=F32, quanti=SCALE  # cls_P4
output[3]: shape=(1,40,40,64), layout=NHWC, type=S32, quanti=SCALE  # bbox_P4
output[4]: shape=(1,20,20,5), layout=NHWC, type=F32, quanti=SCALE  # cls_P5
output[5]: shape=(1,20,20,64), layout=NHWC, type=S32, quanti=SCALE  # bbox_P5

步骤 4：检查配置文件

确保 yolov8workconfig.json 正确：

{
  "model_file": "/home/sunrise/yolov8/yolov8_cow.bin",
  "task_num": 4,
  "dnn_parser": "yolov8",
  "model_output_count": 6,
  "reg_max": 16,
  "class_num": 5,
  "cls_names_list": "my_config/cow.list",
  "strides": [8, 16, 32],
  "score_threshold": 0.25,
  "nms_threshold": 0.7,
  "nms_top_k": 300,
  "output_order": [0, 1, 2, 3, 4, 5]
}

关键差异对比

属性	官方模型（能跑）	你的模型（崩溃）	修复后
输出 Layout	NHWC	NCHW/NONE	NHWC
bbox 类型	S32 + SCALE	S32 + NONE	S32 + SCALE
cls 类型	F32	F32	F32
输出顺序	先 cls 后 bbox	先 bbox 后 cls	先 cls 后 bbox

参考资源

快速验证

修复后运行：

ros2 launch dnn_node_example dnn_node_example_feedback.launch.py \
  dnn_example_config_file:=my_config/yolov8workconfig.json \
  dnn_example_image:=my_config/target.jpg

如果还有问题，请把新的 hrt_model_exec model_info 输出贴出来继续分析！