(非官方)RDK模型转换可视化工具!它来了!!!

作者:SkyXZ

CSDN:SkyXZ~-CSDN博客

博客园:SkyXZ - 博客园

之前在使用的RDK X3的时候,吴诺老师@wunuo发布了新一代量化转换工具链使用教程,这个工具真的非常的方便,能非常快速的完成X3上模型的量化,唯一的缺点便是不支持X5,于是我便想着仿照老师的X3可视化工具链来弄一个适配X5的可视化量化转换工具链,我的初步构想便是这个可视化工具链能够逐步适配地瓜ModelZoo里的所有模型实现一站式从模型的训练到模型的转换最后到部署,使之能够更加轻松便捷的服务新拿到RDK的同学们,于是在经过一段时间的努力后,我的工具链初版完成啦!

目前新版工具链已经升级到了V2.0版本,优化了部分的UI显示,新增了页面保持,在切换页面的时候系统会自动记录当前的日志及后台进程,在您切换回当前页面的时候会自动恢复!同时支持了大部分的分类模型的在线训练与导出,最最最重要的是!现在的可视化工具链支持使用模型编译的中间产物"quantion.onnx"直接在开发机上执行推理!可以快速验证量化后的模型效果啦!!!

希望大家能够多多提出意见帮助这个项目改进!!!(qaq:JS真的太难了)后续会增加开发板管理功能!!!

使用方法:

默认启动地址:127.0.0.1:5000

Docker安装详见:Linux下Docker及Nvidia Container ToolKit安装教程 - SkyXZ - 博客园

  1. Docker使用(推荐):

    step 1 拉取docker镜像(阿里云仓库)

    docker pull crpi-0uog49363mcubexr.cn-hangzhou.personal.cr.aliyuncs.com/skyxz/rdk_toolchain:v2.0

    step 2 创建文件夹映射

    mkdir ~/dataset
    export dataset_path=~/dataset

    Run-Method-1 临时创建容器(自行修改–shm-size配置)

    docker run -it --rm --gpus all --shm-size=32g --ipc=host -e PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:128 -e CUDA_LAUNCH_BLOCKING=1 -p 5000:5000 -p 8080:8080 -v “$dataset_path”:/data crpi-0uog49363mcubexr.cn-hangzhou.personal.cr.aliyuncs.com/skyxz/rdk_toolchain:v2.0

    Run-Method-2 永久创建容器(自行修改–shm-size配置)

    docker run -it --rm --gpus all --shm-size={你的内存大小例如:32g} --ipc=host -e PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:128 -e CUDA_LAUNCH_BLOCKING=1 -p 5000:5000 -p 8080:8080 -v “$dataset_path”:/data crpi-0uog49363mcubexr.cn-hangzhou.personal.cr.aliyuncs.com/skyxz/rdk_toolchain:v2.0

  2. 手动构建docker镜像:

    step 1 :从百度云下载源码(仓库中仅有前端后端实现)

    百度网盘 请输入提取码

    step 2 解压并进入项目目录

    step 3 构建docker

    docker build -t rdk_toolchain .

    step 4 创建文件夹映射

    mkdir ~/dataset
    export dataset_path=~/dataset

    Run-Method-1 临时创建容器(自行修改–shm-size配置)

    docker run -it --rm --gpus all --shm-size=32g --ipc=host -e PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:128 -e CUDA_LAUNCH_BLOCKING=1 -p 5000:5000 -p 8080:8080 -v “$dataset_path”:/data crpi-0uog49363mcubexr.cn-hangzhou.personal.cr.aliyuncs.com/skyxz/rdk_toolchain:v2.0

    Run-Method-2 永久创建容器(自行修改–shm-size配置)

    docker run -it --rm --gpus all --shm-size={你的内存大小例如:32g} --ipc=host -e PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:128 -e CUDA_LAUNCH_BLOCKING=1 -p 5000:5000 -p 8080:8080 -v “$dataset_path”:/data crpi-0uog49363mcubexr.cn-hangzhou.personal.cr.aliyuncs.com/skyxz/rdk_toolchain:v2.0

  3. 直接下载源码使用:

    step 1 :从百度云下载源码(仓库中仅有前端后端实现)

    百度网盘 请输入提取码

    step 2 : 安装依赖

    pip3 install -r requirements_docker.txt

    step 3 :运行脚本即可

    bash start_services.sh

注意事项:

  1. 当停止某项操作时(如停止训练)有时停止按钮可能会卡住无反应,这不是卡死了!这是后台正在尝试杀死进程中,几秒后再次点击停止即可退出!
  2. 部分日志输出为红色不一定是报错!进程是否因为报错结束请以训练状态标志为准!
  3. 除了模型导出的onnx会放在原pt模型路径下之外,其他所有运行的结果将保存在/app/logs下

版本介绍:

V1.0:

  1. 已支持所有模型的量化转换操作
  2. 已完成ModelZoo中YOLO全系列的训练与导出实现
  3. 即将支持ResNet系列模型、FCOS等模型(TODO V2.0)
  4. 即将实现PC端转换后模型推理检查(TODO V2.0)

V2.0:

  1. 已支持常见分类模型的量化转换操作
  2. 增加网页保持状态功能,切换页面不用担心状态丢失啦!
  3. 修复了部分BUG,优化了部分功能,也优化了部分丑陋的界面
  4. 新增了训练及导出部分模型尺寸的设置

地瓜机器人RDK模型一站式开发工具功能展示:

  • 工具总览:

indexindex

  • 模型训练:

train

  • 模型导出

exportexport

  • 模型量化检查

quarquar

  • 模型转换

convenconven

  • 反量化节点摘除

deletedelete

  • 模型输入输出情况及可视化检查

detectiondetection

新功能一览:

  • 页面保持:

5667aff9d17558122512ef421c04ac5

b7788f273295383c3f53058c931cf14

  • 模型在线测试:

87ea63e279fd7eaf29de7e93c40670a

  • UI优化:

12cf3ace9c6478561e6efb11ce6e6f6

  • 分类模型训练及导出支持:

dc7c71d57420af1e6c15117342eb492

4 个赞

你好,我在使用工具时,发现网页中不显示文件及文件夹

1 个赞

请问这个能在windowsdocker中使用吗

1 个赞

成功运行之后想导入pt文件时网页上文件显示不全和不显示怎么解决呢

1 个赞

Error message:vector::_M_range_check: __n (which is 2) >= this->size() (which is 2)-
这个错误是怎么回事

1 个赞

检测不到GPU是怎么回事

1 个赞

请问这个平台x3可以用吗

太赞了!!

X3可以使用哦

可以的

请问使用模型测试功能,使用后显示Method Not Allowed

The method is not allowed for the requested URL.是什么原因

模型训练不了诶 导出脚本和训练脚本都不存在 不知道为什么

请问你解决好了吗 我也是这个问题

您好,我拉取镜像后发现占用了大量内存,在删除镜像后,仅仅清理了大约10g的内存,请问如何清理其他文件

要挂载文件夹启动的亲,具体问gpt如何启动docker挂载文件夹

这个支持实例分割吗

兄弟有国外的镜像么,我拉一天了没拉下来,好像服务器有问题PS C:\Users\29774> docker pull crpi-0uog49363mcubexr.cn-hangzhou.personal.cr.aliyuncs.com/skyxz/rdk_toolchain:v2.0
v2.0: Pulling from skyxz/rdk_toolchain
263fc748118f: Pulling fs layer
507fc9045cba: Pulling fs layer
23b7d8e07c16: Pulling fs layer
5b2c1aeca6ff: Pulling fs layer
16c36d0187d0: Pulling fs layer
43373113734d: Pulling fs layer
922ac8fcb889: Pulling fs layer
e7a56570655c: Pulling fs layer
56730522db62: Pulling fs layer
624c2ffd0169: Pulling fs layer
a07de276cc70: Pulling fs layer
f92aac8f11dc: Pulling fs layer
a07de276cc70: Already exists
8fad2fc95c7a: Already exists
824912e95b95: Already exists
352c12e660b0: Already exists
1e9c91331225: Already exists
cef5c9f48a3a: Already exists
a20017c297a0: Already exists
68075f2beca1: Pulling fs layer
3dfcd104a692: Already exists
897ad69c8f5b: Already exists
64982a6f0f6c: Pulling fs layer
unknown: failed to copy: httpReadSeeker: failed open: unexpected status from GET request to http://aliregistry.oss-cn-hangzhou.aliyuncs.com/docker/registry/v2/blobs/sha256/43/43373113734db918f617859130a3ef483ce247f737bb04ec361403f39777fd69/data?Expires=1759815266&OSSAccessKeyId=LTAI4FsQYu7kG56rtBsQAHfw&Signature=zcpOmhoSlgRqIrYFU5SWuPjiGt0%3D&x-oss-traffic-limit=41943040: 500 reading HTTP response body: unexpected EOF,我自己构建好像也有问题manjusaka008@manjusaka:/mnt/f/BaiduNetdiskDownload/rdk_toolchain/V2.0/RDK_ToolChain$ HTTP_PROXY=http://172.22.240.1:7897 HTTPS_PROXY=http://172.22.240.1:7897 docker build -t rdk_toolchain .
[+] Building 248.3s (6/21) docker:default
=> [internal] load build definition from Dockerfile 0.1s
=> => transferring dockerfile: 3.08kB 0.0s
=> [internal] load metadata for nvcr.io/nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04 2.2s
=> [internal] load .dockerignore 0.1s
=> => transferring context: 88B 0.1s
=> [internal] load build context 25.8s
=> => transferring context: 151.97kB 25.8s
=> [ 1/17] FROM nvcr.io/nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04@sha256:8f9dd0d09d3ad3900357a1cf7f887888b5b74056636cd 245.8s
=> => resolve nvcr.io/nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04@sha256:8f9dd0d09d3ad3900357a1cf7f887888b5b74056636cd6ef0 0.0s
=> ERROR [ 2/17] RUN sed -i ‘s/archive.ubuntu.com/mirrors.aliyun.com/g’ /etc/apt/sources.list && sed -i 's/security.ub 0.0s

[ 2/17] RUN sed -i ‘s/archive.ubuntu.com/mirrors.aliyun.com/g’ /etc/apt/sources.list && sed -i ‘s/security.ubuntu.com/mirrors.aliyun.com/g’ /etc/apt/sources.list:


ERROR: failed to build: failed to solve: Canceled: context canceled
manjusaka008@manjusaka:/mnt/f/BaiduNetdiskDownload/rdk_toolchain/V2.0/RDK_ToolChain$ HTTP_PROXY=http://172.22.240.1:7897 HTTPS_PROXY=http://172.22.240.1:7897 docker build -t rdk_toolchain .
[+] Building 2550.6s (16/21) docker:default
=> [internal] load build definition from Dockerfile 0.1s
=> => transferring dockerfile: 3.08kB 0.0s
=> [internal] load metadata for nvcr.io/nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04 0.8s
=> [internal] load .dockerignore 0.1s
=> => transferring context: 88B 0.0s
=> [ 1/17] FROM nvcr.io/nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04@sha256:8f9dd0d09d3ad3900357a1cf7f887888b5b74056636c 1899.3s
=> => resolve nvcr.io/nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04@sha256:8f9dd0d09d3ad3900357a1cf7f887888b5b74056636cd6ef0 0.0s
=> => sha256:23b7d8e07c16707ff4ec3ca558a8099c454953c840156c318a60a6b4273846a0 2.46GB / 2.46GB 1752.1s
=> => sha256:263fc748118f7937f811e3e9c9355318db07dd2dd1dccc370dadaa7d0b5ed692 1.38GB / 1.38GB 585.9s
=> => sha256:68075f2beca1cfd3f243ec110000716dff39d895f4d5e0d3faba7ace430f9633 1.43GB / 1.43GB 616.6s
=> => extracting sha256:263fc748118f7937f811e3e9c9355318db07dd2dd1dccc370dadaa7d0b5ed692 144.1s
=> => extracting sha256:16c36d0187d03bd0de84d870ded86c45fabd78f4bfdb2ed90177e5fc4dd33d11 0.1s
=> => extracting sha256:e7a56570655c990ecc804c77873efc83f9a6c31064e3e8a5dc02430213f2d74c 0.0s
=> => extracting sha256:507fc9045cbad45c1c4ca554a6453fe0a1c9ae74667db0612fec7475256d5c23 0.5s
=> => extracting sha256:23b7d8e07c16707ff4ec3ca558a8099c454953c840156c318a60a6b4273846a0 150.7s
=> => extracting sha256:922ac8fcb88926d95550e82f83c14a4f3f3eaab635e7acf43ee0c59dea0c14d7 0.0s
=> => extracting sha256:68075f2beca1cfd3f243ec110000716dff39d895f4d5e0d3faba7ace430f9633 62.8s
=> [internal] load build context 37.6s
=> => transferring context: 148.92kB 37.5s
=> [ 2/17] RUN sed -i ‘s/archive.ubuntu.com/mirrors.aliyun.com/g’ /etc/apt/sources.list && sed -i 's/security.ubuntu.c 3.6s
=> [ 3/17] RUN apt-get update && apt-get install -y --no-install-recommends software-properties-common && add-ap 641.6s
=> [ 4/17] RUN pip config set global.index-url Simple Index 0.8s
=> [ 5/17] WORKDIR /app 0.0s
=> [ 6/17] RUN pip install --no-cache-dir matplotlib>=2.1.0 python-dateutil>=2.7 pillow>=8 packaging>=20. 10.2s
=> [ 7/17] RUN pip install --no-cache-dir numpy==1.26.1 cython 4.8s
=> [ 8/17] COPY deps/cocoapi /app/deps/cocoapi 0.1s
=> [ 9/17] RUN cd /app/deps/cocoapi/PythonAPI && python3 setup.py build_ext install && cd ../.. 4.2s
=> [10/17] COPY deps/wheels/*.whl ./deps/wheels/ 0.5s
=> [11/17] COPY requirements_docker.txt . 0.0s
=> ERROR [12/17] RUN pip install --no-cache-dir --ignore-installed -r requirements_docker.txt 0.7s

[12/17] RUN pip install --no-cache-dir --ignore-installed -r requirements_docker.txt:
0.485 Looking in indexes: Simple Index
0.700 ERROR: hbdk-3.49.15-cp310-cp310-linux_x86_64.whl is not a supported wheel on this platform.


Dockerfile:81

79 | # 修改安装命令,分开执行
80 | # RUN pip install --no-cache-dir --no-deps --ignore-installed -r requirements_docker.txt
81 | >>> RUN pip install --no-cache-dir --ignore-installed -r requirements_docker.txt
82 | # 复制项目文件
83 | COPY . .

ERROR: failed to build: failed to solve: process “/bin/sh -c pip install --no-cache-dir --ignore-installed -r requirements_docker.txt” did not complete successfully: exit code: 1

功德无量啊!期待大神更新~~

请问S100能用吗?