用户您好,请详细描述您所遇到的问题,这会帮助我们快速定位问题~
1.芯片型号:J5
2.天工开物开发包OpenExplorer版本:J5_OE_1.1.40
3.问题定位:模型转换
4.问题具体描述:qat 训练的时候报下面的错误:
2023-10-20 17:21:50,080 ERROR [ddp_trainer.py:419] Node[0] Traceback (most recent call last):
File “/usr/local/lib/python3.8/dist-packages/hat/engine/ddp_trainer.py”, line 415, in _with_exception
fn(*args)
File “/clever/volumes/anjiaju-owner/dev/tmp/zone_hat/tools/train.py”, line 189, in train_entrance
trainer.fit()
File “/usr/local/lib/python3.8/dist-packages/hat/engine/loop_base.py”, line 510, in fit
self.batch_processor(
File “/usr/local/lib/python3.8/dist-packages/hat/engine/processors/processor.py”, line 544, in __call__
model_outs = model(*_as_list(batch_i))
File “/usr/local/lib/python3.8/dist-packages/hat/utils/module_patch.py”, line 46, in _wrap
return fn(self, *args, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/hat/utils/module_patch.py”, line 46, in _wrap
return fn(self, *args, **kwargs)
File “/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py”, line 1190, in _call_impl
return forward_call(*input, **kwargs)
File “/root/.local/lib/python3.8/site-packages/torch/nn/parallel/distributed.py”, line 1040, in forward
output = self._run_ddp_forward(*inputs, **kwargs)
File “/root/.local/lib/python3.8/site-packages/torch/nn/parallel/distributed.py”, line 1000, in _run_ddp_forward
return module_to_run(*inputs[0], **kwargs[0])
File “/usr/local/lib/python3.8/dist-packages/hat/utils/module_patch.py”, line 46, in _wrap
return fn(self, *args, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/hat/utils/module_patch.py”, line 46, in _wrap
return fn(self, *args, **kwargs)
File “/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py”, line 1190, in _call_impl
return forward_call(*input, **kwargs)
File “/clever/volumes/anjiaju-owner/dev/tmp/zone_hat/hat_plugin/models/structure/bev/bev_structure.py”, line 166, in forward
bev_feat = self.bev_encoder(bev_feat, data)
File “/usr/local/lib/python3.8/dist-packages/hat/utils/module_patch.py”, line 46, in _wrap
return fn(self, *args, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/hat/utils/module_patch.py”, line 46, in _wrap
return fn(self, *args, **kwargs)
File “/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py”, line 1190, in _call_impl
return forward_call(*input, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/hat/models/task_modules/view_fusion/encoder.py”, line 33, in forward
feat = self.backbone(feat)
File “/usr/local/lib/python3.8/dist-packages/hat/utils/module_patch.py”, line 46, in _wrap
return fn(self, *args, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/hat/utils/module_patch.py”, line 46, in _wrap
return fn(self, *args, **kwargs)
File “/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py”, line 1190, in _call_impl
return forward_call(*input, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/hat/models/backbones/efficientnet.py”, line 358, in forward
x = self.quant(x)
File “/usr/local/lib/python3.8/dist-packages/hat/utils/module_patch.py”, line 46, in _wrap
return fn(self, *args, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/hat/utils/module_patch.py”, line 46, in _wrap
return fn(self, *args, **kwargs)
File “/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py”, line 1190, in _call_impl
return forward_call(*input, **kwargs)
File “/usr/local/lib/python3.8/dist-packages/horizon_plugin_pytorch/nn/qat/stubs.py”, line 80, in forward
assert torch.all(
AssertionError: input scale must be the same as op’s