运行compile_perf.py编译qat模型时，遇到下面的错误

beyondaa · 2023 年10 月 26 日 07:26

用户您好，请详细描述您所遇到的问题，这会帮助我们快速定位问题~

1.芯片型号：J5

2.天工开物开发包OpenExplorer版本：J5_OE_1.1.60

3.问题定位：模型转换

4.问题具体描述：您好，我在运行compile_perf.py编译qat模型时，遇到下面的错误，一时间找不到解决的思路，可以帮忙看一下是什么问题导致的，该如何解决吗？

/usr/local/lib/python3.8/dist-packages/horizon_plugin_pytorch/qtensor.py:1000: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!

if scale is not None and scale.numel() > 1:

/usr/local/lib/python3.8/dist-packages/horizon_plugin_pytorch/nn/quantized/conv2d.py:290: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!

per_channel_axis=-1 if self.out_scale.numel() == 1 else 1,

/clever/volumes/anjiaju-owner/dev/history/zone_hat/hat/models/necks/fast_scnn.py:75: TracerWarning: torch.Tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.

input_size = torch.Tensor((x.shape[2], x.shape[3])).cpu().numpy()

/clever/volumes/anjiaju-owner/dev/history/zone_hat/hat/models/necks/fast_scnn.py:75: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!

input_size = torch.Tensor((x.shape[2], x.shape[3])).cpu().numpy()

/clever/volumes/anjiaju-owner/dev/history/zone_hat/hat/models/necks/fast_scnn.py:75: TracerWarning: Converting a tensor to a NumPy array might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!

input_size = torch.Tensor((x.shape[2], x.shape[3])).cpu().numpy()

/usr/local/lib/python3.8/dist-packages/horizon_plugin_pytorch/nn/quantized/functional_modules.py:163: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!

per_channel_axis=-1 if self.scale.numel() == 1 else 1,

/usr/local/lib/python3.8/dist-packages/horizon_plugin_pytorch/utils/script_quantized_fn.py:239: UserWarning: operator() profile_node %323 : int = prim::profile_ivalue(%_storage_type)

does not have profile information (Triggered internally at ../torch/csrc/jit/codegen/cuda/graph_fuser.cpp:105.)

return compiled_fn(*args, **kwargs)

/usr/local/lib/python3.8/dist-packages/horizon_plugin_pytorch/qtensor.py:296: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!

assert input.q_scale().numel() == 1, (

/root/.local/lib/python3.8/site-packages/torch/jit/_trace.py:976: TracerWarning: Encountering a list at the output of the tracer might cause the trace to be incorrect, this is only valid if the container structure does not change based on the module’s inputs. Consider using a constant container instead (e.g. for `list`, use a `tuple` instead. for `dict`, use a `NamedTuple` instead). If you absolutely need this and know the side effects, pass strict=False to trace() to allow this behavior.

module._c._create_method_from_trace(

Traceback (most recent call last):

File “/usr/local/lib/python3.8/dist-packages/horizon_plugin_pytorch/qtensor.py”, line 1204, in __torch_function__

return wrapped_func(func, types, args, kwargs)

File “/usr/local/lib/python3.8/dist-packages/horizon_plugin_pytorch/qtensor.py”, line 179, in wrapped_func

return f(*args, **kwargs)

TypeError: _qtensor_requires_grad_() takes 1 positional argument but 2 were given

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File “/usr/local/lib/python3.8/dist-packages/hbdk/torch_script/tools.py”, line 72, in _trace_module

traced_module = trace(module, example_inputs)

File “/usr/local/lib/python3.8/dist-packages/hbdk/torch_script/tools.py”, line 63, in trace

return torch.jit.trace(obj, placeholders)

File “/root/.local/lib/python3.8/site-packages/torch/jit/_trace.py”, line 759, in trace

return trace_module(

File “/root/.local/lib/python3.8/site-packages/torch/jit/_trace.py”, line 1001, in trace_module

_check_trace(

File “/root/.local/lib/python3.8/site-packages/torch/autograd/grad_mode.py”, line 27, in decorate_context

return func(*args, **kwargs)

File “/root/.local/lib/python3.8/site-packages/torch/jit/_trace.py”, line 329, in _check_trace

copied_dict[name] = _clone_inputs(data)

File “/root/.local/lib/python3.8/site-packages/torch/jit/_trace.py”, line 160, in _clone_inputs

return function._nested_map(

File “/root/.local/lib/python3.8/site-packages/torch/autograd/function.py”, line 486, in _map

return type(obj)(mapped)

File “/root/.local/lib/python3.8/site-packages/torch/autograd/function.py”, line 482, in

mapped = (_map(x) for x in obj)

File “/root/.local/lib/python3.8/site-packages/torch/autograd/function.py”, line 488, in _map

return {x : _map(obj[x]) for x in obj}

File “/root/.local/lib/python3.8/site-packages/torch/autograd/function.py”, line 488, in

return {x : _map(obj[x]) for x in obj}

File “/root/.local/lib/python3.8/site-packages/torch/autograd/function.py”, line 478, in _map

return fn(obj)

File “/root/.local/lib/python3.8/site-packages/torch/jit/_trace.py”, line 150, in clone_input

a.detach()

File “/usr/local/lib/python3.8/dist-packages/horizon_plugin_pytorch/qtensor.py”, line 1206, in __torch_function__

raise type(e)(

File “/usr/local/lib/python3.8/dist-packages/horizon_plugin_pytorch/qtensor.py”, line 1204, in __torch_function__

return wrapped_func(func, types, args, kwargs)

File “/usr/local/lib/python3.8/dist-packages/horizon_plugin_pytorch/qtensor.py”, line 179, in wrapped_func

return f(*args, **kwargs)

TypeError: _qtensor_requires_grad_() takes 1 positional argument but 2 were given

when calling function <method ‘requires_grad_’ of ‘torch._C._TensorBase’ objects>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File “tools/compile_perf.py”, line 229, in

compile_then_perf(

File “tools/compile_perf.py”, line 133, in compile_then_perf

result = perf_model(

File “/usr/local/lib/python3.8/dist-packages/hbdk/torch_script/tools.py”, line 360, in perf_model

traced_module = _trace_module(module, example_inputs)

File “/usr/local/lib/python3.8/dist-packages/hbdk/torch_script/tools.py”, line 75, in _trace_module

raise RuntimeError(

RuntimeError: torch.jit.trace fail. Please make sure the model is traceable.

颜值即正义 · 2023 年10 月 27 日 02:44

你好，从报错信息来看是模型中的qtensor的_qtensor_requires_grad_给了错误的参数，请尝试根据报错信息中的代码路径在qtensor.py中打印报错的函数-