resume_optimizer无法恢复

sdaheng · 2023 年3 月 10 日 05:13

用户您好，请详细描述您所遇到的问题，这会帮助我们快速定位问题~

1.芯片型号：J5

2.天工开物开发包OpenExplorer版本：J5_OE_1.1.40

3.问题定位：浮点模型训练

4.问题具体描述：请提供运行的命令、报错信息，如果可以的话，可在附件中提供模型供技术支持进行复现

在训练过程中中断了一次，看到文档里在配置文件中float_trainer中添加resume_optimizer=True恢复训练。

但报checkpoint找不到。

模型GANet

命令：python3 ./train.py --stage float --config ../configs/lane_pred/ganet/ganet.py

ckpt_dir未修改，默认的./tmp_models/ganet_mixvargenet_culane

目录下有很多pth.tar文件

```

-rw-r–r-- 1 root root 15M 3月 9 13:43 float-checkpoint-best-d56ba571.pth.tar

-rw-r–r-- 1 root root 15M 3月 9 13:43 float-checkpoint-best.pth.tar

-rw-r–r-- 1 root root 15M 3月 3 15:12 float-checkpoint-epoch-0000-8fdb667a.pth.tar

-rw-r–r-- 1 root root 15M 3月 3 17:34 float-checkpoint-epoch-0000-dfa32c76.pth.tar

-rw-r–r-- 1 root root 15M 3月 3 19:15 float-checkpoint-epoch-0001-f0f36e7d.pth.tar

-rw-r–r-- 1 root root 15M 3月 3 20:51 float-checkpoint-epoch-0002-ded5236d.pth.tar

-rw-r–r-- 1 root root 15M 3月 3 22:31 float-checkpoint-epoch-0003-10365abe.pth.tar

-rw-r–r-- 1 root root 15M 3月 4 00:10 float-checkpoint-epoch-0004-7c65f5ec.pth.tar

-rw-r–r-- 1 root root 15M 3月 4 01:52 float-checkpoint-epoch-0005-d227b05b.pth.tar

-rw-r–r-- 1 root root 15M 3月 4 03:38 float-checkpoint-epoch-0006-2d2cc0e3.pth.tar

-rw-r–r-- 1 root root 15M 3月 4 05:25 float-checkpoint-epoch-0007-d16c64df.pth.tar

-rw-r–r-- 1 root root 15M 3月 4 07:09 float-checkpoint-epoch-0008-ae58fe99.pth.tar

-rw-r–r-- 1 root root 15M 3月 4 08:56 float-checkpoint-epoch-0009-6d37358e.pth.tar

-rw-r–r-- 1 root root 15M 3月 4 10:45 float-checkpoint-epoch-0010-e996e357.pth.tar

-rw-r–r-- 1 root root 15M 3月 4 12:30 float-checkpoint-epoch-0011-62a67660.pth.tar

-rw-r–r-- 1 root root 15M 3月 4 14:15 float-checkpoint-epoch-0012-2321b459.pth.tar

-rw-r–r-- 1 root root 15M 3月 4 16:01 float-checkpoint-epoch-0013-7ecd79b4.pth.tar

-rw-r–r-- 1 root root 15M 3月 4 17:49 float-checkpoint-epoch-0014-5b1cc5b0.pth.tar

-rw-r–r-- 1 root root 15M 3月 4 19:36 float-checkpoint-epoch-0015-b2300e22.pth.tar

-rw-r–r-- 1 root root 15M 3月 4 21:21 float-checkpoint-epoch-0016-b5739d52.pth.tar

-rw-r–r-- 1 root root 15M 3月 4 23:07 float-checkpoint-epoch-0017-3a2f4db8.pth.tar

-rw-r–r-- 1 root root 15M 3月 5 00:51 float-checkpoint-epoch-0018-a5839ed0.pth.tar

-rw-r–r-- 1 root root 15M 3月 5 02:39 float-checkpoint-epoch-0019-b573c03b.pth.tar

-rw-r–r-- 1 root root 15M 3月 5 04:26 float-checkpoint-epoch-0020-785099a3.pth.tar

-rw-r–r-- 1 root root 15M 3月 5 06:12 float-checkpoint-epoch-0021-7c2b52f9.pth.tar

-rw-r–r-- 1 root root 15M 3月 5 07:58 float-checkpoint-epoch-0022-362f345e.pth.tar

-rw-r–r-- 1 root root 15M 3月 5 09:48 float-checkpoint-epoch-0023-4d15b924.pth.tar

-rw-r–r-- 1 root root 15M 3月 5 12:16 float-checkpoint-epoch-0024-8d554bc0.pth.tar

-rw-r–r-- 1 root root 15M 3月 5 14:03 float-checkpoint-epoch-0025-cbc85e5f.pth.tar

-rw-r–r-- 1 root root 15M 3月 5 15:50 float-checkpoint-epoch-0026-1db73ac3.pth.tar

-rw-r–r-- 1 root root 15M 3月 5 17:36 float-checkpoint-epoch-0027-0b53d6db.pth.tar

-rw-r–r-- 1 root root 15M 3月 5 19:23 float-checkpoint-epoch-0028-89001863.pth.tar

-rw-r–r-- 1 root root 15M 3月 5 21:06 float-checkpoint-epoch-0029-bf576ec4.pth.tar

-rw-r–r-- 1 root root 15M 3月 5 22:55 float-checkpoint-epoch-0030-7743ffcc.pth.tar

-rw-r–r-- 1 root root 15M 3月 6 00:39 float-checkpoint-epoch-0031-ae8e5a32.pth.tar

-rw-r–r-- 1 root root 15M 3月 6 02:26 float-checkpoint-epoch-0032-f4b9d28f.pth.tar

-rw-r–r-- 1 root root 15M 3月 6 04:12 float-checkpoint-epoch-0033-ee861fc7.pth.tar

-rw-r–r-- 1 root root 15M 3月 6 05:59 float-checkpoint-epoch-0034-472d9b79.pth.tar

-rw-r–r-- 1 root root 15M 3月 6 07:46 float-checkpoint-epoch-0035-37c2c37c.pth.tar

-rw-r–r-- 1 root root 15M 3月 6 09:32 float-checkpoint-epoch-0036-bb15540e.pth.tar

-rw-r–r-- 1 root root 15M 3月 6 11:19 float-checkpoint-epoch-0037-b4874156.pth.tar

-rw-r–r-- 1 root root 15M 3月 6 13:04 float-checkpoint-epoch-0038-d53c5e4b.pth.tar

-rw-r–r-- 1 root root 15M 3月 6 14:51 float-checkpoint-epoch-0039-51b5525e.pth.tar

-rw-r–r-- 1 root root 15M 3月 6 16:37 float-checkpoint-epoch-0040-137f8511.pth.tar

-rw-r–r-- 1 root root 15M 3月 6 18:23 float-checkpoint-epoch-0041-b3b5b414.pth.tar

-rw-r–r-- 1 root root 15M 3月 6 20:10 float-checkpoint-epoch-0042-10470986.pth.tar

-rw-r–r-- 1 root root 15M 3月 6 21:57 float-checkpoint-epoch-0043-ac25c8a8.pth.tar

-rw-r–r-- 1 root root 15M 3月 6 23:42 float-checkpoint-epoch-0044-ac380098.pth.tar

-rw-r–r-- 1 root root 15M 3月 7 01:32 float-checkpoint-epoch-0045-61237ffc.pth.tar

-rw-r–r-- 1 root root 15M 3月 7 03:19 float-checkpoint-epoch-0046-8547d17a.pth.tar

-rw-r–r-- 1 root root 15M 3月 7 05:09 float-checkpoint-epoch-0047-6b4b4357.pth.tar

-rw-r–r-- 1 root root 15M 3月 7 06:58 float-checkpoint-epoch-0048-e7ca3063.pth.tar

-rw-r–r-- 1 root root 15M 3月 7 08:48 float-checkpoint-epoch-0049-94ff44a1.pth.tar

-rw-r–r-- 1 root root 15M 3月 7 10:38 float-checkpoint-epoch-0050-dbe5e83a.pth.tar

-rw-r–r-- 1 root root 15M 3月 7 12:30 float-checkpoint-epoch-0051-853121f1.pth.tar

-rw-r–r-- 1 root root 15M 3月 7 14:23 float-checkpoint-epoch-0052-916765b4.pth.tar

-rw-r–r-- 1 root root 15M 3月 7 16:12 float-checkpoint-epoch-0053-ee7e2cb7.pth.tar

-rw-r–r-- 1 root root 15M 3月 7 18:05 float-checkpoint-epoch-0054-2f3f88eb.pth.tar

-rw-r–r-- 1 root root 15M 3月 7 19:50 float-checkpoint-epoch-0055-dbc81b96.pth.tar

-rw-r–r-- 1 root root 15M 3月 7 21:39 float-checkpoint-epoch-0056-249a2cc8.pth.tar

-rw-r–r-- 1 root root 15M 3月 7 23:26 float-checkpoint-epoch-0057-589c0209.pth.tar

-rw-r–r-- 1 root root 15M 3月 8 01:14 float-checkpoint-epoch-0058-8521c528.pth.tar

-rw-r–r-- 1 root root 15M 3月 8 03:03 float-checkpoint-epoch-0059-5769a818.pth.tar

-rw-r–r-- 1 root root 15M 3月 8 04:51 float-checkpoint-epoch-0060-a63c1662.pth.tar

-rw-r–r-- 1 root root 15M 3月 8 06:41 float-checkpoint-epoch-0061-fc72bac4.pth.tar

-rw-r–r-- 1 root root 15M 3月 8 08:31 float-checkpoint-epoch-0062-e557302a.pth.tar

-rw-r–r-- 1 root root 15M 3月 8 10:25 float-checkpoint-epoch-0063-6f2b9564.pth.tar

-rw-r–r-- 1 root root 15M 3月 8 12:17 float-checkpoint-epoch-0064-bc8f9efd.pth.tar

-rw-r–r-- 1 root root 15M 3月 8 14:03 float-checkpoint-epoch-0065-b558f00c.pth.tar

-rw-r–r-- 1 root root 15M 3月 8 15:52 float-checkpoint-epoch-0066-94a6cf88.pth.tar

-rw-r–r-- 1 root root 15M 3月 8 17:42 float-checkpoint-epoch-0067-99d63cc9.pth.tar

-rw-r–r-- 1 root root 15M 3月 8 19:31 float-checkpoint-epoch-0068-6198973d.pth.tar

-rw-r–r-- 1 root root 15M 3月 8 21:17 float-checkpoint-epoch-0069-ec1855b7.pth.tar

-rw-r–r-- 1 root root 15M 3月 8 23:05 float-checkpoint-epoch-0070-6b165b99.pth.tar

-rw-r–r-- 1 root root 15M 3月 9 00:55 float-checkpoint-epoch-0071-726c771a.pth.tar

-rw-r–r-- 1 root root 15M 3月 9 02:44 float-checkpoint-epoch-0072-48644ce1.pth.tar

-rw-r–r-- 1 root root 15M 3月 9 04:36 float-checkpoint-epoch-0073-c7e1cd02.pth.tar

-rw-r–r-- 1 root root 15M 3月 9 06:25 float-checkpoint-epoch-0074-6c0d7b2f.pth.tar

-rw-r–r-- 1 root root 15M 3月 9 08:11 float-checkpoint-epoch-0075-21f0d119.pth.tar

-rw-r–r-- 1 root root 15M 3月 9 10:01 float-checkpoint-epoch-0076-175283d4.pth.tar

-rw-r–r-- 1 root root 15M 3月 9 11:50 float-checkpoint-epoch-0077-412aca21.pth.tar

-rw-r–r-- 1 root root 15M 3月 9 13:43 float-checkpoint-epoch-0078-d56ba571.pth.tar

-rw-r–r-- 1 root root 15M 3月 9 15:56 float-checkpoint-epoch-0079-5b3225ac.pth.tar

-rw-r–r-- 1 root root 15M 3月 9 15:56 float-checkpoint-last-5b3225ac.pth.tar

-rw-r–r-- 1 root root 15M 3月 9 15:56 float-checkpoint-last.pth.tar

```

log如下：

```

023-03-10 12:13:14,849 INFO [logger.py:147] Node[0] {‘ConfigVersion’: <enum ‘ConfigVersion’>,

‘March’: <class ‘horizon_plugin_pytorch.march.March’>,

‘MixVarGENetConfig’: <class ‘hat.models.backbones.mixvargenet.MixVarGENetConfig’>,

‘VERSION’: <ConfigVersion.v2: 2>,

‘attn_ratio’: 4,

‘base_lr’: 0.01,

‘batch_size_per_gpu’: 16,

‘bn_kwargs’: {},

‘ckpt_callback’: {‘mode’: ‘max’,

‘monitor_metric_key’: ‘CulaneF1Score’,

‘name_prefix’: ‘float-’,

‘save_dir’: ‘/open_explorer/ddk/samples/ai_toolchain/horizon_model_train_sample/scripts/tools/tmp_models/ganet_mixvargenet_culane/’,

‘strict_match’: True,

‘type’: ‘Checkpoint’},

‘ckpt_dir’: ‘/open_explorer/ddk/samples/ai_toolchain/horizon_model_train_sample/scripts/tools/tmp_models/ganet_mixvargenet_culane/’,

‘collate_2d’: <function collate_2d at 0x7fd9d4540e50>,

‘compile_cfg’: {‘hbm’: ‘/open_explorer/ddk/samples/ai_toolchain/horizon_model_train_sample/scripts/tools/tmp_models/ganet_mixvargenet_culane/compile/model.hbm’,

‘input_source’: [‘pyramid’],

‘layer_details’: True,

‘march’: ‘bayes’,

‘name’: ‘ganet_mixvargenet_culane’,

‘out_dir’: ‘/open_explorer/ddk/samples/ai_toolchain/horizon_model_train_sample/scripts/tools/tmp_models/ganet_mixvargenet_culane/compile’},

‘compile_dir’: ‘/open_explorer/ddk/samples/ai_toolchain/horizon_model_train_sample/scripts/tools/tmp_models/ganet_mixvargenet_culane/compile’,

‘copy’: <module ‘copy’ from ‘/usr/lib/python3.8/copy.py’>,

‘cudnn_benchmark’: True,

‘data_num_workers’: 4,

‘deploy_inputs’: {‘img’: torch.Size([1, 3, 320, 800])},

‘deploy_model’: {‘backbone’: {‘bias’: True,

‘bn_kwargs’: {},

‘disable_quanti_input’: False,

‘include_top’: False,

‘input_channels’: 3,

‘input_sequence_length’: 1,

‘net_config’: [[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f2’, stack_ops=, stride=1, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=64, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=64, out_channels=96, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=96, out_channels=160, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)]],

‘num_classes’: 1000,

‘output_list’: [2, 3, 4],

‘type’: ‘MixVarGENet’},

‘head’: {‘in_channel’: 32, ‘type’: ‘GaNetHead’},

‘neck’: {‘attn_in_channels’: [160],

‘attn_out_channels’: [32],

‘attn_ratios’: [4],

‘fpn_module’: {‘in_channels’: [64, 96, 32],

‘in_strides’: [8, 16, 32],

‘out_channels’: [32, 32, 32],

‘out_strides’: [8, 16, 32],

‘type’: ‘FPN’},

‘pos_shape’: [1, 10, 25],

‘type’: ‘GaNetNeck’},

‘type’: ‘GaNet’},

‘device_ids’: [0],

‘float_predictor’: {‘batch_processor’: {‘batch_transforms’: [{‘rgb_input’: True,

‘type’: ‘BgrToYuv444’},

{‘interface’: ‘Normalize’,

‘mean’: 128.0,

‘std’: 128.0,

‘type’: ‘TorchVisionAdapter’}],

‘loss_collector’: None,

‘need_grad_update’: False,

‘type’: ‘BasicBatchProcessor’},

‘callbacks’: [{‘log_freq’: 1000, ‘type’: ‘StatsMonitor’},

{‘epoch_log_freq’: 1,

‘log_prefix’: ‘ganet_mixvargenet_culane’,

‘metric_update_func’: <function update_metric at 0x7fd9c85a3040>,

‘step_log_freq’: 20,

‘type’: ‘MetricUpdater’}],

‘data_loader’: [{‘batch_size’: 16,

‘collate_fn’: <function collate_2d at 0x7fd9d4540e50>,

‘dataset’: {‘data_path’: ‘/open_explorer/data/ganet/test_lmdb’,

‘to_rgb’: True,

‘transforms’: [{‘size’: [0,

270,

1640,

320],

‘type’: ‘FixedCrop’},

{‘img_scale’: [320,

800],

‘keep_ratio’: False,

‘multiscale_mode’: ‘value’,

‘type’: ‘Resize’},

{‘to_yuv’: False,

‘type’: ‘ToTensor’}],

‘type’: ‘CuLaneDataset’},

‘num_workers’: 4,

‘pin_memory’: True,

‘sampler’: {‘type’: <class ‘torch.utils.data.distributed.DistributedSampler’>},

‘shuffle’: False,

‘type’: <class ‘torch.utils.data.dataloader.DataLoader’>}],

‘device’: None,

‘log_interval’: 50,

‘metrics’: [{‘type’: ‘CulaneF1Score’}],

‘model’: {‘backbone’: {‘bias’: True,

‘bn_kwargs’: {},

‘disable_quanti_input’: False,

‘include_top’: False,

‘input_channels’: 3,

‘input_sequence_length’: 1,

‘net_config’: [[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f2’, stack_ops=, stride=1, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=64, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=64, out_channels=96, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=96, out_channels=160, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)]],

‘num_classes’: 1000,

‘output_list’: [2, 3, 4],

‘type’: ‘MixVarGENet’},

‘head’: {‘in_channel’: 32, ‘type’: ‘GaNetHead’},

‘neck’: {‘attn_in_channels’: [160],

‘attn_out_channels’: [32],

‘attn_ratios’: [4],

‘fpn_module’: {‘in_channels’: [64,

96,

32],

‘in_strides’: [8,

16,

32],

‘out_channels’: [32,

32,

32],

‘out_strides’: [8,

16,

32],

‘type’: ‘FPN’},

‘pos_shape’: [1, 10, 25],

‘type’: ‘GaNetNeck’},

‘post_process’: {‘cluster_thr’: 5,

‘downscale’: 8,

‘kpt_thr’: 0.4,

‘root_thr’: 1,

‘type’: ‘GaNetDecoder’},

‘type’: ‘GaNet’},

‘model_convert_pipeline’: {‘converters’: [{‘checkpoint_path’: ‘/open_explorer/ddk/samples/ai_toolchain/horizon_model_train_sample/scripts/tools/tmp_models/ganet_mixvargenet_culane/float-checkpoint-best.pth.tar’,

‘type’: ‘LoadCheckpoint’}],

‘type’: ‘ModelConvertPipeline’},

‘type’: ‘Predictor’},

‘float_trainer’: {‘batch_processor’: {‘batch_transforms’: [{‘rgb_input’: True,

‘type’: ‘BgrToYuv444’},

{‘interface’: ‘Normalize’,

‘mean’: 128.0,

‘std’: 128.0,

‘type’: ‘TorchVisionAdapter’}],

‘loss_collector’: <function loss_collector at 0x7fd9c858eee0>,

‘need_grad_update’: True,

‘type’: ‘BasicBatchProcessor’},

‘callbacks’: [{‘log_freq’: 1000, ‘type’: ‘StatsMonitor’},

{‘epoch_log_freq’: 1,

‘log_prefix’: ‘loss_ganet_mixvargenet_culane’,

‘metric_update_func’: <function update_loss at 0x7fd9c858ef70>,

‘step_log_freq’: 20,

‘type’: ‘MetricUpdater’},

{‘step_log_interval’: 10,

‘type’: ‘CosLrUpdater’,

‘warmup_by’: ‘epoch’,

‘warmup_len’: 1},

{‘batch_processor’: {‘batch_transforms’: [{‘rgb_input’: True,

‘type’: ‘BgrToYuv444’},

{‘interface’: ‘Normalize’,

‘mean’: 128.0,

‘std’: 128.0,

‘type’: ‘TorchVisionAdapter’}],

‘loss_collector’: None,

‘need_grad_update’: False,

‘type’: ‘BasicBatchProcessor’},

‘callbacks’: [{‘epoch_log_freq’: 1,

‘log_prefix’: ‘ganet_mixvargenet_culane’,

‘metric_update_func’: <function update_metric at 0x7fd9c85a3040>,

‘step_log_freq’: 20,

‘type’: ‘MetricUpdater’}],

‘data_loader’: {‘batch_size’: 16,

‘collate_fn’: <function collate_2d at 0x7fd9d4540e50>,

‘dataset’: {‘data_path’: ‘/open_explorer/data/ganet/test_lmdb’,

‘to_rgb’: True,

‘transforms’: [{‘size’: [0,

270,

1640,

320],

‘type’: ‘FixedCrop’},

{‘img_scale’: [320,

800],

‘keep_ratio’: False,

‘multiscale_mode’: ‘value’,

‘type’: ‘Resize’},

{‘to_yuv’: False,

‘type’: ‘ToTensor’}],

‘type’: ‘CuLaneDataset’},

‘num_workers’: 4,

‘pin_memory’: True,

‘sampler’: {‘type’: <class ‘torch.utils.data.distributed.DistributedSampler’>},

‘shuffle’: False,

‘type’: <class ‘torch.utils.data.dataloader.DataLoader’>},

‘type’: ‘Validation’,

‘val_model’: {‘backbone’: {‘bias’: True,

‘bn_kwargs’: {},

‘disable_quanti_input’: False,

‘include_top’: False,

‘input_channels’: 3,

‘input_sequence_length’: 1,

‘net_config’: [[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f2’, stack_ops=, stride=1, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=64, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=64, out_channels=96, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=96, out_channels=160, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)]],

‘num_classes’: 1000,

‘output_list’: [2,

3,

4],

‘type’: ‘MixVarGENet’},

‘head’: {‘in_channel’: 32,

‘type’: ‘GaNetHead’},

‘neck’: {‘attn_in_channels’: [160],

‘attn_out_channels’: [32],

‘attn_ratios’: [4],

‘fpn_module’: {‘in_channels’: [64,

96,

32],

‘in_strides’: [8,

16,

32],

‘out_channels’: [32,

32,

32],

‘out_strides’: [8,

16,

32],

‘type’: ‘FPN’},

‘pos_shape’: [1,

10,

25],

‘type’: ‘GaNetNeck’},

‘post_process’: {‘cluster_thr’: 5,

‘downscale’: 8,

‘kpt_thr’: 0.4,

‘root_thr’: 1,

‘type’: ‘GaNetDecoder’},

‘type’: ‘GaNet’},

‘val_on_train_end’: True},

{‘mode’: ‘max’,

‘monitor_metric_key’: ‘CulaneF1Score’,

‘name_prefix’: ‘float-’,

‘save_dir’: ‘/open_explorer/ddk/samples/ai_toolchain/horizon_model_train_sample/scripts/tools/tmp_models/ganet_mixvargenet_culane/’,

‘strict_match’: True,

‘type’: ‘Checkpoint’}],

‘data_loader’: {‘batch_size’: 16,

‘collate_fn’: <function collate_2d at 0x7fd9d4540e50>,

‘dataset’: {‘data_path’: ‘/open_explorer/data/ganet/train_lmdb’,

‘to_rgb’: True,

‘transforms’: [{‘size’: [0,

270,

1640,

320],

‘type’: ‘FixedCrop’},

{‘px’: 0.5,

‘py’: 0.0,

‘type’: ‘RandomFlip’},

{‘img_scale’: [320,

800],

‘keep_ratio’: False,

‘multiscale_mode’: ‘value’,

‘type’: ‘Resize’},

{‘p’: 0.7,

‘transforms’: [{‘b_shift_limit’: [-10,

10],

‘g_shift_limit’: [-10,

10],

‘p’: 1.0,

‘r_shift_limit’: [-10,

10],

‘type’: ‘RGBShift’},

{‘hue_range’: [-10,

10],

‘p’: 1.0,

‘sat_range’: [-15,

15],

‘type’: ‘HueSaturationValue’,

‘val_range’: [-10,

10]}],

‘type’: ‘RandomSelectOne’},

{‘max_quality’: 85,

‘min_quality’: 95,

‘p’: 0.2,

‘type’: ‘JPEGCompress’},

{‘p’: 0.2,

‘transforms’: [{‘ksize’: 3,

‘p’: 1.0,

‘type’: ‘MeanBlur’},

{‘ksize’: 3,

‘p’: 1.0,

‘type’: ‘MedianBlur’}],

‘type’: ‘RandomSelectOne’},

{‘brightness_limit’: [-0.2,

0.2],

‘contrast_limit’: [-0.0,

0.0],

‘p’: 0.5,

‘type’: ‘RandomBrightnessContrast’},

{‘border_mode’: 0,

‘interpolation’: 1,

‘p’: 0.6,

‘rotate_limit’: [-10,

10],

‘scale_limit’: [0.8,

1.2],

‘shift_limit’: [-0.1,

0.1],

‘type’: ‘ShiftScaleRotate’},

{‘height’: 320,

‘p’: 0.6,

‘ratio’: [1.7,

2.7],

‘scale’: [0.8,

1.2],

‘type’: ‘RandomResizedCrop’,

‘width’: 800},

{‘to_yuv’: False,

‘type’: ‘ToTensor’}],

‘type’: ‘CuLaneDataset’},

‘num_workers’: 4,

‘pin_memory’: True,

‘sampler’: {‘type’: <class ‘torch.utils.data.distributed.DistributedSampler’>},

‘shuffle’: True,

‘type’: <class ‘torch.utils.data.dataloader.DataLoader’>},

‘device’: None,

‘model’: {‘backbone’: {‘bias’: True,

‘bn_kwargs’: {},

‘disable_quanti_input’: False,

‘include_top’: False,

‘input_channels’: 3,

‘input_sequence_length’: 1,

‘net_config’: [[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f2’, stack_ops=, stride=1, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=64, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=64, out_channels=96, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=96, out_channels=160, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)]],

‘num_classes’: 1000,

‘output_list’: [2, 3, 4],

‘type’: ‘MixVarGENet’},

‘head’: {‘in_channel’: 32, ‘type’: ‘GaNetHead’},

‘losses’: {‘loss_int_offset_reg’: {‘loss_weight’: 1.0,

‘type’: ‘L1Loss’},

‘loss_kpts_cls’: {‘loss_weight’: 1.0,

‘type’: ‘LaneFastFocalLoss’},

‘loss_pts_offset_reg’: {‘loss_weight’: 0.5,

‘type’: ‘L1Loss’},

‘type’: ‘GaNetLoss’},

‘neck’: {‘attn_in_channels’: [160],

‘attn_out_channels’: [32],

‘attn_ratios’: [4],

‘fpn_module’: {‘in_channels’: [64,

96,

32],

‘in_strides’: [8, 16, 32],

‘out_channels’: [32,

32,

32],

‘out_strides’: [8, 16, 32],

‘type’: ‘FPN’},

‘pos_shape’: [1, 10, 25],

‘type’: ‘GaNetNeck’},

‘post_process’: {‘cluster_thr’: 5,

‘downscale’: 8,

‘kpt_thr’: 0.4,

‘root_thr’: 1,

‘type’: ‘GaNetDecoder’},

‘targets’: {‘hm_down_scale’: 8,

‘radius’: 2,

‘type’: ‘GaNetTarget’},

‘type’: ‘GaNet’},

‘num_epochs’: 240,

‘optimizer’: {‘lr’: 0.01,

‘params’: {‘weight’: {‘weight_decay’: 4e-05}},

‘type’: <class ‘torch.optim.adam.Adam’>},

‘resume_optimizer’: True,

‘stop_by’: ‘epoch’,

‘sync_bn’: True,

‘train_metrics’: [{‘type’: ‘LossShow’}],

‘type’: ‘distributed_data_parallel_trainer’,

‘val_metrics’: [{‘type’: ‘CulaneF1Score’}]},

‘hid_dim’: 32,

‘int_infer_predictor’: {‘batch_processor’: {‘batch_transforms’: [{‘rgb_input’: True,

‘type’: ‘BgrToYuv444’},

{‘interface’: ‘Normalize’,

‘mean’: 128.0,

‘std’: 128.0,

‘type’: ‘TorchVisionAdapter’}],

‘loss_collector’: None,

‘need_grad_update’: False,

‘type’: ‘BasicBatchProcessor’},

‘callbacks’: [{‘log_freq’: 1000,

‘type’: ‘StatsMonitor’},

{‘epoch_log_freq’: 1,

‘log_prefix’: ‘ganet_mixvargenet_culane’,

‘metric_update_func’: <function update_metric at 0x7fd9c85a3040>,

‘step_log_freq’: 20,

‘type’: ‘MetricUpdater’}],

‘data_loader’: [{‘batch_size’: 16,

‘collate_fn’: <function collate_2d at 0x7fd9d4540e50>,

‘dataset’: {‘data_path’: ‘/open_explorer/data/ganet/test_lmdb’,

‘to_rgb’: True,

‘transforms’: [{‘size’: [0,

270,

1640,

320],

‘type’: ‘FixedCrop’},

{‘img_scale’: [320,

800],

‘keep_ratio’: False,

‘multiscale_mode’: ‘value’,

‘type’: ‘Resize’},

{‘to_yuv’: False,

‘type’: ‘ToTensor’}],

‘type’: ‘CuLaneDataset’},

‘num_workers’: 4,

‘pin_memory’: True,

‘sampler’: {‘type’: <class ‘torch.utils.data.distributed.DistributedSampler’>},

‘shuffle’: False,

‘type’: <class ‘torch.utils.data.dataloader.DataLoader’>}],

‘device’: None,

‘log_interval’: 50,

‘metrics’: [{‘type’: ‘CulaneF1Score’}],

‘model’: {‘backbone’: {‘bias’: True,

‘bn_kwargs’: {},

‘disable_quanti_input’: False,

‘include_top’: False,

‘input_channels’: 3,

‘input_sequence_length’: 1,

‘net_config’: [[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f2’, stack_ops=, stride=1, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=64, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=64, out_channels=96, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=96, out_channels=160, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)]],

‘num_classes’: 1000,

‘output_list’: [2, 3, 4],

‘type’: ‘MixVarGENet’},

‘head’: {‘in_channel’: 32,

‘type’: ‘GaNetHead’},

‘neck’: {‘attn_in_channels’: [160],

‘attn_out_channels’: [32],

‘attn_ratios’: [4],

‘fpn_module’: {‘in_channels’: [64,

96,

32],

‘in_strides’: [8,

16,

32],

‘out_channels’: [32,

32,

32],

‘out_strides’: [8,

16,

32],

‘type’: ‘FPN’},

‘pos_shape’: [1, 10, 25],

‘type’: ‘GaNetNeck’},

‘post_process’: {‘cluster_thr’: 5,

‘downscale’: 8,

‘kpt_thr’: 0.4,

‘root_thr’: 1,

‘type’: ‘GaNetDecoder’},

‘type’: ‘GaNet’},

‘model_convert_pipeline’: {‘converters’: [{‘type’: ‘Float2QAT’},

{‘checkpoint_path’: ‘/open_explorer/ddk/samples/ai_toolchain/horizon_model_train_sample/scripts/tools/tmp_models/ganet_mixvargenet_culane/qat-checkpoint-best.pth.tar’,

‘type’: ‘LoadCheckpoint’},

{‘type’: ‘QAT2Quantize’}],

‘qat_mode’: ‘fuse_bn’,

‘type’: ‘ModelConvertPipeline’},

‘type’: ‘Predictor’},

‘int_infer_trainer’: {‘batch_processor’: None,

‘callbacks’: [{‘mode’: ‘max’,

‘monitor_metric_key’: ‘CulaneF1Score’,

‘name_prefix’: ‘float-’,

‘save_dir’: ‘/open_explorer/ddk/samples/ai_toolchain/horizon_model_train_sample/scripts/tools/tmp_models/ganet_mixvargenet_culane/’,

‘strict_match’: True,

‘type’: ‘Checkpoint’},

{‘save_dir’: ‘/open_explorer/ddk/samples/ai_toolchain/horizon_model_train_sample/scripts/tools/tmp_models/ganet_mixvargenet_culane/’,

‘trace_inputs’: {‘img’: torch.Size([1, 3, 320, 800])},

‘type’: ‘SaveTraced’}],

‘data_loader’: None,

‘device’: None,

‘model’: {‘backbone’: {‘bias’: True,

‘bn_kwargs’: {},

‘disable_quanti_input’: False,

‘include_top’: False,

‘input_channels’: 3,

‘input_sequence_length’: 1,

‘net_config’: [[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f2’, stack_ops=, stride=1, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=64, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=64, out_channels=96, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=96, out_channels=160, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)]],

‘num_classes’: 1000,

‘output_list’: [2, 3, 4],

‘type’: ‘MixVarGENet’},

‘head’: {‘in_channel’: 32,

‘type’: ‘GaNetHead’},

‘neck’: {‘attn_in_channels’: [160],

‘attn_out_channels’: [32],

‘attn_ratios’: [4],

‘fpn_module’: {‘in_channels’: [64,

96,

32],

‘in_strides’: [8,

16,

32],

‘out_channels’: [32,

32,

32],

‘out_strides’: [8,

16,

32],

‘type’: ‘FPN’},

‘pos_shape’: [1, 10, 25],

‘type’: ‘GaNetNeck’},

‘type’: ‘GaNet’},

‘model_convert_pipeline’: {‘converters’: [{‘type’: ‘Float2QAT’},

{‘checkpoint_path’: ‘/open_explorer/ddk/samples/ai_toolchain/horizon_model_train_sample/scripts/tools/tmp_models/ganet_mixvargenet_culane/qat-checkpoint-best.pth.tar’,

‘type’: ‘LoadCheckpoint’},

{‘type’: ‘QAT2Quantize’}],

‘qat_mode’: ‘fuse_bn’,

‘type’: ‘ModelConvertPipeline’},

‘num_epochs’: 0,

‘optimizer’: None,

‘type’: ‘Trainer’},

‘log_rank_zero_only’: True,

‘loss_collector’: <function loss_collector at 0x7fd9c858eee0>,

‘loss_show_update’: {‘epoch_log_freq’: 1,

‘log_prefix’: ‘loss_ganet_mixvargenet_culane’,

‘metric_update_func’: <function update_loss at 0x7fd9c858ef70>,

‘step_log_freq’: 20,

‘type’: ‘MetricUpdater’},

‘march’: ‘bayes’,

‘metric_updater’: {‘epoch_log_freq’: 1,

‘log_prefix’: ‘ganet_mixvargenet_culane’,

‘metric_update_func’: <function update_metric at 0x7fd9c85a3040>,

‘step_log_freq’: 20,

‘type’: ‘MetricUpdater’},

‘model’: {‘backbone’: {‘bias’: True,

‘bn_kwargs’: {},

‘disable_quanti_input’: False,

‘include_top’: False,

‘input_channels’: 3,

‘input_sequence_length’: 1,

‘net_config’: [[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f2’, stack_ops=, stride=1, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=64, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=64, out_channels=96, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=96, out_channels=160, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)]],

‘num_classes’: 1000,

‘output_list’: [2, 3, 4],

‘type’: ‘MixVarGENet’},

‘head’: {‘in_channel’: 32, ‘type’: ‘GaNetHead’},

‘losses’: {‘loss_int_offset_reg’: {‘loss_weight’: 1.0,

‘type’: ‘L1Loss’},

‘loss_kpts_cls’: {‘loss_weight’: 1.0,

‘type’: ‘LaneFastFocalLoss’},

‘loss_pts_offset_reg’: {‘loss_weight’: 0.5,

‘type’: ‘L1Loss’},

‘type’: ‘GaNetLoss’},

‘neck’: {‘attn_in_channels’: [160],

‘attn_out_channels’: [32],

‘attn_ratios’: [4],

‘fpn_module’: {‘in_channels’: [64, 96, 32],

‘in_strides’: [8, 16, 32],

‘out_channels’: [32, 32, 32],

‘out_strides’: [8, 16, 32],

‘type’: ‘FPN’},

‘pos_shape’: [1, 10, 25],

‘type’: ‘GaNetNeck’},

‘post_process’: {‘cluster_thr’: 5,

‘downscale’: 8,

‘kpt_thr’: 0.4,

‘root_thr’: 1,

‘type’: ‘GaNetDecoder’},

‘targets’: {‘hm_down_scale’: 8, ‘radius’: 2, ‘type’: ‘GaNetTarget’},

‘type’: ‘GaNet’},

‘num_epochs’: 240,

‘os’: <module ‘os’ from ‘/usr/lib/python3.8/os.py’>,

‘qat_predictor’: {‘batch_processor’: {‘batch_transforms’: [{‘rgb_input’: True,

‘type’: ‘BgrToYuv444’},

{‘interface’: ‘Normalize’,

‘mean’: 128.0,

‘std’: 128.0,

‘type’: ‘TorchVisionAdapter’}],

‘loss_collector’: None,

‘need_grad_update’: False,

‘type’: ‘BasicBatchProcessor’},

‘callbacks’: [{‘log_freq’: 1000, ‘type’: ‘StatsMonitor’},

{‘epoch_log_freq’: 1,

‘log_prefix’: ‘ganet_mixvargenet_culane’,

‘metric_update_func’: <function update_metric at 0x7fd9c85a3040>,

‘step_log_freq’: 20,

‘type’: ‘MetricUpdater’}],

‘data_loader’: [{‘batch_size’: 16,

‘collate_fn’: <function collate_2d at 0x7fd9d4540e50>,

‘dataset’: {‘data_path’: ‘/open_explorer/data/ganet/test_lmdb’,

‘to_rgb’: True,

‘transforms’: [{‘size’: [0,

270,

1640,

320],

‘type’: ‘FixedCrop’},

{‘img_scale’: [320,

800],

‘keep_ratio’: False,

‘multiscale_mode’: ‘value’,

‘type’: ‘Resize’},

{‘to_yuv’: False,

‘type’: ‘ToTensor’}],

‘type’: ‘CuLaneDataset’},

‘num_workers’: 4,

‘pin_memory’: True,

‘sampler’: {‘type’: <class ‘torch.utils.data.distributed.DistributedSampler’>},

‘shuffle’: False,

‘type’: <class ‘torch.utils.data.dataloader.DataLoader’>}],

‘device’: None,

‘log_interval’: 50,

‘metrics’: [{‘type’: ‘CulaneF1Score’}],

‘model’: {‘backbone’: {‘bias’: True,

‘bn_kwargs’: {},

‘disable_quanti_input’: False,

‘include_top’: False,

‘input_channels’: 3,

‘input_sequence_length’: 1,

‘net_config’: [[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f2’, stack_ops=, stride=1, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=64, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=64, out_channels=96, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=96, out_channels=160, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)]],

‘num_classes’: 1000,

‘output_list’: [2, 3, 4],

‘type’: ‘MixVarGENet’},

‘head’: {‘in_channel’: 32, ‘type’: ‘GaNetHead’},

‘neck’: {‘attn_in_channels’: [160],

‘attn_out_channels’: [32],

‘attn_ratios’: [4],

‘fpn_module’: {‘in_channels’: [64,

96,

32],

‘in_strides’: [8, 16, 32],

‘out_channels’: [32,

32,

32],

‘out_strides’: [8, 16, 32],

‘type’: ‘FPN’},

‘pos_shape’: [1, 10, 25],

‘type’: ‘GaNetNeck’},

‘post_process’: {‘cluster_thr’: 5,

‘downscale’: 8,

‘kpt_thr’: 0.4,

‘root_thr’: 1,

‘type’: ‘GaNetDecoder’},

‘type’: ‘GaNet’},

‘model_convert_pipeline’: {‘converters’: [{‘type’: ‘Float2QAT’},

{‘checkpoint_path’: ‘/open_explorer/ddk/samples/ai_toolchain/horizon_model_train_sample/scripts/tools/tmp_models/ganet_mixvargenet_culane/qat-checkpoint-best.pth.tar’,

‘type’: ‘LoadCheckpoint’}],

‘qat_mode’: ‘fuse_bn’,

‘type’: ‘ModelConvertPipeline’},

‘type’: ‘Predictor’},

‘qat_trainer’: {‘batch_processor’: {‘batch_transforms’: [{‘rgb_input’: True,

‘type’: ‘BgrToYuv444’},

{‘interface’: ‘Normalize’,

‘mean’: 128.0,

‘std’: 128.0,

‘type’: ‘TorchVisionAdapter’}],

‘loss_collector’: <function loss_collector at 0x7fd9c858eee0>,

‘need_grad_update’: True,

‘type’: ‘BasicBatchProcessor’},

‘callbacks’: [{‘log_freq’: 1000, ‘type’: ‘StatsMonitor’},

{‘epoch_log_freq’: 1,

‘log_prefix’: ‘loss_ganet_mixvargenet_culane’,

‘metric_update_func’: <function update_loss at 0x7fd9c858ef70>,

‘step_log_freq’: 20,

‘type’: ‘MetricUpdater’},

{‘step_log_interval’: 10,

‘type’: ‘CosLrUpdater’},

{‘batch_processor’: {‘batch_transforms’: [{‘rgb_input’: True,

‘type’: ‘BgrToYuv444’},

{‘interface’: ‘Normalize’,

‘mean’: 128.0,

‘std’: 128.0,

‘type’: ‘TorchVisionAdapter’}],

‘loss_collector’: None,

‘need_grad_update’: False,

‘type’: ‘BasicBatchProcessor’},

‘callbacks’: [{‘epoch_log_freq’: 1,

‘log_prefix’: ‘ganet_mixvargenet_culane’,

‘metric_update_func’: <function update_metric at 0x7fd9c85a3040>,

‘step_log_freq’: 20,

‘type’: ‘MetricUpdater’}],

‘data_loader’: {‘batch_size’: 16,

‘collate_fn’: <function collate_2d at 0x7fd9d4540e50>,

‘dataset’: {‘data_path’: ‘/open_explorer/data/ganet/test_lmdb’,

‘to_rgb’: True,

‘transforms’: [{‘size’: [0,

270,

1640,

320],

‘type’: ‘FixedCrop’},

{‘img_scale’: [320,

800],

‘keep_ratio’: False,

‘multiscale_mode’: ‘value’,

‘type’: ‘Resize’},

{‘to_yuv’: False,

‘type’: ‘ToTensor’}],

‘type’: ‘CuLaneDataset’},

‘num_workers’: 4,

‘pin_memory’: True,

‘sampler’: {‘type’: <class ‘torch.utils.data.distributed.DistributedSampler’>},

‘shuffle’: False,

‘type’: <class ‘torch.utils.data.dataloader.DataLoader’>},

‘model_convert_pipeline’: {‘converters’: [{‘type’: ‘Float2QAT’}],

‘qat_mode’: ‘fuse_bn’,

‘type’: ‘ModelConvertPipeline’},

‘type’: ‘Validation’,

‘val_model’: {‘backbone’: {‘bias’: True,

‘bn_kwargs’: {},

‘disable_quanti_input’: False,

‘include_top’: False,

‘input_channels’: 3,

‘input_sequence_length’: 1,

‘net_config’: [[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f2’, stack_ops=, stride=1, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=64, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=64, out_channels=96, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=96, out_channels=160, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)]],

‘num_classes’: 1000,

‘output_list’: [2,

3,

4],

‘type’: ‘MixVarGENet’},

‘head’: {‘in_channel’: 32,

‘type’: ‘GaNetHead’},

‘neck’: {‘attn_in_channels’: [160],

‘attn_out_channels’: [32],

‘attn_ratios’: [4],

‘fpn_module’: {‘in_channels’: [64,

96,

32],

‘in_strides’: [8,

16,

32],

‘out_channels’: [32,

32,

32],

‘out_strides’: [8,

16,

32],

‘type’: ‘FPN’},

‘pos_shape’: [1, 10, 25],

‘type’: ‘GaNetNeck’},

‘post_process’: {‘cluster_thr’: 5,

‘downscale’: 8,

‘kpt_thr’: 0.4,

‘root_thr’: 1,

‘type’: ‘GaNetDecoder’},

‘type’: ‘GaNet’},

‘val_on_train_end’: True},

{‘mode’: ‘max’,

‘monitor_metric_key’: ‘CulaneF1Score’,

‘name_prefix’: ‘float-’,

‘save_dir’: ‘/open_explorer/ddk/samples/ai_toolchain/horizon_model_train_sample/scripts/tools/tmp_models/ganet_mixvargenet_culane/’,

‘strict_match’: True,

‘type’: ‘Checkpoint’}],

‘data_loader’: {‘batch_size’: 16,

‘collate_fn’: <function collate_2d at 0x7fd9d4540e50>,

‘dataset’: {‘data_path’: ‘/open_explorer/data/ganet/train_lmdb’,

‘to_rgb’: True,

‘transforms’: [{‘size’: [0,

270,

1640,

320],

‘type’: ‘FixedCrop’},

{‘px’: 0.5,

‘py’: 0.0,

‘type’: ‘RandomFlip’},

{‘img_scale’: [320,

800],

‘keep_ratio’: False,

‘multiscale_mode’: ‘value’,

‘type’: ‘Resize’},

{‘p’: 0.7,

‘transforms’: [{‘b_shift_limit’: [-10,

10],

‘g_shift_limit’: [-10,

10],

‘p’: 1.0,

‘r_shift_limit’: [-10,

10],

‘type’: ‘RGBShift’},

{‘hue_range’: [-10,

10],

‘p’: 1.0,

‘sat_range’: [-15,

15],

‘type’: ‘HueSaturationValue’,

‘val_range’: [-10,

10]}],

‘type’: ‘RandomSelectOne’},

{‘max_quality’: 85,

‘min_quality’: 95,

‘p’: 0.2,

‘type’: ‘JPEGCompress’},

{‘p’: 0.2,

‘transforms’: [{‘ksize’: 3,

‘p’: 1.0,

‘type’: ‘MeanBlur’},

{‘ksize’: 3,

‘p’: 1.0,

‘type’: ‘MedianBlur’}],

‘type’: ‘RandomSelectOne’},

{‘brightness_limit’: [-0.2,

0.2],

‘contrast_limit’: [-0.0,

0.0],

‘p’: 0.5,

‘type’: ‘RandomBrightnessContrast’},

{‘border_mode’: 0,

‘interpolation’: 1,

‘p’: 0.6,

‘rotate_limit’: [-10,

10],

‘scale_limit’: [0.8,

1.2],

‘shift_limit’: [-0.1,

0.1],

‘type’: ‘ShiftScaleRotate’},

{‘height’: 320,

‘p’: 0.6,

‘ratio’: [1.7,

2.7],

‘scale’: [0.8,

1.2],

‘type’: ‘RandomResizedCrop’,

‘width’: 800},

{‘to_yuv’: False,

‘type’: ‘ToTensor’}],

‘type’: ‘CuLaneDataset’},

‘num_workers’: 4,

‘pin_memory’: True,

‘sampler’: {‘type’: <class ‘torch.utils.data.distributed.DistributedSampler’>},

‘shuffle’: True,

‘type’: <class ‘torch.utils.data.dataloader.DataLoader’>},

‘device’: None,

‘model’: {‘backbone’: {‘bias’: True,

‘bn_kwargs’: {},

‘disable_quanti_input’: False,

‘include_top’: False,

‘input_channels’: 3,

‘input_sequence_length’: 1,

‘net_config’: [[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f2’, stack_ops=, stride=1, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=64, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=64, out_channels=96, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=96, out_channels=160, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)]],

‘num_classes’: 1000,

‘output_list’: [2, 3, 4],

‘type’: ‘MixVarGENet’},

‘head’: {‘in_channel’: 32, ‘type’: ‘GaNetHead’},

‘losses’: {‘loss_int_offset_reg’: {‘loss_weight’: 1.0,

‘type’: ‘L1Loss’},

‘loss_kpts_cls’: {‘loss_weight’: 1.0,

‘type’: ‘LaneFastFocalLoss’},

‘loss_pts_offset_reg’: {‘loss_weight’: 0.5,

‘type’: ‘L1Loss’},

‘type’: ‘GaNetLoss’},

‘neck’: {‘attn_in_channels’: [160],

‘attn_out_channels’: [32],

‘attn_ratios’: [4],

‘fpn_module’: {‘in_channels’: [64, 96, 32],

‘in_strides’: [8, 16, 32],

‘out_channels’: [32, 32, 32],

‘out_strides’: [8, 16, 32],

‘type’: ‘FPN’},

‘pos_shape’: [1, 10, 25],

‘type’: ‘GaNetNeck’},

‘post_process’: {‘cluster_thr’: 5,

‘downscale’: 8,

‘kpt_thr’: 0.4,

‘root_thr’: 1,

‘type’: ‘GaNetDecoder’},

‘targets’: {‘hm_down_scale’: 8,

‘radius’: 2,

‘type’: ‘GaNetTarget’},

‘type’: ‘GaNet’},

‘model_convert_pipeline’: {‘converters’: [{‘checkpoint_path’: ‘/open_explorer/ddk/samples/ai_toolchain/horizon_model_train_sample/scripts/tools/tmp_models/ganet_mixvargenet_culane/float-checkpoint-best.pth.tar’,

‘type’: ‘LoadCheckpoint’},

{‘type’: ‘Float2QAT’}],

‘qat_mode’: ‘fuse_bn’,

‘type’: ‘ModelConvertPipeline’},

‘num_epochs’: 24,

‘optimizer’: {‘lr’: 0.0001,

‘momentum’: 0.9,

‘params’: {‘weight’: {‘weight_decay’: 0.0004}},

‘type’: <class ‘torch.optim.sgd.SGD’>},

‘stop_by’: ‘epoch’,

‘sync_bn’: True,

‘train_metrics’: [{‘type’: ‘LossShow’}],

‘type’: ‘distributed_data_parallel_trainer’,

‘val_metrics’: [{‘type’: ‘CulaneF1Score’}]},

‘qat_val_callback’: {‘batch_processor’: {‘batch_transforms’: [{‘rgb_input’: True,

‘type’: ‘BgrToYuv444’},

{‘interface’: ‘Normalize’,

‘mean’: 128.0,

‘std’: 128.0,

‘type’: ‘TorchVisionAdapter’}],

‘loss_collector’: None,

‘need_grad_update’: False,

‘type’: ‘BasicBatchProcessor’},

‘callbacks’: [{‘epoch_log_freq’: 1,

‘log_prefix’: ‘ganet_mixvargenet_culane’,

‘metric_update_func’: <function update_metric at 0x7fd9c85a3040>,

‘step_log_freq’: 20,

‘type’: ‘MetricUpdater’}],

‘data_loader’: {‘batch_size’: 16,

‘collate_fn’: <function collate_2d at 0x7fd9d4540e50>,

‘dataset’: {‘data_path’: ‘/open_explorer/data/ganet/test_lmdb’,

‘to_rgb’: True,

‘transforms’: [{‘size’: [0,

270,

1640,

320],

‘type’: ‘FixedCrop’},

{‘img_scale’: [320,

800],

‘keep_ratio’: False,

‘multiscale_mode’: ‘value’,

‘type’: ‘Resize’},

{‘to_yuv’: False,

‘type’: ‘ToTensor’}],

‘type’: ‘CuLaneDataset’},

‘num_workers’: 4,

‘pin_memory’: True,

‘sampler’: {‘type’: <class ‘torch.utils.data.distributed.DistributedSampler’>},

‘shuffle’: False,

‘type’: <class ‘torch.utils.data.dataloader.DataLoader’>},

‘model_convert_pipeline’: {‘converters’: [{‘type’: ‘Float2QAT’}],

‘qat_mode’: ‘fuse_bn’,

‘type’: ‘ModelConvertPipeline’},

‘type’: ‘Validation’,

‘val_model’: {‘backbone’: {‘bias’: True,

‘bn_kwargs’: {},

‘disable_quanti_input’: False,

‘include_top’: False,

‘input_channels’: 3,

‘input_sequence_length’: 1,

‘net_config’: [[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f2’, stack_ops=, stride=1, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=64, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=64, out_channels=96, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=96, out_channels=160, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)]],

‘num_classes’: 1000,

‘output_list’: [2, 3, 4],

‘type’: ‘MixVarGENet’},

‘head’: {‘in_channel’: 32,

‘type’: ‘GaNetHead’},

‘neck’: {‘attn_in_channels’: [160],

‘attn_out_channels’: [32],

‘attn_ratios’: [4],

‘fpn_module’: {‘in_channels’: [64,

96,

32],

‘in_strides’: [8,

16,

32],

‘out_channels’: [32,

32,

32],

‘out_strides’: [8,

16,

32],

‘type’: ‘FPN’},

‘pos_shape’: [1, 10, 25],

‘type’: ‘GaNetNeck’},

‘post_process’: {‘cluster_thr’: 5,

‘downscale’: 8,

‘kpt_thr’: 0.4,

‘root_thr’: 1,

‘type’: ‘GaNetDecoder’},

‘type’: ‘GaNet’},

‘val_on_train_end’: True},

‘radius’: 2,

‘seed’: None,

‘stat_callback’: {‘log_freq’: 1000, ‘type’: ‘StatsMonitor’},

‘task_name’: ‘ganet_mixvargenet_culane’,

‘test_model’: {‘backbone’: {‘bias’: True,

‘bn_kwargs’: {},

‘disable_quanti_input’: False,

‘include_top’: False,

‘input_channels’: 3,

‘input_sequence_length’: 1,

‘net_config’: [[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f2’, stack_ops=, stride=1, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=64, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=64, out_channels=96, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=96, out_channels=160, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)]],

‘num_classes’: 1000,

‘output_list’: [2, 3, 4],

‘type’: ‘MixVarGENet’},

‘head’: {‘in_channel’: 32, ‘type’: ‘GaNetHead’},

‘neck’: {‘attn_in_channels’: [160],

‘attn_out_channels’: [32],

‘attn_ratios’: [4],

‘fpn_module’: {‘in_channels’: [64, 96, 32],

‘in_strides’: [8, 16, 32],

‘out_channels’: [32, 32, 32],

‘out_strides’: [8, 16, 32],

‘type’: ‘FPN’},

‘pos_shape’: [1, 10, 25],

‘type’: ‘GaNetNeck’},

‘post_process’: {‘cluster_thr’: 5,

‘downscale’: 8,

‘kpt_thr’: 0.4,

‘root_thr’: 1,

‘type’: ‘GaNetDecoder’},

‘type’: ‘GaNet’},

‘torch’: <module ‘torch’ from ‘/root/.local/lib/python3.8/site-packages/torch/__init__.py’>,

‘trace_callback’: {‘save_dir’: ‘/open_explorer/ddk/samples/ai_toolchain/horizon_model_train_sample/scripts/tools/tmp_models/ganet_mixvargenet_culane/’,

‘trace_inputs’: {‘img’: torch.Size([1, 3, 320, 800])},

‘type’: ‘SaveTraced’},

‘train_batch_processor’: {‘batch_transforms’: [{‘rgb_input’: True,

‘type’: ‘BgrToYuv444’},

{‘interface’: ‘Normalize’,

‘mean’: 128.0,

‘std’: 128.0,

‘type’: ‘TorchVisionAdapter’}],

‘loss_collector’: <function loss_collector at 0x7fd9c858eee0>,

‘need_grad_update’: True,

‘type’: ‘BasicBatchProcessor’},

‘train_data_loader’: {‘batch_size’: 16,

‘collate_fn’: <function collate_2d at 0x7fd9d4540e50>,

‘dataset’: {‘data_path’: ‘/open_explorer/data/ganet/train_lmdb’,

‘to_rgb’: True,

‘transforms’: [{‘size’: [0, 270, 1640, 320],

‘type’: ‘FixedCrop’},

{‘px’: 0.5,

‘py’: 0.0,

‘type’: ‘RandomFlip’},

{‘img_scale’: [320, 800],

‘keep_ratio’: False,

‘multiscale_mode’: ‘value’,

‘type’: ‘Resize’},

{‘p’: 0.7,

‘transforms’: [{‘b_shift_limit’: [-10,

10],

‘g_shift_limit’: [-10,

10],

‘p’: 1.0,

‘r_shift_limit’: [-10,

10],

‘type’: ‘RGBShift’},

{‘hue_range’: [-10,

10],

‘p’: 1.0,

‘sat_range’: [-15,

15],

‘type’: ‘HueSaturationValue’,

‘val_range’: [-10,

10]}],

‘type’: ‘RandomSelectOne’},

{‘max_quality’: 85,

‘min_quality’: 95,

‘p’: 0.2,

‘type’: ‘JPEGCompress’},

{‘p’: 0.2,

‘transforms’: [{‘ksize’: 3,

‘p’: 1.0,

‘type’: ‘MeanBlur’},

{‘ksize’: 3,

‘p’: 1.0,

‘type’: ‘MedianBlur’}],

‘type’: ‘RandomSelectOne’},

{‘brightness_limit’: [-0.2,

0.2],

‘contrast_limit’: [-0.0,

0.0],

‘p’: 0.5,

‘type’: ‘RandomBrightnessContrast’},

{‘border_mode’: 0,

‘interpolation’: 1,

‘p’: 0.6,

‘rotate_limit’: [-10, 10],

‘scale_limit’: [0.8, 1.2],

‘shift_limit’: [-0.1, 0.1],

‘type’: ‘ShiftScaleRotate’},

{‘height’: 320,

‘p’: 0.6,

‘ratio’: [1.7, 2.7],

‘scale’: [0.8, 1.2],

‘type’: ‘RandomResizedCrop’,

‘width’: 800},

{‘to_yuv’: False,

‘type’: ‘ToTensor’}],

‘type’: ‘CuLaneDataset’},

‘num_workers’: 4,

‘pin_memory’: True,

‘sampler’: {‘type’: <class ‘torch.utils.data.distributed.DistributedSampler’>},

‘shuffle’: True,

‘type’: <class ‘torch.utils.data.dataloader.DataLoader’>},

‘train_data_path’: ‘/open_explorer/data/ganet/train_lmdb’,

‘training_step’: ‘float’,

‘update_loss’: <function update_loss at 0x7fd9c858ef70>,

‘update_metric’: <function update_metric at 0x7fd9c85a3040>,

‘val_batch_processor’: {‘batch_transforms’: [{‘rgb_input’: True,

‘type’: ‘BgrToYuv444’},

{‘interface’: ‘Normalize’,

‘mean’: 128.0,

‘std’: 128.0,

‘type’: ‘TorchVisionAdapter’}],

‘loss_collector’: None,

‘need_grad_update’: False,

‘type’: ‘BasicBatchProcessor’},

‘val_callback’: {‘batch_processor’: {‘batch_transforms’: [{‘rgb_input’: True,

‘type’: ‘BgrToYuv444’},

{‘interface’: ‘Normalize’,

‘mean’: 128.0,

‘std’: 128.0,

‘type’: ‘TorchVisionAdapter’}],

‘loss_collector’: None,

‘need_grad_update’: False,

‘type’: ‘BasicBatchProcessor’},

‘callbacks’: [{‘epoch_log_freq’: 1,

‘log_prefix’: ‘ganet_mixvargenet_culane’,

‘metric_update_func’: <function update_metric at 0x7fd9c85a3040>,

‘step_log_freq’: 20,

‘type’: ‘MetricUpdater’}],

‘data_loader’: {‘batch_size’: 16,

‘collate_fn’: <function collate_2d at 0x7fd9d4540e50>,

‘dataset’: {‘data_path’: ‘/open_explorer/data/ganet/test_lmdb’,

‘to_rgb’: True,

‘transforms’: [{‘size’: [0,

270,

1640,

320],

‘type’: ‘FixedCrop’},

{‘img_scale’: [320,

800],

‘keep_ratio’: False,

‘multiscale_mode’: ‘value’,

‘type’: ‘Resize’},

{‘to_yuv’: False,

‘type’: ‘ToTensor’}],

‘type’: ‘CuLaneDataset’},

‘num_workers’: 4,

‘pin_memory’: True,

‘sampler’: {‘type’: <class ‘torch.utils.data.distributed.DistributedSampler’>},

‘shuffle’: False,

‘type’: <class ‘torch.utils.data.dataloader.DataLoader’>},

‘type’: ‘Validation’,

‘val_model’: {‘backbone’: {‘bias’: True,

‘bn_kwargs’: {},

‘disable_quanti_input’: False,

‘include_top’: False,

‘input_channels’: 3,

‘input_sequence_length’: 1,

‘net_config’: [[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f2’, stack_ops=, stride=1, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=32, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=32, out_channels=64, head_op=‘mixvarge_f4’, stack_ops=[‘mixvarge_f4’, ‘mixvarge_f4’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=64, out_channels=96, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)],

[MixVarGENetConfig(in_channels=96, out_channels=160, head_op=‘mixvarge_f2_gb16’, stack_ops=[‘mixvarge_f2_gb16’, ‘mixvarge_f2_gb16’], stride=2, stack_factor=1, fusion_strides=, extra_downsample_num=0)]],

‘num_classes’: 1000,

‘output_list’: [2, 3, 4],

‘type’: ‘MixVarGENet’},

‘head’: {‘in_channel’: 32, ‘type’: ‘GaNetHead’},

‘neck’: {‘attn_in_channels’: [160],

‘attn_out_channels’: [32],

‘attn_ratios’: [4],

‘fpn_module’: {‘in_channels’: [64,

96,

32],

‘in_strides’: [8,

16,

32],

‘out_channels’: [32,

32,

32],

‘out_strides’: [8,

16,

32],

‘type’: ‘FPN’},

‘pos_shape’: [1, 10, 25],

‘type’: ‘GaNetNeck’},

‘post_process’: {‘cluster_thr’: 5,

‘downscale’: 8,

‘kpt_thr’: 0.4,

‘root_thr’: 1,

‘type’: ‘GaNetDecoder’},

‘type’: ‘GaNet’},

‘val_on_train_end’: True},

‘val_data_loader’: {‘batch_size’: 16,

‘collate_fn’: <function collate_2d at 0x7fd9d4540e50>,

‘dataset’: {‘data_path’: ‘/open_explorer/data/ganet/test_lmdb’,

‘to_rgb’: True,

‘transforms’: [{‘size’: [0, 270, 1640, 320],

‘type’: ‘FixedCrop’},

{‘img_scale’: [320, 800],

‘keep_ratio’: False,

‘multiscale_mode’: ‘value’,

‘type’: ‘Resize’},

{‘to_yuv’: False,

‘type’: ‘ToTensor’}],

‘type’: ‘CuLaneDataset’},

‘num_workers’: 4,

‘pin_memory’: True,

‘sampler’: {‘type’: <class ‘torch.utils.data.distributed.DistributedSampler’>},

‘shuffle’: False,

‘type’: <class ‘torch.utils.data.dataloader.DataLoader’>},

‘val_data_path’: ‘/open_explorer/data/ganet/test_lmdb’}

2023-03-10 12:13:14,851 INFO [logger.py:147] Node[0] ==================================================BEGIN FLOAT STAGE==================================================

2023-03-10 12:13:14,900 INFO [thread_init.py:38] Node[0] init torch_num_thread is `12`,opencv_num_thread is `12`,openblas_num_thread is `12`,mkl_num_thread is `12`,omp_num_thread is `12`,

/usr/local/lib/python3.8/dist-packages/hat/models/embeddings.py:157: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the ‘trunc’ function NOT ‘floor’). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode=‘trunc’), or for actual floor division, use torch.div(a, b, rounding_mode=‘floor’).

dim_t = self.temperature ** (2 * (dim_t // 2) / self.num_pos_feats)

2023-03-10 12:13:14,960 ERROR [ddp_trainer.py:363] Node[0] Traceback (most recent call last):

File “/usr/local/lib/python3.8/dist-packages/hat/engine/ddp_trainer.py”, line 359, in _with_exception

fn(*args)

File “/open_explorer/ddk/samples/ai_toolchain/horizon_model_train_sample/scripts/tools/train.py”, line 185, in train_entrance

trainer = build_from_registry(trainer)

File “/usr/local/lib/python3.8/dist-packages/hat/registry.py”, line 236, in build_from_registry

return _impl(x)

File “/usr/local/lib/python3.8/dist-packages/hat/registry.py”, line 223, in _impl

obj = build_from_cfg(OBJECT_REGISTRY, x)

File “/usr/local/lib/python3.8/dist-packages/hat/registry.py”, line 98, in build_from_cfg

instance = obj_cls(**cfg)

File “/usr/local/lib/python3.8/dist-packages/hat/engine/ddp_trainer.py”, line 172, in __init__

super(DistributedDataParallelTrainer, self).__init__(

File “/usr/local/lib/python3.8/dist-packages/hat/engine/trainer.py”, line 86, in __init__

super(Trainer, self).__init__(

File “/usr/local/lib/python3.8/dist-packages/hat/engine/loop_base.py”, line 254, in __init__

assert (

AssertionError: No checkpoint found, double check

ERROR:__main__:launch trainer failed! process 0 terminated with exit code 1

Traceback (most recent call last):

File “./train.py”, line 277, in

train(

File “./train.py”, line 272, in train

raise e

File “./train.py”, line 255, in train

launch(

File “/usr/local/lib/python3.8/dist-packages/hat/engine/ddp_trainer.py”, line 328, in launch

mp.spawn(

File “/root/.local/lib/python3.8/site-packages/torch/multiprocessing/spawn.py”, line 230, in spawn

return start_processes(fn, args, nprocs, join, daemon, start_method=‘spawn’)

File “/root/.local/lib/python3.8/site-packages/torch/multiprocessing/spawn.py”, line 188, in start_processes

while not context.join():

File “/root/.local/lib/python3.8/site-packages/torch/multiprocessing/spawn.py”, line 139, in join

raise ProcessExitedException(

torch.multiprocessing.spawn.ProcessExitedException: process 0 terminated with exit code 1

```

颜值即正义 · 2023 年4 月 24 日 11:13

感谢您使用地平线芯片算法工具链，最近我们在收集大家的满意度反馈，欢迎您填写问卷，详细情况可见：https://developer.horizon.ai/forumDetail/146177053698464782

颜值即正义 · 2023 年3 月 10 日 10:18

您好，恢复训练如果要加载参数需要添加model_convert_pipeline来load checkpoint-
可以配置以下形式尝试：-

 resume_optimizer=True, model_convert_pipeline=dict( type="ModelConvertPipeline", converters=[ dict( type="LoadCheckpoint", checkpoint_path=os.path.join( ckpt_dir, "float-checkpoint-best.pth.tar"                ),            ),        ],    ),