AI studio上ERNI阅读理解模型使用问题

首页版块访问AI主站注册发帖

andwhatandwhat 发布于2020-03 浏览:7161 回复:5

AI studio上ERNI阅读理解模型使用问题

快速回复

在 AI studio 上，下载好模型ERNIE 1.0 中文 Base 模型(max_len=512)和数据

运行 sh script/zh_task/ernie_base/run_drcd.sh 报错：

export TASK_DATA_PATH=/task_data
+ export MODEL_PATH=/model
+ sh script/zh_task/ernie_base/run_drcd.sh
+ export FLAGS_eager_delete_tensor_gb=0
+ export FLAGS_sync_nccl_allreduce=1
+ export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
+ export PYTHONPATH=./ernie:
+ hostname -i
+ python ./ernie/finetune_launch.py --nproc_per_node 8 --selected_gpus 0,1,2,3,4,5,6,7 --node_ips 172.21.15.60 --node_id 0 ./ernie/run_mrc.py --use_cuda true --batch_size 16 --in_tokens false --use_fast_executor true --checkpoints ./checkpoints --vocab_path /model/vocab.txt --ernie_config_path /model/ernie_config.json --do_train true --do_val true --do_test true --verbose true --save_steps 1000 --validation_steps 100 --warmup_proportion 0.0 --weight_decay 0.01 --epoch 2 --max_seq_len 512 --do_lower_case true --doc_stride 128 --train_set /task_data/drcd/train.json --dev_set /task_data/drcd/dev.json --test_set /task_data/drcd/test.json --learning_rate 5e-5 --num_iteration_per_drop_scope 1 --init_pretraining_params /model/params --skip_steps 10
2020-03-10 22:43:27,051-INFO: init model: /model/params
[INFO] 2020-03-10 22:43:27,051 [finetune_launch.py:  185]:	init model: /model/params
syntax error at (eval 1) line 1, near "."
2020-03-10 22:43:27,084-INFO: -----------  Configuration Arguments -----------
[INFO] 2020-03-10 22:43:27,084 [     args.py:   68]:	-----------  Configuration Arguments -----------
2020-03-10 22:43:27,085-INFO: current_node_ip: None
[INFO] 2020-03-10 22:43:27,085 [     args.py:   70]:	current_node_ip: None
2020-03-10 22:43:27,085-INFO: log_prefix: 
[INFO] 2020-03-10 22:43:27,085 [     args.py:   70]:	log_prefix: 
2020-03-10 22:43:27,085-INFO: node_id: 0
[INFO] 2020-03-10 22:43:27,085 [     args.py:   70]:	node_id: 0
2020-03-10 22:43:27,085-INFO: node_ips: 172.21.15.60
[INFO] 2020-03-10 22:43:27,085 [     args.py:   70]:	node_ips: 172.21.15.60
2020-03-10 22:43:27,086-INFO: nproc_per_node: 8
[INFO] 2020-03-10 22:43:27,086 [     args.py:   70]:	nproc_per_node: 8
2020-03-10 22:43:27,086-INFO: print_config: True
[INFO] 2020-03-10 22:43:27,086 [     args.py:   70]:	print_config: True
2020-03-10 22:43:27,086-INFO: selected_gpus: 0,1,2,3,4,5,6,7
[INFO] 2020-03-10 22:43:27,086 [     args.py:   70]:	selected_gpus: 0,1,2,3,4,5,6,7
2020-03-10 22:43:27,086-INFO: split_log_path: log
[INFO] 2020-03-10 22:43:27,086 [     args.py:   70]:	split_log_path: log
2020-03-10 22:43:27,086-INFO: training_script: ./ernie/run_mrc.py
[INFO] 2020-03-10 22:43:27,086 [     args.py:   70]:	training_script: ./ernie/run_mrc.py
2020-03-10 22:43:27,087-INFO: training_script_args: ['--use_cuda', 'true', '--batch_size', '16', '--in_tokens', 'false', '--use_fast_executor', 'true', '--checkpoints', './checkpoints', '--vocab_path', '/model/vocab.txt', '--ernie_config_path', '/model/ernie_config.json', '--do_train', 'true', '--do_val', 'true', '--do_test', 'true', '--verbose', 'true', '--save_steps', '1000', '--validation_steps', '100', '--warmup_proportion', '0.0', '--weight_decay', '0.01', '--epoch', '2', '--max_seq_len', '512', '--do_lower_case', 'true', '--doc_stride', '128', '--train_set', '/task_data/drcd/train.json', '--dev_set', '/task_data/drcd/dev.json', '--test_set', '/task_data/drcd/test.json', '--learning_rate', '5e-5', '--num_iteration_per_drop_scope', '1', '--init_pretraining_params', '/model/params', '--skip_steps', '10']
[INFO] 2020-03-10 22:43:27,087 [     args.py:   70]:	training_script_args: ['--use_cuda', 'true', '--batch_size', '16', '--in_tokens', 'false', '--use_fast_executor', 'true', '--checkpoints', './checkpoints', '--vocab_path', '/model/vocab.txt', '--ernie_config_path', '/model/ernie_config.json', '--do_train', 'true', '--do_val', 'true', '--do_test', 'true', '--verbose', 'true', '--save_steps', '1000', '--validation_steps', '100', '--warmup_proportion', '0.0', '--weight_decay', '0.01', '--epoch', '2', '--max_seq_len', '512', '--do_lower_case', 'true', '--doc_stride', '128', '--train_set', '/task_data/drcd/train.json', '--dev_set', '/task_data/drcd/dev.json', '--test_set', '/task_data/drcd/test.json', '--learning_rate', '5e-5', '--num_iteration_per_drop_scope', '1', '--init_pretraining_params', '/model/params', '--skip_steps', '10']
2020-03-10 22:43:27,087-INFO: ------------------------------------------------
[INFO] 2020-03-10 22:43:27,087 [     args.py:   71]:	------------------------------------------------
2020-03-10 22:43:27,087-INFO: 172.21.15.60
[INFO] 2020-03-10 22:43:27,087 [finetune_launch.py:   74]:	172.21.15.60
2020-03-10 22:43:27,088-INFO: all_trainer_endpoints: 172.21.15.60:6170,172.21.15.60:6171,172.21.15.60:6172,172.21.15.60:6173,172.21.15.60:6174,172.21.15.60:6175,172.21.15.60:6176,172.21.15.60:6177, node_id: 0, current_ip: 172.21.15.60, num_nodes: 1, node_ips: [u'172.21.15.60'], gpus_per_proc: 1, selected_gpus_per_proc: [[u'0'], [u'1'], [u'2'], [u'3'], [u'4'], [u'5'], [u'6'], [u'7']], nranks: 8
[INFO] 2020-03-10 22:43:27,088 [finetune_launch.py:  112]:	all_trainer_endpoints: 172.21.15.60:6170,172.21.15.60:6171,172.21.15.60:6172,172.21.15.60:6173,172.21.15.60:6174,172.21.15.60:6175,172.21.15.60:6176,172.21.15.60:6177, node_id: 0, current_ip: 172.21.15.60, num_nodes: 1, node_ips: [u'172.21.15.60'], gpus_per_proc: 1, selected_gpus_per_proc: [[u'0'], [u'1'], [u'2'], [u'3'], [u'4'], [u'5'], [u'6'], [u'7']], nranks: 8
2020-03-10 22:43:27,100-INFO: subprocess launched, check log at log/job.log.0
[INFO] 2020-03-10 22:43:27,100 [finetune_launch.py:  150]:	subprocess launched, check log at log/job.log.0
2020-03-10 22:43:27,110-INFO: subprocess launched, check log at log/job.log.1
[INFO] 2020-03-10 22:43:27,110 [finetune_launch.py:  150]:	subprocess launched, check log at log/job.log.1
2020-03-10 22:43:27,124-INFO: subprocess launched, check log at log/job.log.2
[INFO] 2020-03-10 22:43:27,124 [finetune_launch.py:  150]:	subprocess launched, check log at log/job.log.2
2020-03-10 22:43:27,139-INFO: subprocess launched, check log at log/job.log.3
[INFO] 2020-03-10 22:43:27,139 [finetune_launch.py:  150]:	subprocess launched, check log at log/job.log.3
2020-03-10 22:43:27,150-INFO: subprocess launched, check log at log/job.log.4
[INFO] 2020-03-10 22:43:27,150 [finetune_launch.py:  150]:	subprocess launched, check log at log/job.log.4
2020-03-10 22:43:27,160-INFO: subprocess launched, check log at log/job.log.5
[INFO] 2020-03-10 22:43:27,160 [finetune_launch.py:  150]:	subprocess launched, check log at log/job.log.5
2020-03-10 22:43:27,170-INFO: subprocess launched, check log at log/job.log.6
[INFO] 2020-03-10 22:43:27,170 [finetune_launch.py:  150]:	subprocess launched, check log at log/job.log.6
2020-03-10 22:43:27,182-INFO: subprocess launched, check log at log/job.log.7
[INFO] 2020-03-10 22:43:27,182 [finetune_launch.py:  150]:	subprocess launched, check log at log/job.log.7
Traceback (most recent call last):
  File "./ernie/finetune_launch.py", line 188, in 
    main(lanch_args)
  File "./ernie/finetune_launch.py", line 176, in main
    start_procs(args)
  File "./ernie/finetune_launch.py", line 164, in start_procs
    cmd=cmds[i])
subprocess.CalledProcessError: Command '['/opt/conda/envs/python27-paddle120-env/bin/python', u'-u', './ernie/run_mrc.py', '--use_cuda', 'true', '--batch_size', '16', '--in_tokens', 'false', '--use_fast_executor', 'true', '--checkpoints', './checkpoints', '--vocab_path', '/model/vocab.txt', '--ernie_config_path', '/model/ernie_config.json', '--do_train', 'true', '--do_val', 'true', '--do_test', 'true', '--verbose', 'true', '--save_steps', '1000', '--validation_steps', '100', '--warmup_proportion', '0.0', '--weight_decay', '0.01', '--epoch', '2', '--max_seq_len', '512', '--do_lower_case', 'true', '--doc_stride', '128', '--train_set', '/task_data/drcd/train.json', '--dev_set', '/task_data/drcd/dev.json', '--test_set', '/task_data/drcd/test.json', '--learning_rate', '5e-5', '--num_iteration_per_drop_scope', '1', '--init_pretraining_params', '/model/params', '--skip_steps', '10', u'--is_distributed', u'true']' returned non-zero exit status 1

飞桨深度学习500问

技术问答

个赞

共5条回复最后由蜆煌粔拳茀铣回复于2023-10

#15蜆煌粔拳茀铣回复于2023-10

sfsdfsdfsdfsdfsdf

#14蜆煌粔拳茀铣回复于2023-10

sfsdfsdfsdfsfdsdfsdf

#12Daniel_more回复于2021-07

同问，我做训练数据也是出现类似的错误。请问楼主解决了吗？

#11Daniel_more回复于2021-07

对#2 阿布军师回复

感觉你提供的这个报错信息不全阿 File "./ernie/finetune_launch.py", line 188 这个文件有问题吗？

展开

那个文件没问题吧，这个就是按照官方指南做就这样了。

#2阿布军师回复于2020-03

感觉你提供的这个报错信息不全阿

File "./ernie/finetune_launch.py", line 188 这个文件有问题吗？

快速回复

TOP

操作指南

常见问答

平台公告

经验交流

技术专区

文字识别

人脸识别

语音技术

PaddlePaddle

EasyDL

BML

EasyData

AI Studio

UNIT

人体分析

图像搜索

图像识别

内容审核

自然语言处理

机器人视觉

视频技术

增强现实

知识图谱

智能创作

智能呼叫中心

文心

EdgeBoard

DuerOS

EasyEdge

度目硬件

百度AI市场

Doris

AI赛事

百度之星大赛

AI Studio人工智能竞赛

语言与智能技术竞赛

千言数据集

集思广益

共享工具

头脑风暴

成果展示

智能客服