首页 Paddle框架 帖子详情
deepspeech训练崩溃,为什么?
收藏
快速回复
Paddle框架 问答深度学习模型训练 2936 9
deepspeech训练崩溃,为什么?
收藏
快速回复
Paddle框架 问答深度学习模型训练 2936 9

1080Ti单卡,cuda8,cudnn7

 

----------- Configuration Arguments -----------
augment_conf_path: conf/augmentation.config
batch_size: 8
dev_manifest: data/aishell/manifest.dev
init_model_path: None
is_local: 1
learning_rate: 0.0005
max_duration: 27.0
mean_std_path: data/aishell/mean_std.npz
min_duration: 0.0
num_conv_layers: 2
num_iter_print: 100
num_passes: 50
num_proc_data: 2
num_rnn_layers: 3
output_model_dir: ./checkpoints/aishell
rnn_layer_size: 2048
share_rnn_weights: 0
shuffle_method: batch_shuffle_clipped
specgram_type: linear
test_off: 0
train_manifest: data/aishell/manifest.train
trainer_count: 1
use_gpu: 1
use_gru: 1
use_sortagrad: 1
vocab_path: data/aishell/vocab.txt
------------------------------------------------
I0821 23:27:54.971693 2754 Util.cpp:166] commandline: --use_gpu=1 --rnn_use_batch=True --log_clipping=True --trainer_count=1
[INFO 2018-08-21 23:27:55,717 layers.py:2716] output for __conv_0__: c = 32, h = 81, w = 54, size = 139968
[INFO 2018-08-21 23:27:55,718 layers.py:3361] output for __batch_norm_0__: c = 32, h = 81, w = 54, size = 139968
[INFO 2018-08-21 23:27:55,718 layers.py:7533] output for __scale_sub_region_0__: c = 32, h = 81, w = 54, size = 139968
[INFO 2018-08-21 23:27:55,719 layers.py:2716] output for __conv_1__: c = 32, h = 41, w = 54, size = 70848
[INFO 2018-08-21 23:27:55,719 layers.py:3361] output for __batch_norm_1__: c = 32, h = 41, w = 54, size = 70848
[INFO 2018-08-21 23:27:55,720 layers.py:7533] output for __scale_sub_region_1__: c = 32, h = 41, w = 54, size = 70848
I0821 23:27:55.757741 2754 GradientMachine.cpp:94] Initing parameters..
I0821 23:28:05.078935 2754 GradientMachine.cpp:101] Init parameters done.
...................................................................................................
Pass: 0, Batch: 100, TrainCost: 48.879078
...................................................................................................
Pass: 0, Batch: 200, TrainCost: 52.050614
...................................................................................................
Pass: 0, Batch: 300, TrainCost: 54.287450
...................................................................................................
Pass: 0, Batch: 400, TrainCost: 57.531792
...................................................................................................
Pass: 0, Batch: 500, TrainCost: 58.881645
...................................................................................................
Pass: 0, Batch: 600, TrainCost: 59.436953
...................................................................................................
Pass: 0, Batch: 700, TrainCost: 60.328327
...................................................................................................
Pass: 0, Batch: 800, TrainCost: 60.804533
...................................................................................................
Pass: 0, Batch: 900, TrainCost: 61.153322
...................................................................................................
Pass: 0, Batch: 1000, TrainCost: 61.622115
...................................................................................................
Pass: 0, Batch: 1100, TrainCost: 62.076225
...................................................................................................
Pass: 0, Batch: 1200, TrainCost: 62.130373
...................................................................................................
Pass: 0, Batch: 1300, TrainCost: 62.816035
...................................................................................................
Pass: 0, Batch: 1400, TrainCost: 61.784479
...................................................................................................
Pass: 0, Batch: 1500, TrainCost: 62.960338
...................................................................................................
Pass: 0, Batch: 1600, TrainCost: 62.670135
...................................................................................................
Pass: 0, Batch: 1700, TrainCost: 62.893296
...................................................................................................
Pass: 0, Batch: 1800, TrainCost: 61.601202
...................................................................................................
Pass: 0, Batch: 1900, TrainCost: 58.109036
...................................................................................................
Pass: 0, Batch: 2000, TrainCost: 52.317494
...................................................................................................
Pass: 0, Batch: 2100, TrainCost: 50.390086
...................................................................................................
Pass: 0, Batch: 2200, TrainCost: 44.981347
...................................................................................................
Pass: 0, Batch: 2300, TrainCost: 43.375281
...................................................................................................
Pass: 0, Batch: 2400, TrainCost: 41.419544
...................................................................................................
Pass: 0, Batch: 2500, TrainCost: 40.002826
...................................................................................................
Pass: 0, Batch: 2600, TrainCost: 37.328278
...................................................................................................
Pass: 0, Batch: 2700, TrainCost: 36.365238
...................................................................................................
Pass: 0, Batch: 2800, TrainCost: 33.527406
...................................................................................................
Pass: 0, Batch: 2900, TrainCost: 32.365128
...................................................................................................
Pass: 0, Batch: 3000, TrainCost: 32.271214
...................................................................................................
Pass: 0, Batch: 3100, TrainCost: 30.543637
...................................................................................................
Pass: 0, Batch: 3200, TrainCost: 31.396257
...................................................................................................
Pass: 0, Batch: 3300, TrainCost: 28.940030
...................................................................................................
Pass: 0, Batch: 3400, TrainCost: 28.019823
...................................................................................................
Pass: 0, Batch: 3500, TrainCost: 29.050141
...................................................................................................
Pass: 0, Batch: 3600, TrainCost: 28.669284
...................................................................................................
Pass: 0, Batch: 3700, TrainCost: 27.615314
...................................................................................................
Pass: 0, Batch: 3800, TrainCost: 27.171779
...................................................................................................
Pass: 0, Batch: 3900, TrainCost: 26.071601
...................................................................................................
Pass: 0, Batch: 4000, TrainCost: 26.146597
...................................................................................................
Pass: 0, Batch: 4100, TrainCost: 25.027937
...................................................................................................
Pass: 0, Batch: 4200, TrainCost: 24.738099
...................................................................................................
Pass: 0, Batch: 4300, TrainCost: 24.274302
...................................................................................................
Pass: 0, Batch: 4400, TrainCost: 23.047921
...................................................................................................
Pass: 0, Batch: 4500, TrainCost: 24.859896
...................................................................................................
Pass: 0, Batch: 4600, TrainCost: 24.143398
...................................................................................................
Pass: 0, Batch: 4700, TrainCost: 23.236050
...................................................................................................
Pass: 0, Batch: 4800, TrainCost: 22.991473
...................................................................................................
Pass: 0, Batch: 4900, TrainCost: 22.974143
...................................................................................................
Pass: 0, Batch: 5000, TrainCost: 23.039866
...................................................................................................
Pass: 0, Batch: 5100, TrainCost: 23.963588
...................................................................................................
Pass: 0, Batch: 5200, TrainCost: 23.761914
...................................................................................................
Pass: 0, Batch: 5300, TrainCost: 21.814489
...................................................................................................
Pass: 0, Batch: 5400, TrainCost: 21.643347
...................................................................................................
Pass: 0, Batch: 5500, TrainCost: 22.959213
...................................................................................................
Pass: 0, Batch: 5600, TrainCost: 21.787954
...................................................................................................
Pass: 0, Batch: 5700, TrainCost: 21.848434
...................................................................................................
Pass: 0, Batch: 5800, TrainCost: 21.710103
...................................................................................................
Pass: 0, Batch: 5900, TrainCost: 22.876925
...................................................................................................
Pass: 0, Batch: 6000, TrainCost: 23.023977
...................................................................................................
Pass: 0, Batch: 6100, TrainCost: 22.839039
...................................................................................................
Pass: 0, Batch: 6200, TrainCost: 22.518005
...................................................................................................
Pass: 0, Batch: 6300, TrainCost: 23.792581
...................................................................................................
Pass: 0, Batch: 6400, TrainCost: 21.843962
...................................................................................................
Pass: 0, Batch: 6500, TrainCost: 20.767369
...................................................................................................
Pass: 0, Batch: 6600, TrainCost: 21.218098
...................................................................................................
Pass: 0, Batch: 6700, TrainCost: 20.349717
...................................................................................................
Pass: 0, Batch: 6800, TrainCost: 20.137947
...................................................................................................
Pass: 0, Batch: 6900, TrainCost: 20.579745
...................................................................................................
Pass: 0, Batch: 7000, TrainCost: 20.510667
...................................................................................................
Pass: 0, Batch: 7100, TrainCost: 20.100289
...................................................................................................
Pass: 0, Batch: 7200, TrainCost: 20.292129
...................................................................................................
Pass: 0, Batch: 7300, TrainCost: 20.378230
...................................................................................................
Pass: 0, Batch: 7400, TrainCost: 19.452542
...................................................................................................
Pass: 0, Batch: 7500, TrainCost: 20.897582
...................................................................................................
Pass: 0, Batch: 7600, TrainCost: 19.770902
...................................................................................................
Pass: 0, Batch: 7700, TrainCost: 19.958120
...................................................................................................
Pass: 0, Batch: 7800, TrainCost: 21.735318
...................................................................................................
Pass: 0, Batch: 7900, TrainCost: 21.810184
...................................................................................................
Pass: 0, Batch: 8000, TrainCost: 21.521692
...................................................................................................
Pass: 0, Batch: 8100, TrainCost: 21.686873
...................................................................................................
Pass: 0, Batch: 8200, TrainCost: 21.478682
...................................................................................................
Pass: 0, Batch: 8300, TrainCost: 22.821815
...................................................................................................
Pass: 0, Batch: 8400, TrainCost: 26.826910
...................................................................................................
Pass: 0, Batch: 8500, TrainCost: 22.171535
...................................................................................................
Pass: 0, Batch: 8600, TrainCost: 20.660853
...................................................................................................
Pass: 0, Batch: 8700, TrainCost: 21.872783
...................................................................................................
Pass: 0, Batch: 8800, TrainCost: 22.539768
...................................................................................................
Pass: 0, Batch: 8900, TrainCost: 21.445652
...................................................................................................
Pass: 0, Batch: 9000, TrainCost: 21.501793
...................................................................................................
Pass: 0, Batch: 9100, TrainCost: 22.619789
...................................................................................................
Pass: 0, Batch: 9200, TrainCost: 21.212744
...................................................................................................
Pass: 0, Batch: 9300, TrainCost: 22.251114
...................................................................................................
Pass: 0, Batch: 9400, TrainCost: 21.609730
...................................................................................................
Pass: 0, Batch: 9500, TrainCost: 20.934048
...................................................................................................
Pass: 0, Batch: 9600, TrainCost: 21.034606
...................................................................................................
Pass: 0, Batch: 9700, TrainCost: 19.989085
...................................................................................................
Pass: 0, Batch: 9800, TrainCost: 21.536486
...................................................................................................
Pass: 0, Batch: 9900, TrainCost: 19.891194
...................................................................................................
Pass: 0, Batch: 10000, TrainCost: 19.882399
...................................................................................................
Pass: 0, Batch: 10100, TrainCost: 20.286163
...................................................................................................
Pass: 0, Batch: 10200, TrainCost: 19.629309
...................................................................................................
Pass: 0, Batch: 10300, TrainCost: 20.012704
...................................................................................................
Pass: 0, Batch: 10400, TrainCost: 21.024145
...................................................................................................
Pass: 0, Batch: 10500, TrainCost: 20.872763
...................................................................................................
Pass: 0, Batch: 10600, TrainCost: 24.681857
...................................................................................................
Pass: 0, Batch: 10700, TrainCost: 30.042671
...................................................................................................
Pass: 0, Batch: 10800, TrainCost: 24.070756
...................................................................................................
Pass: 0, Batch: 10900, TrainCost: 22.919402
...................................................................................................
Pass: 0, Batch: 11000, TrainCost: 20.995246
...................................................................................................
Pass: 0, Batch: 11100, TrainCost: 21.119481
...................................................................................................
Pass: 0, Batch: 11200, TrainCost: 21.865830
...................................................................................................
Pass: 0, Batch: 11300, TrainCost: 21.092561
...................................................................................................
Pass: 0, Batch: 11400, TrainCost: 20.358502
...................................................................................................
Pass: 0, Batch: 11500, TrainCost: 20.588887
...................................................................................................
Pass: 0, Batch: 11600, TrainCost: 20.577302
...................................................................................................
Pass: 0, Batch: 11700, TrainCost: 19.600526
...................................................................................................
Pass: 0, Batch: 11800, TrainCost: 20.403101
...................................................................................................
Pass: 0, Batch: 11900, TrainCost: 20.533533
...................................................................................................
Pass: 0, Batch: 12000, TrainCost: 20.582964
...................................................................................................
Pass: 0, Batch: 12100, TrainCost: 20.450969
...................................................................................................
Pass: 0, Batch: 12200, TrainCost: 20.513164
...................................................................................................
Pass: 0, Batch: 12300, TrainCost: 19.690470
...................................................................................................
Pass: 0, Batch: 12400, TrainCost: 20.446221
...................................................................................................
Pass: 0, Batch: 12500, TrainCost: 21.356843
...................................................................................................
Pass: 0, Batch: 12600, TrainCost: 21.286460
...................................................................................................
Pass: 0, Batch: 12700, TrainCost: 21.445777
...................................................................................................
Pass: 0, Batch: 12800, TrainCost: 20.311089
...................................................................................................
Pass: 0, Batch: 12900, TrainCost: 21.699734
...................................................................................................
Pass: 0, Batch: 13000, TrainCost: 23.843516
...................................................................................................
Pass: 0, Batch: 13100, TrainCost: 24.096860
...................................................................................................
Pass: 0, Batch: 13200, TrainCost: 23.125042
...................................................................................................
Pass: 0, Batch: 13300, TrainCost: 23.618811
...................................................................................................
Pass: 0, Batch: 13400, TrainCost: 23.282482
...................................................................................................
Pass: 0, Batch: 13500, TrainCost: 21.530445
...................................................................................................
Pass: 0, Batch: 13600, TrainCost: 21.139920
...................................................................................................
Pass: 0, Batch: 13700, TrainCost: 21.672733
...................................................................................................
Pass: 0, Batch: 13800, TrainCost: 22.190933
...................................................................................................
Pass: 0, Batch: 13900, TrainCost: 21.937914
...................................................................................................
Pass: 0, Batch: 14000, TrainCost: 21.792679
...................................................................................................
Pass: 0, Batch: 14100, TrainCost: 21.540702
...................................................................................................
Pass: 0, Batch: 14200, TrainCost: 22.184913
...................................................................................................
Pass: 0, Batch: 14300, TrainCost: 22.972764
...................................................................................................
Pass: 0, Batch: 14400, TrainCost: 22.984130
...................................................................................................
Pass: 0, Batch: 14500, TrainCost: 22.457224
...................................................................................................
Pass: 0, Batch: 14600, TrainCost: 22.847064
...................................................................................................
Pass: 0, Batch: 14700, TrainCost: 22.189152
...................................................................................................
Pass: 0, Batch: 14800, TrainCost: 22.868371
...................................................................................................
Pass: 0, Batch: 14900, TrainCost: 22.652411
...................................................................................................
Pass: 0, Batch: 15000, TrainCost: 24.813406
.............
------- Time: 30543 sec, Pass: 0, ValidationCost: 15.2627369494
...................................................................................................
Pass: 1, Batch: 100, TrainCost: 13.050176
...................................................................................................
Pass: 1, Batch: 200, TrainCost: 13.952498
...................................................................................................
Pass: 1, Batch: 300, TrainCost: 13.800532
...................................................................................................
Pass: 1, Batch: 400, TrainCost: 13.934765
...................................................................................................
Pass: 1, Batch: 500, TrainCost: 14.678977
...................................................................................................
Pass: 1, Batch: 600, TrainCost: 13.538844
...................................................................................................
Pass: 1, Batch: 700, TrainCost: 12.893786
...................................................................................................
Pass: 1, Batch: 800, TrainCost: 12.738690
...................................................................................................
Pass: 1, Batch: 900, TrainCost: 12.565846
...................................................................................................
Pass: 1, Batch: 1000, TrainCost: 13.747320
...................................................................................................
Pass: 1, Batch: 1100, TrainCost: 13.261116
...................................................................................................
Pass: 1, Batch: 1200, TrainCost: 13.436616
...................................................................................................
Pass: 1, Batch: 1300, TrainCost: 13.858294
...................................................................................................
Pass: 1, Batch: 1400, TrainCost: 14.172470
...................................................................................................
Pass: 1, Batch: 1500, TrainCost: 12.979685
...................................................................................................
Pass: 1, Batch: 1600, TrainCost: 13.973018
...................................................................................................
Pass: 1, Batch: 1700, TrainCost: 13.635892
...................................................................................................
Pass: 1, Batch: 1800, TrainCost: 14.531257
...................................................................................................
Pass: 1, Batch: 1900, TrainCost: 13.037960
...................................................................................................
Pass: 1, Batch: 2000, TrainCost: 12.505253
...................................................................................................
Pass: 1, Batch: 2100, TrainCost: 11.832279
...................................................................................................
Pass: 1, Batch: 2200, TrainCost: 13.006873
...................................................................................................
Pass: 1, Batch: 2300, TrainCost: 13.152135
...................................................................................................
Pass: 1, Batch: 2400, TrainCost: 15.219668
...................................................................................................
Pass: 1, Batch: 2500, TrainCost: 12.358140
...................................................................................................
Pass: 1, Batch: 2600, TrainCost: 13.586748
...................................................................................................
Pass: 1, Batch: 2700, TrainCost: 15.858224
...................................................................................................
Pass: 1, Batch: 2800, TrainCost: 13.916366
...................................................................................................
Pass: 1, Batch: 2900, TrainCost: 14.274427
...................................................................................................
Pass: 1, Batch: 3000, TrainCost: 14.265047
...................................................................................................
Pass: 1, Batch: 3100, TrainCost: 14.870578
...................................................................................................
Pass: 1, Batch: 3200, TrainCost: 13.956934
...................................................................................................
Pass: 1, Batch: 3300, TrainCost: 15.495274
...................................................................................................
Pass: 1, Batch: 3400, TrainCost: 14.658154
...................................................................................................
Pass: 1, Batch: 3500, TrainCost: 12.305827
...................................................................................................
Pass: 1, Batch: 3600, TrainCost: 13.750895
...................................................................................................
Pass: 1, Batch: 3700, TrainCost: 13.162479
...................................................................................................
Pass: 1, Batch: 3800, TrainCost: 13.140204
...................................................................................................
Pass: 1, Batch: 3900, TrainCost: 14.472836
...................................................................................................
Pass: 1, Batch: 4000, TrainCost: 13.212756
...................................................................................................
Pass: 1, Batch: 4100, TrainCost: 13.700532
...................................................................................................
Pass: 1, Batch: 4200, TrainCost: 12.579000
...................................................................................................
Pass: 1, Batch: 4300, TrainCost: 12.884265
...................................................................................................
Pass: 1, Batch: 4400, TrainCost: 13.244943
...................................................................................................
Pass: 1, Batch: 4500, TrainCost: 12.842920
...................................................................................................
Pass: 1, Batch: 4600, TrainCost: 12.524387
..*** Aborted at 1534905055 (unix time) try "date -d @1534905055" if you are using GNU date ***
PC: @ 0x0 (unknown)
*** SIGFPE (@0x7f8ceb511ef9) received by PID 2754 (TID 0x7f8d21e4d700) from PID 18446744073362546425; stack trace: ***
@ 0x7f8d21a3c390 (unknown)
@ 0x7f8ceb511ef9 paddle::GpuVectorT<>::getAbsMax()
@ 0x7f8ceb818136 paddle::OptimizerWithGradientClipping::update()
@ 0x7f8ceb7fec95 paddle::SgdThreadUpdater::updateImpl()
@ 0x7f8ceb6bceb1 ParameterUpdater::update()
@ 0x7f8ceb24fea6 _wrap_ParameterUpdater_update
@ 0x4c30ce PyEval_EvalFrameEx
@ 0x4b9ab6 PyEval_EvalCodeEx
@ 0x4c1e6f PyEval_EvalFrameEx
@ 0x4b9ab6 PyEval_EvalCodeEx
@ 0x4c16e7 PyEval_EvalFrameEx
@ 0x4b9ab6 PyEval_EvalCodeEx
@ 0x4c16e7 PyEval_EvalFrameEx
@ 0x4b9ab6 PyEval_EvalCodeEx
@ 0x4c1e6f PyEval_EvalFrameEx
@ 0x4b9ab6 PyEval_EvalCodeEx
@ 0x4c1e6f PyEval_EvalFrameEx
@ 0x4b9ab6 PyEval_EvalCodeEx
@ 0x4eb30f (unknown)
@ 0x4e5422 PyRun_FileExFlags
@ 0x4e3cd6 PyRun_SimpleFileExFlags
@ 0x493ae2 Py_Main
@ 0x7f8d21681830 __libc_start_main
@ 0x4933e9 _start
@ 0x0 (unknown)
Floating point exception (core dumped)
Failed in training!

1
收藏
回复
全部评论(9)
时间顺序
夜雨飘零1
#2 回复于2018-08

报的是浮点异常,试试减小学习率是否有用。

0
回复
A
AIStudio783003
#3 回复于2018-08
报的是浮点异常,试试减小学习率是否有用。

谢谢你。还有一个问题,运行到pass 4,我把训练中断了。现在想接着继续训练,我该怎么设置 --init_model_path这个参数?试验了各种组合,软件都是从头开始。

0
回复
夜雨飘零1
#4 回复于2018-08
谢谢你。还有一个问题,运行到pass 4,我把训练中断了。现在想接着继续训练,我该怎么设置 --init_model_path这个参数?试验了各种组合,软件都是从头开始。

如果是v2版本,可以这样

 with open(parameters_path, 'r') as f:
                    parameters = paddle.parameters.Parameters.from_tar(f)
0
回复
A
AIStudio783003
#5 回复于2018-08
如果是v2版本,可以这样 [代码]

你好,能说的再具体些吗?是改train.py吗?怎么改?

0
回复
夜雨飘零1
#6 回复于2018-08
你好,能说的再具体些吗?是改train.py吗?怎么改?

你可以参考这里。https://github.com/yeyupiaoling/LearnPaddle/blob/c4500904615149115535b66a67d3e5d06f8435c4/note3/code/train.py#L28-L30

1
回复
A
AIStudio783003
#7 回复于2018-08
你可以参考这里。https://github.com/yeyupiaoling/LearnPaddle/blob/c4500904615149115535b66a67d3e5d06f8435c4/note3/code/train.py#L28-L30
展开

搞定。谢谢你。

0
回复
周俊316
#8 回复于2018-08
你可以参考这里。https://github.com/yeyupiaoling/LearnPaddle/blob/c4500904615149115535b66a67d3e5d06f8435c4/note3/code/train.py#L28-L30
展开

0
回复
h
hot20031103
#9 回复于2019-08
谢谢你。还有一个问题,运行到pass 4,我把训练中断了。现在想接着继续训练,我该怎么设置 --init_model_path这个参数?试验了各种组合,软件都是从头开始。

请问下,崩溃问题,是减少学习率解决的吗

0
回复
永允儿
#10 回复于2023-07
谢谢你。还有一个问题,运行到pass 4,我把训练中断了。现在想接着继续训练,我该怎么设置 --init_model_path这个参数?试验了各种组合,软件都是从头开始。

请问你当时是怎么改的,能给我看看你开动代码的train,py吗

0
回复
需求/bug反馈?一键提issue告诉我们
发现bug?如果您知道修复办法,欢迎提pr直接参与建设飞桨~
在@后输入用户全名并按空格结束,可艾特全站任一用户