首页 PaddleCV 帖子详情
error
收藏
快速回复
PaddleCV 问答图像分类目标识别 1598 2
error
收藏
快速回复
PaddleCV 问答图像分类目标识别 1598 2

这是我在centos7服务器上用paddlepaddle运行mnist数据集,训练的时候出现的问题:

*** Aborted at 1545839946 (unix time) try "date -d @1545839946" if you are using GNU date ***
PC: @ 0x0 (unknown)
*** SIGSEGV (@0x50) received by PID 13607 (TID 0x7f99982ca740) from PID 80; stack trace: ***
@ 0x7f9997adb5d0 (unknown)
@ 0x7f99980bfb16 _dl_relocate_object
@ 0x7f99980c85ac dl_open_worker
@ 0x7f99980c3714 _dl_catch_error
@ 0x7f99980c7acb _dl_open
@ 0x7f9997132992 do_dlopen
@ 0x7f99980c3714 _dl_catch_error
@ 0x7f9997132a52 __GI___libc_dlopen_mode
@ 0x7f9997109e45 init
@ 0x7f9997ad8e40 __GI___pthread_once
@ 0x7f9997109f5c __GI___backtrace
@ 0x7f998aa16463 (unknown)
@ 0x7f998aa169a0 (unknown)
@ 0x7f998a9fde1b (unknown)
@ 0x7f9997d2f95c (unknown)
@ 0x7f9997d31d7a PyNumber_Multiply
@ 0x7f9997dc9430 PyEval_EvalFrameEx
@ 0x7f9997d4e2b8 (unknown)
@ 0x7f9997dc7c41 PyEval_EvalFrameEx
@ 0x7f9997d4e2b8 (unknown)
@ 0x7f9997dc7c41 PyEval_EvalFrameEx
@ 0x7f9997d4e2b8 (unknown)
@ 0x7f9997d49d2f (unknown)
@ 0x7f9997dc7c41 PyEval_EvalFrameEx
@ 0x7f9997dcf03d PyEval_EvalCodeEx
@ 0x7f9997dcc53c PyEval_EvalFrameEx
@ 0x7f9997dcf03d PyEval_EvalCodeEx
@ 0x7f9997dcc53c PyEval_EvalFrameEx
@ 0x7f9997dcf03d PyEval_EvalCodeEx
@ 0x7f9997dcf142 PyEval_EvalCode
@ 0x7f9997de857f (unknown)
@ 0x7f9997de973e PyRun_FileExFlags

gzip: stdout: Broken pipe
Segmentation fault

--------------------------------------------------------------------------

在python命令行中运行了import paddlepaddle也没有报什么非法指令错误。

---------------------------------------------------------

我的GDB版本是7.9.1

我也检查了我的电脑是否支持avx指令,结果如下:

通过这个命令: cat /proc/cpuinfo | grep flags | uniq | grep avx --color

结果是有显示出avx 和avx2这两个内容。

 

这是我的训练的代码:

# encoding:utf-8
import os
import sys
import paddle.v2 as paddle
from cnn import convolutional_neural_network

class TestMNIST:
def __init__(self):
# 该模型运行在CUP上,CUP的数量为2
paddle.init(use_gpu=False, trainer_count=1)

# *****************获取训练器********************************
def get_trainer(self):

# 获取分类器
out = convolutional_neural_network()

# 定义标签
label = paddle.layer.data(name="label",
type=paddle.data_type.integer_value(10))

# 获取损失函数
cost = paddle.layer.classification_cost(input=out, label=label)

# 获取参数
parameters = paddle.parameters.create(layers=cost)

"""
定义优化方法
learning_rate 迭代的速度
momentum 跟前面动量优化的比例
regularzation 正则化,防止过拟合
:leng re
"""
optimizer = paddle.optimizer.Momentum(learning_rate=0.1 / 128.0,
momentum=0.9,
regularization=paddle.optimizer.L2Regularization(rate=0.0005 * 128))
'''
创建训练器
cost 分类器
parameters 训练参数,可以通过创建,也可以使用之前训练好的参数
update_equation 优化方法
'''
trainer = paddle.trainer.SGD(cost=cost,
parameters=parameters,
update_equation=optimizer)
return trainer

# *****************开始训练********************************
def start_trainer(self):
# 获取训练器
trainer = self.get_trainer()

# 定义训练事件
def event_handler(event):
lists = []
if isinstance(event, paddle.event.EndIteration):
if event.batch_id % 100 == 0:
print "\nPass %d, Batch %d, Cost %f, %s" % (
event.pass_id, event.batch_id, event.cost, event.metrics)
else:
sys.stdout.write('.')
sys.stdout.flush()
if isinstance(event, paddle.event.EndPass):
# 保存训练好的参数
model_path = '../model'
if not os.path.exists(model_path):
os.makedirs(model_path)
with open(model_path + "/model.tar", 'w') as f:
trainer.save_parameter_to_tar(f=f)
# 使用测试进行测试
result = trainer.test(reader=paddle.batch(paddle.dataset.mnist.test(), batch_size=128))
print "\nTest with Pass %d, Cost %f, %s\n" % (event.pass_id, result.cost, result.metrics)
lists.append((event.pass_id, result.cost, result.metrics['classification_error_evaluator']))

# 获取数据
reader = paddle.batch(paddle.reader.shuffle(paddle.dataset.mnist.train(), buf_size=20000),
batch_size=128)
'''
开始训练
reader 训练数据
num_passes 训练的轮数
event_handler 训练的事件,比如在训练的时候要做一些什么事情
'''
trainer.train(reader=reader,
num_passes=100,
event_handler=event_handler)


if __name__ == "__main__":
testMNIST = TestMNIST()
# 开始训练
testMNIST.start_trainer()

 

网络是正常的。

请各位大神指点一下,是我的程序的问题,还是安装的paddlepaddle有问题。我几乎查遍了全网,找不到和我的情况一样的,谢谢!

 

0
收藏
回复
全部评论(2)
时间顺序
busyboxs
#2 回复于2018-12

你这是什么版本的paddlepaddle呀,现在不都用fluid API了么,你这还在用v2

0
回复
杨光的敌人
#3 回复于2018-12

那我的服务器到到底是支持avx还是不支持呢?

0
回复
需求/bug反馈?一键提issue告诉我们
发现bug?如果您知道修复办法,欢迎提pr直接参与建设飞桨~
在@后输入用户全名并按空格结束,可艾特全站任一用户