gpu train 可以evaluation吗

项目

数据集

课程

比赛

模型库

活动

论坛

访问飞桨官网

项目

数据集

课程

比赛

模型库

活动

论坛

访问飞桨官网

l lucywsq 发布于2018-05

mul-label的model，在训练的时候如何评估？加cpu的evaluation出错~

59 cost = multi_binary_label_cross_entropy(input=output, label=label, name='cost')
60 outputs(cost)
61
62 Evaluator(
63 inputs = ['output', 'label'],
64 type = 'classification_error',
65 name = 'classification_error',
66 classification_threshold = 0.5
67 )

I0419 12:09:26.660202 16286 GradientMachine.cpp:92] Init parameters done.
F0419 12:10:10.975368 16286 Matrix.h:1029] Not implemented
*** Check failure stack trace: ***
@ 0xafe2ed google::LogMessage::Fail()
@ 0xb01d9c google::LogMessage::SendToLog()
@ 0xafdde3 google::LogMessage::Flush()
@ 0xb032ae google::LogMessageFatal::~LogMessageFatal()
@ 0xd4c5fa paddle::Matrix::classificationErrorMulti()
@ 0xba8007 paddle::ClassificationErrorEvaluator::calcError()
@ 0xba833a paddle::ClassificationErrorEvaluator::evalImp()
@ 0xb9b54f paddle::Evaluator::eval()
@ 0xb2b2ba paddle::CombinedEvaluator::eval()
@ 0xb276ce paddle::NeuralNetwork::eval()
@ 0xb45e62 paddle::MultiGradientMachine::eval()
@ 0xdaa5af paddle::TrainerInternal::trainOneBatch()
@ 0xda5a26 paddle::Trainer::trainOneDataBatch()
@ 0xda5f2f paddle::Trainer::trainOnePass()
@ 0xda4652 paddle::Trainer::train()
@ 0xafa224 main
@ 0x318ae1ecdd (unknown)
@ 0xaf9e5d (unknown)
label_dict len : 12258272

全部评论(5)

AIStudio782998

#2 回复于2018-05

gpu版的multi-label，label输入是sparse类型，从代码里支持使用classification_error评估，这个evalutor调用的https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/math/Matrix.cpp#L4555 函数没有GPU实现，所以不能用该函数评估。

想到的办法：

看似用的Paddle非常古老，你可以使用parallel_nn，让这个evalutor跑在CPU上，参考用法： http://www.paddlepaddle.org/docs/0.9.0/documentation/en/ui/cmd_argument/use_case.html#parallel-nn
训练中不做评估，保存模型线下用CPU评估。

lucywsq

#3 回复于2018-05

gpu版的multi-label，可以在train的时候evaluation吗？应该怎么写evalution呢？
下面的写法为啥不行呀？
Evaluator(
inputs = ['output', 'label'],
type = 'classification_error',
name = 'classification_error',
classification_threshold = 0.5
)
改成：
classification_error_evaluator("ErrorRate", input=output, label=label)
报错：
TypeError: classification_error_evaluator() got multiple values for keyword argument 'input'

AIStudio782998

#4 回复于2018-05

Matrix.h:1029] Not implemented

看起来是某个计算不支持

lucywsq

#5 回复于2018-05

cpu版的multi-label是怎么评估的啊？就是说准确率是怎么算的？
是全部准确才算正确？还是预测的Top1-hit-rate？
Evaluator(
inputs = ['output', 'label'],
type = 'classification_error',
name = 'classification_error',
classification_threshold = 0.5
)

AIStudio782998

#6 回复于2018-05

不是的，这里指定了 classification_threshold 阈值，计算error的时候和这个阈值比较：

(超过阈值的非label数 + 未超过阈值的label数)/总类别数

超过阈值的非label数：是预测错误了，将当前样本预测为了某个非标签类别了；
未超过阈值的label数：是将本来的类别，没有预测出来；
总类别数：是总的数量
评估方法代码：

Paddle/paddle/math/Matrix.cpp

Lines 4555 to 4589 in 2e331c6

void CpuMatrix::classificationErrorMulti(Matrix& output,

Matrix& label,

real threshold) {

CHECK(dynamic_cast(&output));

auto labelPtr = dynamic_cast(&label);

CHECK(labelPtr);

size_t numSamples = getHeight();

size_t dim = output.getWidth();

CHECK_EQ(numSamples, output.getHeight());

CHECK_EQ(numSamples, labelPtr->getHeight());

CHECK_EQ(dim, labelPtr->getWidth());

提issue

需求/bug反馈？一键提issue告诉我们

提pr

发现bug？如果您知道修复办法，欢迎提pr直接参与建设飞桨~