mul-label的model,在训练的时候如何评估?加cpu的evaluation出错~
59 cost = multi_binary_label_cross_entropy(input=output, label=label, name='cost')
60 outputs(cost)
61
62 Evaluator(
63 inputs = ['output', 'label'],
64 type = 'classification_error',
65 name = 'classification_error',
66 classification_threshold = 0.5
67 )
I0419 12:09:26.660202 16286 GradientMachine.cpp:92] Init parameters done.
F0419 12:10:10.975368 16286 Matrix.h:1029] Not implemented
*** Check failure stack trace: ***
@ 0xafe2ed google::LogMessage::Fail()
@ 0xb01d9c google::LogMessage::SendToLog()
@ 0xafdde3 google::LogMessage::Flush()
@ 0xb032ae google::LogMessageFatal::~LogMessageFatal()
@ 0xd4c5fa paddle::Matrix::classificationErrorMulti()
@ 0xba8007 paddle::ClassificationErrorEvaluator::calcError()
@ 0xba833a paddle::ClassificationErrorEvaluator::evalImp()
@ 0xb9b54f paddle::Evaluator::eval()
@ 0xb2b2ba paddle::CombinedEvaluator::eval()
@ 0xb276ce paddle::NeuralNetwork::eval()
@ 0xb45e62 paddle::MultiGradientMachine::eval()
@ 0xdaa5af paddle::TrainerInternal::trainOneBatch()
@ 0xda5a26 paddle::Trainer::trainOneDataBatch()
@ 0xda5f2f paddle::Trainer::trainOnePass()
@ 0xda4652 paddle::Trainer::train()
@ 0xafa224 main
@ 0x318ae1ecdd (unknown)
@ 0xaf9e5d (unknown)
label_dict len : 12258272
gpu版的multi-label,label输入是sparse类型, 从代码里支持使用classification_error评估,这个evalutor调用的https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/math/Matrix.cpp#L4555 函数没有GPU实现,所以不能用该函数评估。
想到的办法:
看似用的Paddle非常古老,你可以使用parallel_nn,让这个evalutor跑在CPU上,参考用法: http://www.paddlepaddle.org/docs/0.9.0/documentation/en/ui/cmd_argument/use_case.html#parallel-nn
训练中不做评估,保存模型线下用CPU评估。
gpu版的multi-label,可以在train的时候evaluation吗?应该怎么写evalution呢?
下面的写法为啥不行呀?
Evaluator(
inputs = ['output', 'label'],
type = 'classification_error',
name = 'classification_error',
classification_threshold = 0.5
)
改成:
classification_error_evaluator("ErrorRate", input=output, label=label)
报错:
TypeError: classification_error_evaluator() got multiple values for keyword argument 'input'
Matrix.h:1029] Not implemented
看起来是某个计算不支持
cpu版的multi-label是怎么评估的啊?就是说准确率是怎么算的?
是全部准确才算正确?还是预测的Top1-hit-rate?
Evaluator(
inputs = ['output', 'label'],
type = 'classification_error',
name = 'classification_error',
classification_threshold = 0.5
)
不是的,这里指定了 classification_threshold 阈值,计算error的时候和这个阈值比较:
(超过阈值的非label数 + 未超过阈值的label数)/总类别数
超过阈值的非label数: 是预测错误了,将当前样本预测为了某个非标签类别了;
未超过阈值的label数:是将本来的类别,没有预测出来;
总类别数:是总的数量
评估方法代码:
Paddle/paddle/math/Matrix.cpp
Lines 4555 to 4589 in 2e331c6
void CpuMatrix::classificationErrorMulti(Matrix& output,
Matrix& label,
real threshold) {
CHECK(dynamic_cast(&output));
auto labelPtr = dynamic_cast(&label);
CHECK(labelPtr);
size_t numSamples = getHeight();
size_t dim = output.getWidth();
CHECK_EQ(numSamples, output.getHeight());
CHECK_EQ(numSamples, labelPtr->getHeight());
CHECK_EQ(dim, labelPtr->getWidth());