NER任务中损失函数CrossEntropyLoss()指定weight报错

项目

数据集

课程

比赛

模型库

活动

论坛

访问飞桨官网

项目

数据集

课程

比赛

模型库

活动

论坛

访问飞桨官网

c chengcheng 发布于2021-08

使用PaddleNLP进行NER训练，设置的损失函数为：paddle.nn.loss.CrossEntropyLoss(weight=weight)

其中的weight为：weight = paddle.to_tensor(
np.array(['0.000451536', '0.000451536', '0.000645355', '0.000645355', '0.000336313', '0.000336313', '0.003716059',
'0.003716059', '0.003580706', '0.003580706', '0.001656642', '0.001656642', '0.002692495', '0.002692495',
'0.010524314', '0.010524314', '0.012340922', '0.012340922', '0.010868073', '0.010868073', '0.034742916',
'0.034742916', '0.003124228', '0.003124228', '0.013517166', '0.013517166', '0.022825821', '0.022825821',
'0.077240946', '0.077240946', '0.024164765', '0.024164765', '0.262151091', '0.262151091', '0.015420652',
'0.015420652']), dtype='float64')

一共有36个标签，weight的也是36个

然后运行之后报了如下错误：

Error: /paddle/paddle/fluid/operators/gather.cu.h:62 Assertion `index_value >= 0 && index_value < input_dims[j]` failed. The index is out of bounds, please check whether the dimensions of index and input meet the requirements. It should be less than [36] and greater or equal to 0, but received [0]
Traceback (most recent call last):
  File "/home/www/.conda/envs/nlp_env/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3418, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "", line 1, in 
    runfile('/data/work/nlp_task/cheng/jz_git/pintu/enent_entity/intelligent_info_second/paddlenlp_version/train.py', wdir='/data/work/nlp_task/cheng/jz_git/pintu/enent_entity/intelligent_info_second/paddlenlp_version')
  File "/root/.pycharm_helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "/root/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/data/work/nlp_task/cheng/jz_git/pintu/enent_entity/intelligent_info_second/paddlenlp_version/train.py", line 161, in 
    loss.backward()
  File "", line 2, in backward
  File "/home/www/.conda/envs/nlp_env/lib/python3.7/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
    return wrapped_func(*args, **kwargs)
  File "/home/www/.conda/envs/nlp_env/lib/python3.7/site-packages/paddle/fluid/framework.py", line 225, in __impl__
    return func(*args, **kwargs)
  File "/home/www/.conda/envs/nlp_env/lib/python3.7/site-packages/paddle/fluid/dygraph/varbase_patch_methods.py", line 177, in backward
    self._run_backward(framework._dygraph_tracer(), retain_graph)
OSError: (External)  Cudnn error, CUDNN_STATUS_EXECUTION_FAILED  (at /paddle/paddle/fluid/operators/softmax_cudnn_op.cu:511)

全部评论(5)

零

零下一度朦胧

#2 回复于2021-08

使用ai studio跑的么，是的话把环境关了重新打开试试

chengcheng

#3 回复于2021-08

零下一度朦胧 #2

使用ai studio跑的么，是的话把环境关了重新打开试试

用aistudio是了一下，还是不行

---------------------------------------------------------------------------ValueError                                Traceback (most recent call last) in 
      3     for idx, (input_ids, token_type_ids, length, labels) in enumerate(train_loader):
      4         logits = model(input_ids, token_type_ids)
----> 5         loss = paddle.mean(loss_fn(logits, labels))
      6         loss.backward()
      7         optimizer.step()
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py in __call__(self, *inputs, **kwargs)
    896                 self._built = True
    897 
--> 898             outputs = self.forward(*inputs, **kwargs)
    899 
    900             for forward_post_hook in self._forward_post_hooks.values():
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/layer/loss.py in forward(self, input, label)
    403             axis=self.axis,
    404             use_softmax=self.use_softmax,
--> 405             name=self.name)
    406 
    407         return ret
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/functional/loss.py in cross_entropy(input, label, weight, ignore_index, reduction, soft_label, axis, use_softmax, name)
   1418                         'Expected 0 <= label_value < class_dimension({}), but got {} <= label_value <= {} '.
   1419                         format(input.shape[-1],
-> 1420                                label_min.numpy(), label_max.numpy()))
   1421                 weight_gather = core.ops.gather_nd(weight, label)
   1422                 input_shape = list(label.shape)
ValueError: Expected 0 <= label_value < class_dimension(13), but got [-1] <= label_value <= [12]

零

零下一度朦胧

#4 回复于2021-08

chengcheng #3

用aistudio是了一下，还是不行 [代码]

这个是代码有问题了label_value 的值有问题，可以排查一下，你在错误的情况下再去运行代码，就会出现你第一次的问题，这应该是bug

chengcheng

#5 回复于2021-08

零下一度朦胧 #4

这个是代码有问题了label_value 的值有问题，可以排查一下，你在错误的情况下再去运行代码，就会出现你第一次的问题，这应该是bug

嗯，谢谢你的回复，我提issue了，看怎么回复吧

chengcheng

#6 回复于2021-08

此问题已经解决，具体请查看，issue为：https://github.com/PaddlePaddle/PaddleNLP/issues/835

提issue

需求/bug反馈？一键提issue告诉我们

提pr

发现bug？如果您知道修复办法，欢迎提pr直接参与建设飞桨~