PaddleOCR关键信息抽取模型训练 SER-文本语义识别训练报错

项目

数据集

课程

比赛

模型库

活动

论坛

访问飞桨官网

项目

数据集

课程

比赛

模型库

活动

论坛

访问飞桨官网

p peak5213 发布于2023-06

关键信息抽取模型训练中在使用自定义数据集进行语义实体识别(SER)训练时报错，具体异常信息：Error: ../paddle/phi/kernels/gpu/cross_entropy_kernel.cu:1008 Assertion `false` failed. The value of label expected >= 0 and < 3, or == -100, but got 4. Please check label value

....

File "D:\Program Files\Anaconda\envs\paddle_env\lib\site-packages\paddle\fluid\dygraph\varbase_patch_methods.py", line 297, in backward
core.eager.run_backward([self], grad_tensor, retain_graph)
OSError: (External) CUBLAS error(14).
[Hint: Please search for the error code(14) on website (https://docs.nvidia.com/cuda/cublas/index.html#cublasstatus_t) to get Nvidia's official solution and advice about CUBLAS Error.] (at ..\paddle/phi/kernels/funcs/blas/blas_impl.cu.h:35)

看异常信息是说label标签的值应该在0到3之间，但是得到4个。我使用的数据集是身份证正面图，所以标签只有两种类型，就是question和answer,已经反复检查过数据集是没有问题的，而且在进行关系抽取训练(RE)时可以使用此数据集正常完成训练

本身使用的OCR环境是windos下的gpu环境， paddlepaddle-gpu版本为2.4.2 ，CUDA版本 11.2， cuDNN 版本8.2.1，环境安装测试没有问题

训练参考的文档地址：https://gitee.com/paddlepaddle/PaddleOCR/blob/release/2.6/doc/doc_ch/kie.md

我的ser配置文件：ser_vi_layoutxlm_xfund_zh.yml

Global:
use_gpu: True
epoch_num: &epoch_num 100
log_smooth_window: 10
print_batch_step: 10
save_model_dir: ./output/ser_vi_layoutxlm_xfund_zh
save_epoch_step: 2000
# evaluation is run every 10 iterations after the 0th iteration
eval_batch_step: [ 0, 19 ]
cal_metric_during_train: False
save_inference_dir:
use_visualdl: False
seed: 2022
infer_img: ppstructure/docs/kie/input/zh_val_42.jpg
# if you want to predict using the groundtruth ocr info,
# you can use the following config
# infer_img: train_data/XFUND/zh_val/val.json
# infer_mode: False

save_res_path: ./output/ser/xfund_zh/res
kie_rec_model_dir:
kie_det_model_dir:

Architecture:
model_type: kie
algorithm: &algorithm "LayoutXLM"
Transform:
Backbone:
name: LayoutXLMForSer
pretrained: True
checkpoints:
# one of base or vi
mode: vi
num_classes: &num_classes 3

Loss:
name: VQASerTokenLayoutLMLoss
num_classes: *num_classes
key: "backbone_out"

Optimizer:
name: AdamW
beta1: 0.9
beta2: 0.999
lr:
name: Linear
learning_rate: 0.00005
epochs: *epoch_num
warmup_epoch: 2
regularizer:
name: L2
factor: 0.00000

PostProcess:
name: VQASerTokenLayoutLMPostProcess
class_path: &class_path train_data/idcard/class_list.txt

Metric:
name: VQASerTokenMetric
main_indicator: hmean

Train:
dataset:
name: SimpleDataSet
data_dir: train_data/idcard/train
label_file_list:
- train_data/idcard/train.json
ratio_list: [ 1.0 ]
transforms:
- DecodeImage: # load image
img_mode: RGB
channel_first: False
- VQATokenLabelEncode: # Class handling label
contains_re: False
algorithm: *algorithm
class_path: *class_path
use_textline_bbox_info: &use_textline_bbox_info True
# one of [None, "tb-yx"]
order_method: &order_method "tb-yx"
- VQATokenPad:
max_seq_len: &max_seq_len 512
return_attention_mask: True
- VQASerTokenChunk:
max_seq_len: *max_seq_len
- Resize:
size: [224,224]
- NormalizeImage:
scale: 1
mean: [ 123.675, 116.28, 103.53 ]
std: [ 58.395, 57.12, 57.375 ]
order: 'hwc'
- ToCHWImage:
- KeepKeys:
keep_keys: [ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels'] # dataloader will return list in this order
loader:
shuffle: True
drop_last: False
batch_size_per_card: 1
num_workers: 0

Eval:
dataset:
name: SimpleDataSet
data_dir: train_data/idcard/val
label_file_list:
- train_data/idcard/val.json
transforms:
- DecodeImage: # load image
img_mode: RGB
channel_first: False
- VQATokenLabelEncode: # Class handling label
contains_re: False
algorithm: *algorithm
class_path: *class_path
use_textline_bbox_info: *use_textline_bbox_info
order_method: *order_method
- VQATokenPad:
max_seq_len: *max_seq_len
return_attention_mask: True
- VQASerTokenChunk:
max_seq_len: *max_seq_len
- Resize:
size: [224,224]
- NormalizeImage:
scale: 1
mean: [ 123.675, 116.28, 103.53 ]
std: [ 58.395, 57.12, 57.375 ]
order: 'hwc'
- ToCHWImage:
- KeepKeys:
keep_keys: [ 'input_ids', 'bbox', 'attention_mask', 'token_type_ids', 'image', 'labels'] # dataloader will return list in this order
loader:
shuffle: False
drop_last: False
batch_size_per_card: 1
num_workers: 0

附上使用的数据集3条以供参考：

[
{"transcription": "姓名", "points": [[82, 606], [201, 606], [201, 652], [82, 652]], "label": "question", "id": 1, "linking": [[1,2]]},
{"transcription": "性别", "points": [[87, 691], [203, 691], [203, 736], [87, 736]], "label": "question", "id": 3, "linking": [[3,4]]},
{"transcription": "民族", "points": [[342, 694], [443, 694], [443, 741], [342, 741]], "label": "question", "id": 5, "linking": [[5,6]]},
{"transcription": "出生", "points": [[99, 776], [197, 776], [197, 818], [99, 818]], "label": "question", "id": 7, "linking": [[7,8]]},
{"transcription": "住址", "points": [[97, 868], [201, 868], [201, 910], [97, 910]], "label": "question", "id": 9, "linking": [[9,10]]},
{"transcription": "公民身份号码", "points": [[100, 1083], [358, 1083], [358, 1130], [100, 1130]], "label": "question", "id": 11, "linking": [[11,12]]},
{"transcription": "张雨欣", "points": [[226, 597], [387, 597], [387, 654], [226, 654]], "label": "answer", "id": 2, "linking": [[1,2]]},
{"transcription": "男", "points": [[226, 693], [279, 693], [279, 741], [226, 741]], "label": "answer", "id": 4, "linking": [[3,4]]},
{"transcription": "汉", "points": [[449, 693], [495, 693], [495, 744], [449, 744]], "label": "answer", "id": 6, "linking": [[5,6]]},
{"transcription": "1990年2月19日", "points": [[234, 776], [595, 776], [595, 826], [234, 826]], "label": "answer", "id": 8, "linking": [[7,8]]},
{"transcription": "北京市东城区XXX号", "points": [[237, 862], [627, 862], [627, 917], [237, 917]], "label": "answer", "id": 10, "linking": [[9,10]]},
{"transcription": "11XXXXXXXXXXXXXX4", "points": [[405, 1075], [976, 1075], [976, 1122], [405, 1122]], "label": "answer", "id": 12, "linking": [[11,12]]}
]

[
{
"transcription": "姓名", "points": [[501, 217], [601, 217], [601, 258], [501, 258]], "label": "question", "id": 1, "linking": [[1,2]]},
{"transcription": "性别", "points": [[499, 290], [604, 290], [604, 335], [499, 335]], "label": "question", "id": 3, "linking": [[3,4]]},
{"transcription": "民族", "points": [[727, 298], [816, 298], [816, 339], [727, 339]], "label": "question", "id": 5, "linking": [[5,6]]},
{"transcription": "出生", "points": [[500, 368], [590, 368], [590, 408], [500, 408]], "label": "question", "id": 7, "linking": [[7,8]]},
{"transcription": "住址", "points": [[494, 449], [591, 449], [591, 486], [494, 486]], "label": "question", "id": 9, "linking": [[9,10],[9,11]]},
{"transcription": "公民身份号码", "points": [[491, 646], [727, 646], [727, 692], [491, 692]], "label": "question", "id": 12, "linking": [[12,13]]},
{"transcription": "燕雨萌", "points": [[625, 211], [770, 211], [770, 264], [625, 264]], "label": "answer", "id": 2, "linking": [[1,2]]},
{"transcription": "女", "points": [[622, 300], [664, 300], [664, 333], [622, 333]], "label": "answer", "id": 4, "linking": [[3,4]]},
{"transcription": "汉", "points": [[823, 304], [863, 304], [863, 343], [823, 343]], "label": "answer", "id": 6, "linking": [[5,6]]},
{"transcription": "1990年7月27日", "points": [[625, 372], [948, 372], [948, 419], [625, 419]], "label": "answer", "id": 8, "linking": [[7,8]]},
{"transcription": "北京市大兴区XXX天宝", "points": [[615, 448], [1031, 462], [1030, 499], [617, 488]], "label": "answer", "id": 10, "linking": [[9,10]]},
{"transcription": "园二里X号楼X单元XXX号", "points": [[619, 501], [1021, 510], [1018, 551], [618, 543]], "label": "answer", "id": 11, "linking": [[9,11]]},
{"transcription": "110XXXXXXXXXXX48", "points": [[776, 648], [1309, 661], [1308, 710], [773, 693]], "label": "answer", "id": 13, "linking": [[12,13]]}]

[
{"transcription": "山西省朔州市朔城区神头", "points": [[345, 577], [1081, 583], [1081, 640], [345, 634]], "label": "answer", "id": 10, "linking": [[9,10]]},
{"transcription": "镇北邵庄村X区XXX号", "points": [[356, 673], [961, 673], [961, 730], [356, 730]], "label": "answer", "id": 11, "linking": [[9,11]]},
{"transcription": "公民身份号码", "points": [[143, 936], [539, 936], [539, 989], [143, 989]], "label": "question", "id": 12, "linking": [[12,13]]},
{"transcription": "14XXXXXXXXXXXXX28", "points": [[639, 936], [1596, 940], [1595, 996], [639, 992]], "label": "answer", "id": 13, "linking": [[12,13]]},
{"transcription": "姓名", "points": [[128, 149], [289, 149], [289, 210], [128, 210]], "label": "question", "id": 1, "linking": [[1,2]]},
{"transcription": "性别", "points": [[123, 284], [296, 284], [296, 346], [123, 346]], "label": "question", "id": 3, "linking": [[3,4]]},
{"transcription": "出生", "points": [[133, 420], [294, 420], [294, 486], [133, 486]], "label": "question", "id": 7, "linking": [[7,8]]},
{"transcription": "住址", "points": [[137, 570], [294, 570], [294, 633], [137, 633]], "label": "question", "id": 9, "linking": [[9,10],[9,11]]},
{"transcription": "民族", "points": [[539, 289], [693, 289], [693, 354], [539, 354]], "label": "question", "id": 5, "linking": [[5,6]]},
{"transcription": "殷桃", "points": [[348, 134], [561, 134], [561, 215], [348, 215]], "label": "answer", "id": 2, "linking": [[1,2]]},
{"transcription": "1994年6月23日", "points": [[353, 430], [939, 430], [939, 489], [353, 489]], "label": "answer", "id": 8, "linking": [[7,8]]},
{"transcription": "女", "points": [[348, 293], [419, 293], [419, 357], [348, 357]], "label": "answer", "id": 4, "linking": [[3,4]]},
{"transcription": "汉", "points": [[708, 292], [778, 292], [778, 362], [708, 362]], "label": "answer", "id": 6, "linking": [[5,6]]}]

尝试很多方法，都没有解决，希望有大神可以帮忙看看，非常感谢！！！

全部评论(3)

beyondyourself

#2 回复于2023-06

标签0-3，num_classes 应该为4

peak5213

#3 回复于2023-06

beyondyourself #2

标签0-3，num_classes 应该为4

不对哦，文档有说明是 2n-1，n是标签数量，我的数据集是两个标签，所以是3

peak5213

#4 回复于2023-06

应该是这个不支持两个标签的数据集吧，我换了别的标签类型多的数据集可以跑通了

提issue

需求/bug反馈？一键提issue告诉我们

提pr

发现bug？如果您知道修复办法，欢迎提pr直接参与建设飞桨~