单独训练PaddleOCR文字检测板块出错

项目

数据集

课程

比赛

模型库

活动

论坛

访问飞桨官网

项目

数据集

课程

比赛

模型库

活动

论坛

访问飞桨官网

热牛奶坑发布于2021-05

参考了这个文档进行finetune训练
然后报错，好像是数据集没有成功加载？但是确实数据集已经下载了，直接打开目录也可以打开

# 单机单卡训练 det_r50_vd 模型
%cd ../PaddleOCR/

!pip install imgaug
!pip install pyclipper
!pip install lmdb
!pip install Levenshtein

!python3 tools/train.py -c configs/det/det_r50_vd_db.yml \
-o Global.pretrain_weights=./pretrain_models/ResNet50_vd_ssld_pretrained/
"""
# 单机多卡训练，通过 --gpus 参数设置使用的GPU ID
!python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/det/det_r50_vd_db.yml \
-o Global.pretrain_weights=./pretrain_models/ResNet50_vd_ssld_pretrained/
"""

----------------------------------------------------------------------------------------------------

/home/aistudio/PaddleOCR
Looking in indexes: https://mirror.baidu.com/pypi/simple/
Requirement already satisfied: imgaug in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (0.4.0)
Requirement already satisfied: scipy in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug) (1.6.3)
Requirement already satisfied: Pillow in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug) (7.1.2)
Requirement already satisfied: numpy>=1.15 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug) (1.20.3)
Requirement already satisfied: opencv-python in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug) (4.1.1.26)
Requirement already satisfied: matplotlib in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug) (2.2.3)
Requirement already satisfied: Shapely in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug) (1.7.1)
Requirement already satisfied: six in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug) (1.15.0)
Requirement already satisfied: imageio in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug) (2.6.1)
Requirement already satisfied: scikit-image>=0.14.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug) (0.18.1)
Requirement already satisfied: python-dateutil>=2.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->imgaug) (2.8.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->imgaug) (1.1.0)
Requirement already satisfied: cycler>=0.10 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->imgaug) (0.10.0)
Requirement already satisfied: pytz in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->imgaug) (2019.3)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->imgaug) (2.4.2)
Requirement already satisfied: networkx>=2.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from scikit-image>=0.14.2->imgaug) (2.4)
Requirement already satisfied: PyWavelets>=1.1.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from scikit-image>=0.14.2->imgaug) (1.1.1)
Requirement already satisfied: tifffile>=2019.7.26 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from scikit-image>=0.14.2->imgaug) (2021.4.8)
Requirement already satisfied: setuptools in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from kiwisolver>=1.0.1->matplotlib->imgaug) (56.2.0)
Requirement already satisfied: decorator>=4.3.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from networkx>=2.0->scikit-image>=0.14.2->imgaug) (4.4.2)
Looking in indexes: https://mirror.baidu.com/pypi/simple/
Requirement already satisfied: pyclipper in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (1.2.1)
Looking in indexes: https://mirror.baidu.com/pypi/simple/
Requirement already satisfied: lmdb in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (1.2.1)
Looking in indexes: https://mirror.baidu.com/pypi/simple/
Requirement already satisfied: Levenshtein in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (0.12.0)
Requirement already satisfied: setuptools in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from Levenshtein) (56.2.0)
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
def convert_to_list(value, n, name, dtype=np.int):
[2021/05/28 09:49:48] root INFO: Architecture :
[2021/05/28 09:49:48] root INFO: Backbone :
[2021/05/28 09:49:48] root INFO: layers : 50
[2021/05/28 09:49:48] root INFO: name : ResNet
[2021/05/28 09:49:48] root INFO: Head :
[2021/05/28 09:49:48] root INFO: k : 50
[2021/05/28 09:49:48] root INFO: name : DBHead
[2021/05/28 09:49:48] root INFO: Neck :
[2021/05/28 09:49:48] root INFO: name : DBFPN
[2021/05/28 09:49:48] root INFO: out_channels : 256
[2021/05/28 09:49:48] root INFO: Transform : None
[2021/05/28 09:49:48] root INFO: algorithm : DB
[2021/05/28 09:49:48] root INFO: model_type : det
[2021/05/28 09:49:48] root INFO: Eval :
[2021/05/28 09:49:48] root INFO: dataset :
[2021/05/28 09:49:48] root INFO: data_dir : ./train_data/icdar2015/text_localization/
[2021/05/28 09:49:48] root INFO: label_file_list : ['./train_data/icdar2015/text_localization/test_icdar2015_label.txt']
[2021/05/28 09:49:48] root INFO: name : SimpleDataSet
[2021/05/28 09:49:48] root INFO: transforms :
[2021/05/28 09:49:48] root INFO: DecodeImage :
[2021/05/28 09:49:48] root INFO: channel_first : False
[2021/05/28 09:49:48] root INFO: img_mode : BGR
[2021/05/28 09:49:48] root INFO: DetLabelEncode : None
[2021/05/28 09:49:48] root INFO: DetResizeForTest :
[2021/05/28 09:49:48] root INFO: image_shape : [736, 1280]
[2021/05/28 09:49:48] root INFO: NormalizeImage :
[2021/05/28 09:49:48] root INFO: mean : [0.485, 0.456, 0.406]
[2021/05/28 09:49:48] root INFO: order : hwc
[2021/05/28 09:49:48] root INFO: scale : 1./255.
[2021/05/28 09:49:48] root INFO: std : [0.229, 0.224, 0.225]
[2021/05/28 09:49:48] root INFO: ToCHWImage : None
[2021/05/28 09:49:48] root INFO: KeepKeys :
[2021/05/28 09:49:48] root INFO: keep_keys : ['image', 'shape', 'polys', 'ignore_tags']
[2021/05/28 09:49:48] root INFO: loader :
[2021/05/28 09:49:48] root INFO: batch_size_per_card : 1
[2021/05/28 09:49:48] root INFO: drop_last : False
[2021/05/28 09:49:48] root INFO: num_workers : 8
[2021/05/28 09:49:48] root INFO: shuffle : False
[2021/05/28 09:49:48] root INFO: Global :
[2021/05/28 09:49:48] root INFO: cal_metric_during_train : False
[2021/05/28 09:49:48] root INFO: checkpoints : None
[2021/05/28 09:49:48] root INFO: debug : False
[2021/05/28 09:49:48] root INFO: distributed : False
[2021/05/28 09:49:48] root INFO: epoch_num : 1200
[2021/05/28 09:49:48] root INFO: eval_batch_step : [0, 2000]
[2021/05/28 09:49:48] root INFO: infer_img : doc/imgs_en/img_10.jpg
[2021/05/28 09:49:48] root INFO: load_static_weights : True
[2021/05/28 09:49:48] root INFO: log_smooth_window : 20
[2021/05/28 09:49:48] root INFO: pretrain_weights : ./pretrain_models/ResNet50_vd_ssld_pretrained/
[2021/05/28 09:49:48] root INFO: pretrained_model : ./pretrain_models/ResNet50_vd_ssld_pretrained
[2021/05/28 09:49:48] root INFO: print_batch_step : 10
[2021/05/28 09:49:48] root INFO: save_epoch_step : 1200
[2021/05/28 09:49:48] root INFO: save_inference_dir : None
[2021/05/28 09:49:48] root INFO: save_model_dir : ./output/det_r50_vd/
[2021/05/28 09:49:48] root INFO: save_res_path : ./output/det_db/predicts_db.txt
[2021/05/28 09:49:48] root INFO: use_gpu : True
[2021/05/28 09:49:48] root INFO: use_visualdl : False
[2021/05/28 09:49:48] root INFO: Loss :
[2021/05/28 09:49:48] root INFO: alpha : 5
[2021/05/28 09:49:48] root INFO: balance_loss : True
[2021/05/28 09:49:48] root INFO: beta : 10
[2021/05/28 09:49:48] root INFO: main_loss_type : DiceLoss
[2021/05/28 09:49:48] root INFO: name : DBLoss
[2021/05/28 09:49:48] root INFO: ohem_ratio : 3
[2021/05/28 09:49:48] root INFO: Metric :
[2021/05/28 09:49:48] root INFO: main_indicator : hmean
[2021/05/28 09:49:48] root INFO: name : DetMetric
[2021/05/28 09:49:48] root INFO: Optimizer :
[2021/05/28 09:49:48] root INFO: beta1 : 0.9
[2021/05/28 09:49:48] root INFO: beta2 : 0.999
[2021/05/28 09:49:48] root INFO: lr :
[2021/05/28 09:49:48] root INFO: learning_rate : 0.001
[2021/05/28 09:49:48] root INFO: name : Adam
[2021/05/28 09:49:48] root INFO: regularizer :
[2021/05/28 09:49:48] root INFO: factor : 0
[2021/05/28 09:49:48] root INFO: name : L2
[2021/05/28 09:49:48] root INFO: PostProcess :
[2021/05/28 09:49:48] root INFO: box_thresh : 0.7
[2021/05/28 09:49:48] root INFO: max_candidates : 1000
[2021/05/28 09:49:48] root INFO: name : DBPostProcess
[2021/05/28 09:49:48] root INFO: thresh : 0.3
[2021/05/28 09:49:48] root INFO: unclip_ratio : 1.5
[2021/05/28 09:49:48] root INFO: Train :
[2021/05/28 09:49:48] root INFO: dataset :
[2021/05/28 09:49:48] root INFO: data_dir : ./train_data/icdar2015/text_localization/
[2021/05/28 09:49:48] root INFO: label_file_list : ['./train_data/icdar2015/text_localization/train_icdar2015_label.txt']
[2021/05/28 09:49:48] root INFO: name : SimpleDataSet
[2021/05/28 09:49:48] root INFO: ratio_list : [1.0]
[2021/05/28 09:49:48] root INFO: transforms :
[2021/05/28 09:49:48] root INFO: DecodeImage :
[2021/05/28 09:49:48] root INFO: channel_first : False
[2021/05/28 09:49:48] root INFO: img_mode : BGR
[2021/05/28 09:49:48] root INFO: DetLabelEncode : None
[2021/05/28 09:49:48] root INFO: IaaAugment :
[2021/05/28 09:49:48] root INFO: augmenter_args :
[2021/05/28 09:49:48] root INFO: args :
[2021/05/28 09:49:48] root INFO: p : 0.5
[2021/05/28 09:49:48] root INFO: type : Fliplr
[2021/05/28 09:49:48] root INFO: args :
[2021/05/28 09:49:48] root INFO: rotate : [-10, 10]
[2021/05/28 09:49:48] root INFO: type : Affine
[2021/05/28 09:49:48] root INFO: args :
[2021/05/28 09:49:48] root INFO: size : [0.5, 3]
[2021/05/28 09:49:48] root INFO: type : Resize
[2021/05/28 09:49:48] root INFO: EastRandomCropData :
[2021/05/28 09:49:48] root INFO: keep_ratio : True
[2021/05/28 09:49:48] root INFO: max_tries : 50
[2021/05/28 09:49:48] root INFO: size : [640, 640]
[2021/05/28 09:49:48] root INFO: MakeBorderMap :
[2021/05/28 09:49:48] root INFO: shrink_ratio : 0.4
[2021/05/28 09:49:48] root INFO: thresh_max : 0.7
[2021/05/28 09:49:48] root INFO: thresh_min : 0.3
[2021/05/28 09:49:48] root INFO: MakeShrinkMap :
[2021/05/28 09:49:48] root INFO: min_text_size : 8
[2021/05/28 09:49:48] root INFO: shrink_ratio : 0.4
[2021/05/28 09:49:48] root INFO: NormalizeImage :
[2021/05/28 09:49:48] root INFO: mean : [0.485, 0.456, 0.406]
[2021/05/28 09:49:48] root INFO: order : hwc
[2021/05/28 09:49:48] root INFO: scale : 1./255.
[2021/05/28 09:49:48] root INFO: std : [0.229, 0.224, 0.225]
[2021/05/28 09:49:48] root INFO: ToCHWImage : None
[2021/05/28 09:49:48] root INFO: KeepKeys :
[2021/05/28 09:49:48] root INFO: keep_keys : ['image', 'threshold_map', 'threshold_mask', 'shrink_map', 'shrink_mask']
[2021/05/28 09:49:48] root INFO: loader :
[2021/05/28 09:49:48] root INFO: batch_size_per_card : 16
[2021/05/28 09:49:48] root INFO: drop_last : False
[2021/05/28 09:49:48] root INFO: num_workers : 1
[2021/05/28 09:49:48] root INFO: shuffle : True
[2021/05/28 09:49:48] root INFO: train with paddle 2.0.2 and device CUDAPlace(0)
[2021/05/28 09:49:48] root INFO: Initialize indexs of datasets:['./train_data/icdar2015/text_localization/train_icdar2015_label.txt']
[2021/05/28 09:49:48] root INFO: Initialize indexs of datasets:['./train_data/icdar2015/text_localization/test_icdar2015_label.txt']
W0528 09:49:48.468700 10893 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W0528 09:49:48.473711 10893 device_context.cc:372] device: 0, cuDNN Version: 7.6.
[2021/05/28 09:49:54] root INFO: load pretrained model from ['./pretrain_models/ResNet50_vd_ssld_pretrained']
[2021/05/28 09:49:54] root INFO: train dataloader has 63 iters
[2021/05/28 09:49:54] root INFO: valid dataloader has 500 iters
[2021/05/28 09:49:54] root INFO: During the training process, after the 0th iteration, an evaluation is run every 2000 iterations
[2021/05/28 09:49:54] root INFO: Initialize indexs of datasets:['./train_data/icdar2015/text_localization/train_icdar2015_label.txt']
2021-05-28 09:49:59,340 - ERROR - DataLoader reader thread raised an exception!
Traceback (most recent call last):
File "tools/train.py", line 125, in
main(config, device, logger, vdl_writer)
File "tools/train.py", line 102, in main
eval_class, pre_best_model_dict, logger, vdl_writer)
File "/home/aistudio/PaddleOCR/tools/program.py", line 199, in train
Exception in thread Thread-1:
Traceback (most recent call last):
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 684, in _get_data
data = self._data_queue.get(timeout=self._timeout)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/multiprocessing/queues.py", line 105, in get
raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 616, in _thread_loop
batch = self._get_data()
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 700, in _get_data
"pids: {}".format(len(failed_workers), pids))
RuntimeError: DataLoader 1 workers exit unexpectedly, pids: 10956
for idx, batch in enumerate(train_dataloader):

File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 779, in __next__
data = self._reader.read_next_var_list()
SystemError: (Fatal) Blocking queue is killed because the data reader raises an exception.
[Hint: Expected killed_ != true, but received killed_:1 == true:1.] (at /paddle/paddle/fluid/operators/reader/blocking_queue.h:158)

"\n# 单机多卡训练，通过 --gpus 参数设置使用的GPU ID\n!python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/det/det_r50_vd_db.yml -o Global.pretrain_weights=./pretrain_models/ResNet50_vd_ssld_pretrained/\n"

麻烦大佬帮忙看看是怎么回事~~~Thanks♪(･ω･)ﾉ

全部评论(1)

热牛奶坑

#2 回复于2021-05

Global:
  use_gpu: true
  epoch_num: 1200
  log_smooth_window: 20
  print_batch_step: 10
  save_model_dir: ./output/det_r50_vd/
  save_epoch_step: 1200
  # evaluation is run every 2000 iterations
  eval_batch_step: [0,2000]
  # 1. If pretrained_model is saved in static mode, such as classification pretrained model
  #    from static branch, load_static_weights must be set as True.
  # 2. If you want to finetune the pretrained models we provide in the docs,
  #    you should set load_static_weights as False.
  load_static_weights: True
  cal_metric_during_train: False
  pretrained_model: ./pretrain_models/ResNet50_vd_ssld_pretrained
  checkpoints:
  save_inference_dir:
  use_visualdl: False
  infer_img: doc/imgs_en/img_10.jpg
  save_res_path: ./output/det_db/predicts_db.txt

Architecture:
  model_type: det
  algorithm: DB
  Transform:
  Backbone:
    name: ResNet
    layers: 50
  Neck:
    name: DBFPN
    out_channels: 256
  Head:
    name: DBHead
    k: 50

Loss:
  name: DBLoss
  balance_loss: true
  main_loss_type: DiceLoss
  alpha: 5
  beta: 10
  ohem_ratio: 3

Optimizer:
  name: Adam
  beta1: 0.9
  beta2: 0.999
  lr:
    learning_rate: 0.001
  regularizer:
    name: 'L2'
    factor: 0

PostProcess:
  name: DBPostProcess
  thresh: 0.3
  box_thresh: 0.7
  max_candidates: 1000
  unclip_ratio: 1.5

Metric:
name: DetMetric
main_indicator: hmean

Train:
  dataset:
    name: SimpleDataSet
    data_dir: ./train_data/icdar2015/text_localization/
    label_file_list:
      - ./train_data/icdar2015/text_localization/train_icdar2015_label.txt
    ratio_list: [1.0]
    transforms:
      - DecodeImage: # load image
          img_mode: BGR
          channel_first: False
      - DetLabelEncode: # Class handling label
      - IaaAugment:
          augmenter_args:
            - { 'type': Fliplr, 'args': { 'p': 0.5 } }
            - { 'type': Affine, 'args': { 'rotate': [-10, 10] } }
            - { 'type': Resize, 'args': { 'size': [0.5, 3] } }
      - EastRandomCropData:
          size: [640, 640]
          max_tries: 50
          keep_ratio: true
      - MakeBorderMap:
          shrink_ratio: 0.4
          thresh_min: 0.3
          thresh_max: 0.7
      - MakeShrinkMap:
          shrink_ratio: 0.4
          min_text_size: 8
      - NormalizeImage:
          scale: 1./255.
          mean: [0.485, 0.456, 0.406]
          std: [0.229, 0.224, 0.225]
          order: 'hwc'
      - ToCHWImage:
      - KeepKeys:
          keep_keys: ['image', 'threshold_map', 'threshold_mask', 'shrink_map', 'shrink_mask'] # the order of the dataloader list
  loader:
    shuffle: True
    drop_last: False
    batch_size_per_card: 16
    num_workers: 8

Eval:
  dataset:
    name: SimpleDataSet
    data_dir: ./train_data/icdar2015/text_localization/
    label_file_list:
      - ./train_data/icdar2015/text_localization/test_icdar2015_label.txt
    transforms:
      - DecodeImage: # load image
          img_mode: BGR
          channel_first: False
      - DetLabelEncode: # Class handling label
      - DetResizeForTest:
          image_shape: [736, 1280]
      - NormalizeImage:
          scale: 1./255.
          mean: [0.485, 0.456, 0.406]
          std: [0.229, 0.224, 0.225]
          order: 'hwc'
      - ToCHWImage:
      - KeepKeys:
          keep_keys: ['image', 'shape', 'polys', 'ignore_tags']
  loader:
    shuffle: False
    drop_last: False
    batch_size_per_card: 1 # must be 1
    num_workers: 8

提issue

需求/bug反馈？一键提issue告诉我们

提pr

发现bug？如果您知道修复办法，欢迎提pr直接参与建设飞桨~