参考了 这个文档 进行finetune训练
然后报错,好像是数据集没有成功加载?但是确实数据集已经下载了,直接打开目录也可以打开
# 单机单卡训练 det_r50_vd 模型
%cd ../PaddleOCR/
!pip install imgaug
!pip install pyclipper
!pip install lmdb
!pip install Levenshtein
!python3 tools/train.py -c configs/det/det_r50_vd_db.yml \
-o Global.pretrain_weights=./pretrain_models/ResNet50_vd_ssld_pretrained/
"""
# 单机多卡训练,通过 --gpus 参数设置使用的GPU ID
!python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/det/det_r50_vd_db.yml \
-o Global.pretrain_weights=./pretrain_models/ResNet50_vd_ssld_pretrained/
"""
----------------------------------------------------------------------------------------------------
/home/aistudio/PaddleOCR
Looking in indexes: https://mirror.baidu.com/pypi/simple/
Requirement already satisfied: imgaug in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (0.4.0)
Requirement already satisfied: scipy in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug) (1.6.3)
Requirement already satisfied: Pillow in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug) (7.1.2)
Requirement already satisfied: numpy>=1.15 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug) (1.20.3)
Requirement already satisfied: opencv-python in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug) (4.1.1.26)
Requirement already satisfied: matplotlib in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug) (2.2.3)
Requirement already satisfied: Shapely in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug) (1.7.1)
Requirement already satisfied: six in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug) (1.15.0)
Requirement already satisfied: imageio in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug) (2.6.1)
Requirement already satisfied: scikit-image>=0.14.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from imgaug) (0.18.1)
Requirement already satisfied: python-dateutil>=2.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->imgaug) (2.8.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->imgaug) (1.1.0)
Requirement already satisfied: cycler>=0.10 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->imgaug) (0.10.0)
Requirement already satisfied: pytz in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->imgaug) (2019.3)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->imgaug) (2.4.2)
Requirement already satisfied: networkx>=2.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from scikit-image>=0.14.2->imgaug) (2.4)
Requirement already satisfied: PyWavelets>=1.1.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from scikit-image>=0.14.2->imgaug) (1.1.1)
Requirement already satisfied: tifffile>=2019.7.26 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from scikit-image>=0.14.2->imgaug) (2021.4.8)
Requirement already satisfied: setuptools in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from kiwisolver>=1.0.1->matplotlib->imgaug) (56.2.0)
Requirement already satisfied: decorator>=4.3.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from networkx>=2.0->scikit-image>=0.14.2->imgaug) (4.4.2)
Looking in indexes: https://mirror.baidu.com/pypi/simple/
Requirement already satisfied: pyclipper in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (1.2.1)
Looking in indexes: https://mirror.baidu.com/pypi/simple/
Requirement already satisfied: lmdb in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (1.2.1)
Looking in indexes: https://mirror.baidu.com/pypi/simple/
Requirement already satisfied: Levenshtein in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (0.12.0)
Requirement already satisfied: setuptools in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from Levenshtein) (56.2.0)
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
def convert_to_list(value, n, name, dtype=np.int):
[2021/05/28 09:49:48] root INFO: Architecture :
[2021/05/28 09:49:48] root INFO: Backbone :
[2021/05/28 09:49:48] root INFO: layers : 50
[2021/05/28 09:49:48] root INFO: name : ResNet
[2021/05/28 09:49:48] root INFO: Head :
[2021/05/28 09:49:48] root INFO: k : 50
[2021/05/28 09:49:48] root INFO: name : DBHead
[2021/05/28 09:49:48] root INFO: Neck :
[2021/05/28 09:49:48] root INFO: name : DBFPN
[2021/05/28 09:49:48] root INFO: out_channels : 256
[2021/05/28 09:49:48] root INFO: Transform : None
[2021/05/28 09:49:48] root INFO: algorithm : DB
[2021/05/28 09:49:48] root INFO: model_type : det
[2021/05/28 09:49:48] root INFO: Eval :
[2021/05/28 09:49:48] root INFO: dataset :
[2021/05/28 09:49:48] root INFO: data_dir : ./train_data/icdar2015/text_localization/
[2021/05/28 09:49:48] root INFO: label_file_list : ['./train_data/icdar2015/text_localization/test_icdar2015_label.txt']
[2021/05/28 09:49:48] root INFO: name : SimpleDataSet
[2021/05/28 09:49:48] root INFO: transforms :
[2021/05/28 09:49:48] root INFO: DecodeImage :
[2021/05/28 09:49:48] root INFO: channel_first : False
[2021/05/28 09:49:48] root INFO: img_mode : BGR
[2021/05/28 09:49:48] root INFO: DetLabelEncode : None
[2021/05/28 09:49:48] root INFO: DetResizeForTest :
[2021/05/28 09:49:48] root INFO: image_shape : [736, 1280]
[2021/05/28 09:49:48] root INFO: NormalizeImage :
[2021/05/28 09:49:48] root INFO: mean : [0.485, 0.456, 0.406]
[2021/05/28 09:49:48] root INFO: order : hwc
[2021/05/28 09:49:48] root INFO: scale : 1./255.
[2021/05/28 09:49:48] root INFO: std : [0.229, 0.224, 0.225]
[2021/05/28 09:49:48] root INFO: ToCHWImage : None
[2021/05/28 09:49:48] root INFO: KeepKeys :
[2021/05/28 09:49:48] root INFO: keep_keys : ['image', 'shape', 'polys', 'ignore_tags']
[2021/05/28 09:49:48] root INFO: loader :
[2021/05/28 09:49:48] root INFO: batch_size_per_card : 1
[2021/05/28 09:49:48] root INFO: drop_last : False
[2021/05/28 09:49:48] root INFO: num_workers : 8
[2021/05/28 09:49:48] root INFO: shuffle : False
[2021/05/28 09:49:48] root INFO: Global :
[2021/05/28 09:49:48] root INFO: cal_metric_during_train : False
[2021/05/28 09:49:48] root INFO: checkpoints : None
[2021/05/28 09:49:48] root INFO: debug : False
[2021/05/28 09:49:48] root INFO: distributed : False
[2021/05/28 09:49:48] root INFO: epoch_num : 1200
[2021/05/28 09:49:48] root INFO: eval_batch_step : [0, 2000]
[2021/05/28 09:49:48] root INFO: infer_img : doc/imgs_en/img_10.jpg
[2021/05/28 09:49:48] root INFO: load_static_weights : True
[2021/05/28 09:49:48] root INFO: log_smooth_window : 20
[2021/05/28 09:49:48] root INFO: pretrain_weights : ./pretrain_models/ResNet50_vd_ssld_pretrained/
[2021/05/28 09:49:48] root INFO: pretrained_model : ./pretrain_models/ResNet50_vd_ssld_pretrained
[2021/05/28 09:49:48] root INFO: print_batch_step : 10
[2021/05/28 09:49:48] root INFO: save_epoch_step : 1200
[2021/05/28 09:49:48] root INFO: save_inference_dir : None
[2021/05/28 09:49:48] root INFO: save_model_dir : ./output/det_r50_vd/
[2021/05/28 09:49:48] root INFO: save_res_path : ./output/det_db/predicts_db.txt
[2021/05/28 09:49:48] root INFO: use_gpu : True
[2021/05/28 09:49:48] root INFO: use_visualdl : False
[2021/05/28 09:49:48] root INFO: Loss :
[2021/05/28 09:49:48] root INFO: alpha : 5
[2021/05/28 09:49:48] root INFO: balance_loss : True
[2021/05/28 09:49:48] root INFO: beta : 10
[2021/05/28 09:49:48] root INFO: main_loss_type : DiceLoss
[2021/05/28 09:49:48] root INFO: name : DBLoss
[2021/05/28 09:49:48] root INFO: ohem_ratio : 3
[2021/05/28 09:49:48] root INFO: Metric :
[2021/05/28 09:49:48] root INFO: main_indicator : hmean
[2021/05/28 09:49:48] root INFO: name : DetMetric
[2021/05/28 09:49:48] root INFO: Optimizer :
[2021/05/28 09:49:48] root INFO: beta1 : 0.9
[2021/05/28 09:49:48] root INFO: beta2 : 0.999
[2021/05/28 09:49:48] root INFO: lr :
[2021/05/28 09:49:48] root INFO: learning_rate : 0.001
[2021/05/28 09:49:48] root INFO: name : Adam
[2021/05/28 09:49:48] root INFO: regularizer :
[2021/05/28 09:49:48] root INFO: factor : 0
[2021/05/28 09:49:48] root INFO: name : L2
[2021/05/28 09:49:48] root INFO: PostProcess :
[2021/05/28 09:49:48] root INFO: box_thresh : 0.7
[2021/05/28 09:49:48] root INFO: max_candidates : 1000
[2021/05/28 09:49:48] root INFO: name : DBPostProcess
[2021/05/28 09:49:48] root INFO: thresh : 0.3
[2021/05/28 09:49:48] root INFO: unclip_ratio : 1.5
[2021/05/28 09:49:48] root INFO: Train :
[2021/05/28 09:49:48] root INFO: dataset :
[2021/05/28 09:49:48] root INFO: data_dir : ./train_data/icdar2015/text_localization/
[2021/05/28 09:49:48] root INFO: label_file_list : ['./train_data/icdar2015/text_localization/train_icdar2015_label.txt']
[2021/05/28 09:49:48] root INFO: name : SimpleDataSet
[2021/05/28 09:49:48] root INFO: ratio_list : [1.0]
[2021/05/28 09:49:48] root INFO: transforms :
[2021/05/28 09:49:48] root INFO: DecodeImage :
[2021/05/28 09:49:48] root INFO: channel_first : False
[2021/05/28 09:49:48] root INFO: img_mode : BGR
[2021/05/28 09:49:48] root INFO: DetLabelEncode : None
[2021/05/28 09:49:48] root INFO: IaaAugment :
[2021/05/28 09:49:48] root INFO: augmenter_args :
[2021/05/28 09:49:48] root INFO: args :
[2021/05/28 09:49:48] root INFO: p : 0.5
[2021/05/28 09:49:48] root INFO: type : Fliplr
[2021/05/28 09:49:48] root INFO: args :
[2021/05/28 09:49:48] root INFO: rotate : [-10, 10]
[2021/05/28 09:49:48] root INFO: type : Affine
[2021/05/28 09:49:48] root INFO: args :
[2021/05/28 09:49:48] root INFO: size : [0.5, 3]
[2021/05/28 09:49:48] root INFO: type : Resize
[2021/05/28 09:49:48] root INFO: EastRandomCropData :
[2021/05/28 09:49:48] root INFO: keep_ratio : True
[2021/05/28 09:49:48] root INFO: max_tries : 50
[2021/05/28 09:49:48] root INFO: size : [640, 640]
[2021/05/28 09:49:48] root INFO: MakeBorderMap :
[2021/05/28 09:49:48] root INFO: shrink_ratio : 0.4
[2021/05/28 09:49:48] root INFO: thresh_max : 0.7
[2021/05/28 09:49:48] root INFO: thresh_min : 0.3
[2021/05/28 09:49:48] root INFO: MakeShrinkMap :
[2021/05/28 09:49:48] root INFO: min_text_size : 8
[2021/05/28 09:49:48] root INFO: shrink_ratio : 0.4
[2021/05/28 09:49:48] root INFO: NormalizeImage :
[2021/05/28 09:49:48] root INFO: mean : [0.485, 0.456, 0.406]
[2021/05/28 09:49:48] root INFO: order : hwc
[2021/05/28 09:49:48] root INFO: scale : 1./255.
[2021/05/28 09:49:48] root INFO: std : [0.229, 0.224, 0.225]
[2021/05/28 09:49:48] root INFO: ToCHWImage : None
[2021/05/28 09:49:48] root INFO: KeepKeys :
[2021/05/28 09:49:48] root INFO: keep_keys : ['image', 'threshold_map', 'threshold_mask', 'shrink_map', 'shrink_mask']
[2021/05/28 09:49:48] root INFO: loader :
[2021/05/28 09:49:48] root INFO: batch_size_per_card : 16
[2021/05/28 09:49:48] root INFO: drop_last : False
[2021/05/28 09:49:48] root INFO: num_workers : 1
[2021/05/28 09:49:48] root INFO: shuffle : True
[2021/05/28 09:49:48] root INFO: train with paddle 2.0.2 and device CUDAPlace(0)
[2021/05/28 09:49:48] root INFO: Initialize indexs of datasets:['./train_data/icdar2015/text_localization/train_icdar2015_label.txt']
[2021/05/28 09:49:48] root INFO: Initialize indexs of datasets:['./train_data/icdar2015/text_localization/test_icdar2015_label.txt']
W0528 09:49:48.468700 10893 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W0528 09:49:48.473711 10893 device_context.cc:372] device: 0, cuDNN Version: 7.6.
[2021/05/28 09:49:54] root INFO: load pretrained model from ['./pretrain_models/ResNet50_vd_ssld_pretrained']
[2021/05/28 09:49:54] root INFO: train dataloader has 63 iters
[2021/05/28 09:49:54] root INFO: valid dataloader has 500 iters
[2021/05/28 09:49:54] root INFO: During the training process, after the 0th iteration, an evaluation is run every 2000 iterations
[2021/05/28 09:49:54] root INFO: Initialize indexs of datasets:['./train_data/icdar2015/text_localization/train_icdar2015_label.txt']
2021-05-28 09:49:59,340 - ERROR - DataLoader reader thread raised an exception!
Traceback (most recent call last):
File "tools/train.py", line 125, in
main(config, device, logger, vdl_writer)
File "tools/train.py", line 102, in main
eval_class, pre_best_model_dict, logger, vdl_writer)
File "/home/aistudio/PaddleOCR/tools/program.py", line 199, in train
Exception in thread Thread-1:
Traceback (most recent call last):
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 684, in _get_data
data = self._data_queue.get(timeout=self._timeout)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/multiprocessing/queues.py", line 105, in get
raise Empty
_queue.Empty
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 616, in _thread_loop
batch = self._get_data()
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 700, in _get_data
"pids: {}".format(len(failed_workers), pids))
RuntimeError: DataLoader 1 workers exit unexpectedly, pids: 10956
for idx, batch in enumerate(train_dataloader):
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 779, in __next__
data = self._reader.read_next_var_list()
SystemError: (Fatal) Blocking queue is killed because the data reader raises an exception.
[Hint: Expected killed_ != true, but received killed_:1 == true:1.] (at /paddle/paddle/fluid/operators/reader/blocking_queue.h:158)
"\n# 单机多卡训练,通过 --gpus 参数设置使用的GPU ID\n!python3 -m paddle.distributed.launch --gpus '0,1,2,3' tools/train.py -c configs/det/det_r50_vd_db.yml -o Global.pretrain_weights=./pretrain_models/ResNet50_vd_ssld_pretrained/\n"
麻烦大佬帮忙看看是怎么回事~~~Thanks♪(・ω・)ノ
Global:
use_gpu: true
epoch_num: 1200
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/det_r50_vd/
save_epoch_step: 1200
# evaluation is run every 2000 iterations
eval_batch_step: [0,2000]
# 1. If pretrained_model is saved in static mode, such as classification pretrained model
# from static branch, load_static_weights must be set as True.
# 2. If you want to finetune the pretrained models we provide in the docs,
# you should set load_static_weights as False.
load_static_weights: True
cal_metric_during_train: False
pretrained_model: ./pretrain_models/ResNet50_vd_ssld_pretrained
checkpoints:
save_inference_dir:
use_visualdl: False
infer_img: doc/imgs_en/img_10.jpg
save_res_path: ./output/det_db/predicts_db.txt
Architecture:
model_type: det
algorithm: DB
Transform:
Backbone:
name: ResNet
layers: 50
Neck:
name: DBFPN
out_channels: 256
Head:
name: DBHead
k: 50
Loss:
name: DBLoss
balance_loss: true
main_loss_type: DiceLoss
alpha: 5
beta: 10
ohem_ratio: 3
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
learning_rate: 0.001
regularizer:
name: 'L2'
factor: 0
PostProcess:
name: DBPostProcess
thresh: 0.3
box_thresh: 0.7
max_candidates: 1000
unclip_ratio: 1.5
Metric:
name: DetMetric
main_indicator: hmean
Train:
dataset:
name: SimpleDataSet
data_dir: ./train_data/icdar2015/text_localization/
label_file_list:
- ./train_data/icdar2015/text_localization/train_icdar2015_label.txt
ratio_list: [1.0]
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- DetLabelEncode: # Class handling label
- IaaAugment:
augmenter_args:
- { 'type': Fliplr, 'args': { 'p': 0.5 } }
- { 'type': Affine, 'args': { 'rotate': [-10, 10] } }
- { 'type': Resize, 'args': { 'size': [0.5, 3] } }
- EastRandomCropData:
size: [640, 640]
max_tries: 50
keep_ratio: true
- MakeBorderMap:
shrink_ratio: 0.4
thresh_min: 0.3
thresh_max: 0.7
- MakeShrinkMap:
shrink_ratio: 0.4
min_text_size: 8
- NormalizeImage:
scale: 1./255.
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: 'hwc'
- ToCHWImage:
- KeepKeys:
keep_keys: ['image', 'threshold_map', 'threshold_mask', 'shrink_map', 'shrink_mask'] # the order of the dataloader list
loader:
shuffle: True
drop_last: False
batch_size_per_card: 16
num_workers: 8
Eval:
dataset:
name: SimpleDataSet
data_dir: ./train_data/icdar2015/text_localization/
label_file_list:
- ./train_data/icdar2015/text_localization/test_icdar2015_label.txt
transforms:
- DecodeImage: # load image
img_mode: BGR
channel_first: False
- DetLabelEncode: # Class handling label
- DetResizeForTest:
image_shape: [736, 1280]
- NormalizeImage:
scale: 1./255.
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: 'hwc'
- ToCHWImage:
- KeepKeys:
keep_keys: ['image', 'shape', 'polys', 'ignore_tags']
loader:
shuffle: False
drop_last: False
batch_size_per_card: 1 # must be 1
num_workers: 8
jiejuelema