提示没有共享内存怎么破?
收藏
如题:提示没有共享内存怎么破?
/home/aistudio/PaddleOCR
[2021/12/23 02:18:34] root INFO: Architecture :
[2021/12/23 02:18:34] root INFO: Backbone :
[2021/12/23 02:18:34] root INFO: layers : 50
[2021/12/23 02:18:34] root INFO: name : ResNet
[2021/12/23 02:18:34] root INFO: Head :
[2021/12/23 02:18:34] root INFO: k : 50
[2021/12/23 02:18:34] root INFO: name : DBHead
[2021/12/23 02:18:34] root INFO: Neck :
[2021/12/23 02:18:34] root INFO: name : DBFPN
[2021/12/23 02:18:34] root INFO: out_channels : 256
[2021/12/23 02:18:34] root INFO: Transform : None
[2021/12/23 02:18:34] root INFO: algorithm : DB
[2021/12/23 02:18:34] root INFO: model_type : det
[2021/12/23 02:18:34] root INFO: Eval :
[2021/12/23 02:18:34] root INFO: dataset :
[2021/12/23 02:18:34] root INFO: data_dir : /home/aistudio/icdar2015/text_localization/
[2021/12/23 02:18:34] root INFO: label_file_list : ['/home/aistudio/icdar2015/text_localization/test_icdar2015_label.txt']
[2021/12/23 02:18:34] root INFO: name : SimpleDataSet
[2021/12/23 02:18:34] root INFO: transforms :
[2021/12/23 02:18:34] root INFO: DecodeImage :
[2021/12/23 02:18:34] root INFO: channel_first : False
[2021/12/23 02:18:34] root INFO: img_mode : BGR
[2021/12/23 02:18:34] root INFO: DetLabelEncode : None
[2021/12/23 02:18:34] root INFO: DetResizeForTest :
[2021/12/23 02:18:34] root INFO: image_shape : [736, 1280]
[2021/12/23 02:18:34] root INFO: NormalizeImage :
[2021/12/23 02:18:34] root INFO: mean : [0.485, 0.456, 0.406]
[2021/12/23 02:18:34] root INFO: order : hwc
[2021/12/23 02:18:34] root INFO: scale : 1./255.
[2021/12/23 02:18:34] root INFO: std : [0.229, 0.224, 0.225]
[2021/12/23 02:18:34] root INFO: ToCHWImage : None
[2021/12/23 02:18:34] root INFO: KeepKeys :
[2021/12/23 02:18:34] root INFO: keep_keys : ['image', 'shape', 'polys', 'ignore_tags']
[2021/12/23 02:18:34] root INFO: loader :
[2021/12/23 02:18:34] root INFO: batch_size_per_card : 1
[2021/12/23 02:18:34] root INFO: drop_last : False
[2021/12/23 02:18:34] root INFO: num_workers : 8
[2021/12/23 02:18:34] root INFO: shuffle : False
[2021/12/23 02:18:34] root INFO: Global :
[2021/12/23 02:18:34] root INFO: cal_metric_during_train : False
[2021/12/23 02:18:34] root INFO: checkpoints : None
[2021/12/23 02:18:34] root INFO: debug : False
[2021/12/23 02:18:34] root INFO: distributed : False
[2021/12/23 02:18:34] root INFO: epoch_num : 1200
[2021/12/23 02:18:34] root INFO: eval_batch_step : [10, 20]
[2021/12/23 02:18:34] root INFO: infer_img : doc/imgs_en/img_10.jpg
[2021/12/23 02:18:34] root INFO: log_smooth_window : 20
[2021/12/23 02:18:34] root INFO: pretrain_weights : ./pretrain_models/ResNet50_vd_ssld_pretrained.pdparams
[2021/12/23 02:18:34] root INFO: pretrained_model : ./pretrain_models/ResNet50_vd_ssld_pretrained
[2021/12/23 02:18:34] root INFO: print_batch_step : 10
[2021/12/23 02:18:34] root INFO: save_epoch_step : 100
[2021/12/23 02:18:34] root INFO: save_inference_dir : None
[2021/12/23 02:18:34] root INFO: save_model_dir : ./output/det_r50_vd/
[2021/12/23 02:18:34] root INFO: save_res_path : ./output/det_db/predicts_db.txt
[2021/12/23 02:18:34] root INFO: use_gpu : True
[2021/12/23 02:18:34] root INFO: use_visualdl : False
[2021/12/23 02:18:34] root INFO: Loss :
[2021/12/23 02:18:34] root INFO: alpha : 5
[2021/12/23 02:18:34] root INFO: balance_loss : True
[2021/12/23 02:18:34] root INFO: beta : 10
[2021/12/23 02:18:34] root INFO: main_loss_type : DiceLoss
[2021/12/23 02:18:34] root INFO: name : DBLoss
[2021/12/23 02:18:34] root INFO: ohem_ratio : 3
[2021/12/23 02:18:34] root INFO: Metric :
[2021/12/23 02:18:34] root INFO: main_indicator : hmean
[2021/12/23 02:18:34] root INFO: name : DetMetric
[2021/12/23 02:18:34] root INFO: Optimizer :
[2021/12/23 02:18:34] root INFO: beta1 : 0.9
[2021/12/23 02:18:34] root INFO: beta2 : 0.999
[2021/12/23 02:18:34] root INFO: lr :
[2021/12/23 02:18:34] root INFO: learning_rate : 0.001
[2021/12/23 02:18:34] root INFO: name : Adam
[2021/12/23 02:18:34] root INFO: regularizer :
[2021/12/23 02:18:34] root INFO: factor : 0
[2021/12/23 02:18:34] root INFO: name : L2
[2021/12/23 02:18:34] root INFO: PostProcess :
[2021/12/23 02:18:34] root INFO: box_thresh : 0.7
[2021/12/23 02:18:34] root INFO: max_candidates : 1000
[2021/12/23 02:18:34] root INFO: name : DBPostProcess
[2021/12/23 02:18:34] root INFO: thresh : 0.3
[2021/12/23 02:18:34] root INFO: unclip_ratio : 1.5
[2021/12/23 02:18:34] root INFO: Train :
[2021/12/23 02:18:34] root INFO: dataset :
[2021/12/23 02:18:34] root INFO: data_dir : /home/aistudio/icdar2015/text_localization
[2021/12/23 02:18:34] root INFO: label_file_list : ['/home/aistudio/icdar2015/text_localization/train_icdar2015_label.txt']
[2021/12/23 02:18:34] root INFO: name : SimpleDataSet
[2021/12/23 02:18:34] root INFO: ratio_list : [1.0]
[2021/12/23 02:18:34] root INFO: transforms :
[2021/12/23 02:18:34] root INFO: DecodeImage :
[2021/12/23 02:18:34] root INFO: channel_first : False
[2021/12/23 02:18:34] root INFO: img_mode : BGR
[2021/12/23 02:18:34] root INFO: DetLabelEncode : None
[2021/12/23 02:18:34] root INFO: IaaAugment :
[2021/12/23 02:18:34] root INFO: augmenter_args :
[2021/12/23 02:18:34] root INFO: args :
[2021/12/23 02:18:34] root INFO: p : 0.5
[2021/12/23 02:18:34] root INFO: type : Fliplr
[2021/12/23 02:18:34] root INFO: args :
[2021/12/23 02:18:34] root INFO: rotate : [-10, 10]
[2021/12/23 02:18:34] root INFO: type : Affine
[2021/12/23 02:18:34] root INFO: args :
[2021/12/23 02:18:34] root INFO: size : [0.5, 3]
[2021/12/23 02:18:34] root INFO: type : Resize
[2021/12/23 02:18:34] root INFO: EastRandomCropData :
[2021/12/23 02:18:34] root INFO: keep_ratio : True
[2021/12/23 02:18:34] root INFO: max_tries : 50
[2021/12/23 02:18:34] root INFO: size : [640, 640]
[2021/12/23 02:18:34] root INFO: MakeBorderMap :
[2021/12/23 02:18:34] root INFO: shrink_ratio : 0.4
[2021/12/23 02:18:34] root INFO: thresh_max : 0.7
[2021/12/23 02:18:34] root INFO: thresh_min : 0.3
[2021/12/23 02:18:34] root INFO: MakeShrinkMap :
[2021/12/23 02:18:34] root INFO: min_text_size : 8
[2021/12/23 02:18:34] root INFO: shrink_ratio : 0.4
[2021/12/23 02:18:34] root INFO: NormalizeImage :
[2021/12/23 02:18:34] root INFO: mean : [0.485, 0.456, 0.406]
[2021/12/23 02:18:34] root INFO: order : hwc
[2021/12/23 02:18:34] root INFO: scale : 1./255.
[2021/12/23 02:18:34] root INFO: std : [0.229, 0.224, 0.225]
[2021/12/23 02:18:34] root INFO: ToCHWImage : None
[2021/12/23 02:18:34] root INFO: KeepKeys :
[2021/12/23 02:18:34] root INFO: keep_keys : ['image', 'threshold_map', 'threshold_mask', 'shrink_map', 'shrink_mask']
[2021/12/23 02:18:34] root INFO: loader :
[2021/12/23 02:18:34] root INFO: batch_size_per_card : 28
[2021/12/23 02:18:34] root INFO: drop_last : False
[2021/12/23 02:18:34] root INFO: num_workers : 4
[2021/12/23 02:18:34] root INFO: shuffle : True
[2021/12/23 02:18:34] root INFO: profiler_options : None
[2021/12/23 02:18:34] root INFO: train with paddle 2.2.1 and device CUDAPlace(0)
[2021/12/23 02:18:34] root INFO: Initialize indexs of datasets:['/home/aistudio/icdar2015/text_localization/train_icdar2015_label.txt']
[2021/12/23 02:18:34] root INFO: Initialize indexs of datasets:['/home/aistudio/icdar2015/text_localization/test_icdar2015_label.txt']
W1223 02:18:34.179519 12460 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W1223 02:18:34.184252 12460 device_context.cc:465] device: 0, cuDNN Version: 7.6.
[2021/12/23 02:18:39] root WARNING: The shape of model params neck.in2_conv.weight [256, 256, 1, 1] not matched with loaded params out.weight [2048, 1000] !
[2021/12/23 02:18:39] root WARNING: The shape of model params neck.in3_conv.weight [256, 512, 1, 1] not matched with loaded params out.bias [1000] !
[2021/12/23 02:18:39] root INFO: load pretrain successful from ./pretrain_models/ResNet50_vd_ssld_pretrained
[2021/12/23 02:18:39] root INFO: train dataloader has 36 iters
[2021/12/23 02:18:39] root INFO: valid dataloader has 500 iters
[2021/12/23 02:18:39] root INFO: During the training process, after the 10th iteration, an evaluation is run every 20 iterations
[2021/12/23 02:18:39] root INFO: Initialize indexs of datasets:['/home/aistudio/icdar2015/text_localization/train_icdar2015_label.txt']
ERROR: Unexpected BUS error encountered in DataLoader worker. This might be caused by insufficient shared memory (shm), please check whether use_shared_memory is set and storage space in /dev/shm is enough
ERROR: Unexpected BUS error encountered in DataLoader worker. This might be caused by insufficient shared memory (shm), please check whether use_shared_memory is set and storage space in /dev/shm is enough
ERROR: Unexpected BUS error encountered in DataLoader worker. This might be caused by insufficient shared memory (shm), please check whether use_shared_memory is set and storage space in /dev/shm is enough
Traceback (most recent call last):
File "tools/train.py", line 148, in
main(config, device, logger, vdl_writer)
File "tools/train.py", line 125, in main
eval_class, pre_best_model_dict, logger, vdl_writer, scaler)
File "/home/aistudio/PaddleOCR/tools/program.py", line 255, in train
optimizer.step()
File "", line 2, in step
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/base.py", line 276, in __impl__
return func(*args, **kwargs)
File "", line 2, in step
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
return wrapped_func(*args, **kwargs)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py", line 229, in __impl__
return func(*args, **kwargs)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/adam.py", line 422, in step
loss=None, startup_program=None, params_grads=params_grads)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/optimizer.py", line 891, in _apply_optimize
optimize_ops = self._create_optimization_pass(params_grads)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/optimizer.py", line 696, in _create_optimization_pass
self._append_optimize_op(target_block, param_and_grad)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/adam.py", line 331, in _append_optimize_op
'beta2', _beta2, 'multi_precision', find_master)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/multiprocess_utils.py", line 134, in __handler__
core._throw_error_if_process_failed()
SystemError: (Fatal) DataLoader process (pid 1. If run DataLoader by DataLoader.from_generator(...), queue capacity is set by from_generator(..., capacity=xx, ...).
2. If run DataLoader by DataLoader(dataset, ...), queue capacity is set as 2 times of the max value of num_workers and len(places).
3. If run by DataLoader(dataset, ..., use_shared_memory=True), set use_shared_memory=False for not using shared memory.) exited is killed by signal: 12511.
It may be caused by insufficient shared storage space. This problem usually occurs when using docker as a development environment.
Please use command `df -h` to check the storage space of `/dev/shm`. Shared storage space needs to be greater than (DataLoader Num * DataLoader queue capacity * 1 batch data size).
You can solve this problem by increasing the shared storage space or reducing the queue capacity appropriately.
Bus error (at /paddle/paddle/fluid/imperative/data_loader.cc:177)
0
收藏
请登录后评论
就是aistudio用的人多,资源分配不过来。
关闭重开即可解决。
好哒,重启就行啦~~~
重启 哈哈哈哈 最近遇到好多次了
说起来AI Studio上的共享内存有多大呢?支持多少个人同时使用?