提示没有共享内存怎么破?
收藏
快速回复
AI Studio平台使用 问答学习资料 877 4
提示没有共享内存怎么破?
收藏
快速回复
AI Studio平台使用 问答学习资料 877 4

如题:提示没有共享内存怎么破?

 

/home/aistudio/PaddleOCR
[2021/12/23 02:18:34] root INFO: Architecture : 
[2021/12/23 02:18:34] root INFO:     Backbone : 
[2021/12/23 02:18:34] root INFO:         layers : 50
[2021/12/23 02:18:34] root INFO:         name : ResNet
[2021/12/23 02:18:34] root INFO:     Head : 
[2021/12/23 02:18:34] root INFO:         k : 50
[2021/12/23 02:18:34] root INFO:         name : DBHead
[2021/12/23 02:18:34] root INFO:     Neck : 
[2021/12/23 02:18:34] root INFO:         name : DBFPN
[2021/12/23 02:18:34] root INFO:         out_channels : 256
[2021/12/23 02:18:34] root INFO:     Transform : None
[2021/12/23 02:18:34] root INFO:     algorithm : DB
[2021/12/23 02:18:34] root INFO:     model_type : det
[2021/12/23 02:18:34] root INFO: Eval : 
[2021/12/23 02:18:34] root INFO:     dataset : 
[2021/12/23 02:18:34] root INFO:         data_dir : /home/aistudio/icdar2015/text_localization/
[2021/12/23 02:18:34] root INFO:         label_file_list : ['/home/aistudio/icdar2015/text_localization/test_icdar2015_label.txt']
[2021/12/23 02:18:34] root INFO:         name : SimpleDataSet
[2021/12/23 02:18:34] root INFO:         transforms : 
[2021/12/23 02:18:34] root INFO:             DecodeImage : 
[2021/12/23 02:18:34] root INFO:                 channel_first : False
[2021/12/23 02:18:34] root INFO:                 img_mode : BGR
[2021/12/23 02:18:34] root INFO:             DetLabelEncode : None
[2021/12/23 02:18:34] root INFO:             DetResizeForTest : 
[2021/12/23 02:18:34] root INFO:                 image_shape : [736, 1280]
[2021/12/23 02:18:34] root INFO:             NormalizeImage : 
[2021/12/23 02:18:34] root INFO:                 mean : [0.485, 0.456, 0.406]
[2021/12/23 02:18:34] root INFO:                 order : hwc
[2021/12/23 02:18:34] root INFO:                 scale : 1./255.
[2021/12/23 02:18:34] root INFO:                 std : [0.229, 0.224, 0.225]
[2021/12/23 02:18:34] root INFO:             ToCHWImage : None
[2021/12/23 02:18:34] root INFO:             KeepKeys : 
[2021/12/23 02:18:34] root INFO:                 keep_keys : ['image', 'shape', 'polys', 'ignore_tags']
[2021/12/23 02:18:34] root INFO:     loader : 
[2021/12/23 02:18:34] root INFO:         batch_size_per_card : 1
[2021/12/23 02:18:34] root INFO:         drop_last : False
[2021/12/23 02:18:34] root INFO:         num_workers : 8
[2021/12/23 02:18:34] root INFO:         shuffle : False
[2021/12/23 02:18:34] root INFO: Global : 
[2021/12/23 02:18:34] root INFO:     cal_metric_during_train : False
[2021/12/23 02:18:34] root INFO:     checkpoints : None
[2021/12/23 02:18:34] root INFO:     debug : False
[2021/12/23 02:18:34] root INFO:     distributed : False
[2021/12/23 02:18:34] root INFO:     epoch_num : 1200
[2021/12/23 02:18:34] root INFO:     eval_batch_step : [10, 20]
[2021/12/23 02:18:34] root INFO:     infer_img : doc/imgs_en/img_10.jpg
[2021/12/23 02:18:34] root INFO:     log_smooth_window : 20
[2021/12/23 02:18:34] root INFO:     pretrain_weights : ./pretrain_models/ResNet50_vd_ssld_pretrained.pdparams
[2021/12/23 02:18:34] root INFO:     pretrained_model : ./pretrain_models/ResNet50_vd_ssld_pretrained
[2021/12/23 02:18:34] root INFO:     print_batch_step : 10
[2021/12/23 02:18:34] root INFO:     save_epoch_step : 100
[2021/12/23 02:18:34] root INFO:     save_inference_dir : None
[2021/12/23 02:18:34] root INFO:     save_model_dir : ./output/det_r50_vd/
[2021/12/23 02:18:34] root INFO:     save_res_path : ./output/det_db/predicts_db.txt
[2021/12/23 02:18:34] root INFO:     use_gpu : True
[2021/12/23 02:18:34] root INFO:     use_visualdl : False
[2021/12/23 02:18:34] root INFO: Loss : 
[2021/12/23 02:18:34] root INFO:     alpha : 5
[2021/12/23 02:18:34] root INFO:     balance_loss : True
[2021/12/23 02:18:34] root INFO:     beta : 10
[2021/12/23 02:18:34] root INFO:     main_loss_type : DiceLoss
[2021/12/23 02:18:34] root INFO:     name : DBLoss
[2021/12/23 02:18:34] root INFO:     ohem_ratio : 3
[2021/12/23 02:18:34] root INFO: Metric : 
[2021/12/23 02:18:34] root INFO:     main_indicator : hmean
[2021/12/23 02:18:34] root INFO:     name : DetMetric
[2021/12/23 02:18:34] root INFO: Optimizer : 
[2021/12/23 02:18:34] root INFO:     beta1 : 0.9
[2021/12/23 02:18:34] root INFO:     beta2 : 0.999
[2021/12/23 02:18:34] root INFO:     lr : 
[2021/12/23 02:18:34] root INFO:         learning_rate : 0.001
[2021/12/23 02:18:34] root INFO:     name : Adam
[2021/12/23 02:18:34] root INFO:     regularizer : 
[2021/12/23 02:18:34] root INFO:         factor : 0
[2021/12/23 02:18:34] root INFO:         name : L2
[2021/12/23 02:18:34] root INFO: PostProcess : 
[2021/12/23 02:18:34] root INFO:     box_thresh : 0.7
[2021/12/23 02:18:34] root INFO:     max_candidates : 1000
[2021/12/23 02:18:34] root INFO:     name : DBPostProcess
[2021/12/23 02:18:34] root INFO:     thresh : 0.3
[2021/12/23 02:18:34] root INFO:     unclip_ratio : 1.5
[2021/12/23 02:18:34] root INFO: Train : 
[2021/12/23 02:18:34] root INFO:     dataset : 
[2021/12/23 02:18:34] root INFO:         data_dir : /home/aistudio/icdar2015/text_localization
[2021/12/23 02:18:34] root INFO:         label_file_list : ['/home/aistudio/icdar2015/text_localization/train_icdar2015_label.txt']
[2021/12/23 02:18:34] root INFO:         name : SimpleDataSet
[2021/12/23 02:18:34] root INFO:         ratio_list : [1.0]
[2021/12/23 02:18:34] root INFO:         transforms : 
[2021/12/23 02:18:34] root INFO:             DecodeImage : 
[2021/12/23 02:18:34] root INFO:                 channel_first : False
[2021/12/23 02:18:34] root INFO:                 img_mode : BGR
[2021/12/23 02:18:34] root INFO:             DetLabelEncode : None
[2021/12/23 02:18:34] root INFO:             IaaAugment : 
[2021/12/23 02:18:34] root INFO:                 augmenter_args : 
[2021/12/23 02:18:34] root INFO:                     args : 
[2021/12/23 02:18:34] root INFO:                         p : 0.5
[2021/12/23 02:18:34] root INFO:                     type : Fliplr
[2021/12/23 02:18:34] root INFO:                     args : 
[2021/12/23 02:18:34] root INFO:                         rotate : [-10, 10]
[2021/12/23 02:18:34] root INFO:                     type : Affine
[2021/12/23 02:18:34] root INFO:                     args : 
[2021/12/23 02:18:34] root INFO:                         size : [0.5, 3]
[2021/12/23 02:18:34] root INFO:                     type : Resize
[2021/12/23 02:18:34] root INFO:             EastRandomCropData : 
[2021/12/23 02:18:34] root INFO:                 keep_ratio : True
[2021/12/23 02:18:34] root INFO:                 max_tries : 50
[2021/12/23 02:18:34] root INFO:                 size : [640, 640]
[2021/12/23 02:18:34] root INFO:             MakeBorderMap : 
[2021/12/23 02:18:34] root INFO:                 shrink_ratio : 0.4
[2021/12/23 02:18:34] root INFO:                 thresh_max : 0.7
[2021/12/23 02:18:34] root INFO:                 thresh_min : 0.3
[2021/12/23 02:18:34] root INFO:             MakeShrinkMap : 
[2021/12/23 02:18:34] root INFO:                 min_text_size : 8
[2021/12/23 02:18:34] root INFO:                 shrink_ratio : 0.4
[2021/12/23 02:18:34] root INFO:             NormalizeImage : 
[2021/12/23 02:18:34] root INFO:                 mean : [0.485, 0.456, 0.406]
[2021/12/23 02:18:34] root INFO:                 order : hwc
[2021/12/23 02:18:34] root INFO:                 scale : 1./255.
[2021/12/23 02:18:34] root INFO:                 std : [0.229, 0.224, 0.225]
[2021/12/23 02:18:34] root INFO:             ToCHWImage : None
[2021/12/23 02:18:34] root INFO:             KeepKeys : 
[2021/12/23 02:18:34] root INFO:                 keep_keys : ['image', 'threshold_map', 'threshold_mask', 'shrink_map', 'shrink_mask']
[2021/12/23 02:18:34] root INFO:     loader : 
[2021/12/23 02:18:34] root INFO:         batch_size_per_card : 28
[2021/12/23 02:18:34] root INFO:         drop_last : False
[2021/12/23 02:18:34] root INFO:         num_workers : 4
[2021/12/23 02:18:34] root INFO:         shuffle : True
[2021/12/23 02:18:34] root INFO: profiler_options : None
[2021/12/23 02:18:34] root INFO: train with paddle 2.2.1 and device CUDAPlace(0)
[2021/12/23 02:18:34] root INFO: Initialize indexs of datasets:['/home/aistudio/icdar2015/text_localization/train_icdar2015_label.txt']
[2021/12/23 02:18:34] root INFO: Initialize indexs of datasets:['/home/aistudio/icdar2015/text_localization/test_icdar2015_label.txt']
W1223 02:18:34.179519 12460 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W1223 02:18:34.184252 12460 device_context.cc:465] device: 0, cuDNN Version: 7.6.
[2021/12/23 02:18:39] root WARNING: The shape of model params neck.in2_conv.weight [256, 256, 1, 1] not matched with loaded params out.weight [2048, 1000] !
[2021/12/23 02:18:39] root WARNING: The shape of model params neck.in3_conv.weight [256, 512, 1, 1] not matched with loaded params out.bias [1000] !
[2021/12/23 02:18:39] root INFO: load pretrain successful from ./pretrain_models/ResNet50_vd_ssld_pretrained
[2021/12/23 02:18:39] root INFO: train dataloader has 36 iters
[2021/12/23 02:18:39] root INFO: valid dataloader has 500 iters
[2021/12/23 02:18:39] root INFO: During the training process, after the 10th iteration, an evaluation is run every 20 iterations
[2021/12/23 02:18:39] root INFO: Initialize indexs of datasets:['/home/aistudio/icdar2015/text_localization/train_icdar2015_label.txt']
ERROR: Unexpected BUS error encountered in DataLoader worker. This might be caused by insufficient shared memory (shm), please check whether use_shared_memory is set and storage space in /dev/shm is enough
ERROR: Unexpected BUS error encountered in DataLoader worker. This might be caused by insufficient shared memory (shm), please check whether use_shared_memory is set and storage space in /dev/shm is enough
ERROR: Unexpected BUS error encountered in DataLoader worker. This might be caused by insufficient shared memory (shm), please check whether use_shared_memory is set and storage space in /dev/shm is enough
Traceback (most recent call last):
  File "tools/train.py", line 148, in 
    main(config, device, logger, vdl_writer)
  File "tools/train.py", line 125, in main
    eval_class, pre_best_model_dict, logger, vdl_writer, scaler)
  File "/home/aistudio/PaddleOCR/tools/program.py", line 255, in train
    optimizer.step()
  File "", line 2, in step
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/base.py", line 276, in __impl__
    return func(*args, **kwargs)
  File "", line 2, in step
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in __impl__
    return wrapped_func(*args, **kwargs)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/framework.py", line 229, in __impl__
    return func(*args, **kwargs)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/adam.py", line 422, in step
    loss=None, startup_program=None, params_grads=params_grads)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/optimizer.py", line 891, in _apply_optimize
    optimize_ops = self._create_optimization_pass(params_grads)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/optimizer.py", line 696, in _create_optimization_pass
    self._append_optimize_op(target_block, param_and_grad)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/optimizer/adam.py", line 331, in _append_optimize_op
    'beta2', _beta2, 'multi_precision', find_master)
  File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/multiprocess_utils.py", line 134, in __handler__
    core._throw_error_if_process_failed()
SystemError: (Fatal) DataLoader process (pid   1. If run DataLoader by DataLoader.from_generator(...), queue capacity is set by from_generator(..., capacity=xx, ...).
  2. If run DataLoader by DataLoader(dataset, ...), queue capacity is set as 2 times of the max value of num_workers and len(places).
  3. If run by DataLoader(dataset, ..., use_shared_memory=True), set use_shared_memory=False for not using shared memory.) exited is killed by signal: 12511.
  It may be caused by insufficient shared storage space. This problem usually occurs when using docker as a development environment.
  Please use command `df -h` to check the storage space of `/dev/shm`. Shared storage space needs to be greater than (DataLoader Num * DataLoader queue capacity * 1 batch data size).
  You can solve this problem by increasing the shared storage space or reducing the queue capacity appropriately.
Bus error (at /paddle/paddle/fluid/imperative/data_loader.cc:177)
0
收藏
回复
全部评论(4)
时间顺序
JavaRoom
#2 回复于2021-12

就是aistudio用的人多,资源分配不过来。

关闭重开即可解决。

0
回复
三岁
#3 回复于2021-12

好哒,重启就行啦~~~

0
回复
Y_kira
#4 回复于2021-12

重启 哈哈哈哈  最近遇到好多次了

0
回复
DeepGeGe
#5 回复于2021-12

说起来AI Studio上的共享内存有多大呢?支持多少个人同时使用?

0
回复
在@后输入用户全名并按空格结束,可艾特全站任一用户