Paddle2.1在AIStudio上的GPU环境报错
收藏
快速回复
AI Studio平台使用 问答Notebook项目 943 3
Paddle2.1在AIStudio上的GPU环境报错
收藏
快速回复
AI Studio平台使用 问答Notebook项目 943 3

在AIStudio跑自己的数据集,dataloader设置为0时报错这个

2021-07-23 16:04:28,904 INFO ('n_thread', 1)
2021-07-23 16:04:28,904 INFO ('epoch', 200)
2021-07-23 16:04:28,904 INFO ('train_path', 'PaddleCV_EffV1Mul/datasets/train.txt')
2021-07-23 16:04:28,904 INFO ('val_path', 'PaddleCV_EffV1Mul/datasets/val.txt')
2021-07-23 16:04:28,904 INFO ('model_path', 'PaddleCV_EffV1Mul/model_save')
2021-07-23 16:04:28,904 INFO ('pre_train_model1', 'PaddleCV_EffV1Mul/EfficientNetB0_pretrained')
2021-07-23 16:04:28,904 INFO ('pre_train_model2', 'PaddleCV_EffV1Mul/EfficientNetB0_pretrained')
2021-07-23 16:04:28,904 INFO ('epoch:', 1, 'begin_train')
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/tensor/creation.py:125: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
if data.dtype == np.object:


--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0 std::thread::_Impl >::_M_run()
1 std::__future_base::_State_baseV2::_M_do_set(std::function ()>*, bool*)
2 paddle::framework::SignalHandle(char const*, int)
3 paddle::platform::GetCurrentTraceBackString[abi:cxx11]()

----------------------
Error Message Summary:
----------------------
FatalError: `Segmentation fault` is detected by the operating system.
[TimeInfo: *** Aborted at 1627027469 (unix time) try "date -d @1627027469" if you are using GNU date ***]
[SignalInfo: *** SIGSEGV (@0x0) received by PID 3027 (TID 0x7ff9ab230700) from PID 0 ***]

Segmentation fault (core dumped)

num_worker设置为1也不行,设置为2时能跑,但是我设置的200个epoch跑到第八个的时候,报错/dev/shm空间溢出,这一块分配内存过小。

0
收藏
回复
全部评论(3)
时间顺序
JavaRoom
#2 回复于2021-07

shm太小了?

0
回复
StaveZhou
#3 回复于2021-07
shm太小了?

是了,在本地或者拿CPU能跑,网上说关闭多线程,但是云端这个多线程关不掉

0
回复
JavaRoom
#4 回复于2021-07
是了,在本地或者拿CPU能跑,网上说关闭多线程,但是云端这个多线程关不掉

我一看数据太大我就懒了不想动了。。。。。。

0
回复
在@后输入用户全名并按空格结束,可艾特全站任一用户