PaddleCheckError: cudaGetDeviceCount failed in paddle::platform::GetCUDADeviceCountImpl, error code : 3
收藏
PaddleCheckError: cudaGetDeviceCount failed in paddle::platform::GetCUDADeviceCountImpl, error code : 3
收藏
1)PaddleCloud paddle-fluid-v1.6.1
2)GPU:v100
1)单机 单卡
在PaddleCloud上同样的代码,
少量训练数据(2张图片)可以正确训练;
大量训练数据正式训练时报错。
Traceback (most recent call last):
File "train.py", line 777, in
train(args)
File "train.py", line 164, in train
place = fluid.CUDAPlace(0)
paddle.fluid.core_avx.EnforceNotMet:
C++ Call Stacks (More useful to developers):
0 std::string paddle::platform::GetTraceBackString<char const*>(char const*&&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::__exception_ptr::exception_ptr, char const*, int)
2 paddle::platform::GetCUDADeviceCount()
Error Message Summary:
PaddleCheckError: cudaGetDeviceCount failed in paddle::platform::GetCUDADeviceCountImpl, error code : 3, Please see detail in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1g3f51e3575c2178246db0a94a430e0038: initialization error at [/paddle/paddle/fluid/platform/gpu_info.cc:67]