paddle使用不当造成内存泄漏？

项目

数据集

课程

比赛

模型库

活动

论坛

访问飞桨官网

项目

数据集

课程

比赛

模型库

活动

论坛

访问飞桨官网

回回来的不完美k 发布于2020-07

在自己的工程中加载了paddle的so动态库。

使用gperftools工具检测内存泄漏时，出现以下问题：

Leak of 800 bytes in 10 objects allocated from:
@ 7f7ba4c311c8 unknown
@ 00007f7ba4c27699 paddle::memory::detail::MemoryBlock::split ??:0
@ 00007f7ba4c25fe1 paddle::memory::detail::BuddyAllocator::SplitToAlloc ??:0
@ 00007f7ba4c2651c paddle::memory::detail::BuddyAllocator::Alloc ??:0
@ 00007f7ba4c2225d paddle::memory::legacy::Alloc ??:0
@ 00007f7ba4c227f7 paddle::memory::allocation::NaiveBestFitAllocator::AllocateImpl ??:0
@ 00007f7ba4c3d195 paddle::memory::allocation::RetryAllocator::AllocateImpl ??:0
@ 00007f7ba4c1ac41 paddle::memory::allocation::AllocatorFacade::Alloc ??:0
@ 00007f7ba3e6e4ca paddle::memory::Alloc ??:0
@ 00007f7ba178ae18 paddle::memory::Alloc ??:0
@ 00007f7ba178af49 paddle::platform::CudnnWorkspaceHandle::ReallocWorkspace ??:0
@ 00007f7ba314f55e paddle::operators::CUDNNConvOpKernel::Compute ??:0
@ 00007f7ba31501fe std::_Function_handler::_M_invoke ??:0
@ 00007f7ba3dc8f26 paddle::framework::OperatorWithKernel::RunImpl ??:0
@ 00007f7ba3dc9df6 paddle::framework::OperatorWithKernel::RunImpl ??:0
@ 00007f7ba3dc1b1d paddle::framework::OperatorBase::Run ??:0
@ 00007f7ba18f2af9 paddle::framework::NaiveExecutor::Run ??:0
@ 00007f7ba173bc27 paddle::AnalysisPredictor::ZeroCopyRun ??:0
@ 00007f7bc9175937 baidu_retina::RetinaDlInferenceLayer::pdl_do_inference /data/inference_layer.cpp:235
@ 00007f7bc9175a01 baidu_retina::RetinaDlInferenceLayer::pdl_infer /data/inference_layer.cpp:245
@ 00007f7bc9196cea baidu_retina::DiscCupDetectionLayer::infer /data/detection_layer.cpp:78
@ 00007f7bc9113a6c baidu_retina::AbstractLayer::forward /data/layer.hpp:41
@ 00007f7bc9160949 baidu_retina::RetinaSolver::process_layer /data/solver.cpp:166
@ 00007f7bc9160cd6 baidu_retina::RetinaSolver::process /data/solver.cpp:189
@ 00007f7bc919e2bc baidu_retina::RetinaSolver::solve /data/solver.hpp:52
@ 00007f7bc919dc16 merge_for_report /data/report.cpp:54
@ 000055a1349ac379 main

主要代码如下：

int RetinaDlInferenceLayer::pdl_do_inference() {
_pdl_predictor->ZeroCopyRun();
return 0;
}

int RetinaDlInferenceLayer::pdl_infer() {

pdl_infer_init();
set_cpu_io_buf();
pdl_set_input();
pdl_do_inference();
pdl_get_output();

return 0;
}

代码中会有多个模块调用此处。

压测且使用valgrind，gperftools等多个工具验证此处存在内存泄漏。

paddle在此处申请的内存该如何释放？

全部评论(3)

回

回来的不完美k

#2 回复于2020-07

补充：

@ 7f7b8b93cdcd unknown
@ 00007f7b8b948295 cuVDPAUCtxCreate ??:0
@ 00007f7b8b949e86 cuVDPAUCtxCreate ??:0
@ 00007f7b8b94a3e7 cuVDPAUCtxCreate ??:0
@ 00007f7b8b742999 cuMemGetAttribute ??:0
@ 00007f7b8b743331 cuMemGetAttribute ??:0
@ 00007f7b8b8a2daa cuMemAlloc_v2 ??:0
@ 00007f7ba4d30712 cudart::driverHelper::mallocPtr :0
@ 00007f7ba4d0bcea cudart::cudaApiMalloc :0
@ 00007f7ba4d4128b cudaMalloc :0
@ 00007f7ba4c31769 paddle::memory::detail::GPUAllocator::Alloc ??:0
@ 00007f7ba4c25d4c paddle::memory::detail::BuddyAllocator::RefillPool ??:0
@ 00007f7ba4c2659a paddle::memory::detail::BuddyAllocator::Alloc ??:0
@ 00007f7ba4c2225d paddle::memory::legacy::Alloc ??:0
@ 00007f7ba4c227f7 paddle::memory::allocation::NaiveBestFitAllocator::AllocateImpl ??:0
@ 00007f7ba4c3d195 paddle::memory::allocation::RetryAllocator::AllocateImpl ??:0
@ 00007f7ba4c1ac41 paddle::memory::allocation::AllocatorFacade::Alloc ??:0
@ 00007f7ba4c1acf9 paddle::memory::allocation::AllocatorFacade::AllocShared ??:0
@ 00007f7ba3e6e46a paddle::memory::AllocShared ??:0
@ 00007f7ba18b17b5 paddle::framework::Tensor::mutable_data ??:0
@ 00007f7ba1b4c162 paddle::framework::Tensor::mutable_data ??:0
@ 00007f7ba2b69595 paddle::operators::BatchNormKernel::Compute ??:0
@ 00007f7ba2b6b45e std::_Function_handler::_M_invoke ??:0
@ 00007f7ba3dc8f26 paddle::framework::OperatorWithKernel::RunImpl ??:0
@ 00007f7ba3dc9df6 paddle::framework::OperatorWithKernel::RunImpl ??:0
@ 00007f7ba3dc1b1d paddle::framework::OperatorBase::Run ??:0
@ 00007f7ba18f2af9 paddle::framework::NaiveExecutor::Run ??:0
@ 00007f7ba173bc27 paddle::AnalysisPredictor::ZeroCopyRun ??:0

补充：Leak of 496 bytes in 1 objects allocated from:

@ 7f7b8b93cdcd unknown
@ 00007f7b8b948295 cuVDPAUCtxCreate ??:0
@ 00007f7b8b949e86 cuVDPAUCtxCreate ??:0
@ 00007f7b8b94a3e7 cuVDPAUCtxCreate ??:0
@ 00007f7b8b805544 cudbgApiDetach ??:0
@ 00007f7b8b8058c3 cudbgApiDetach ??:0
@ 00007f7b8b849571 cuEGLApiInit ??:0
@ 00007f7b8b752201 cuMemGetAttribute_v2 ??:0
@ 00007f7b8b7522bf cuMemGetAttribute_v2 ??:0
@ 00007f7b8b8abd69 cuStreamCreateWithPriority ??:0
@ 00007f7ae9ab7a31 cudnnDropoutForward ??:0
@ 00007f7ae9ab7bfb cudnnDropoutForward ??:0
@ 00007f7ae9ae59d1 cudnnDropoutForward ??:0
@ 00007f7ae8510aad cudnnCreate ??:0
@ 00007f7ba178b3ff paddle::platform::CUDADeviceContext::CUDADeviceContext ??:0
@ 00007f7ba1796658 std::_Function_handler::_M_invoke ??:0
@ 00007f7ba178e44d std::__future_base::_State_baseV2::_M_do_set :0
@ 00007f7bcf35c826 __pthread_once_slow /build/glibc-OTsEL5/glibc-2.27/nptl/pthread_once.c:116
@ 00007f7ba17902e1 std::__future_base::_State_baseV2::_M_set_result :0
@ 00007f7ba1790450 std::__future_base::_Deferred_state::_M_complete_async ??:0
@ 00007f7ba178aaeb paddle::platform::DeviceContextPool::Get ??:0
@ 00007f7ba174e7a1 paddle::ZeroCopyTensor::copy_from_cpu ??:0
@ 00007f7bc91756d4 baidu_retina::RetinaDlInferenceLayer::pdl_set_input /data/inference_layer.cpp:221
@ 00007f7bc91759ea baidu_retina::RetinaDlInferenceLayer::pdl_infer /data/inference_layer.cpp:244

补充：

直接申请内存/未释放内存%/累计未释放内存%/间接申请内存/间接未释放内存%

500.0 64.9% 64.9% 500.0 64.9% paddle::memory::detail::AlignedMalloc
219.6 28.5% 93.4% 264.5 34.3% cuEGLApiInit
36.6 4.8% 98.1% 37.4 4.9% cuVDPAUCtxCreate
7.9 1.0% 99.2% 19.0 2.5% cuMemGetAttribute

HolliZhao

#3 回复于2020-07

这个得问问PaddlePaddle的RD同学了，可以在这里提出问题：https://github.com/PaddlePaddle/Paddle/issues

小

小猪PPA

#4 回复于2021-04

你好请问你解决了吗

提issue

需求/bug反馈？一键提issue告诉我们

提pr

发现bug？如果您知道修复办法，欢迎提pr直接参与建设飞桨~