Padlle-Tensorrt

AIStudio783230 发布于2019-12

Tensorrt
我在使用 Paddle_TensorRT 运行 Detection 模型的时候：
batch size 已经是1了
[1,3,640,640]
莫名其妙说我设置了 1000bitchsize 我整个项目里面吗都没有1000这个出现

I1223 18:11:28.561244 27333 analysis_predictor.cc:470] ======= optimize end =======
I1223 18:11:28.561568 27333 naive_executor.cc:105] --- skip [feed], feed -> im_shape
I1223 18:11:28.561578 27333 naive_executor.cc:105] --- skip [feed], feed -> im_info
I1223 18:11:28.561583 27333 naive_executor.cc:105] --- skip [feed], feed -> image
I1223 18:11:28.562793 27333 naive_executor.cc:105] --- skip [save_infer_model/scale_0], fetch -> fetch
W1223 18:11:28.570389 27333 device_context.cc:236] Please NOTE: device: 0, CUDA Capability: 61, Driver API Version: 10.1, Runtime API Version: 9.0
W1223 18:11:28.570479 27333 device_context.cc:244] device: 0, cuDNN Version: 7.6.
terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
what():

C++ Call Stacks (More useful to developers):

0 std::__cxx11::basic_string<char, std::char_traits, std::allocator > paddle::platform::GetTraceBackString<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >(std::__cxx11::basic_string<char, std::char_traits, std::allocator >&&, char const*, int)
1 paddle::platform::EnforceNotMet::EnforceNotMet(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, char const*, int)
2 paddle::operators::TensorRTEngineOp::RunTrt(paddle::framework::Scope const&, paddle::platform::Place const&, paddle::inference::tensorrt::TensorRTEngine*) const
3 paddle::operators::TensorRTEngineOp::RunImpl(paddle::framework::Scope const&, paddle::platform::Place const&) const
4 paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, paddle::platform::Place const&)
5 paddle::framework::NaiveExecutor::Run()
6 paddle::AnalysisPredictor::ZeroCopyRun()

Error Message Summary:

Error: Paddle internal Check failed. (Please help us create a new issue, here we need to find the developer to add a user friendly error message)
[Hint: Expected runtime_batch <= max_batch_size_, but received runtime_batch:1000 > max_batch_size_:1.] at (/home/a/lztnew/train_net_work/baidu/Paddle/paddle/fluid/operators/tensorrt/tensorrt_engine_op.h:272)

全部评论(8)

AIStudio783230

#2 回复于2019-12

这里是代码

auto predictor = CreatePaddlePredictor(config);
int channels = 3;
int height = 640;
int width = 640;
float input[1 * channels * height * width] = {0};
//float inm_info[3]={640,640,1.0};
std::vector image_infos;
image_infos.push_back(640);
image_infos.push_back(640);
image_infos.push_back(1.0);
auto input_names = predictor->GetInputNames();
auto input_t = predictor->GetInputTensor(input_names[0]);
auto im_info_tensor = predictor->GetInputTensor(input_names[1]);
auto input_im_shape = predictor->GetInputTensor(input_names[2]);
input_t->Reshape({1, channels, height, width});
input_t->copy_from_cpu(input);
im_info_tensor->Reshape({1, 3});
im_info_tensor->copy_from_cpu(image_infos.data());
// run
predictor->ZeroCopyRun();

AIStudio783230

#3 回复于2019-12

config->SetModel(FLAGS_dirname + "/model",
FLAGS_dirname + "/params");
config->EnableUseGpu(100, 0);
//We use ZeroCopyTensor here, so we set config->SwitchUseFeedFetchOps(false)
config->SwitchUseFeedFetchOps(false);
config->EnableTensorRtEngine(1<<20, 1, 30, AnalysisConfig::Precision::kFloat32, false,false);

AIStudio783230

#4 回复于2019-12

请问哪里有问题吗

AIStudio783230

#5 回复于2019-12

config->EnableTensorRtEngine(1<<20, 1, 30, AnalysisConfig::Precision::kFloat32, false,false);
我已经开启了

AIStudio783231

#6 回复于2019-12

将这行的30改成 40，试一试

AIStudio783230

#7 回复于2019-12

您好可以解释一下为什么吗？是玄学吗

AIStudio783231

#8 回复于2019-12

这个参数控制跑在tensorrt上最少graph的个数。

是两阶段模型，在rcnn部分，conv的第一维度不是batch-size，调大min_subgraph_size，让rcnn那部分不跑在tensorrt上。

AIStudio783230

#9 回复于2019-12

感谢指导，学到了设计的真好，是不是 trt 加速最耗时的 backbone部分，老哥二阶的模型可以int8 加速吗，