首页 Paddle框架 帖子详情
【安装错误】Paddle源码编译时设置-DWITH_XBYAK=OFF,安装后import出错
收藏
快速回复
Paddle框架 问答深度学习 1306 8
【安装错误】Paddle源码编译时设置-DWITH_XBYAK=OFF,安装后import出错
收藏
快速回复
Paddle框架 问答深度学习 1306 8

问题描述

docker编译:镜像 hub.baidubce.com/paddlepaddle/paddle:latest-gpu-cuda9.0-cudnn7-dev

  • 使用最新的develop分支代码,源码编译,设置-DWITH_XBYAK=OFF,例如:
cmake .. -DPY_VERSION=2.7 -DWITH_GPU=ON -DWITH_TESTING=ON -DCMAKE_BUILD_TYPE=Release -DWITH_XBYAK=OFF
  • 编译完成后安装,import 出错,如下:
λ paddle-gpu /workspace/Paddle/build {develop} python
Python 2.7.12 (default, Nov 12 2018, 14:36:49)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import paddle.fluid as fluid
grep: warning: GREP_OPTIONS is deprecated; please use an alias or script
F0217 05:33:03.666159 22983 init.cc:195] This version is compiled on higher instruction(AVX) system, you may encounter illegal instruction error running on your local CPU machine. Please reinstall the NonAVX version or compile from source code.
*** Check failure stack trace: ***
    @     0x7fe444bf525d  google::LogMessage::Fail()
    @     0x7fe444bf75a8  google::LogMessage::SendToLog()
    @     0x7fe444bf4d6b  google::LogMessage::Flush()
    @     0x7fe444bf847e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7fe444d17a6d  paddle::framework::InitDevices()
    @     0x7fe444d17b7c  paddle::framework::InitDevices()
    @     0x7fe4448a69f2  _ZZN8pybind1112cpp_function10initializeIZN6paddle6pybindL22pybind11_init_core_avxERNS_6moduleEEUlbE107_vJbEJNS_4nameENS_5scopeENS_7siblingEEEEvOT_PFT0_DpT1_EDpRKT2_ENUlRNS_6detail13function_callEE1_4_FUNESN_
    @     0x7fe4449322f1  pybind11::cpp_function::dispatcher()
    @           0x4bc4aa  PyEval_EvalFrameEx
    @           0x4b9b66  PyEval_EvalCodeEx
    @           0x4c1f56  PyEval_EvalFrameEx
    @           0x4b9b66  PyEval_EvalCodeEx
    @           0x4b9856  PyEval_EvalCode
    @           0x4b978f  PyImport_ExecCodeModuleEx
    @           0x4b2bc6  (unknown)
    @           0x4b40ec  (unknown)
    @           0x4a4be1  (unknown)
    @           0x4a4667  PyImport_ImportModuleLevel
    @           0x4a5ae4  (unknown)
    @           0x4a587e  PyObject_Call
    @           0x4c5ef0  PyEval_CallObjectWithKeywords
    @           0x4bec4b  PyEval_EvalFrameEx
    @           0x4b9b66  PyEval_EvalCodeEx
    @           0x4eb69f  (unknown)
    @           0x44a7c2  PyRun_InteractiveOneFlags
    @           0x44a58d  PyRun_InteractiveLoopFlags
    @           0x430b76  (unknown)
    @           0x4938ce  Py_Main
    @     0x7fe499ef9830  __libc_start_main
    @           0x493299  _start
    @              (nil)  (unknown)
Aborted (core dumped)
  • 其他:
    • 在GPU物理机和CPU物理机上都测试过,编译时设置-DWITH_XBYAK=OFF时会出现上述错误
    • 但同时设置-DWITH_XBYAK=OFF以及-DWITH_AVX=OFF,则正常
0
收藏
回复
全部评论(8)
时间顺序
AIStudio785465
#2 回复于2020-02
@bingyanghuang

Could you help see it? XBYAK is a JIT assembler.

0
回复
AIStudio791338
#3 回复于2020-02

This error happens because in init.cc (lines 193-197) there is a check if AVX is supported:

#ifdef __AVX__
  if (!platform::MayIUse(platform::avx)) {
    AVX_GUIDE(AVX, NonAVX);
  }
#endif

And platform::MayIUse() function has 2 implementations depending on PADDLE_WITH_XBYAK flag (cpu_info.cc lines 106-145):

#ifdef PADDLE_WITH_XBYAK
static Xbyak::util::Cpu cpu;
bool MayIUse(const cpu_isa_t cpu_isa) {
  ...
}
#else
bool MayIUse(const cpu_isa_t cpu_isa) {
  ...
}
#endif

Xbyak is used here to determine if CPU has specific instruction set (AVX, AVX2, AVX512F ...). Implementation without XBYAK is as follows:

bool MayIUse(const cpu_isa_t cpu_isa) {
  if (cpu_isa == isa_any) {
    return true;
  } else {
    return false;
  }
}

So it returns false every time except for MayIUse(platform::isa_any).

This is why building paddle without XBYAK makes init.cc crash on !platform::MayIUse(platform::avx) condition and building without AVX "fixes" problem because #ifdef __AVX__ is omitted.

I'm not sure what is expected behavior of such scenario so please comment on that.

0
回复
AIStudio785465
#4 回复于2020-02
@grygielski

I'm not sure what is expected behavior of such scenario so please comment on that.

We think that all the following 4 scenarios should work successfully.

  • WITH_XBYAK=ON, WITH_AVX=ON
  • WITH_XBYAK=OFF, WITH_AVX=ON
  • WITH_XBYAK=ON, WITH_AVX=OFF: Does this exist? I.e WITH_XBYAK=ON must depend on WITH_AVX=ON?
  • WITH_XBYAK=OFF, WITH_AVX=OFF
0
回复
AIStudio791338
#5 回复于2020-02
@luotao1

Main problem here is that without XBYAK we can't check if avx instructions are available during runtime. Many algorithm implementations rely on MayIUse() function which returns always false if program has been built without XBYAK. This will lead to not executing AVX versions of many algorithms. I still don't know what is the expected outcome of such configuration (WITH_XBYAK=OFF, WITH_AVX=ON)? Should it just don't crash and pretend everything is ok (as I mentioned, all implementations checking for avx flags will be disabled due to lack of xbyak)? In my opinion these 2 flags should be correlated (either ON-ON or OFF-OFF) because I can't see the point of compiling with AVX but without XBYAK or the other way around.

0
回复
AIStudio785465
#6 回复于2020-02

In my opinion, these 2 flags should be correlated (either ON-ON or OFF-OFF)

XBYAK is a JIT assembler for x86(IA-32)/x64(AMD64, x86-64) MMX/SSE/SSE2/SSE3/SSSE3/SSE4/FPU/AVX/AVX2/AVX-512 by C++ header, thus, it supports NO_AVX as well.

But (either ON-ON or OFF-OFF) may be a temporary solution.

0
回复
AIStudio791338
#7 回复于2020-02

#22716 should fix this issue for every case that was mentioned. If you could, please confirm that on your machines.

0
回复
AIStudio785465
#8 回复于2020-02
@zhangting2020

will help to confirm it.

0
回复
AIStudio791360
#9 回复于2020-02

I test above four cases, this issue have been fixed. But I have another issue #22757, please help to see it.

0
回复
需求/bug反馈?一键提issue告诉我们
发现bug?如果您知道修复办法,欢迎提pr直接参与建设飞桨~
在@后输入用户全名并按空格结束,可艾特全站任一用户