首页 Paddle框架 帖子详情
ubuntu 22.04(cuda-11.6+cudnn8.4)编译paddle
收藏
快速回复
Paddle框架 文章学习资料 3547 4
ubuntu 22.04(cuda-11.6+cudnn8.4)编译paddle
收藏
快速回复
Paddle框架 文章学习资料 3547 4

 

为啥要自己编译

  • 因为官方包不支持ubuntu22.04(系统自带gcc版本与glibc版本太高)
ImportError: /home/ubuntu/anaconda3/envs/paddle/lib/python3.9/site-packages/paddle/fluid/core_avx.so: undefined symbol: _dl_sym, version GLIBC_PRIVATE

环境

  • 自带python环境(其实没啥影响,只是展示一下)
$ /usr/bin/python3 --version
Python 3.10.4
  • cmake环境(建议版本装高一下,貌似要3.19以上)
cmake --version
cmake version 3.22.1

CMake suite maintained and supported by Kitware (kitware.com/cmake).
  • 自带gcc环境(其实没啥影响,只是展示一下)
$ gcc --version
gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
  • 系统描述
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04 LTS"
  • cuda环境
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_Mar__8_18:18:20_PST_2022
Cuda compilation tools, release 11.6, V11.6.124
Build cuda_11.6.r11.6/compiler.31057947_0
  • cudnn环境
  • 由于conda安装的gcc不会读取系统环境的c/c++ include,所以cudnn只能用tar包的方式安装。
  • 选择的tar.xz的包为:cudnn-linux-x86_64-8.4.1.50_cuda11.6-archive.tar.xz
  • 简易安装教程如下:
tar -xvf cudnn-linux-x86_64-8.4.1.50_cuda11.6-archive.tar.xz
cd cudnn-linux-x86_64-8.4.1.50_cuda11.6-archive
sudo cp -r include/* /usr/local/cuda/include/
sudo cp -r lib/* /usr/local/cuda/lib64
# 刷新库缓存,并查看安装结果
sudo ldconfig -v | grep libcudnn

# 结果如下,已能识别到cudnn, 8.4.1
/sbin/ldconfig.real: Path `/usr/lib' given more than once
(from :0 and :0)
	libcudnn_ops_train.so.8 -> libcudnn_ops_train.so.8.4.1
	libcudnn_cnn_train.so.8 -> libcudnn_cnn_train.so.8.4.1
	libcudnn_ops_infer.so.8 -> libcudnn_ops_infer.so.8.4.1
	libcudnn_adv_infer.so.8 -> libcudnn_adv_infer.so.8.4.1
	libcudnn.so.8 -> libcudnn.so.8.4.1
	libcudnn_adv_train.so.8 -> libcudnn_adv_train.so.8.4.1
	libcudnn_cnn_infer.so.8 -> libcudnn_cnn_infer.so.8.4.1
  • 安装nccl (这个是多卡才需要的,但是编译的时候关闭多卡选项貌似也提示要装这个,所以只能安装一下了,并且只能压缩包或者源码安装。)
  • 官方安装方法(压缩包安装)
  • 官网下载路径:https://developer.nvidia.com/nccl/nccl-download
  • 选择“O/S agnostic local installer”
  • 之后解压并复制到cuda目录即可
tar -xvf nccl_2.13.4-1+cuda11.7_x86_64.txz
cd nccl_2.13.4-1+cuda11.7_x86_64
sudo cp -r include/* /usr/local/cuda/include/
sudo cp -r lib/* /usr/local/cuda/lib64
  • 源码安装(推荐,毕竟用自己的cuda编译出来的兼容性更好一些)
git clone https://github.com/NVIDIA/nccl.git
cd nccl
git checkout v2.13.4-1
make pkg.txz.build -j12
# 如果出现大量sm35弃用警告,可以删除makefiles/common.mk中-gencode=arch=compute_35,code=sm_35,不删也没关系。
# 修改前
CUDA8_GENCODE = -gencode=arch=compute_35,code=sm_35 \
							-gencode=arch=compute_50,code=sm_50 \
							-gencode=arch=compute_60,code=sm_60 \
							-gencode=arch=compute_61,code=sm_61
# 修改后
CUDA8_GENCODE = -gencode=arch=compute_50,code=sm_50 \
							-gencode=arch=compute_60,code=sm_60 \
							-gencode=arch=compute_61,code=sm_61
# 编译大概需要20分钟左右。
cd build/pkg/txz
tar -xvf nccl_2.13.4-1+cuda11.6_x86_64.txz
cd nccl_2.13.4-1+cuda11.6_x86_64
sudo cp -r include/* /usr/local/cuda/include/
sudo cp -r lib/* /usr/local/cuda/lib64

准备工作

  • 创建并激活虚拟环境
conda create -n paddle python==3.9.12
conda activate paddle
  • 获取python相关信息
find `dirname $(dirname $(which python3))` -name "libpython3.so" > /tmp/temp1 && export PYTHON_LIBRARY=$(cat /tmp/temp1 | xargs -L 1)

export PATH=${PYTHON_LIBRARY}:$PATH

find `dirname $(dirname $(which python3))`/include -name "python3.9" > /tmp/temp2 && export PYTHON_INCLUDE_DIRS=$(cat /tmp/temp2 | xargs -L 1)

export PYTHON3_EXECUTABLE=$(for dirname in `whereis python3`; do echo $dirname > /tmp/tmp3 | cat /tmp/tmp3 | grep env ; done;)
  • 打印python变量
echo PYTHON_LIBRARY=${PYTHON_LIBRARY}
echo PYTHON_INCLUDE_DIRS=${PYTHON_INCLUDE_DIRS}
echo PYTHON3_EXECUTABLE=${PYTHON3_EXECUTABLE}

# 结果如下
PYTHON_LIBRARY=/home/tlntin/anaconda3/envs/paddle/lib/libpython3.so
PYTHON_INCLUDE_DIRS=/home/tlntin/anaconda3/envs/paddle/include/python3.9
PYTHON3_EXECUTABLE=/home/tlntin/anaconda3/envs/paddle/bin/python3
  • 安装numpy
pip install numpy

export PYTHON3_NUMPY_INCLUDE_DIRS=`python -c "import numpy as np; print(np.__path__[0] + '/core/include')"`
echo PYTHON3_NUMPY_INCLUDE_DIRS=$PYTHON3_NUMPY_INCLUDE_DIRS
  • 安装protobuf
pip install protobuf==3.20.0
  • 安装patchelf
pip install patchelf
  • 安装gcc-8,g++-8,glibc-2.17(因为paddle用的protobuf最高只支持gcc-8编译器)
# 建议用代理运行,不然比较慢
# 设置代理
conda config --set proxy_servers.http http://xxxx
# 安装
conda install -c conda-forge gcc=8 gxx=8 sysroot_linux-64=2.17
  • 重新检查你的gcc/g++版本(只影响虚拟环境,不影响系统环境)
$ gcc --version
gcc (conda-forge gcc 8.5.0-16) 8.5.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ g++ --version
g++ (conda-forge gcc 8.5.0-16) 8.5.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
  • 安装yaml(编译过程中提示找不到yaml模块,所以安装一下)
pip install pyyaml

编译过程

  1. 拉取源码,切换最新分支
git clone https://github.com/PaddlePaddle/Paddle.git
cd Paddle
git checkout release/2.3
  1. 创建并进入build
mkdir build && cd build
  1. 设置目标paddle版本
export PADDLE_VERSION="2.3.1"
  1. 准备编译(未开启TensorRT)
cmake  .. \
-DWITH_CONTRIB=OFF \
-DWITH_MKL=ON \
-DWITH_MKLDNN=ON  \
-DWITH_TESTING=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DWITH_INFERENCE_API_TEST=OFF \
-DWITH_GPU=ON \
-DCUDNN_ROOT=/usr/local/cuda \
-DON_INFER=ON \
-DWITH_PYTHON=ON \
-D PYTHON3_EXECUTABLE=${PYTHON3_EXECUTABLE} \
-D PYTHON3_INCLUDE_DIR=${PYTHON3_INCLUDE_DIR} \
-D PYTHON3_LIBRARY=${PYTHON3_LIBRARY} \
-D PYTHON3_NUMPY_INCLUDE_DIRS=${PYTHON3_NUMPY_INCLUDE_DIRS}  \
-D WITH_GPU=ON \
-D WITH_TENSORRT=OFF
  1. 正式编译(注意,该步骤需要科学上网,因为make的时候需要从github拉取第三方库源码),大概等待个1-2小时左右,差不多就可以了。
make -j10
  • 编译到一半报错,error too many open files,需要修改最大打开文件限制,默认是1024
# 修改前为1024
$ ulimit -Sn
1024
# 修改为9192
ulimit -n 9192
# 修改后
$ ulimit -Sn
9192
  • 修改后重新继续编译,之前的进度可以保留
make -j10
  1. 获取安装包,安装包在build目录下面的python/dist目录下,文件属性如下:
cd python/dist
ls -lh
.rw-r--r-- ubuntu ubuntu 167 MB Wed Jul 27 17:33:44 2022  paddlepaddle_gpu-0.0.0-cp39-cp39-linux_x86_64.whl
  1. 安装安装包(理论上和我相同cuda/cudnn/nccl版本,且cudnn/nccl都为zip安装,30系列显卡的ubuntu22.04/20.04都能用该包)
pip install paddlepaddle_gpu-2.3.1-cp39-cp39-linux_x86_64.whl
  1. 测试效果
$ python3
Python 3.9.12 (main, Jun  1 2022, 11:38:51)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import paddle
>>> paddle.utils.run_check()
Running verify PaddlePaddle program ...
W0727 17:46:03.775210 12918 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.7, Runtime API Version: 11.6
W0727 17:46:03.796252 12918 gpu_resources.cc:91] device: 0, cuDNN Version: 8.4.
PaddlePaddle works well on 1 GPU.
I0727 17:46:06.006351 12918 parallel_executor.cc:486] Cross op memory reuse strategy is enabled, when build_strategy.memory_optimize = True or garbage collection strategy is disabled, which is not recommended
PaddlePaddle works well on 1 GPUs.
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.
  1. 跑一下官方测试代码,貌似也正常,可以正常用GPU进行训练。
$ python3 test_paddle.py
数据集标签共有10种, 分别为:[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
W0727 17:54:46.313586 13232 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.7, Runtime API Version: 11.6
W0727 17:54:46.325191 13232 gpu_resources.cc:91] device: 0, cuDNN Version: 8.4.
The loss value printed in the log is the current step, and the metric is the average value of previous steps.
Epoch 1/5
step 938/938 [==============================] - loss: 0.1149 - acc: 0.9398 - 14ms/step
Epoch 2/5
step 938/938 [==============================] - loss: 0.0688 - acc: 0.9760 - 13ms/step
Epoch 3/5
step 938/938 [==============================] - loss: 0.0354 - acc: 0.9809 - 11ms/step
Epoch 4/5
step 938/938 [==============================] - loss: 0.0052 - acc: 0.9833 - 13ms/step
Epoch 5/5
step 938/938 [==============================] - loss: 0.0110 - acc: 0.9855 - 12ms/step
  • 代码内容如下:
import paddle

# 设置使用GPU
paddle.device.set_device("gpu:0")


from paddle.vision.transforms import Normalize
from paddle.vision.datasets import MNIST
from paddle.vision.models import LeNet
import numpy as np


# ### 拉取数据集

transform = Normalize(mean=[127.5], std=[127.5], data_format="CHW")
train_dataset = MNIST(mode="train", transform=transform)
valid_dataset = MNIST(mode="test", transform=transform)


# ### 获取数据集类别

y_list = [da[1][0] for da in train_dataset]
num_list = list(set(y_list))
num_classes = len(num_list)
print(f"数据集标签共有{num_classes}种, 分别为:{num_list}")

# 构建模型
pre_mdoel = LeNet(num_classes=num_classes)
model = paddle.Model(pre_mdoel)
adam = paddle.optimizer.Adam(learning_rate=1e-3, parameters=model.parameters())
model.prepare(adam, loss=paddle.nn.CrossEntropyLoss(), metrics=paddle.metric.Accuracy())

# 训练模型
model.fit(train_data=train_dataset, batch_size=64, verbose=1, epochs=5)
  • 显卡使用正常
$ nvidia-smi
Wed Jul 27 17:55:58 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.57       Driver Version: 516.59       CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
| 36%   34C    P2   120W / 370W |   3228MiB / 24576MiB |      4%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     13353      C   /python3.9                      N/A      |
+-----------------------------------------------------------------------------+
0
收藏
回复
全部评论(4)
时间顺序
李长安
#2 回复于2022-07

大佬牛的

0
回复
Tlntin
#3 回复于2022-07

论坛不支持markdown排版,已转成图片,并附相关文件。

0
回复
z
zhujiehaode
#4 回复于2022-07

牛的

0
回复
李长安
#5 回复于2022-08

下个版本应该就支持了

0
回复
需求/bug反馈?一键提issue告诉我们
发现bug?如果您知道修复办法,欢迎提pr直接参与建设飞桨~
在@后输入用户全名并按空格结束,可艾特全站任一用户