飞桨图像分类套件PaddleClas是飞桨为工业界和学术界所准备的一个图像分类任务的工具集,助力使用者训练出更好的视觉模型和应用落地。
GitHub - PaddlePaddle/PaddleClas: A treasure chest for visual recognition powered by PaddlePaddle
截至2020-12-10最新版本PaddleClas需要需要PaddlePaddle 2.0rc或更高版本,有些新的方法在稳定版本1.8.5中没有。
安装方法:飞桨PaddlePaddle-源于产业实践的开源深度学习平台
python -m pip install paddlepaddle-gpu==2.0.0rc0 -f https://paddlepaddle.org.cn/whl/stable.html
我的运行环境:
Python3.7
CUDA= 10.2
cuDNN >= 8.0.3
克隆PaddleClas模型库:
cd path_to_clone_PaddleClas
git clone https://github.com/PaddlePaddle/PaddleClas.git
安装Python依赖库:
Python依赖库在requirements.txt中给出,可通过如下命令安装:
pip install --upgrade -r requirements.txt
数据准备:
将数据拷贝到项目的dataset目录下,目录结构如上图
battery为名称目录
binary_0和binary_1为2个分类目录,每个目录下存放对应类别图片
test.txt train.txt val.txt为存放图片路径及分类的文件,通过下方的分隔数据集代码生成
分割数据集代码:
import os
import numpy as np
img_dir = r'battery'
classes_dict = {}
split_sets = {'train': [], 'val': [], 'test': []}
go = os.walk(img_dir)
for path, d, fileList in go:
for c in d:
if len(c) > 0:
classes_dict[c] = []
for index, filename in enumerate(fileList):
if filename.endswith(('jpg', 'png', 'bmp')):
class_name = os.path.basename(path)
classes_dict[class_name].append(os.path.join(class_name, filename)+' '+str(list(classes_dict.keys()).index(class_name)))
# print(classes_dict)
# 按比例划分打乱后的数据集
train_ratio = 0.8
val_ratio = 0.1
# 打乱数据集
np.random.seed(100)
for k, v in classes_dict.items():
np.random.shuffle(v)
train_num = int(len(v) * train_ratio)
val_num = int(len(v) * val_ratio)
split_sets['train'].extend(v[:train_num])
split_sets['val'].extend(v[train_num: train_num + val_num])
split_sets['test'].extend(v[train_num + val_num:])
# 写txt文件
for name,list in split_sets.items():
list_file = open(os.path.join(img_dir,'./%s.txt' % (name)), 'w')
for l in list:
# print(l)
list_file.write(l+'\r')
list_file.close()
print('count:',len(split_sets['train']+split_sets['val']+split_sets['test']))
分割完之后的txt文件为
空格前为文件路径,空格后为类别索引
训练数据:
采用ResNet50_vd模型
下载预训练模型
通过tools/download.py下载所需要的预训练模型。
python tools/download.py -a ResNet50_vd_ssld -p ./pretrained -d True
配置yaml文件
文件拷贝于configs/quick_start/ResNet50_vd_ssld_finetune.yaml
mode: 'train'
ARCHITECTURE:
name: 'ResNet50_vd'
params:
lr_mult_list: [0.1, 0.1, 0.2, 0.2, 0.3]
pretrained_model: "./pretrained/ResNet50_vd_ssld_pretrained"
load_static_weights: True
model_save_dir: "./output/"
classes_num: 2
total_images: 5674
save_interval: 1
validate: True
valid_interval: 1
epochs: 20
topk: 1
image_shape: [3, 224, 224]
LEARNING_RATE:
function: 'Cosine'
params:
lr: 0.00375
OPTIMIZER:
function: 'Momentum'
params:
momentum: 0.9
regularizer:
function: 'L2'
factor: 0.000001
TRAIN:
batch_size: 32
num_workers: 4
file_list: "./dataset/battery/train.txt"
data_dir: "./dataset/battery/"
shuffle_seed: 0
transforms:
- DecodeImage:
to_rgb: True
to_np: False
channel_first: False
- ResizeImage:
size: 224
- RandFlipImage:
flip_code: 1
- NormalizeImage:
scale: 1./255.
mean: [ 0.485, 0.456, 0.406 ]
std: [ 0.229, 0.224, 0.225 ]
order: ''
- ToCHWImage:
VALID:
batch_size: 20
num_workers: 4
file_list: "./dataset/battery/val.txt"
data_dir: "./dataset/battery/"
shuffle_seed: 0
transforms:
- DecodeImage:
to_rgb: True
to_np: False
channel_first: False
- ResizeImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [ 0.485, 0.456, 0.406 ]
std: [ 0.229, 0.224, 0.225 ]
order: ''
- ToCHWImage:
需要修改几个参数:
pretrained_model: "./pretrained/ResNet50_vd_ssld_pretrained" 为预训练模型路径
classes_num: 2 分类数
total_images: 5674 图片数 之前代码中有输出
topk: 1 根据分类数调整
image_shape: [3, 224, 224] 图片大小根据实际情况调整
TRAIN:
file_list: "./dataset/battery/train.txt" 为train.txt路径
data_dir: "./dataset/battery/"
VALID: 同上
开始训练:
python -m paddle.distributed.launch --selected_gpus="0" tools/train.py -c ./configs/quick_start/ResNet50_vd_ssld_finetune_battery.yaml
训练的结果会保存在output目录下
一共训练20轮,配置save_interval: 1 为每次都保存,我们采用best_model目录中的权重ppcls.pdopt和ppcls.pdparams
模型评估:
修改configs/leval.yaml文件
mode: 'valid'
ARCHITECTURE:
name: "ResNet50_vd"
pretrained_model: "./output/ResNet50_vd/best_model/ppcls"
classes_num: 2
total_images: 5674
topk: 1
image_shape: [3, 224, 224]
VALID:
batch_size: 16
num_workers: 4
file_list: "./dataset/battery/val.txt"
data_dir: "./dataset/battery/"
shuffle_seed: 0
transforms:
- DecodeImage:
to_rgb: True
to_np: False
channel_first: False
- ResizeImage:
size: 224
- NormalizeImage:
scale: 1.0/255.0
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
order: ''
- ToCHWImage:
主要修改几个参数:
pretrained_model: "./output/ResNet50_vd/best_model/ppcls" 为训练好保存的权重路径
classes_num: 2 分类数
total_images: 5674 图片数
topk: 1 根据分类数调整
image_shape: [3, 224, 224] 图片大小根据实际情况调整
运行以下命令评估模型
python -m paddle.distributed.launch --selected_gpus="0" tools/eval.py -c ./configs/eval.yaml
模型推理:
PaddlePaddle提供三种方式进行预测推理,接下来介绍如何用预测引擎进行推理: 首先,对训练好的模型进行转换:
python tools/export_model.py --model=ResNet50_vd --pretrained_model=output/ResNet50_vd/best_model/ppcls --output_path=inference/ResNet50_vd --class_dim=2
这里在pretrained_model后面一定要加ppcls
会在inference文件夹中生成以下文件
之后,通过预测引擎进行推理:
python ./tools/infer/predict.py -i=./dataset/battery/binary_1/123(7)19_1.jpg --model_file=./inference/ResNet50_vd.pdmodel --params_file=./inference/ResNet50_vd.pdiparams --use_gpu=1
输出结果:Current image file: ./dataset/battery/binary_1/123(7)19_1.jpg
top-1 class: 1
top-1 score: 0.99908447265625
批量测试:
自己写的批量测试代码
from paddle.inference import Config
from paddle.inference import create_predictor
import tools.infer.utils as utils
import os, cv2
import numpy as np
args = utils.parse_args()
args.model_file = r'../inference_cap/ResNet50_vd.pdmodel'
args.params_file = r'../inference_cap/ResNet50_vd.pdiparams'
args.use_gpu = True
img_dir = r'../dataset/cap/crooked'
config = Config(args.model_file, args.params_file)
if args.use_gpu:
config.enable_use_gpu(args.gpu_mem, 0)
else:
config.disable_gpu()
if args.enable_mkldnn:
# cache 10 different shapes for mkldnn to avoid memory leak
config.set_mkldnn_cache_capacity(10)
config.enable_mkldnn()
config.disable_glog_info()
config.switch_ir_optim(args.ir_optim) # default true
if args.use_tensorrt:
config.enable_tensorrt_engine(
precision_mode=Config.Precision.Half
if args.use_fp16 else Config.Precision.Float32,
max_batch_size=args.batch_size)
config.enable_memory_optim()
# use zero copy
config.switch_use_feed_fetch_ops(False)
predictor = create_predictor(config)
# predict(args, predictor)
img_list = os.listdir(img_dir)
input_names = predictor.get_input_names()
input_tensor = predictor.get_input_handle(input_names[0])
output_names = predictor.get_output_names()
output_tensor = predictor.get_output_handle(output_names[0])
for img_name in img_list:
img = cv2.imread(os.path.join(img_dir, img_name))[:, :, ::-1]
assert img is not None, "Error in loading image: {}".format(args.image_file)
inputs = utils.preprocess(img, args)
inputs = np.expand_dims(inputs, axis=0).repeat(args.batch_size, axis=0).copy()
input_tensor.copy_from_cpu(inputs)
predictor.run()
output = output_tensor.copy_to_cpu()
classes, scores = utils.postprocess(output, args)
print("Current image file: {}".format(img_name))
print("\ttop-1 class: {0}".format(classes[0]))
print("\ttop-1 score: {0}".format(scores[0]))
牛啊
不过最好用最新版本的PaddlePaddle
usage: split_dataset_list.py [-h] [--split SPLIT SPLIT SPLIT]
[--separator SEPARATOR] [--format FORMAT FORMAT]
[--postfix POSTFIX POSTFIX]
dataset_root images_dir_name labels_dir_name
split_dataset_list.py: error: unrecognized arguments: --label_class Other Facade Road Vegetation Vehicle Roof
我这哪里错了呢?