常见数据集（VOC和COCO数据集制作）

项目

数据集

课程

比赛

模型库

活动

论坛

访问飞桨官网

项目

数据集

课程

比赛

模型库

活动

论坛

访问飞桨官网

噜啦啦本噜发布于2021-08

COCO数据集简介

MS COCO的全称是Microsoft Common Objects in Context，起源于微软于2014年出资标注的Microsoft COCO数据集。COCO数据集是一个大型的、丰富的物体检测，分割和字幕数据集。这个数据集以scene understanding为目标，主要从复杂的日常场景中截取，图像中的目标通过精确的segmentation进行位置的标定。图像包括91类目标，328,000影像和2,500,000个label。数据集主要解决3个问题：目标检测，目标之间的上下文关系，目标的2维上的精确定位。

官网地址：http://cocodataset.org

COCO数据集格式

 COCO_2017/
  	    ├── val2017     # 总的验证集
  	    ├── train2017    # 总的训练集
        ├── annotations    # COCO标注
		│   ├── instances_train2017.json     # object instances（目标实例） ---目标实例的训练集标注 
		│   ├── instances_val2017.json        # object instances（目标实例） ---目标实例的验证集标注
		│   ├── person_keypoints_train2017.json     # object keypoints（目标上的关键点） ---关键点检测的训练集标注
		│   ├── person_keypoints_val2017.json       # object keypoints（目标上的关键点） ---关键点检测的验证集标注
		│   ├── captions_train2017.json    # image captions（看图说话） ---看图说话的训练集标注
		│   ├── captions_val2017.json      # image captions（看图说话） ---看图说话的验证集标注

COCO数据集制作

COCO一共有5种不同任务分类，分别是目标检测、关键点检测、语义分割、场景分割和图像描述。COCO数据集的标注文件以JSON格式保存，官方的注释文件有仨 captions_type.json instances_type.json person_keypoints_type.json，其中的type是 train/val/test+year。

框架准备

新建文件夹COCO
在COCO下新建images/ 和annotations/

使用labelme标注数据集

在anaconda中安装labelme 输入命令pip install labelme。
安装成功后输入labelme，打开labelme。
点击open Dir选择你要标注的文件夹。
点击Create Polygons开始标注数据集。
将标注好生成的josn文件保存至指定文件夹。

改写josn文件。

# -*- coding:utf-8 -*-
# !/usr/bin/env python
 
import argparse
import json
import matplotlib.pyplot as plt
import skimage.io as io
import cv2
from labelme import utils
import numpy as np
import glob
import PIL.Image
 
class MyEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.integer):
            return int(obj)
        elif isinstance(obj, np.floating):
            return float(obj)
        elif isinstance(obj, np.ndarray):
            return obj.tolist()
        else:
            return super(MyEncoder, self).default(obj)
 
class labelme2coco(object):
    def __init__(self, labelme_json=[], save_json_path='./tran.json'):
        '''
        :param labelme_json: 所有labelme的json文件路径组成的列表
        :param save_json_path: json保存位置
        '''
        self.labelme_json = labelme_json
        self.save_json_path = save_json_path
        self.images = []
        self.categories = []
        self.annotations = []
        # self.data_coco = {}
        self.label = []
        self.annID = 1
        self.height = 0
        self.width = 0
 
        self.save_json()
 
    def data_transfer(self):
 
        for num, json_file in enumerate(self.labelme_json):
            with open(json_file, 'r') as fp:
                data = json.load(fp)  # 加载json文件
                self.images.append(self.image(data, num))
                for shapes in data['shapes']:
                    label = shapes['label']
                    if label not in self.label:
                        self.categories.append(self.categorie(label))
                        self.label.append(label)
                    points = shapes['points']#这里的point是用rectangle标注得到的，只有两个点，需要转成四个点
                    #points.append([points[0][0],points[1][1]])
                    #points.append([points[1][0],points[0][1]])
                    self.annotations.append(self.annotation(points, label, num))
                    self.annID += 1
 
    def image(self, data, num):
        image = {}
        img = utils.img_b64_to_arr(data['imageData'])  # 解析原图片数据
        # img=io.imread(data['imagePath']) # 通过图片路径打开图片
        # img = cv2.imread(data['imagePath'], 0)
        height, width = img.shape[:2]
        img = None
        image['height'] = height
        image['width'] = width
        image['id'] = num + 1
        #image['file_name'] = data['imagePath'].split('/')[-1]
        image['file_name'] = data['imagePath'][3:14]
        self.height = height
        self.width = width
 
        return image
 
    def categorie(self, label):
        categorie = {}
        categorie['supercategory'] = 'Cancer'
        categorie['id'] = len(self.label) + 1  # 0 默认为背景
        categorie['name'] = label
        return categorie
 
    def annotation(self, points, label, num):
        annotation = {}
        annotation['segmentation'] = [list(np.asarray(points).flatten())]
        annotation['iscrowd'] = 0
        annotation['image_id'] = num + 1
        # annotation['bbox'] = str(self.getbbox(points)) # 使用list保存json文件时报错
        # list(map(int,a[1:-1].split(','))) a=annotation['bbox'] 使用该方式转成list
        annotation['bbox'] = list(map(float, self.getbbox(points)))
        annotation['area'] = annotation['bbox'][2] * annotation['bbox'][3]
        # annotation['category_id'] = self.getcatid(label)
        annotation['category_id'] = self.getcatid(label)#注意，源代码默认为1
        annotation['id'] = self.annID
        return annotation
 
    def getcatid(self, label):
        for categorie in self.categories:
            if label == categorie['name']:
                return categorie['id']
        return 1
 
    def getbbox(self, points):
        # img = np.zeros([self.height,self.width],np.uint8)
        # cv2.polylines(img, [np.asarray(points)], True, 1, lineType=cv2.LINE_AA)  # 画边界线
        # cv2.fillPoly(img, [np.asarray(points)], 1)  # 画多边形 内部像素值为1
        polygons = points
 
        mask = self.polygons_to_mask([self.height, self.width], polygons)
        return self.mask2box(mask)
 
    def mask2box(self, mask):
        '''从mask反算出其边框
        mask：[h,w]  0、1组成的图片
        1对应对象，只需计算1对应的行列号（左上角行列号，右下角行列号，就可以算出其边框）
        '''
        # np.where(mask==1)
        index = np.argwhere(mask == 1)
        rows = index[:, 0]
        clos = index[:, 1]
        # 解析左上角行列号
        left_top_r = np.min(rows)  # y
        left_top_c = np.min(clos)  # x
 
        # 解析右下角行列号
        right_bottom_r = np.max(rows)
        right_bottom_c = np.max(clos)
 
        # return [(left_top_r,left_top_c),(right_bottom_r,right_bottom_c)]
        # return [(left_top_c, left_top_r), (right_bottom_c, right_bottom_r)]
        # return [left_top_c, left_top_r, right_bottom_c, right_bottom_r]  # [x1,y1,x2,y2]
        return [left_top_c, left_top_r, right_bottom_c - left_top_c,
                right_bottom_r - left_top_r]  # [x1,y1,w,h] 对应COCO的bbox格式
 
    def polygons_to_mask(self, img_shape, polygons):
        mask = np.zeros(img_shape, dtype=np.uint8)
        mask = PIL.Image.fromarray(mask)
        xy = list(map(tuple, polygons))
        PIL.ImageDraw.Draw(mask).polygon(xy=xy, outline=1, fill=1)
        mask = np.array(mask, dtype=bool)
        return mask
 
    def data2coco(self):
        data_coco = {}
        data_coco['images'] = self.images
        data_coco['categories'] = self.categories
        data_coco['annotations'] = self.annotations
        return data_coco
 
    def save_json(self):
        self.data_transfer()
        self.data_coco = self.data2coco()
        # 保存json文件
        json.dump(self.data_coco, open(self.save_json_path, 'w'), indent=4, cls=MyEncoder)  # indent=4 看着更得劲
 
 
labelme_json = glob.glob('./Annotations/*.json')
# labelme_json=['./Annotations/*.json']
 
labelme2coco(labelme_json, './json/test.json')

Pascal VOC 数据集简介

PASCAL VOC挑战赛（The PASCAL Visual Object Classes ）是一个世界级的计算机视觉挑战赛, PASCAL全称：Pattern Analysis, Statical Modeling and Computational Learning，是一个由欧盟资助的网络组织。很多模型都基于此数据集推出.比如目标检测领域的yolo,ssd等等。

VOC数据集格式

├── Annotations
├── ImageSets
│   ├── Action
│   ├── Layout
│   ├── Main
│   └── Segmentation
├── JPEGImages
├── SegmentationClass
└── SegmentationObject

VOC数据集制作

按上图创建文件夹
使用pip命令安装labelimg
如上COCO数据集标注，将标注好的数据放入Annotations文件夹下

生成4个txt文件

# -*- coding: utf-8 -*-
# @Author  : matthew
# @File    : make_train_val_test_set.py
# @Software: PyCharm

import os
import random


def _main():
    trainval_percent = 0.1
    train_percent = 0.9
    xmlfilepath = 'F:/jupyter/process/VOC2007/Annotation/'
    total_xml = os.listdir(xmlfilepath)

    num = len(total_xml)
    list = range(num)
    tv = int(num * trainval_percent)
    tr = int(tv * train_percent)
    trainval = random.sample(list, tv)
    train = random.sample(trainval, tr)

    ftrainval = open('F:/jupyter/process/VOC2007/ImageSets/Main/trainval.txt', 'w')
    ftest = open('F:/jupyter/process/VOC2007/ImageSets/Main/test.txt', 'w')
    ftrain = open('F:/jupyter/process/VOC2007/ImageSets/Main/train.txt', 'w')
    fval = open('F:/jupyter/process/VOC2007/ImageSets/Main/val.txt', 'w')

    for i in list:
        name = total_xml[i][:-4] + '\n'
        if i in trainval:
            ftrainval.write(name)
            if i in train:
                ftest.write(name)
            else:
                fval.write(name)
        else:
            ftrain.write(name)

    ftrainval.close()
    ftrain.close()
    fval.close()
    ftest.close()


if __name__ == '__main__':
    _main()

数据集转换

VOC和COCO数据集转换可以使用paddleX和paddleDection中集成好的工具，当然大佬可以自己写。

VOC和COCO数据集的制作方法很多，本文使用的labelme和labelimg只是众多工具中的两个。

萌新求指正。

全部评论(5)

FutureSI

#2 回复于2021-08

果断收藏

LeonX86

#4 回复于2021-11

果断收藏

zhaolipeng

#5 回复于2021-12

有没有ubyte这种

DeepGeGe

#6 回复于2021-12

制作VOC数据集可以使用一款叫做精灵标注助手的工具，在导出时候选择pascal-voc格式就可以生成VOC数据集了，近期在做目标检测项目，就是使用的这个工具。

李

李新辉541

#7 回复于2021-12

收藏收藏