功能介绍
可结构化识别各类版式的营业执照,返回证件编号、社会信用代码、单位名称、地址、法人、类型、成立日期、有效日期、经营范围等关键字段信息
应用场景
- 商家资质审查:结构化识别录入企业信息,应用于电商、零售、O2O 等行业的商户入驻审查场景,实现商户信息的自动化审查和录入,大幅度提升服务标准和运营效率
- 企业金融服务:自动识别录入企业信息,应用于企业银行开户、抵押贷款等金融服务场景,大幅度提升信息录入效率,并有效控制业务风险
接口描述
支持对不同版式营业执照的证件编号、社会信用代码、单位名称、地址、法人、类型、成立日期、有效日期、经营范围等关键字段进行结构化识别
请求说明
- HTTP 方法: POST
- 请求 URL: https://aip.baidubce.com/rest/2.0/ocr/v1/business_license
- URL参数: access_token
- Header 参数: Content-Type = application/x-www-form-urlencoded
- Body 参数:见下表
返回说明
返回参数如下表:
返回示例如下:
{
"log_id": 490058765,
"words_result": {
"社会信用代码": {
"words": "10440119MA06M8503",
"location": {
"top": 296,
"left": 237,
"width": 178,
"height": 18
}
},
"组成形式": {
"words": "无",
"location": {
"top": -1,
"left": -1,
"width": 0,
"height": 0
}
},
"经营范围": {
"words": "商务服务业",
"location": {
"top": 587,
"left": 378,
"width": 91,
"height": 18
}
},
"成立日期": {
"words": "2019年01月01日",
"location": {
"top": 482,
"left": 1045,
"width": 119,
"height": 19
}
},
"法人": {
"words": "方平",
"location": {
"top": 534,
"left": 377,
"width": 39,
"height": 19
}
},
"注册资本": {
"words": "200万元",
"location": {
"top": 429,
"left": 1043,
"width": 150,
"height": 19
}
},
"证件编号": {
"words": "921MA190538210301",
"location": {
"top": 216,
"left": 298,
"width": 146,
"height": 16
}
},
"地址": {
"words": "广州市",
"location": {
"top": 585,
"left": 1041,
"width": 55,
"height": 19
}
},
"单位名称": {
"words": "有限公司",
"location": {
"top": 429,
"left": 382,
"width": 72,
"height": 19
}
},
"有效期": {
"words": "长期",
"location": {
"top": 534,
"left": 1045,
"width": 0,
"height": 0
}
},
"类型": {
"words": "有限责任公司(自然人投资或控股)",
"location": {
"top": 482,
"left": 382,
"width": 260,
"height": 18
}
}
},
"log_id": 1310106134421438464,
"words_result_num": 11
}
C++ 代码实现调用
这里假设前置准备已经做好了,如果没有,请阅读以下文章;如果有,则直接跳过;
(基础篇 01)在控制台创建对应的应用 https://yangshun.win/blogs/dea770b9/
(基础篇 02)Windows 下使用 Vcpkg 配置百度 AI 图像识别 C++开发环境(VS2017) https://yangshun.win/blogs/3b103680/
(基础篇 03)C++ 获取 access token https://yangshun.win/blogs/49f400d2/
(基础篇 04)C++ base64 编解码原理及实现 https://yangshun.win/blogs/3f2fcf2e/
下面的代码中部分函数定义在 util.h 和 util.cpp 中
screenshot/functions/util.h https://github.com/busyboxs/screenshot/blob/master/functions/util.h
screenshot/functions/util.cpp https://github.com/busyboxs/screenshot/blob/master/functions/util.cpp
为了方便,首先根据返回参数定义了一个结构体,该结构体包括了返回参数中的参数,如下:
struct BusinessLicenseInfo
{
uint64_t logID{};
uint32_t resultNumber{};
// WordsResult wordsResult;
std::map wordsResult{};
void print()
{
std::cout << "log id: " << logID << '\n';
std::cout << "words result number: " << resultNumber << '\n';
for (auto& [name, res] : wordsResult)
{
std::cout << "\t" << UTF8ToGB(name.c_str()) << ": ";
res.print();
}
}
void draw(cv::Mat& img)
{
for (auto& [name, res] : wordsResult)
{
res.draw(img);
}
}
};
然后定义了一个类来调用接口并获取结果
class BusinessLicense
{
public:
BusinessLicense();
~BusinessLicense();
Json::Value request(std::string imgBase64, std::map& options);
void getResult(BusinessLicenseInfo& result);
private:
Json::Value m_obj;
std::string m_url;
// file to save token key
std::string m_filename;
};
类中的私有成员 m_obj 表示返回结果对应的 json 对象。m_url 表示请求的 url,m_filename 表示用于存储 access token 的文件的文件名。
request 函数输入请求图像的 base64 编码以及请求参数,返回一个 json 对象,json 对象中包含请求的结果。
getResult 将请求的结果进行解析为自定义的结构体数据类型。以便用于后序的打印和绘图等。
完整代码如下
BusinessLicense.h 代码如下: screenshot/functions/BusinessLicense.h https://github.com/busyboxs/screenshot/blob/master/functions/BusinessLicense.h
#pragma once
#include
#include
#include
#include
#include
#include "util.h"
#include "customVariables.h"
struct BusinessLicenseInfo
{
uint64_t logID{};
uint32_t resultNumber{};
// WordsResult wordsResult;
std::map wordsResult{};
void print()
{
std::cout << "log id: " << logID << '\n';
std::cout << "words result number: " << resultNumber << '\n';
for (auto& [name, res] : wordsResult)
{
std::cout << "\t" << UTF8ToGB(name.c_str()) << ": ";
res.print();
}
}
void draw(cv::Mat& img)
{
for (auto& [name, res] : wordsResult)
{
res.draw(img);
}
}
};
class BusinessLicense
{
public:
BusinessLicense();
~BusinessLicense();
Json::Value request(std::string imgBase64, std::map& options);
void getResult(BusinessLicenseInfo& result);
private:
Json::Value m_obj;
std::string m_url;
// file to save token key
std::string m_filename;
};
void BusinessLicenseTest();
BusinessLicenseInfo BusinessLicenseDetect(std::string imgPath);
Passport.cpp 代码如下: screenshot/functions/Passport.cpp https://github.com/busyboxs/screenshot/blob/master/functions/Passport.cpp
#include "BusinessLicense.h"
BusinessLicense::BusinessLicense()
{
m_url = "https://aip.baidubce.com/rest/2.0/ocr/v1/business_license";
m_filename = "tokenKey";
}
BusinessLicense::~BusinessLicense()
{
}
Json::Value BusinessLicense::request(std::string imgBase64, std::map& options)
{
std::string response;
Json::Value obj;
std::string token;
// 1. get HTTP post body
std::string body;
mergeHttpPostBody(body, imgBase64, options);
// 2. get HTTP url with access token
std::string url = m_url;
getHttpPostUrl(url, m_filename, token);
// 3. post request, response store the result
int status_code = httpPostRequest(url, body, response);
if (status_code != CURLcode::CURLE_OK) {
obj["curl_error_code"] = status_code;
m_obj = obj;
return obj; // TODO: maybe should exit
}
// 4. make string to json object
generateJson(response, obj);
// if access token is invalid or expired, we will get a new one
if (obj["error_code"].asInt() == 110 || obj["error_code"].asInt() == 111) {
token = getTokenKey();
writeFile(m_filename, token);
return request(imgBase64, options);
}
m_obj = obj;
//checkErrorWithExit(obj);
return obj;
}
void BusinessLicense::getResult(BusinessLicenseInfo& result)
{
if (m_obj.get("error_code", "null"))
{
result.wordsResult["error_code"].words = m_obj.get("error_code", "null").asString();
result.wordsResult["error_msg"].words = m_obj.get("error_msg", "null").asString();
return;
}
result.logID = m_obj["log_id"].asUInt64();
result.resultNumber = m_obj["words_result_num"].asUInt();
Json::Value::Members keys = m_obj["words_result"].getMemberNames();
for (auto it = keys.begin(); it != keys.end(); ++it)
{
ResultPart resultPart;
getResultPart(m_obj["words_result"][*it], resultPart);
result.wordsResult[*it] = resultPart;
}
}
void BusinessLicenseTest()
{
std::string img_file = "./images/businesslicense_test.png";
std::string out;
readImageFile(img_file.c_str(), out);
std::string img_base64 = base64_encode(out.c_str(), (int)out.size());
// set options
std::map options;
options["detect_direction"] = "false";
//options["accuracy"] = "normal";
Json::Value obj;
BusinessLicense businessLiceseObj;
BusinessLicenseInfo result;
obj = businessLiceseObj.request(img_base64, options);
businessLiceseObj.getResult(result);
result.print();
cv::Mat img = cv::imread("./images/businesslicense_test.png");
result.draw(img);
cv::namedWindow("businessLicense", cv::WINDOW_NORMAL);
cv::imshow("businessLicense", img);
cv::imwrite("./images/businesslicense_result.jpg", img);
cv::waitKey();
}
BusinessLicenseInfo BusinessLicenseDetect(std::string imgPath)
{
std::string out;
readImageFile(imgPath.c_str(), out);
std::string img_base64 = base64_encode(out.c_str(), (int)out.size());
// set options
std::map options;
options["detect_direction"] = "false";
Json::Value obj;
BusinessLicense businessLiceseObj;
BusinessLicenseInfo result;
obj = businessLiceseObj.request(img_base64, options);
businessLiceseObj.getResult(result);
return result;
}
测试图像
测试结果