功能介绍
可结构化识别各类版式的营业执照,返回证件编号、社会信用代码、单位名称、地址、法人、类型、成立日期、有效日期、经营范围等关键字段信息
应用场景
- 商家资质审查:结构化识别录入企业信息,应用于电商、零售、O2O 等行业的商户入驻审查场景,实现商户信息的自动化审查和录入,大幅度提升服务标准和运营效率
- 企业金融服务:自动识别录入企业信息,应用于企业银行开户、抵押贷款等金融服务场景,大幅度提升信息录入效率,并有效控制业务风险
接口描述
支持对不同版式营业执照的证件编号、社会信用代码、单位名称、地址、法人、类型、成立日期、有效日期、经营范围等关键字段进行结构化识别
请求说明
- HTTP 方法: POST
- 请求 URL: https://aip.baidubce.com/rest/2.0/ocr/v1/business_license
- URL参数: access_token
- Header 参数: Content-Type = application/x-www-form-urlencoded
- Body 参数:见下表
返回说明
返回参数如下表:
返回示例如下:
{ "log_id": 490058765, "words_result": { "社会信用代码": { "words": "10440119MA06M8503", "location": { "top": 296, "left": 237, "width": 178, "height": 18 } }, "组成形式": { "words": "无", "location": { "top": -1, "left": -1, "width": 0, "height": 0 } }, "经营范围": { "words": "商务服务业", "location": { "top": 587, "left": 378, "width": 91, "height": 18 } }, "成立日期": { "words": "2019年01月01日", "location": { "top": 482, "left": 1045, "width": 119, "height": 19 } }, "法人": { "words": "方平", "location": { "top": 534, "left": 377, "width": 39, "height": 19 } }, "注册资本": { "words": "200万元", "location": { "top": 429, "left": 1043, "width": 150, "height": 19 } }, "证件编号": { "words": "921MA190538210301", "location": { "top": 216, "left": 298, "width": 146, "height": 16 } }, "地址": { "words": "广州市", "location": { "top": 585, "left": 1041, "width": 55, "height": 19 } }, "单位名称": { "words": "有限公司", "location": { "top": 429, "left": 382, "width": 72, "height": 19 } }, "有效期": { "words": "长期", "location": { "top": 534, "left": 1045, "width": 0, "height": 0 } }, "类型": { "words": "有限责任公司(自然人投资或控股)", "location": { "top": 482, "left": 382, "width": 260, "height": 18 } } }, "log_id": 1310106134421438464, "words_result_num": 11 }
C++ 代码实现调用
这里假设前置准备已经做好了,如果没有,请阅读以下文章;如果有,则直接跳过;
(基础篇 01)在控制台创建对应的应用 https://yangshun.win/blogs/dea770b9/
(基础篇 02)Windows 下使用 Vcpkg 配置百度 AI 图像识别 C++开发环境(VS2017) https://yangshun.win/blogs/3b103680/
(基础篇 03)C++ 获取 access token https://yangshun.win/blogs/49f400d2/
(基础篇 04)C++ base64 编解码原理及实现 https://yangshun.win/blogs/3f2fcf2e/
下面的代码中部分函数定义在 util.h 和 util.cpp 中
screenshot/functions/util.h https://github.com/busyboxs/screenshot/blob/master/functions/util.h
screenshot/functions/util.cpp https://github.com/busyboxs/screenshot/blob/master/functions/util.cpp
为了方便,首先根据返回参数定义了一个结构体,该结构体包括了返回参数中的参数,如下:
struct BusinessLicenseInfo { uint64_t logID{}; uint32_t resultNumber{}; // WordsResult wordsResult; std::map wordsResult{}; void print() { std::cout << "log id: " << logID << '\n'; std::cout << "words result number: " << resultNumber << '\n'; for (auto& [name, res] : wordsResult) { std::cout << "\t" << UTF8ToGB(name.c_str()) << ": "; res.print(); } } void draw(cv::Mat& img) { for (auto& [name, res] : wordsResult) { res.draw(img); } } };
然后定义了一个类来调用接口并获取结果
class BusinessLicense { public: BusinessLicense(); ~BusinessLicense(); Json::Value request(std::string imgBase64, std::map& options); void getResult(BusinessLicenseInfo& result); private: Json::Value m_obj; std::string m_url; // file to save token key std::string m_filename; };
类中的私有成员 m_obj 表示返回结果对应的 json 对象。m_url 表示请求的 url,m_filename 表示用于存储 access token 的文件的文件名。
request 函数输入请求图像的 base64 编码以及请求参数,返回一个 json 对象,json 对象中包含请求的结果。
getResult 将请求的结果进行解析为自定义的结构体数据类型。以便用于后序的打印和绘图等。
完整代码如下
BusinessLicense.h 代码如下: screenshot/functions/BusinessLicense.h https://github.com/busyboxs/screenshot/blob/master/functions/BusinessLicense.h
#pragma once #include #include #include #include #include #include "util.h" #include "customVariables.h" struct BusinessLicenseInfo { uint64_t logID{}; uint32_t resultNumber{}; // WordsResult wordsResult; std::map wordsResult{}; void print() { std::cout << "log id: " << logID << '\n'; std::cout << "words result number: " << resultNumber << '\n'; for (auto& [name, res] : wordsResult) { std::cout << "\t" << UTF8ToGB(name.c_str()) << ": "; res.print(); } } void draw(cv::Mat& img) { for (auto& [name, res] : wordsResult) { res.draw(img); } } }; class BusinessLicense { public: BusinessLicense(); ~BusinessLicense(); Json::Value request(std::string imgBase64, std::map& options); void getResult(BusinessLicenseInfo& result); private: Json::Value m_obj; std::string m_url; // file to save token key std::string m_filename; }; void BusinessLicenseTest(); BusinessLicenseInfo BusinessLicenseDetect(std::string imgPath);
Passport.cpp 代码如下: screenshot/functions/Passport.cpp https://github.com/busyboxs/screenshot/blob/master/functions/Passport.cpp
#include "BusinessLicense.h" BusinessLicense::BusinessLicense() { m_url = "https://aip.baidubce.com/rest/2.0/ocr/v1/business_license"; m_filename = "tokenKey"; } BusinessLicense::~BusinessLicense() { } Json::Value BusinessLicense::request(std::string imgBase64, std::map& options) { std::string response; Json::Value obj; std::string token; // 1. get HTTP post body std::string body; mergeHttpPostBody(body, imgBase64, options); // 2. get HTTP url with access token std::string url = m_url; getHttpPostUrl(url, m_filename, token); // 3. post request, response store the result int status_code = httpPostRequest(url, body, response); if (status_code != CURLcode::CURLE_OK) { obj["curl_error_code"] = status_code; m_obj = obj; return obj; // TODO: maybe should exit } // 4. make string to json object generateJson(response, obj); // if access token is invalid or expired, we will get a new one if (obj["error_code"].asInt() == 110 || obj["error_code"].asInt() == 111) { token = getTokenKey(); writeFile(m_filename, token); return request(imgBase64, options); } m_obj = obj; //checkErrorWithExit(obj); return obj; } void BusinessLicense::getResult(BusinessLicenseInfo& result) { if (m_obj.get("error_code", "null")) { result.wordsResult["error_code"].words = m_obj.get("error_code", "null").asString(); result.wordsResult["error_msg"].words = m_obj.get("error_msg", "null").asString(); return; } result.logID = m_obj["log_id"].asUInt64(); result.resultNumber = m_obj["words_result_num"].asUInt(); Json::Value::Members keys = m_obj["words_result"].getMemberNames(); for (auto it = keys.begin(); it != keys.end(); ++it) { ResultPart resultPart; getResultPart(m_obj["words_result"][*it], resultPart); result.wordsResult[*it] = resultPart; } } void BusinessLicenseTest() { std::string img_file = "./images/businesslicense_test.png"; std::string out; readImageFile(img_file.c_str(), out); std::string img_base64 = base64_encode(out.c_str(), (int)out.size()); // set options std::map options; options["detect_direction"] = "false"; //options["accuracy"] = "normal"; Json::Value obj; BusinessLicense businessLiceseObj; BusinessLicenseInfo result; obj = businessLiceseObj.request(img_base64, options); businessLiceseObj.getResult(result); result.print(); cv::Mat img = cv::imread("./images/businesslicense_test.png"); result.draw(img); cv::namedWindow("businessLicense", cv::WINDOW_NORMAL); cv::imshow("businessLicense", img); cv::imwrite("./images/businesslicense_result.jpg", img); cv::waitKey(); } BusinessLicenseInfo BusinessLicenseDetect(std::string imgPath) { std::string out; readImageFile(imgPath.c_str(), out); std::string img_base64 = base64_encode(out.c_str(), (int)out.size()); // set options std::map options; options["detect_direction"] = "false"; Json::Value obj; BusinessLicense businessLiceseObj; BusinessLicenseInfo result; obj = businessLiceseObj.request(img_base64, options); businessLiceseObj.getResult(result); return result; }
测试图像
测试结果