同步链接:https://yangshun.win/blogs/e48e9a13/
github code: https://github.com/busyboxs/BaiDuAICPP
品牌 logo 识别能识别超过 2 万类商品 logo,支持用户创建属于自己的品牌 logo 图库,可准确识别图片中品牌 logo 的名称,适用于需要快速获取品牌信息的业务场景中
应用场景
- 品牌信息获取:根据拍摄照片,识别图片中商品 logo 名称,加快品牌信息获取速度,给消费者轻松高效的信息获取体验,促进消费者向投资者转化,适用于需要快速获取品牌信息的业务场景中
接口描述
该请求用于检测和识别图片中的台标、品牌商标等 logo 信息。即对于输入的一张图片(可正常解码,且长宽比适宜),输出图片中 logo 的名称、位置和置信度。
使用时,可直接调用 logo 识别-检索接口,支持识别超过 2 万类 logo 名称;当效果欠佳时,可以建立子库(在控制台创建应用并申请建库)并通过调用 logo 入口接口完成自定义 logo 入库,再调用 logo 识别-检索接口,选择在自定义 logo 库内检索,提高识别效果。
请求说明
- HTTP 方法: POST
- 请求 URL: https://aip.baidubce.com/rest/2.0/image-classify/v2/logo
- URL参数: access_token
- Header 参数: Content-Type = application/x-www-form-urlencoded
- Body 参数:见下表
返回说明
返回参数如下表:
返回示例如下:
{ "log_id": 843411868, "result_num": 1, "result": [ { "type": 0, "name": "科颜氏", "probability": 0.99998807907104, "location": { "width": 296, "top": 20, "height": 128, "left": 23 } } ] }
了解更多关于 logo 识别-入库[https://ai.baidu.com/ai-doc/IMAGERECOGNITION/Ok3bcxc59#logo%E8%AF%86%E5%88%AB%E5%85%A5%E5%BA%93] 和 logo 识别-删除[https://ai.baidu.com/ai-doc/IMAGERECOGNITION/Ok3bcxc59#logo%E8%AF%86%E5%88%AB%E5%88%A0%E9%99%A4]
C++ 代码实现调用
这里假设已经将环境配置好了,环境配置的文章可以参考 Windows 下使用 Vcpkg 配置百度 AI 图像识别 C++开发环境(VS2017)[https://yangshun.win/blogs/3b103680/]。
为了方便,首先根据返回参数定义了两个结构体,结构体包括了返回参数中的参数,如下:
struct Location { int left; int top; int width; int height; void print() { std::cout << "\n\t left: " << left << " top: " << top << " width: " << width << " height: " << height << '\n'; } void draw(cv::Mat &img) { cv::Rect rect(left, top, width, height); cv::rectangle(img, rect, cv::Scalar(255, 0, 255), 3); } }; struct LogoInfo { int type; std::string name; float probability; Location location; void print() { std::cout << std::setw(30) << std::setfill('-') << '\n'; std::cout << "type: " << type << "\n"; std::cout << "name: " << name << "\n"; std::cout << "probability: " << std::fixed << std::setprecision(4) << probability << "\n"; std::cout << "location: "; location.print(); } void draw(cv::Mat &img) { location.draw(img); } };
在 Location 结构体中,定义了一个 print 方法以打印 logo 位置信息。draw 方法用于在图像中画出 logo 的边框。
在 LogoInfo 结构体中,定义了一个 print 方法以打印 logo 结果信息。draw 方法用于在图像中画出 logo 的边框。
然后定义了一个类来调用接口并获取结果
class Logo { public: Logo(); ~Logo(); Json::Value request(std::string imgBase64, std::map& options); uint32_t getResultNum(); // get all return results void getAllResult(std::vector& results); // only get first result void getResult(LogoInfo& result); private: Json::Value obj_; std::string url_; uint32_t result_num_; // file to save token key std::string filename_; };
类中的私有成员 obj_ 表示返回结果对应的 json 对象。url_ 表示请求的 url,result_num_ 表示返回的结果数,filename_ 表示用于存储 access token 的文件的文件名。
request 函数输入请求图像的 base64 编码以及请求参数,返回一个 json 对象,json 对象中包含请求的结果。
getAllResult 获取请求的结果,总共有 top_num 条结果。
getResult 获取置信度最高的一条结果。
完整代码如下
util.h 和 util.cpp 代码参见 (简单调用篇 01) 通用物体和场景识别高级版 - C++ 简单调用[https://yangshun.win/blogs/cd08a730/]
Logo.h 代码如下:
#pragma once #include "util.h" struct Location { int left; int top; int width; int height; void print() { std::cout << "\n\t left: " << left << " top: " << top << " width: " << width << " height: " << height << '\n'; } void draw(cv::Mat &img) { cv::Rect rect(left, top, width, height); cv::rectangle(img, rect, cv::Scalar(255, 0, 255), 3); } }; struct LogoInfo { int type; std::string name; float probability; Location location; void print() { std::cout << std::setw(30) << std::setfill('-') << '\n'; std::cout << "type: " << type << "\n"; std::cout << "name: " << name << "\n"; std::cout << "probability: " << std::fixed << std::setprecision(4) << probability << "\n"; std::cout << "location: "; location.print(); } void draw(cv::Mat &img) { location.draw(img); } }; class Logo { public: Logo(); ~Logo(); Json::Value request(std::string imgBase64, std::map& options); uint32_t getResultNum(); // get all return results void getAllResult(std::vector& results); // only get first result void getResult(LogoInfo& result); private: Json::Value obj_; std::string url_; uint32_t result_num_; // file to save token key std::string filename_; }; void logoTest();
Logo.cpp 代码如下:
#include "Logo.h" Logo::Logo() { filename_ = "tokenKey"; url_ = "https://aip.baidubce.com/rest/2.0/image-classify/v2/logo"; } Logo::~Logo() { } Json::Value Logo::request(std::string imgBase64, std::map& options) { std::string response; Json::Value obj; std::string token; // 1. get HTTP post body std::string body; mergeHttpPostBody(body, imgBase64, options); // 2. get HTTP url with access token std::string url = url_; getHttpPostUrl(url, filename_, token); // 3. post request, response store the result int status_code = httpPostRequest(url, body, response); if (status_code != CURLcode::CURLE_OK) { obj["curl_error_code"] = status_code; obj_ = obj; return obj; // TODO: maybe should exit } // 4. make string to json object generateJson(response, obj); // if access token is invalid or expired, we will get a new one if (obj["error_code"].asInt() == 110 || obj["error_code"].asInt() == 111) { token = getTokenKey(); writeFile(filename_, token); return request(imgBase64, options); } obj_ = obj; // check for other error code checkErrorWithExit(obj); return obj; } uint32_t Logo::getResultNum() { return obj_["result_num"].asInt(); } void Logo::getAllResult(std::vector& results) { result_num_ = getResultNum(); results.reserve(result_num_); LogoInfo tmp; for (uint32_t i = 0; i < result_num_; ++i) { tmp.type = obj_["result"][i]["type"].asInt(); tmp.name = UTF8ToGB(obj_["result"][i]["name"].asString().c_str()); tmp.probability = obj_["result"][i]["probability"].asFloat(); tmp.location.left = obj_["result"][i]["location"]["left"].asInt(); tmp.location.top = obj_["result"][i]["location"]["top"].asInt(); tmp.location.width = obj_["result"][i]["location"]["width"].asInt(); tmp.location.height = obj_["result"][i]["location"]["height"].asInt(); results.push_back(tmp); } } void Logo::getResult(LogoInfo & result) { result.type = obj_["result"][0]["type"].asInt(); result.name = UTF8ToGB(obj_["result"][0]["name"].asString().c_str()); result.probability = obj_["result"][0]["probability"].asFloat(); result.location.left = obj_["result"][0]["location"]["left"].asInt(); result.location.top = obj_["result"][0]["location"]["top"].asInt(); result.location.width = obj_["result"][0]["location"]["width"].asInt(); result.location.height = obj_["result"][0]["location"]["height"].asInt(); } void logoTest() { std::cout << "size: " << sizeof(LogoInfo) << "\n"; // read image and encode to base64 std::string imgFile = "./images/logo_test.jpg"; std::string imgBase64; imgToBase64(imgFile, imgBase64); // set options std::map options; // options["custom_lib"] = true; Json::Value obj; Logo logoObj; obj = logoObj.request(imgBase64, options); LogoInfo result; logoObj.getResult(result); result.print(); cv::Mat img = cv::imread(imgFile); result.draw(img); cv::namedWindow("Logo Test", cv::WINDOW_NORMAL); cv::imshow("Logo Test", img); std::vector results; logoObj.getAllResult(results); cv::Mat img1 = cv::imread(imgFile); cv::namedWindow("Logo Tests", cv::WINDOW_NORMAL); for (auto & vec : results) { vec.print(); vec.draw(img1); } cv::imshow("Logo Tests", img1); cv::waitKey(); }
main.cpp 代码如下:
#include "util.h" #include "Logo.h" #include int main() { logoTest(); system("pause"); return EXIT_SUCCESS; }
运行结果
测试图像