笔记：Wide & Deep Learning

首页版块访问AI主站注册发帖

才能我浪费99 发布于2018-02 浏览:3412 回复:0

笔记：Wide & Deep Learning

快速回复

前两天自从看到一张图后：

就一直想读一下相关论文，这两天终于有时间把论文看了一下，就是这篇Wide & Deep Learning for Recommender Systems

首先简介，主要说了什么是Wide和Deep：
Wide就是：wide是指高维特征+特征组合的LR，原文Generalized linear models with nonlinear feature transformations are widely used for large-scale regression and classification problems with sparse inputs. Memorization of feature interactions through a wide set of cross-product feature transformations are effective and interpretable, while generalization requires more feature engineering effort.
Deep就是：深度神经网络，原文：With less feature engineering, deep neural networks can generalize better to unseen feature combinations through low-dimensional dense embeddings learned for the sparse features. However, deep neural networks with embeddings can over-generalize and recommend less relevant items when the user-item interactions are sparse and high-rank.
然后就是本文介绍如何整合Wide和Deep

主要内容：
两个有意思的概念Memorization和Generalization：
Memorization can be loosely defined as learning the frequent co-occurrence of items or features and exploiting the correlation available in the historical data.
Generalization, on the other hand, is based on transitivity of correlation and explores new feature combinations that have never or rarely occurred in the past.

回顾LR和深度学习的方法。

介绍他们的实践，一些细节
目标App Acquisitions
对比join training和ensemble。ensemble是disjoint的。join training可以一起优化整个模型。
训练时候LR部分是FTRL+L1正则，深度学习用的AdaGrad?
训练数据有500 个billion。这是怎么算的，这么NB?
连续值先用累计分布函数CDF归一化到[0,1]，再划档离散化。这个倒是不错的trick。

文章不长写的挺有意思的，大家可以下来细读一下。

文档

个赞

快速回复

小编推荐

【7.18升级】自动预标注上线、词典值管理升级

TroubleMaker源 7回复

【智能对话深度实战营】首批训练师认证名单公示

魏亚非669 38回复

语音语义一体化全新升级，一次数据交互全搞定

用户已被禁言 18回复

TOP

操作指南

常见问答

平台公告

经验交流

技术专区

文字识别

人脸识别

语音技术

PaddlePaddle

EasyDL

BML

EasyData

AI Studio

UNIT

人体分析

图像搜索

图像识别

内容审核

自然语言处理

机器人视觉

视频技术

增强现实

知识图谱

智能创作

智能呼叫中心

文心

EdgeBoard

DuerOS

EasyEdge

度目硬件

百度AI市场

Doris

AI赛事

百度之星大赛

AI Studio人工智能竞赛

语言与智能技术竞赛

千言数据集

集思广益

共享工具

头脑风暴

成果展示

智能客服