首页 PaddleHub 帖子详情
Seq2Seq 模型库案例里的reader
收藏
快速回复
PaddleHub 问答预训练模型 830 4
Seq2Seq 模型库案例里的reader
收藏
快速回复
PaddleHub 问答预训练模型 830 4

Seq2Seq 模型库案例里的reader 有几个看不懂的地方求助。

def read_all_line(filenam):

data = []

with io.open(filename, "r", encoding='utf-8') as f:

for line in f.readlines():

data.append(line.strip())

filename 好像写错了。

 

且在 para file to id 中:

def _para_file_to_ids(src_file, tar_file, src_vocab, tar_vocab):

 

src_data = []

with io.open(src_file, "r", encoding='utf-8') as f_src:

for line in f_src.readlines():

arra = line.strip().split()

ids = [src_vocab[w] if w in src_vocab else UNK_ID for w in arra]

ids = ids

 

src_data.append(ids)

 

tar_data = []

with io.open(tar_file, "r", encoding='utf-8') as f_tar:

for line in f_tar.readlines():

arra = line.strip().split()

ids = [tar_vocab[w] if w in tar_vocab else UNK_ID for w in arra]

 

ids = [1] + ids + [2]

 

tar_data.append(ids)

 

return src_data, tar_data

tar_data的ids=[1]+ ids+[2]

这里的1和2是什么意思啊,1 和2 不是已经在_build_vocab里分给了两个词了吗

github连接:https://github.com/PaddlePaddle/models/blob/release/1.8/dygraph/seq2seq/reader.py

0
收藏
回复
全部评论(4)
时间顺序
A
AIStudio784238
#2 回复于2020-10

求助,哪里可以求助啊

0
回复
thinc
#3 回复于2020-10

这个。。。排版可以再好点的,最上面有插入/编辑代码示例

0
回复
AIStudio810258
#4 回复于2020-10

用代码格式吧,没有缩进了

0
回复
A
AIStudio784238
#5 回复于2020-10

Seq2Seq 模型库案例里的reader 有几个看不懂的地方求助。

def read_all_line(filenam):

  data = []

  with io.open(filename, "r", encoding='utf-8') as f:

  for line in f.readlines():

  data.append(line.strip())

filename 好像写错了。

且在 para file to id 中:

def _para_file_to_ids(src_file, tar_file, src_vocab, tar_vocab):

 

 src_data = []

 with io.open(src_file, "r", encoding='utf-8') as f_src:

   for line in f_src.readlines():

   arra = line.strip().split()

   ids = [src_vocab[w] if w in src_vocab else UNK_ID for w in arra]

   ids = ids

 

   src_data.append(ids)

 

 tar_data = []

 with io.open(tar_file, "r", encoding='utf-8') as f_tar:

   for line in f_tar.readlines():

   arra = line.strip().split()

   ids = [tar_vocab[w] if w in tar_vocab else UNK_ID for w in arra]

 

   ids = [1] + ids + [2]

 

   tar_data.append(ids)

 

 return src_data, tar_data

tar_data的ids=[1]+ ids+[2]

这里的1和2是什么意思啊,1 和2 不是已经在_build_vocab里分给了两个词了吗

github连接:https://github.com/PaddlePaddle/models/blob/release/1.8/dygraph/seq2seq/reader.py

0
回复
需求/bug反馈?一键提issue告诉我们
发现bug?如果您知道修复办法,欢迎提pr直接参与建设飞桨~
在@后输入用户全名并按空格结束,可艾特全站任一用户