如何写attention代码
收藏
示例中虽然有attention示例,但是
却不清楚,比如我的起始标签不为0该怎么写,比如解码代码:
targ_embedding = paddle.layer.embedding(
input=paddle.layer.data(
name='target_language_word',
type=paddle.data_type.integer_value_sequence(target_dict_dim)),
size=word_vector_dim,
param_attr=paddle.attr.ParamAttr(name='_target_language_embedding'))
group_inputs.append(trg_embedding)
input=paddle.layer.data(
name='target_language_word',
type=paddle.data_type.integer_value_sequence(target_dict_dim)),
size=word_vector_dim,
param_attr=paddle.attr.ParamAttr(name='_target_language_embedding'))
group_inputs.append(trg_embedding)
这个targ_embedding怎么获取的?我如果初始化的起始标签不为0该怎么修改
1
收藏
请登录后评论
据我了解,标签必须是要0开始的。你说的这个例子哪里有,提供一下。
https://github.com/PaddlePaddle/book/tree/develop/08.machine_translation
“这个targ_embedding怎么获取的”
输入数据经过embedding层处理获得的,您的问题是?
“初始化的其实标签不为0”
这个可以再解释一下吗?
如果新训练一个任务的话,label的起始位置总可以从0开始的
targ_embedding 类似于查表操作,这个layer的参数是NxM的矩阵,N代表token id,范围为[0, N-1],M代表embedding vector的维度
好吧,这个问题我也改成了从0开始了,出现了新的错误F0522 15:32:31.800976 32765 Vector.h:139] Check failed: start + size <= static_cast(getSize()) (485 vs. 477)
*** Check failure stack trace: ***
@ 0x7f346227a2ed google::LogMessage::Fail()
@ 0x7f346227dd9c google::LogMessage::SendToLog()
@ 0x7f3462279e13 google::LogMessage::Flush()
@ 0x7f346227f2ae google::LogMessageFatal::~LogMessageFatal()
@ 0x7f3462060a24 paddle::Argument::concat()
@ 0x7f3461e500cf paddle::MultiGradientMachine::getOutArgs()
@ 0x7f3461e52454 paddle::MultiGradientMachine::forwardImp()
@ 0x7f3461df798d _wrap_GradientMachine_forward
@ 0x7f34bcd81bad PyEval_EvalFrameEx
@ 0x7f34bcd82c3e PyEval_EvalCodeEx
@ 0x7f34bcd821f7 PyEval_EvalFrameEx
@ 0x7f34bcd82c3e PyEval_EvalCodeEx
@ 0x7f34bcd821f7 PyEval_EvalFrameEx
@ 0x7f34bccf239c gen_send_ex
@ 0x7f34bcd7c4f1 PyEval_EvalFrameEx
@ 0x7f34bccf239c gen_send_ex
@ 0x7f34bcd7c4f1 PyEval_EvalFrameEx
@ 0x7f34bcd82c3e PyEval_EvalCodeEx
@ 0x7f34bcd821f7 PyEval_EvalFrameEx
@ 0x7f34bcd82c3e PyEval_EvalCodeEx
@ 0x7f34bcd821f7 PyEval_EvalFrameEx
@ 0x7f34bcd82c3e PyEval_EvalCodeEx
@ 0x7f34bcd821f7 PyEval_EvalFrameEx
@ 0x7f34bcd82c3e PyEval_EvalCodeEx
@ 0x7f34bcd821f7 PyEval_EvalFrameEx
@ 0x7f34bcd82c3e PyEval_EvalCodeEx
@ 0x7f34bcd82d52 PyEval_EvalCode
@ 0x7f34bcda3450 PyRun_FileExFlags
@ 0x7f34bcda362f PyRun_SimpleFileExFlags
@ 0x7f34bcdb8fd4 Py_Main
@ 0x7f34bbfbdc05 __libc_start_main
请问什么情况
具体描述下:
我是这么调用的
def infer_batch_probs(self, infer_data, feeding_dict):
"""Infer the prob matrices for a batch of speech utterances.
:param infer_data: List of utterances to infer, with each utterance
consisting of a tuple of audio features and
transcription text (empty string).
:type infer_data: list
:param feeding_dict: Feeding is a map of field name and tuple index
of the data that reader returns.
:type feeding_dict: dict|list
:return: List of 2-D probability matrix, and each consists of prob
vectors for one speech utterancce.
:rtype: List of matrix
"""
# define inferer
if self._inferer == None:
self._inferer = paddle.inference.Inference(
output_layer=self._loss, parameters=self._parameters)
adapted_feeding_dict = self._adapt_feeding_dict(feeding_dict)
adapted_infer_data = self._adapt_data(infer_data)
# run inference
infer_results = self._inferer.infer(
input=adapted_infer_data, feeding=adapted_feeding_dict)
pdb.set_trace()
其中self._loss就是下面的beam_result
decode_first = paddle.layer.first_seq(input=back_ward_gru)
decoder_boot = paddle.layer.mixed(
size=decoder_size,
act=paddle.activation.Tanh(),
input=paddle.layer.full_matrix_projection(decode_first))
def gru_decoder_with_attention(enc_vec, enc_proj, current_word):
decoder_mem = paddle.layer.memory(
name='gru_decoder', size=decoder_size, boot_layer=decoder_boot)
context = paddle.networks.simple_attention(
encoded_sequence=enc_vec,
encoded_proj=enc_proj,
decoder_state=decoder_mem)
#decoder_inputs = paddle.layer.fc(
# act=paddle.activation.Linear(),
# size=decoder_size * 3,
# bias_attr=False,
# input=[context, current_word],
# layer_attr=paddle.attr.ExtraLayerAttribute(
# error_clipping_threshold=100.0))
decoder_inputs = paddle.layer.mixed(
size=decoder_size * 3,
input=[
paddle.layer.full_matrix_projection(input=context),
paddle.layer.full_matrix_projection(input=current_word)
])
gru_step = paddle.layer.gru_step(
name='gru_decoder',
input=decoder_inputs,
output_mem=decoder_mem,
size=decoder_size)
out = paddle.layer.mixed(
size=dict_size,
bias_attr=True,
act=paddle.activation.Softmax(),
input=paddle.layer.full_matrix_projection(input=gru_step))
#out = paddle.layer.fc(
# size=dict_size,
# bias_attr=True,
# act=paddle.activation.Softmax(),
# input=gru_step)
return out
decoder_group_name = 'decoder_group'
group_input1 = paddle.layer.StaticInput(input=encoded_vector)
group_input2 = paddle.layer.StaticInput(input=encoded_proj)
group_inputs = [group_input1, group_input2]
trg_embedding = paddle.layer.GeneratedInput(
size=dict_size,
embedding_name='_target_language_embedding',
embedding_size=1024)
group_inputs.append(trg_embedding)
beam_gen = paddle.layer.beam_search(
name=decoder_group_name,
step=gru_decoder_with_attention,
input=group_inputs,
bos_id=0,
eos_id=1,
beam_size=5)
return beam_gen
从保存信息 @ 0x7f3462060a24 paddle::Argument::concat() 来看是这里出了问题,但是你配置没有直接调用这个layer,怀疑是有多个input的地方shape无法对上,比如
```
decoder_inputs = paddle.layer.mixed(
size=decoder_size * 3,
input=[
paddle.layer.full_matrix_projection(input=context),
paddle.layer.full_matrix_projection(input=current_word)
])
```
不是说上面的代码有问题,而是去检测这些layer的inputs,看shape是否对的上