论文复现训练营记录day3
收藏
day任务是写transformer的block
看了一下参数,是写vit,但是函数的写法应该是借鉴的swintransformer,所以我是照着swin写的,去掉了swin的部分。注意一点在normlayer的写法上要加上eval()来声明,作业里传的是字符串,而不是类。ps:今天没GPU训的有点慢,开始写的时候就剩4小时截至了就不放结果了,以下是我实现的代码
# Block类实现Transformer encoder的一个层
class Block(nn.Layer):
# 定义Transformer层
def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_scale=None, drop=0., attn_drop=0.,
drop_path=0., act_layer=nn.GELU, norm_layer='nn.LayerNorm', epsilon=1e-5):
super().__init__()
# 此处添加代码
self.dim = dim
self.num_heads = num_heads
self.mlp_ratio = mlp_ratio
self.norm1 = eval(norm_layer)(dim)
self.attn = Attention(self.dim, num_heads=self.num_heads, qkv_bias=qkv_bias, qk_scale=qk_scale, attn_drop=attn_drop, proj_drop=drop)
self.drop_path = DropPath(drop_path) if drop_path > 0. else Identity()
self.norm2 = eval(norm_layer)(dim)
mlp_hidden_dim = int(dim * mlp_ratio)
self.mlp = Mlp(in_features=dim, hidden_features=mlp_hidden_dim, act_layer=act_layer, drop=drop)
def forward(self, x):
# 此处添加代码
shortcut = x
x = self.norm1(x)
attn = self.attn(x)
x = shortcut + self.drop_path(attn)
x = x + self.drop_path(self.mlp(self.norm2(x)))
return x
0
收藏
请登录后评论
加油