论文复现训练营记录day3
收藏
day任务是写transformer的block
看了一下参数,是写vit,但是函数的写法应该是借鉴的swintransformer,所以我是照着swin写的,去掉了swin的部分。注意一点在normlayer的写法上要加上eval()来声明,作业里传的是字符串,而不是类。ps:今天没GPU训的有点慢,开始写的时候就剩4小时截至了就不放结果了,以下是我实现的代码
# Block类实现Transformer encoder的一个层 class Block(nn.Layer): # 定义Transformer层 def __init__(self, dim, num_heads, mlp_ratio=4., qkv_bias=False, qk_scale=None, drop=0., attn_drop=0., drop_path=0., act_layer=nn.GELU, norm_layer='nn.LayerNorm', epsilon=1e-5): super().__init__() # 此处添加代码 self.dim = dim self.num_heads = num_heads self.mlp_ratio = mlp_ratio self.norm1 = eval(norm_layer)(dim) self.attn = Attention(self.dim, num_heads=self.num_heads, qkv_bias=qkv_bias, qk_scale=qk_scale, attn_drop=attn_drop, proj_drop=drop) self.drop_path = DropPath(drop_path) if drop_path > 0. else Identity() self.norm2 = eval(norm_layer)(dim) mlp_hidden_dim = int(dim * mlp_ratio) self.mlp = Mlp(in_features=dim, hidden_features=mlp_hidden_dim, act_layer=act_layer, drop=drop) def forward(self, x): # 此处添加代码 shortcut = x x = self.norm1(x) attn = self.attn(x) x = shortcut + self.drop_path(attn) x = x + self.drop_path(self.mlp(self.norm2(x))) return x
0
收藏
请登录后评论
加油