如题,我用lstm做一个时序数据分类预测,损失函数是交叉熵损失,结果train loss很快降到0.69,之后也缓慢下降;但val loss从一开始训练就在0.69左右波动,没有下降和收敛的兆头,一些参数设置如下,百度了试过改参数初始化、加bn、改lstm层数、还把adam换了都不行,loss根本不下降,求求救救我!!(相同的数据用SVM做分类,准确率还可,但到lstm不知道咋了
class NewModel(paddle.nn.Layer):
def __init__(self,input_size,hidden_size,num_layers,drop_out,num_classes):
super(NewModel,self).__init__()
self.lstm=paddle.nn.LSTM(input_size=input_size,hidden_size=hidden_size,num_layers=num_layers,dropout=drop_out)
self.bn=paddle.nn.BatchNorm1D(hidden_size)
self.fc=paddle.nn.Linear(hidden_size,out_features=num_classes,weight_attr=paddle.ParamAttr(initializer=paddle.nn.initializer.XavierNormal()),bias_attr=paddle.ParamAttr(initializer=paddle.nn.initializer.Constant(0.2)))
def forward(self,inputs,sequence_len):
y,(h,c)=self.lstm(inputs,sequence_length=sequence_len) # batch_size*hiddensize
outputs=paddle.squeeze(y[:,-1,:],axis=1) # batch_size*1*hidden_size
outputs=self.bn(outputs)
outputs=self.fc(outputs)
return outputs
network=NewModel(input_size=feauture_n,hidden_size=128,num_layers=2,drop_out=0,num_classes=2)
LR=1e-5
epoches=200
batch_size=128*3
model=paddle.Model(network)
optimizer=paddle.optimizer.Adam(learning_rate=LR,parameters=model.parameters(),weight_decay=5e-4)
loss = paddle.nn.CrossEntropyLoss()
metric = paddle.metric.Accuracy()
同上,不知道怎么解决
是不是欠拟合
试试 把这行注释掉?outputs=paddle.squeeze(y[:,-1,:],axis=1)
减小学习率,减少模型复杂度,提高dropout率,增大minibatch的数量。
同上,不知道怎么解决
我想问一下解决了吗,我也出现了同样的问题