首页 PaddleCV 帖子详情
PaddleClas在训练时的的top1数据和batch对不上 已解决
收藏
快速回复
PaddleCV 问答图像分类 390 1
PaddleClas在训练时的的top1数据和batch对不上 已解决
收藏
快速回复
PaddleCV 问答图像分类 390 1

Batch是8张图片,在训练时的top1数值怎么这么奇怪?   验证的top1数值看起来是正常的。
[2022/03/13 18:03:04] root INFO: [Train][Epoch 50/50][Iter: 0/5]lr: 0.00041, CELoss: 0.56853, loss: 0.56853, top1: 0.87500, top3: 1.00000, batch_cost: 0.32509s, reader_cost: 0.11848, ips: 24.60891 images/sec, eta: 0:00:01
[2022/03/13 18:03:04] root INFO: [Train][Epoch 50/50][Iter: 1/5]lr: 0.00038, CELoss: 0.32264, loss: 0.32264, top1: 0.93750, top3: 1.00000, batch_cost: 0.32469s, reader_cost: 0.11800, ips: 24.63901 images/sec, eta: 0:00:01
[2022/03/13 18:03:05] root INFO: [Train][Epoch 50/50][Iter: 2/5]lr: 0.00036, CELoss: 0.21675, loss: 0.21675, top1: 0.95833, top3: 1.00000, batch_cost: 0.32431s, reader_cost: 0.11753, ips: 24.66740 images/sec, eta: 0:00:00
[2022/03/13 18:03:05] root INFO: [Train][Epoch 50/50][Iter: 3/5]lr: 0.00033, CELoss: 0.16291, loss: 0.16291, top1: 0.96875, top3: 1.00000, batch_cost: 0.32394s, reader_cost: 0.11706, ips: 24.69624 images/sec, eta: 0:00:00
[2022/03/13 18:03:05] root INFO: [Train][Epoch 50/50][Iter: 4/5]lr: 0.00031, CELoss: 0.13069, loss: 0.13069, top1: 0.97500, top3: 1.00000, batch_cost: 0.32356s, reader_cost: 0.11659, ips: 24.72523 images/sec, eta: 0:00:00
[2022/03/13 18:03:05] root INFO: [Train][Epoch 50/50][Avg]CELoss: 0.13069, loss: 0.13069, top1: 0.97500, top3: 1.00000
[2022/03/13 18:03:05] root INFO: [Eval][Epoch 50][Iter: 0/5]CELoss: 2.76934, loss: 2.76934, top1: 0.12500, top3: 1.00000, batch_cost: 0.14596s, reader_cost: 0.02998, ips: 54.80825 images/sec
[2022/03/13 18:03:05] root INFO: [Eval][Epoch 50][Iter: 1/5]CELoss: 4.25078, loss: 4.25078, top1: 0.00000, top3: 1.00000, batch_cost: 0.08149s, reader_cost: 0.01499, ips: 98.17481 images/sec
[2022/03/13 18:03:05] root INFO: [Eval][Epoch 50][Iter: 2/5]CELoss: 2.22130, loss: 2.22130, top1: 0.50000, top3: 1.00000, batch_cost: 0.05899s, reader_cost: 0.00999, ips: 135.61383 images/sec
[2022/03/13 18:03:05] root INFO: [Eval][Epoch 50][Iter: 3/5]CELoss: 0.13056, loss: 0.13056, top1: 0.87500, top3: 1.00000, batch_cost: 0.04774s, reader_cost: 0.00750, ips: 167.57107 images/sec
[20

A
Albertt
已解决
2# 回复于2022-03
找到问题了,在trainer.py里 if iter_id % print_batch_step == 0:  lr_msg = "lr: {:.5f}".format(lr_sch.get_lr()) metric_msg = ", ".join([  "{}: {:.5f}".format(key, output_info[key].avg)                    #这里或许应该改成output_info[key].val for key in output_info ]) print(output_info['top1'].avg)  print(metric_msg)  time_msg = "s, ".join([ "{}: {:.5f}".format(key, time_info[key].avg) for key in time_info ])
展开
0
收藏
回复
全部评论(1)
时间顺序
A
Albertt
#2 回复于2022-03

找到问题了,在trainer.py里

if iter_id % print_batch_step == 0: 
lr_msg = "lr: {:.5f}".format(lr_sch.get_lr())
metric_msg = ", ".join([ 
"{}: {:.5f}".format(key, output_info[key].avg)                    #这里或许应该改成output_info[key].val
for key in output_info
])
print(output_info['top1'].avg) 
print(metric_msg) 
time_msg = "s, ".join([
"{}: {:.5f}".format(key, time_info[key].avg)
for key in time_info
])

0
回复
需求/bug反馈?一键提issue告诉我们
发现bug?如果您知道修复办法,欢迎提pr直接参与建设飞桨~
在@后输入用户全名并按空格结束,可艾特全站任一用户