首页 炼丹房 帖子详情
求助帖:图片训练时有损坏的文件报错,求解决办法 已解决
收藏
快速回复
炼丹房 问答新手上路 3270 35
求助帖:图片训练时有损坏的文件报错,求解决办法 已解决
收藏
快速回复
炼丹房 问答新手上路 3270 35

我在训练一个2万张图片的数据集,在训练执行中发现报错,求教解决办法

The loss value printed in the log is the current step, and the metric is the average value of previous step.
Epoch 1/100

/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:77: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
return (isinstance(seq, collections.Sequence) and
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/layer/norm.py:648: UserWarning: When training, we now always track global mean and variance.
"When training, we now always track global mean and variance.")

step 60/229 [======>.......................] - loss: 3.1190 - acc_top1: 0.0799 - acc_top5: 0.2940 - ETA: 20:21 - 7s/ste
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/PIL/TiffImagePlugin.py:770: UserWarning: Possibly corrupt EXIF data. Expecting to read 23836229632 bytes but only got 0. Skipping tag 0
" Skipping tag %s" % (size, len(data), tag)

step 229/229 [==============================] - loss: 3.1398 - acc_top1: 0.0979 - acc_top5: 0.3521 - 7s/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 30/58 [==============>...............] - loss: 3.5679 - acc_top1: 0.0578 - acc_top5: 0.3641 - ETA: 3:18 - 7s/st
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/PIL/TiffImagePlugin.py:788: UserWarning: Corrupt EXIF data. Expecting to read 12 bytes but only got 6.
warnings.warn(str(msg))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/PIL/JpegImagePlugin.py:793: UserWarning: Image appears to be a malformed MPO file, it will be interpreted as a base JPEG file
"Image appears to be a malformed MPO file, it will be "

TowerNet
已解决
37# 回复于2021-12
可以通过EasyData来解决
0
收藏
回复
全部评论(35)
时间顺序
琦琦的小老鼠
#2 回复于2021-03

不能沉,不能沉

0
回复
AIStudio810258
#3 回复于2021-03

清洗数据

0
回复
琦琦的小老鼠
#4 回复于2021-03
清洗数据

大佬,这个清洗数据定义到哪一层

0
回复
乌拉__----
#5 回复于2021-03

cv2. imread()循环读取所有图片试试

可能原因在于有些图片可以正常看,但是读取时rbg编码有问题

0
回复
乌拉__----
#6 回复于2021-03
cv2. imread()循环读取所有图片试试 可能原因在于有些图片可以正常看,但是读取时rbg编码有问题

这种图片,mpimg.imread()可以正常读取,但是cv2. imread()读取就是none,根据路径就可以找到坏图片了

0
回复
AIStudio810259
#7 回复于2021-03

try处理下呗。

0
回复
琦琦的小老鼠
#8 回复于2021-03
这种图片,mpimg.imread()可以正常读取,但是cv2. imread()读取就是none,根据路径就可以找到坏图片了

老大,小代码有些不会写

path=""
while True:
    path = input(prompt)
    if path == "quit":
        break
    else:
        img = cv2.imread(path)
        cv2.namedWindow('img',0)
        cv2.imshow('img',img)
        cv2.waitKey()

0
回复
琦琦的小老鼠
#9 回复于2021-03

import os
import cv2
import shutil

dirName = 'E:\PaddleClas-release-2.0\dataset\cat_12\cat_12_train'
# 将dirName路径下的所有文件路径全部存入all_path列表
all_path = []
for root, dirs, files in os.walk(dirName):
for file in files:
if "jpg" in file:
all_path.append(os.path.join(root, file))
all_path.sort()

bad = []
# 坏图片存放路径
badpath = 'E:\PaddleClas-release-2.0\dataset\\bad'

for i in range(len(all_path)):
org = all_path[i]
# print(all_path[i].split('/')[-1])
try:
img = cv2.imread(org)
ss = img.shape
except:
bad.append(all_path[i])
shutil.move(all_path[i],badpath)
continue

print('共有%s张坏图'%(len(bad)))
print(bad)

0
回复
乌拉__----
#10 回复于2021-03
import os import cv2 import shutil dirName = 'E:\PaddleClas-release-2.0\dataset\cat_12\cat_12_train' # 将dirName路径下的所有文件路径全部存入all_path列表 all_path = [] for root, dirs, files in os.walk(dirName): for file in files: if "jpg" in file: all_path.append(os.path.join(root, file)) all_path.sort() bad = [] # 坏图片存放路径 badpath = 'E:\PaddleClas-release-2.0\dataset\\bad' for i in range(len(all_path)): org = all_path[i] # print(all_path[i].split('/')[-1]) try: img = cv2.imread(org) ss = img.shape except: bad.append(all_path[i]) shutil.move(all_path[i],badpath) continue print('共有%s张坏图'%(len(bad))) print(bad)
展开

我说咋代码看着眼熟呢,之前写过

https://aistudio.baidu.com/aistudio/projectdetail/1133588

0
回复
乌拉__----
#11 回复于2021-03
清洗数据

哈哈哈

0
回复
琦琦的小老鼠
#12 回复于2021-03
我说咋代码看着眼熟呢,之前写过 https://aistudio.baidu.com/aistudio/projectdetail/1133588

是大佬写的啊,啊哈哈哈

0
回复
AIStudio810258
#13 回复于2021-03
大佬,这个清洗数据定义到哪一层

属于数据预处理,可以在读取时进行

0
回复
AIStudio810258
#14 回复于2021-03
是大佬写的啊,啊哈哈哈

作者负责~~~~~~

0
回复
七年期限
#15 回复于2021-03
清洗数据

我记得跟坤哥讨论过这个问题

0
回复
琦琦的小老鼠
#16 回复于2021-03

我用了这个代码把数据集里的图片都跑了一遍,然后启动程序后还会报错,请各位支支招。我的数据集文件夹是这样的

总文件夹---a类

                 -b类

                 -c类

import os
import cv2
import shutil
 
dirName = '/home/aistudio/work/dataset'
# 将dirName路径下的所有文件路径全部存入all_path列表
all_path = []
for root, dirs, files in os.walk(dirName):
        for file in files:
            if "jpeg" in file:
                    all_path.append(os.path.join(root, file))
all_path.sort()
 
bad = []
# 坏图片存放路径
badpath = '/home/aistudio/bad'
 
for i in range(len(all_path)):
    org = all_path[i]
    # print(all_path[i].split('/')[-1])
    try:
        img = cv2.imread(org)
        ss = img.shape
    except:
        bad.append(all_path[i])
        shutil.move(all_path[i],badpath)
        continue
 
print('共有%s张坏图'%(len(bad)))
print(bad)

0
回复
AIStudio810258
#17 回复于2021-03

我想起一个坑了,看看是否有黑白图片

0
回复
AIStudio810258
#18 回复于2021-03

有些api读黑白图片是三通道和rgb一样,有的就直接单通道了

0
回复
AIStudio810258
#19 回复于2021-03

我记得我用matplotlib读就出错,用cv2就好了

0
回复
TowerNet
#20 回复于2021-03
我想起一个坑了,看看是否有黑白图片

这种代码怎么写呢,大佬

0
回复
七年期限
#21 回复于2021-03
我想起一个坑了,看看是否有黑白图片

这个也有影响吗?

0
回复
在@后输入用户全名并按空格结束,可艾特全站任一用户