使用PaddleX时,数据集切分报错
收藏
如果在AI Studio的Paddle 2.0.2环境,使用PaddleX的命令行工具的下列命令切分数据集
paddlex --split_dataset --format VOC --dataset_dir MyDataset --val_value 0.2 --test_value 0.1
会产生报错,报错信息为:
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
def convert_to_list(value, n, name, dtype=np.int):
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/distributed/parallel.py:119: UserWarning: Currently not a parallel execution environment, `paddle.distributed.init_parallel_env` will not do anything.
"Currently not a parallel execution environment, `paddle.distributed.init_parallel_env` will not do anything."
Traceback (most recent call last):
File "/opt/conda/envs/python35-paddle120-env/bin/paddlex", line 6, in
from paddlex.command import main
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/__init__.py", line 20, in
from . import cv
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/cv/__init__.py", line 15, in
from . import models
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/cv/models/__init__.py", line 15, in
from .segmenter import *
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/cv/models/segmenter.py", line 24, in
from paddlex.cv.transforms import arrange_transforms
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/cv/transforms/__init__.py", line 16, in
from .batch_operators import BatchRandomResize, BatchRandomResizeByShort, _BatchPadding
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/cv/transforms/batch_operators.py", line 22, in
from paddle.fluid.dataloader.collate import default_collate_fn
ModuleNotFoundError: No module named 'paddle.fluid.dataloader.collate'
深渊上的坑
已解决
4#
回复于2021-07
项目使用的是PaddleX的dev分支里的动态图dygraph目录,在Paddle2.1.0环境中,安装PaddleDet,PaddleSeg和PaddleClas的release 2.1版本,处理好依赖重复、冲突问题后,安装支持动态图的PaddleX版本,数据集切分可以正常执行。
0
收藏
请登录后评论
如果将环境切换到Paddle 1.8.4,则不会有问题,这个是不是一个PaddleX升级过程中出现的一个bug?
问题已解决,主要是因为版本不匹配
项目使用的是PaddleX的dev分支里的动态图dygraph目录,在Paddle2.1.0环境中,安装PaddleDet,PaddleSeg和PaddleClas的release 2.1版本,处理好依赖重复、冲突问题后,安装支持动态图的PaddleX版本,数据集切分可以正常执行。
paddlex 动态图版本develop安装方式: https://github.com/PaddlePaddle/PaddleX/blob/develop/dygraph/docs/install.md#paddlex-develop安装