使用PaddleX时,数据集切分报错
收藏
如果在AI Studio的Paddle 2.0.2环境,使用PaddleX的命令行工具的下列命令切分数据集
paddlex --split_dataset --format VOC --dataset_dir MyDataset --val_value 0.2 --test_value 0.1
会产生报错,报错信息为:
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations def convert_to_list(value, n, name, dtype=np.int): /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/distributed/parallel.py:119: UserWarning: Currently not a parallel execution environment, `paddle.distributed.init_parallel_env` will not do anything. "Currently not a parallel execution environment, `paddle.distributed.init_parallel_env` will not do anything." Traceback (most recent call last): File "/opt/conda/envs/python35-paddle120-env/bin/paddlex", line 6, in from paddlex.command import main File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/__init__.py", line 20, in from . import cv File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/cv/__init__.py", line 15, in from . import models File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/cv/models/__init__.py", line 15, in from .segmenter import * File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/cv/models/segmenter.py", line 24, in from paddlex.cv.transforms import arrange_transforms File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/cv/transforms/__init__.py", line 16, in from .batch_operators import BatchRandomResize, BatchRandomResizeByShort, _BatchPadding File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/cv/transforms/batch_operators.py", line 22, in from paddle.fluid.dataloader.collate import default_collate_fn ModuleNotFoundError: No module named 'paddle.fluid.dataloader.collate'
深渊上的坑
已解决
4#
回复于2021-07
项目使用的是PaddleX的dev分支里的动态图dygraph目录,在Paddle2.1.0环境中,安装PaddleDet,PaddleSeg和PaddleClas的release 2.1版本,处理好依赖重复、冲突问题后,安装支持动态图的PaddleX版本,数据集切分可以正常执行。
0
收藏
请登录后评论
如果将环境切换到Paddle 1.8.4,则不会有问题,这个是不是一个PaddleX升级过程中出现的一个bug?
问题已解决,主要是因为版本不匹配
项目使用的是PaddleX的dev分支里的动态图dygraph目录,在Paddle2.1.0环境中,安装PaddleDet,PaddleSeg和PaddleClas的release 2.1版本,处理好依赖重复、冲突问题后,安装支持动态图的PaddleX版本,数据集切分可以正常执行。
paddlex 动态图版本develop安装方式: https://github.com/PaddlePaddle/PaddleX/blob/develop/dygraph/docs/install.md#paddlex-develop安装