/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/layer/norm.py:641: UserWarning: When training, we now always track global mean and variance.
"When training, we now always track global mean and variance.")
Traceback (most recent call last):
File "train.py", line 193, in
main(args)
File "train.py", line 188, in main
fp16=args.fp16)
File "/home/aistudio/work/PaddleSeg/paddleseg/core/train.py", line 231, in train
loss_iters=iters)
File "/home/aistudio/work/PaddleSeg/paddleseg/core/train.py", line 75, in loss_computation
loss_list.append(losses['coef'][i] * loss_i(logits, labels))
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 902, in __call__
outputs = self.forward(*inputs, **kwargs)
File "/home/aistudio/work/PaddleSeg/paddleseg/models/losses/cross_entropy_loss.py", line 81, in forward
weight=self.weight)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/functional/loss.py", line 1416, in cross_entropy
if label_min < 0 or label_max >= input.shape[-1]:
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/math_op_patch.py", line 250, in __impl__
return math_op(self, other_var, 'axis', axis)
SystemError: (Fatal) Operator less_than raises an thrust::system::system_error exception.
The exception content is
:transform: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered. (at /paddle/paddle/fluid/imperative/tracer.cc:192)
terminate called without an active exception
--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0 paddle::framework::SignalHandle(char const*, int)
1 paddle::platform::GetCurrentTraceBackString[abi:cxx11]()
----------------------
Error Message Summary:
----------------------
FatalError: `Process abort signal` is detected by the operating system.
[TimeInfo: *** Aborted at 1631973346 (unix time) try "date -d @1631973346" if you are using GNU date ***]
[SignalInfo: *** SIGABRT (@0x3e800001250) received by PID 4688 (TID 0x7fb0ef4f4700) from PID 4688 ***]
Aborted (core dumped)
是定义weight的格式问题吗?麻烦大佬帮忙看一下
应该是的weight的数量和标签数量不对