版权声明:本文为博主原创文章,欢迎转载,并请注明出处。联系方式:460356155@qq.com
RuntimeError: received 0 items of ancdata错误是在dataloader加载数据时出现的错误,原因是pytorch多线程共享tensor是通过打开文件的方式实现的,而打开文件的数量是有限制的,通过
ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 128088
max locked memory (kbytes, -l) 16384
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 128088
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
可查看,当需共享的tensor超过open files限制时,即会出现该错误。
解决办法有2种:
1、增加open files的限制数量:
不能用sudo ulimit -n命令,而需执行:
sudo sh -c “ulimit -n 65535 && exec su $LOGNAME”
解释如下:
ulimit is a shell builtin like cd, not a separate program. sudo looks for a binary to run, but there is no ulimit binary, which is why you get the error message. You need to run it in a shell. However, while you do need to be root to raise the limit to 65535, you probably don’t want to run your program as root. So after you raise the limit you should switch back to the current user. To do this, run: sudo sh -c "ulimit -n 65535 && exec su $LOGNAME" and you will get a new shell, without root privileges, but with the raised limit. The exec causes the new shell to replace the process with sudo privileges, so after you exit that shell, you won’t accidentally end up as root again.
2、修改多线程的tensor方式为file_system(默认方式为file_descriptor,受限于open files数量):
torch.multiprocessing.set_sharing_strategy(‘file_system’)