dockerfile

当我在dockerfile里面想安装torchvision

WORKDIR vision-0.8.1

WORKDIR相当于cd,进入文件夹执行

RUN python3 setup.py install --user

想要安装时,报错

Traceback (most recent call last):
  File "setup.py", line 12, in <module>
    import torch
  File "/usr/local/lib/python3.6/site-packages/torch/__init__.py", line 189, in <module>
    _load_global_deps()
  File "/usr/local/lib/python3.6/site-packages/torch/__init__.py", line 142, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/usr/local/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libmpi_cxx.so.20: cannot open shared object file: No such file or directory
The command '/bin/sh -c python3 setup.py install --user' returned a non-zero code: 1

从这个错误来看是缺少libopenblas,但我在dockerfile前面步骤装过了

RUN apt-get install python3-pip libopenblas-base libopenmpi-dev libomp-dev -y

这个解决方案是用CMD来运行python3

CMD ["python3"," setup.py install --user"]

PS

dockerfile里面不能用sudo apt-get,要直接用apt-get,用下面语句可以把默认的apt-get的源换成阿里源,下载更快

RUN sed -i s/deb.debian.org/mirrors.aliyun.com/g /etc/apt/sources.list && ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && echo 'Asia/Shanghai' >/etc/timezone

如果多个命令的话,可以用&&,换行要用\,如

RUN apt-get clean && apt-get update \
    && apt-get install python3-pip

PS2

最近研究才发现,想把pytorch放进容器里跑,直接用英伟达官网的镜像就好了,

NVIDIA L4T PyTorch | NVIDIA NGCPyTorch is a GPU accelerated tensor computational framework with a Python front end. This container contains PyTorch and torchvision pre-installed in a Python 3.6 environment to get up & running quickly with PyTorch on Jetson.icon-default.png?t=M4ADhttps://catalog.ngc.nvidia.com/orgs/nvidia/containers/l4t-pytorch

直接拉一个就好了

l4t-pytorch:r32.6.1-pth1.8-py3