月度归档:2017年11月

从零开始搭建深度学习服务器: 深度学习工具安装(TensorFlow + PyTorch + Torch)

Deep Learning Specialization on Coursera

这个系列写了好几篇文章,这是相关文章的索引,仅供参考:

以下是相关深度学习工具包的安装,包括Tensorflow, PyTorch, Torch等:

1. TensorFlow:

首先安装libcupti-dev

sudo apt-get install libcupti-dev

然后用 virtualenv 方式安装 Tensorflow(当前是1.4版本)

sudo apt-get install python-pip python-dev python-virtualenv 
mkdir tensorflow
cd tensorflow
virtualenv --system-site-packages venv
source venv/bin/activate
pip install --upgrade tensorflow-gpu

测试GPU:

Python 2.7.12 (default, Nov 19 2016, 06:48:10) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
...
2017-10-24 20:37:24.290049: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 0 with properties: 
name: GeForce GTX 1080 Ti
major: 6 minor: 1 memoryClockRate (GHz) 1.6575
pciBusID 0000:01:00.0
Total memory: 10.91GiB
Free memory: 10.52GiB
...
2017-10-24 20:37:24.387363: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 1 with properties: 
name: GeForce GTX 1080 Ti
major: 6 minor: 1 memoryClockRate (GHz) 1.6575
pciBusID 0000:02:00.0
Total memory: 10.91GiB
Free memory: 10.76GiB
2017-10-24 20:37:24.388168: I tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0 1 
2017-10-24 20:37:24.388176: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0:   Y Y 
2017-10-24 20:37:24.388179: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 1:   Y Y 
2017-10-24 20:37:24.388186: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0)
2017-10-24 20:37:24.388189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0)
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0
/job:localhost/replica:0/task:0/gpu:1 -> device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0
2017-10-24 20:37:24.449867: I tensorflow/core/common_runtime/direct_session.cc:300] Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0
/job:localhost/replica:0/task:0/gpu:1 -> device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0
>>> 

2. PyTorch:

首先在PyTorch的官网下载对应的pip安装文件:

然后用virtualenv的方式安装,非常方便:

mkdir pytorch
cd pytorch/
virtualenv venv
source venv/bin/activate
pip install /path/to/torch-0.2.0.post3-cp27-cp27mu-manylinux1_x86_64.whl 
pip install torchvision 

3. Torch

首先按照Torch官方的方法进行安装:http://torch.ch/docs/getting-started.html

git clone https://github.com/torch/distro.git ~/torch --recursive
cd ~/torch; bash install-deps;
./install.sh

如无意外,可以顺利安装,如果遇到了如下两个问题,可按下述方法修改:

1) 执行./install.sh时出现Moses>=1.错误

Missing dependencies for nn:moses >= 1.,有时候执行./install.sh时,会出现这个问题。

用这个方法解决:

sudo apt install luarocks
sudo luarocks install moses

2) install.sh 过程中提示“error -- unsupported GNU version! gcc versions later than 5 are not supported!”

ubuntu17.04自带gcc 6.x 版本,所以降级安装gcc 4.9版本解决问题:

sudo apt-get install g++-4.9  
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 20  
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 20 

成功执行安装脚本后后提示:

Do you want to automatically prepend the Torch install location
to PATH and LD_LIBRARY_PATH in your /home/yourpath/.bashrc? (yes/no)
[yes] >>>
yes

安装脚本会自动将torch的安装路径写入到 .bashrc里,然后输入 th试试:

如果你想用Lua5.2替代LuaJIT的方式安装Torch(If you want to install torch with Lua 5.2 instead of LuaJIT, simply run),可按如下方式安装:

git clone https://github.com/torch/distro.git torch --recursive
cd torch

# clean old torch installation
./clean.sh

在 ~/.bashrec中设置lua的环境:
TORCH_LUA_VERSION=LUA52
并执行 source ~/.bashrc, 然后运行:

./install.sh

遇到第一个问题:

cmake: not found

安装cmake解决:
sudo apt-get install cmake

第二个问题:
readline.c:8:31: fatal error: readline/readline.h: 没有那个文件或目录

安装libreadine-dev解决:
sudo apt-get install libreadline-dev

第三个问题:安装过程依然提示“error -- unsupported GNU version! gcc versions later than 5 are not supported!”

ubuntu17.04自带gcc 6.x 版本,所以降级安装gcc 4.9版本解决问题:

sudo apt-get install g++-4.9
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 20
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 20

安装完毕依然会提示:

Not updating your shell profile.
You might want to
add the following lines to your shell profile:

. /home/textminer/torch/torch/install/bin/torch-activate

在 ~/.profile 文件末尾加上这行 ". /home/textminer/torch/torch/install/bin/torch-activate " 并执行 source ~/.profile,然后输入 th试试。

注:原创文章,转载请注明出处及保留链接“我爱自然语言处理”:http://www.52nlp.cn

本文链接地址:从零开始搭建深度学习服务器: 深度学习工具安装(TensorFlow + PyTorch + Torch) http://www.52nlp.cn/?p=10008

从零开始搭建深度学习服务器: 基础环境配置(Ubuntu + GTX 1080 TI + CUDA + cuDNN)

Deep Learning Specialization on Coursera

这个系列写了好几篇文章,这是相关文章的索引,仅供参考:

去年上半年配置了一台GTX1080深度学习主机:深度学习主机攒机小记,然后分别写了两篇深度学习环境配置的文章:深度学习主机环境配置: Ubuntu16.04+Nvidia GTX 1080+CUDA8.0深度学习主机环境配置: Ubuntu16.04+GeForce GTX 1080+TensorFlow,得到了很多同学留言,不过这个一年多以前完成的深度学习环境配置方案显得有些落伍了。这一年里,深度学习领域继续高歌猛进,包括 Andrew Ng 也离开百度出来创业了,他的第一个项目是deeplearning.ai,和Coursera合作推出了一个深度学习专项课程系列: Andrew Ng 深度学习课程小记。另外GTX1080的升级版1080TI显卡的发售也刺激了深度学习服务器的配置升级,我也机缘巧合的配置了3台1080TI深度学习服务器:从零开始搭建深度学习服务器:硬件选择。同时深度学习工具的开发迭代速度也惊人,Theano在完成了自己的历史使命后选择了停止更新,以这样的方式了退出了深度学习的舞台,而 TensorFlow,Torch,Pytorch 等工具和周边也发展迅猛。因为一次偶然事件,我又一次为老机器重装了系统环境,并且选则了最新的cuda9, cudnn7.0等基础工具版本: 深度学习服务器环境配置: Ubuntu17.04+Nvidia GTX 1080+CUDA 9.0+cuDNN 7.0+TensorFlow 1.3。不过回过头来,发现这种源代码方式编译 TensorFlow GPU 版本的方式在国内的网络环境下并不方便,而我更喜欢 CUDA8 + cuDNN6 + Tensorflow + Pytorch + Torch 的安装方案,简明扼要并且比较方便,于是在新的深度学习主机里我分别在Ubunu17.04和Ubuntu16.04的系统环境下配置了这样的深度学习服务器环境,下面就是相关的安装记录,希望这能成为一份简单的深度学习服务器环境配置指南。

1. 安装Ubuntu系统: Ubuntu16.04 或者 Ubuntu17.04

从Ubuntu官方直接下载Ubuntu镜像(Ubuntu16.04或者Ubuntu17.04,采用的是desktop amd64版本),用U盘和Ubuntu镜像制作安装盘。在MAC下制作 Ubuntu USB 安装盘的方法可参考: 在MAC下使用ISO制作Linux的安装USB盘,之后通过Bios引导U盘启动安装Ubuntu系统。如果安装的时候出现类似黑屏或者类似 "nouveau ... fifo ..."之类的报错信息,重启电脑,进入安装界面时候长按e,进入图形界面,按F6,选择 nomodeset 或者手动添加,进行Ubuntu系统的安装。参考《深度学习主机环境配置: Ubuntu16.04+Nvidia GTX 1080+CUDA8.0》。

2. Source源和Pip源设置:

系统安装完毕后建议设置一下source源和pip源,这样可以加速安装相关的工具包。

cd /etc/apt/
sudo cp sources.list sources.list.bak
sudo vi sources.list

对于Ubuntu16.04,我用的是阿里云的源,把下面的这些源添加到source.list文件头部:

deb-src http://archive.ubuntu.com/ubuntu xenial main restricted #Added by software-properties
deb http://mirrors.aliyun.com/ubuntu/ xenial main restricted
deb-src http://mirrors.aliyun.com/ubuntu/ xenial main restricted multiverse universe #Added by software-properties
deb http://mirrors.aliyun.com/ubuntu/ xenial-updates main restricted
deb-src http://mirrors.aliyun.com/ubuntu/ xenial-updates main restricted multiverse universe #Added by software-properties
deb http://mirrors.aliyun.com/ubuntu/ xenial universe
deb http://mirrors.aliyun.com/ubuntu/ xenial-updates universe
deb http://mirrors.aliyun.com/ubuntu/ xenial multiverse
deb http://mirrors.aliyun.com/ubuntu/ xenial-updates multiverse
deb http://mirrors.aliyun.com/ubuntu/ xenial-backports main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ xenial-backports main restricted universe multiverse #Added by software-properties
deb http://archive.canonical.com/ubuntu xenial partner
deb-src http://archive.canonical.com/ubuntu xenial partner
deb http://mirrors.aliyun.com/ubuntu/ xenial-security main restricted
deb-src http://mirrors.aliyun.com/ubuntu/ xenial-security main restricted multiverse universe #Added by software-properties
deb http://mirrors.aliyun.com/ubuntu/ xenial-security universe
deb http://mirrors.aliyun.com/ubuntu/ xenial-security multiverse

对于Ubuntu17.04,我使用的是网易的源:

deb http://mirrors.163.com/ubuntu/ zesty main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ zesty-security main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ zesty-updates main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ zesty-proposed main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ zesty-backports main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ zesty main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ zesty-security main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ zesty-updates main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ zesty-proposed main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ zesty-backports main restricted universe multiverse

最后更新一下:

sudo apt-get update
sudo apt-get upgrade

另外一个事情是将pip源指向阿里云的源镜像:http://mirrors.aliyun.com/help/pypi,具体添加一个 ~/.config/pip/pip.conf 文件,设置为:

[global]
trusted-host =  mirrors.aliyun.com
index-url = http://mirrors.aliyun.com/pypi/simple

或者清华的pip源,刚好安装的那两天清华的pip源抽风,所以就换阿里云的了。

3. 安装1080TI显卡驱动:

sudo apt-get purge nvidia*
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update && sudo apt-get install nvidia-384 nvidia-settings

安装完毕后重启机器,运行 nvidia-smi,看看生效的显卡驱动:

4. 安装CUDA:

因为Tensorflow和Pytorch目前官方提供的PIP版本只支持CUDA8, 所以我选择了安装CUDA8.0。不过目前英文达官方网站的 CUDA-TOOLKIT页面默认提供的是CUDA9.0的下载,所以需要在英文达官方提供的另一个 CUDA Toolkit Archive 页面选择CUDA8,这个页面包含了CUDA所有的历史版本和当前的CUDA9.0版本。点击 CUDA Toolkit 8.0 GA2 (Feb 2017) 这个页面,选择"cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb" 和 "cuBLAS Patch Update to CUDA 8":

sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
sudo apt-get update
sudo apt-get install cuda

如果之前没有安装上述"cuBLAS Patch Update to CUDA 8",可以用如下方式安装更新:

sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-cublas-performance-update_8.0.61-1_amd64.deb
sudo apt-get update  
sudo apt-get upgrade cuda

在 ~/.bashrc 中设置环境变量:

export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export CUDA_HOME=/usr/local/cuda

运行 source ~/.bashrc 使其生效

4. 安装cuDNN:

cuDNN7.0 虽然出来了,但是 CUDA8 的最佳拍档依然是cuDNN6.0,在NIVIDA开发者官网上,找到cudnn的下载页面: https://developer.nvidia.com/rdp/cudnn-download ,选择"Download cuDNN v6.0 (April 27, 2017), for CUDA 8.0" 中的 "cuDNN v6.0 Library for Linux":

下载后安装非常简单,就是解压然后拷贝到相应的系统CUDA路径下,注意最后一行拷贝时 "-d"不能少, 否则会提示.so不是symbol link:

tar -zxvf cudnn-8.0-linux-x64-v6.0.tgz 
sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/ -d

以上是安装均在Ubunt16.04和Ubuntu17.04环境下测试通过,最后鉴于最近一些相关文章评论有同学留言无法从官方下载CUDA和cuDNN,亲测可能与国内环境有关,我将cuda8.0, cuda9.0, cudnn6.0, cudnn7.0的相关工具包上传到了百度网盘,提供两个下载地址:

CUDA8.0 & CUDA9.0下载地址:链接: https://pan.baidu.com/s/1gfaS4lt 密码 ddji ,包括:

1) CUDA8.0 for Ubuntu16.04: cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
2) CUDA8.0 for Ubuntu16.04 更新: cuda-repo-ubuntu1604-8-0-local-cublas-performance-update_8.0.61-1_amd64
3) CUDA9.0 for Ubuntu16.04: cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb
4) CUDA9.0 for Ubuntu17.04: cuda-repo-ubuntu1704-9-0-local_9.0.176-1_amd64

cuDNN6.0 & cuDNN7.0下载地址:链接: https://pan.baidu.com/s/1qXIZqpA 密码 cwch ,包括:

1) cudnn6.0 for CUDA8: cudnn-8.0-linux-x64-v6.0.tgz
2) cudnn7.0 for CUDA8: cudnn-8.0-linux-x64-v7.tgz
3) cudnn7.0 for CUDA9: cudnn-9.0-linux-x64-v7.tgz

注:原创文章,转载请注明出处及保留链接“我爱自然语言处理”:http://www.52nlp.cn

本文链接地址:从零开始搭建深度学习服务器: 基础环境配置(Ubuntu + GTX 1080 TI + CUDA + cuDNN) http://www.52nlp.cn/?p=9823

Andrew Ng 深度学习课程系列第四门课程卷积神经网络开课

Deep Learning Specialization on Coursera

Andrew Ng 深度学习课程系列第四门课程卷积神经网络(Convolutional Neural Networks)将于11月6日开课 ,不过课程资料已经放出,现在注册课程已经可以听课了 ,这门课程属于Coursera上的深度学习专项系列 ,这个系列有5门课,前三门已经开过好几轮,但是第4、第5门课程一直处于待定状态,新的一轮将于11月7号开始,感兴趣的同学可以关注:Deep Learning Specialization

This course will teach you how to build convolutional neural networks and apply it to image data. Thanks to deep learning, computer vision is working far better than just two years ago, and this is enabling numerous exciting applications ranging from safe autonomous driving, to accurate face recognition, to automatic reading of radiology images. You will: - Understand how to build a convolutional neural network, including recent variations such as residual networks. - Know how to apply convolutional networks to visual detection and recognition tasks. - Know to use neural style transfer to generate art. - Be able to apply these algorithms to a variety of image, video, and other 2D or 3D data. This is the fourth course of the Deep Learning Specialization.

个人认为这是目前互联网上最适合入门深度学习的课程系列了,Andrew Ng 老师善于讲课,另外用Python代码抽丝剥茧扣作业,课程学起来非常舒服,参考我之前写得两篇小结:

Andrew Ng 深度学习课程小记

Andrew Ng (吴恩达) 深度学习课程小结

额外推荐: 深度学习课程资源整理