标签归档:TensorFlow

从零开始搭建深度学习服务器: 1080TI四卡并行(Ubuntu16.04+CUDA9.2+cuDNN7.1+TensorFlow+Keras)

Deep Learning Specialization on Coursera

这个系列写了好几篇文章,这是相关文章的索引,仅供参考:

最近公司又弄了一套4卡1080TI机器,配置基本上和之前是一致的,只是显卡换成了技嘉的伪公版1080TI:技嘉GIGABYTE GTX1080Ti 涡轮风扇108TTURBO-11GD

部件	型号	价格	链接	备注
CPU	英特尔(Intel)酷睿六核i7-6850K 盒装CPU处理器 	4599	http://item.jd.com/11814000696.html	
散热器	美商海盗船 H55 水冷	449	https://item.jd.com/10850633518.html	
主板	华硕(ASUS)华硕 X99-E WS/USB 3.1工作站主板	4759	
内存	美商海盗船(USCORSAIR) 复仇者LPX DDR4 3000 32GB(16Gx4条)  	2799 * 2	https://item.jd.com/1990572.html	
SSD	三星(SAMSUNG) 960 EVO 250G M.2 NVMe 固态硬盘	599	https://item.jd.com/3739097.html		
硬盘	希捷(SEAGATE)酷鱼系列 4TB 5900转 台式机机械硬盘 * 2 	629 * 2	https://item.jd.com/4220257.html	
电源	美商海盗船 AX1500i 全模组电源 80Plus金牌	3699	https://item.jd.com/10783917878.html
机箱	美商海盗船 AIR540 USB3.0 	949	http://item.jd.com/12173900062.html
显卡	技嘉(GIGABYTE) GTX1080Ti 11GB 非公版高端游戏显卡深度学习涡轮 * 4 7400 * 4    https://item.jd.com/10583752777.html

这台深度学习主机大概是这样的:

深度学习主机

安装完Ubuntu16.04之后,我又开始了CUDA、cuDnn等深度学习环境和工具的安装之旅,时隔大半年,又有了很多变化,特别是CUDA9.x和cuDnn7.x已经成了标配,这里记录一下。

安装CUDA9.x

注:如果还需要安装Tensorflow1.8,建议这里安装CUDA9.0,我在另一台机器上遇到了一点问题,怀疑和我这台机器先安装CUDA9.0,再安装CUDA9.2有关。

依然从英伟达官方下载当前的CUDA版本,我选择了最新的CUDA9.2:

点选完对应Ubuntu16.04的CUDA9.2 deb版本之后,英伟达官方主页会给出安装提示:

Installation Instructions:
`sudo dpkg -i cuda-repo-ubuntu1604-9-2-local_9.2.88-1_amd64.deb`
`sudo apt-key add /var/cuda-repo-/7fa2af80.pub`
`sudo apt-get update`
`sudo apt-get install cuda`

在下载完大概1.2G的cuda deb版本之后,实际安装命令是这样的:

sudo dpkg -i cuda-repo-ubuntu1604-9-2-local_9.2.88-1_amd64.deb
sudo apt-key add /var/cuda-repo-9-2-local/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda

官方CUDA下载下载页面还附带了一个cuBLAS 9.2 Patch更新,官方强烈建议安装:

This update includes fix to cublas GEMM APIs on V100 Tensor Core GPUs when used with default algorithm CUBLAS_GEMM_DEFAULT_TENSOR_OP. We strongly recommend installing this update as part of CUDA Toolkit 9.2 installation.

可以用如下方式安装这个Patch更新:

sudo dpkg -i cuda-repo-ubuntu1604-9-2-local-cublas-update-1_1.0-1_amd64.deb 
sudo apt-get update  
sudo apt-get upgrade cuda

CUDA9.2安装完毕之后,1080TI的显卡驱动也附带安装了,可以重启机器,然后用 nvidia-smi 命令查看一下:

最后在在 ~/.bashrc 中设置环境变量:

export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export CUDA_HOME=/usr/local/cuda

运行 source ~/.bashrc 使其生效。

安装cuDNN7.x

同样去英伟达官网的cuDNN下载页面:https://developer.nvidia.com/rdp/cudnn-download,最新版本是cuDNN7.1.4,有三个版本可以选择,分别面向CUDA8.0, CUDA9.0, CUDA9.2:

cudnn7.1.4 cuda9.2 ubuntu16.04

下载完cuDNN7.1的压缩包之后解压,然后将相关文件拷贝到cuda的系统路径下即可:

tar -zxvf cudnn-9.2-linux-x64-v7.1.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/ -d 
sudo chmod a+r /usr/local/cuda/include/cudnn.h
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*

安装TensorFlow 1.8

TensorFlow的安装变得越来越简单,现在TensorFlow的官网也有中文安装文档了:https://www.tensorflow.org/install/install_linux?hl=zh-cn , 我们Follow这个文档,用Virtualenv的安装方式进行TensorFlow的安装,不过首先要配置一下基础环境。

首先在Ubuntu16.04里安装 libcupti-dev 库:

这是 NVIDIA CUDA 分析工具接口。此库提供高级分析支持。要安装此库,请针对 CUDA 工具包 8.0 或更高版本发出以下命令:

$ sudo apt-get install cuda-command-line-tools
并将其路径添加到您的 LD_LIBRARY_PATH 环境变量中:

$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/extras/CUPTI/lib64
对于 CUDA 工具包 7.5 或更低版本,请发出以下命令:

$ sudo apt-get install libcupti-dev

然而我运行“sudo apt-get install cuda-command-line-tools”命令后得到的却是:

E: 无法定位软件包 cuda-command-line-tools

Google后发现其实在安装CUDA9.2的时候,这个包已经安装了,在CUDA的路径下这个库已经有了:

/usr/local/cuda/extras/CUPTI/lib64$ ls
libcupti.so  libcupti.so.9.2  libcupti.so.9.2.88

现在只需要将其加入到环境变量中,在~/.bashrc中添加如下声明并令source ~/.bashrc另其生效即可:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/extras/CUPTI/lib64

剩下的就更简单了:

sudo apt-get install python-pip python-dev python-virtualenv 
virtualenv --system-site-packages tensorflow1.8
source tensorflow1.8/bin/activate
easy_install -U pip
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --upgrade tensorflow-gpu

强烈建议将清华的pip源写到配置文件里,这样就更方便快捷了。

最后测试一下TensorFlow1.8:

Python 2.7.12 (default, Dec  4 2017, 14:50:18) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
2018-06-17 12:15:34.158680: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-06-17 12:15:34.381812: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: 
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:05:00.0
totalMemory: 10.91GiB freeMemory: 5.53GiB
2018-06-17 12:15:34.551451: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 1 with properties: 
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:06:00.0
totalMemory: 10.92GiB freeMemory: 5.80GiB
2018-06-17 12:15:34.780350: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 2 with properties: 
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:09:00.0
totalMemory: 10.92GiB freeMemory: 5.80GiB
2018-06-17 12:15:34.959199: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 3 with properties: 
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:0a:00.0
totalMemory: 10.92GiB freeMemory: 5.80GiB
2018-06-17 12:15:34.966403: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0, 1, 2, 3
2018-06-17 12:15:36.373745: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-17 12:15:36.373785: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929]      0 1 2 3 
2018-06-17 12:15:36.373798: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N Y Y Y 
2018-06-17 12:15:36.373804: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 1:   Y N Y Y 
2018-06-17 12:15:36.373808: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 2:   Y Y N Y 
2018-06-17 12:15:36.373814: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 3:   Y Y Y N 
2018-06-17 12:15:36.374516: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5307 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:05:00.0, compute capability: 6.1)
2018-06-17 12:15:36.444426: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 5582 MB memory) -> physical GPU (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:06:00.0, compute capability: 6.1)
2018-06-17 12:15:36.506340: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 5582 MB memory) -> physical GPU (device: 2, name: GeForce GTX 1080 Ti, pci bus id: 0000:09:00.0, compute capability: 6.1)
2018-06-17 12:15:36.614736: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 5582 MB memory) -> physical GPU (device: 3, name: GeForce GTX 1080 Ti, pci bus id: 0000:0a:00.0, compute capability: 6.1)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:05:00.0, compute capability: 6.1
/job:localhost/replica:0/task:0/device:GPU:1 -> device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:06:00.0, compute capability: 6.1
/job:localhost/replica:0/task:0/device:GPU:2 -> device: 2, name: GeForce GTX 1080 Ti, pci bus id: 0000:09:00.0, compute capability: 6.1
/job:localhost/replica:0/task:0/device:GPU:3 -> device: 3, name: GeForce GTX 1080 Ti, pci bus id: 0000:0a:00.0, compute capability: 6.1
2018-06-17 12:15:36.689345: I tensorflow/core/common_runtime/direct_session.cc:284] Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:05:00.0, compute capability: 6.1
/job:localhost/replica:0/task:0/device:GPU:1 -> device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:06:00.0, compute capability: 6.1
/job:localhost/replica:0/task:0/device:GPU:2 -> device: 2, name: GeForce GTX 1080 Ti, pci bus id: 0000:09:00.0, compute capability: 6.1
/job:localhost/replica:0/task:0/device:GPU:3 -> device: 3, name: GeForce GTX 1080 Ti, pci bus id: 0000:0a:00.0, compute capability: 6.1

安装Keras2.1.x

Keras的后端支持TensorFlow, Theano, CNTK,在安装完TensorFlow GPU版本之后,继续安装Keras非常简单,在TensorFlow的虚拟环境中,直接"pip install keras"即可,安装的版本是Keras2.1.6:

Installing collected packages: h5py, scipy, pyyaml, keras
Successfully installed h5py-2.7.1 keras-2.1.6 pyyaml-3.12 scipy-1.1.0

测试一下:

Python 2.7.12 (default, Dec  4 2017, 14:50:18) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import keras
Using TensorFlow backend.

注:原创文章,转载请注明出处及保留链接“我爱自然语言处理”:http://www.52nlp.cn

本文链接地址:从零开始搭建深度学习服务器: 1080TI四卡并行(Ubuntu16.04+CUDA9.2+cuDNN7.1+TensorFlow+Keras) http://www.52nlp.cn/?p=10334

AI Challenger 2017 奇遇记

Deep Learning Specialization on Coursera

本文记录一下去年下半年参加的AI Challenger比赛的过程,有那么一点意思,之所以说是奇遇,看完文章就明白了。

去年8月,由创新工场、搜狗、今日头条联合举办的“AI challenger全球AI挑战赛”首届比赛正式开赛。比赛共设6个赛道,包括英中机器同声传译、英中机器文本翻译、场景分类、图像中文描述、人体骨骼关键点预测以及虚拟股票趋势预测,一时汇集了众多关注的目光:

“AI Challenger 全球AI挑战赛”是面向全球人工智能(AI)人才的开放数据集和编程竞赛平台,致力于打造大型、全面的科研数据集与世界级竞赛平台,从科研角度出发,满足学术界对高质量数据集的需求,推进人工智能在科研与商业领域的结合,促进世界范围内人工智能研发人员共同探索前沿领域的技术突破及应用创新。在2017年的首届大赛中,AI Challenger发布了千万量级的机器翻译数据集、百万量级的计算机视觉数据集,一系列兼具学术前沿性和产业应用价值的竞赛以及超过200万人民币的奖金,吸引了来自全球65个国家的8892支团队参赛,成为目前国内规模最大的科研数据集平台、以及最大的非商业化竞赛平台。 AI Challenger以服务、培养AI高端人才为使命,打造良性可持续的AI科研新生态。

不过AI Challenger 最吸引我的不是每项比赛数十万元的奖金(这个掂量一下也拿不到),而是英中机器翻译提供的高达1千万的中英双语句对语料,这个量级,在开放的中英语料里仅次于联合国平行语料库,相当的有诱惑力:

简介
英中机器文本翻译作为此次比赛的任务之一,目标是评测各个团队机器翻译的能力。本次机器翻译语言方向为英文到中文。测试文本为口语领域数据。参赛队伍需要根据评测方提供的数据训练机器翻译系统,可以自由的选择机器翻译技术。例如,基于规则的翻译技术、统计机器翻译及神经网络机器翻译等。参赛队伍可以使用系统融合技术,但是系统融合系统不参与排名。需要指出,神经网络机器翻译常见的Ensemble方法,本次评测不认定为系统融合技术。

数据说明
我们将所有数据分割成为训练集、验证集和测试集合。我们提供了超过1000万的英中对照的句子对作为数据集合。其中,训练集合占据绝大部分,验证集合8000对,测试集A 8000条,测试集B 8000条。训练数据主要来源于英语学习网站和电影字幕,领域为口语领域。所有双语句对经过人工检查,数据集从规模、相关度、质量上都有保障。一个英中对照的句子对,包含一句英文和一句中文文本,中文句子由英文句子人工翻译而成。中英文句子分别保存到两个文件中,两个文件中的中英文句子以行号形成一一对应的关系。验证集和测试集最终是以标准的XML格式发布给参赛方。

训练条件
本次评测只允许参赛方使用使用评测方指定的数据训练机器翻译系统,并对其排名。参赛方需遵守以下关于训练方式的说明。参赛方可以使用基本的自然语言处理工具,例如中文分词和命名实体识别。

大概十年前我读研期间做得是统计机器翻译,那个时候能接触到的中英句对最多到过2、3百万,用得最多的工具是知名的开源统计机器翻译工具Moses,也在这里写了不少相关的文章。后来工作先后从事过机器翻译、广告文本挖掘相关的工作,与机器翻译渐行渐远。这一两年,我花了很多时间在专利数据挖掘上,深知专利数据翻译的重要性,也了解到机器翻译对于专利翻译有天然的吸引力。加之这几年来深度学习如火如荼,神经网络机器翻译横空出世,Google, 微软,Facebook等公司关于机器翻译的PR一浪高过一浪,大有“取代”人翻译的感觉,这些都都给了我很大的触动,但是一直没有机会走进神经网络机器翻译。刚好这个时候自己又在家里重新组了一台1080TI深度学习主机,加上AI Challenger提供的机器翻译数据机会,我把这次参赛的目标定为:

  • 了解目前神经网络机器翻译NMT的发展趋势
  • 学习并调研相关的NMT开源工具
  • 将NMT应用在中英日三语之间的专利翻译产品上

相对于统计机器翻译,神经网络机器翻译的开源工具更加丰富,这也和最近几年深度学习开源平台遍地开花有关,每个深度学习平台基本上都附有一两个典型的神经网络机器翻译工具和例子。不过需要说明的是,以下这些关于NMT工具的记录大多数是去年9月到12月期间的调研,很多神经网络机器翻译工具还在不断的迭代和演进中,下面的一些描述可能都有了变化。

虽然之前也或多或少的碰到过一些NMT工具,但是这一次我的神经网络机器翻译开源工具之旅是从OpenNMT开启的,这个开源NMT工具由哈佛NLP组推出,诞生于2016年年末,不过主版本基于Torch, 默认语言是Lua,对于喜爱Python的我来说还不算太方便。所以首先尝试了OpenNMT的Pytorch版本: OpenNMT-py,用AI Challenger官方平台提供中英翻译句对中的500万句对迅速跑了一个OpenNMT-py的默认模型:

Step 2: Train the model
python train.py -data data/demo -save_model demo-model
The main train command is quite simple. Minimally it takes a data file and a save file. This will run the default model, which consists of a 2-layer LSTM with 500 hidden units on both the encoder/decoder.

然后走了一遍AI Challenger的比赛流程,第一次提交记录如下:

2017.09.26 第一次提交:训练数据500万, opennmt-py, default,线下验证集结果:0.2325,线上提交测试集结果:0.22670

走完了比赛流程,接下来我要认真的审视这次英中机器翻译比赛了,在第二轮训练模型开始前,我首先对数据做了标准化的预处理:

  1. 数据shuf之后选择了8000句对作为开发集,8000句对作为测试集,剩下的980多万句对作为训练集;
  2. 英文数据按照统计机器翻译工具Moses 的预处理流程进行了tokenize和truecase;中文数据直接用Jieba中文分词工具进行分词;

这一次我将目光瞄准了Google的NMT系统:GNMT, Google的Research Blog是一个好地方: Building Your Own Neural Machine Translation System in TensorFlow,我从这篇文章入手,然后学习使用Tensorflow的NMT开源工具: Tensorflow-NMT,第一次使用subword bpe处理数据,训练了一个4层的gnmt英中模型,记录如下:

2017.10.05 第二次提交:训练集988万句对, tf-nmt, gnmt-4-layer,bpe16000, 线下验证集结果0.2739,线上提交测试集结果:0.26830

这次的结果不错,BLEU值较第一次提交有4个点的提升,我继续尝试使用bpe处理,一周后,做了第三次提交:

2017.10.12 第三次提交:训练集988万句对,tf-nmt, gnmt-4-layer,bpe32000, 线下验证集结果0.2759,线上提交测试集结果:0.27180

依然有一些提高,不过幅度不大。这一次,为了调研各种NMT开源工具,我又把目光锁定到OpenNMT,事实上,到目前为止,接触到的几个神经网络机器翻译开源工具中,和统计机器翻译开源工具Moses最像的就是OpenNMT,有自己独立的官网,文档相当详细,论坛活跃度很高,并且有不同的分支版本,包括主版本 OpenNMT-lua, Pytorch版本 OpenNMT-py, TensorFlow版本 OpenNMT-tf 。所以为了这次实验我在深度学习主机中安装了Torch和OpenNMT-lua版本,接下来半个月做了两次OpenNMT训练英中神经网络翻译模型的尝试,不过在验证集的结果和上面的差不多或者略低,没有实质性提高,所以我放弃了这两次提交。

也在这个阶段,从不同途径了解到Google新推的Transformer模型很牛,依然从Google Research Blog入手:Transformer: A Novel Neural Network Architecture for Language Understanding ,学习这篇神文:《Attention Is All You Need》 和尝试相关的Transformer开源工具 TensorFlow-Tensor2Tensor。一图胜千言,谷歌AI博客上给得这个图片让人无比期待,不过实际操作中还是踩了很多坑:

还是和之前学习使用开源工具的方法类似,我第一次的目标主要是走通tensor2tensor,所以跑了一个 wmt32k base_single 的英中transformer模型,不过结果一般,记录如下:

2017.11.03 第六次实验:t2t transformer wmt32k base_single, 线下验证集BLEU: 0.2605,未提交

之后我又换为wmt32k big_single的设置,再次训练英中transformer模型,这一次,终于在线下验证集的BLEU值上,达到了之前GNMT最好的结果,所以我做了第四次线上提交,不过测试集A的结果还略低一些,记录如下:

2017.11.06 第七次实验:t2t transformer wmt32k big_single,线下验证集结果 0.2759, 线上测试集得分:0.26950

不过这些结果和博客以及论文里宣称的结果相差很大,我开始去检查差异点,包括tensor2tensor的issue以及论文,其实论文里关于实验的部分交代的很清楚:

On the WMT 2014 English-to-German translation task, the big transformer model (Transformer (big) in Table 2) outperforms the best previously reported models (including ensembles) by more than 2.0 BLEU, establishing a new state-of-the-art BLEU score of 28.4. The configuration of this model is listed in the bottom line of Table 3. Training took 3.5 days on 8 P100 GPUs. Even our base model surpasses all previously published models and ensembles, at a fraction of the training cost of any of the competitive models.

On the WMT 2014 English-to-French translation task, our big model achieves a BLEU score of 41.0, outperforming all of the previously published single models, at less than 1/4 the training cost of the previous state-of-the-art model. The Transformer (big) model trained for English-to-French used dropout rate Pdrop = 0.1, instead of 0.3.

For the base models, we used a single model obtained by averaging the last 5 checkpoints, which were written at 10-minute intervals. For the big models, we averaged the last 20 checkpoints. We used beam search with a beam size of 4 and length penalty α = 0.6 . These hyperparameters were chosen after experimentation on the development set. We set the maximum output length during inference to input length + 50, but terminate early when possible.

总结起来有2个地方可以改进:第一,是对checkpoints进行average, 这个效果立竿见影:

2017.11.07 第八次实验:t2t transformer wmt32k big_single average model, 线下验证集得分 0.2810 , 提交测试集得分:0.27330

第二,要有高性能的深度学习服务器。谷歌实验中最好的结果是在8块 P100 GPU的机器上训练了3.5天,对我的单机1080TI深度学习主机来说,一方面训练时对参数做了取舍,另一方面用时间换空间,尝试增加训练步数,直接将训练步数增加到100万次,结果还是不错的:

2017.11.15 第九次实验:t2t transformer wmt32k big_single 1000k 10beam,线下验证集得分0.2911,线上提交测试集得分0.28560

然后继续average checkpoints:
2017.11.16 第十次提交: t2t transformer wmt32k big_single 1000k average 10beam, 线下验证集得分0.2930,线上提交测试集得分0.28780

这两个方法确实能有效提高BLEU值,所以我继续沿用这个策略,按着训练时间推算了一下,估计这台机器在12月初比赛正式结束前大概可以训练一个250万次的模型,当然,这个给自己预留了最后提交比赛结果的时间。不过在11月27日,我在英中机器翻译比赛测试集A结束提交前提交了一个训练了140万次,并做了模型average的提交,算是这个赛道Test A关闭前的最后一次提交:

2017.11.27 第十一次提交 t2t transformer wmt32k big_single 1400k.beam10.a0.9.average, 验证集 0.2938 测试集 0.28950

12月1日凌晨测试集B正式放出,这个是最终排名的重要依据,只有2次提交机会,并且结果不会实时更新,只有等到12月3号之后才会放出最终排名。我的英中2500k Transformer模型大概在12月2号训练完毕,我做了Test B的第一次提交:

2017.12.2 average b10 a0.9: 0.2972(验证集)

之后,我逐一检查了保留的20个checkpoint在验证集上的得分,最终选择了高于平均值的11个checkpoint的average又做了第二次提交,虽然验证集只高了0.0001, 但是在这样的比赛中,“蚊子肉也是肉啊”:

2017.12.3 average select 11 b10 a0.9: 0.2973(验证集)

这就是我在英中机器文本翻译比赛中的整个历程,在Test A的最终排名大概在二十几名,但是最后一次模型的结果应该还能提高,所以预期是前20,剩下的就是等待TEST B的最终排名结果了。做到这个份上,其实我还挺满意的,不过故事如果真的到此就结束了,那算不上奇遇,有意思的事情才刚开始。

AI Challenger 2017有两个赛道和机器翻译有关,一个是英中机器文本翻译比赛(最高奖金30万),另外一个是英中机器同声传译比赛(最高奖金40万),一开始报名的时候,直观上觉得后者比较复杂,一方面奖金部分说明了问题,另外赛题描述部分也让人觉得涉及到语音处理,比较复杂:

简介
随着最近深度学习在语音、自然语言处理里面的应用,语音识别的错误率在不断降低,机器翻译的效果也在不断提高。语音处理和机器翻译的进步也推动机器同声传译的进步。如果竞赛任务同时考虑语音识别、机器翻译和语音合成这些项目,参赛队伍遇到的难度会很大。所以本次评测重点也在语音识别后的文本处理和机器翻译任务。翻译语言方向为英文到中文。

语音识别后处理模块:语音识别后的文本与书面语有很多不同。识别后文本具有(1)包含有识别错误;(2)识别结果没有标点符号;(3)源端为比较长的句子,例如对40~50s的语音标注后的文本,没有断句;(4)口语化文本,夹杂语气词等特点。由于本次比赛没有提供错误和正确对照的文本用于训练纠错模块。本次比赛提供的测试集合的源端文本是人工对语音标注后的文本,不包含识别错误。针对其它的特点,参赛队伍可以这几个方面考虑优化,但不限于以下几个方面:

1. 针对无标点的情况,参赛方可以利用提供的英文单语数据训练自动标点模块。用自动标点模块对测试集合文本进行添加标点。自动标点也属于序列标注任务,选手可以使用统计模型或是神经网络的模型进行建模。

2. 针对断句:源端文本都是比较长的文本,不利于机器翻译,参赛者可以设定断句策略。例如,参赛者可以依据标点来进行断句,将每个小的分句送入机器翻译系统。

3. 针对口语化:参赛队伍可以制定一些去除口语词的规则来处理测试集合。

机器翻译模块:将识别后处理的文本翻译成目标语言。参赛队伍需要根据评测方提供的数据训练机器翻译系统,可以自由的选择机器翻译技术。例如,基于规则的翻译技术、基于实例的翻译技术、统计机器翻译及神经网络机器翻译等。参赛队伍可以使用系统融合技术,但是系统融合系统不参与排名。

数据说明
机器翻译训练集。我们提供了1000万左右英中对照的句子对作为训练集合。训练数据领域为口语领域。所有双语句对经过人工检查,数据集从规模、相关度、质量上都有保障。一个英中对照的句子对,包含一句英文和一句中文文本,中文句子由英文句子人工翻译而成。

自动标点训练数据。选手可以利用提供的1000万文本训练自动标点系统。

验证集和测试集。我们会分别选取多个英语演讲的题材的音频,总时长在3~6小时之间,然后按照内容切分成30s~50s不等长度的音频数据,人工标注出音频对应的英文文本。人工标注的文本不翻译识别错误、无标点、含有语气词等。人工标注的好的英文文本会由专业译员翻译成中文文本,就形成了英中对照的句子对。抽取的英中对照的句子对会被分割为验证集和测试集。验证集和测试集最终是以标准的XML格式提供给选手。

我在一开始的时候考虑到这个比赛同样提供上千万句对的语料,所以当时顺手报名了这个同声传译比赛,但是直到最后一刻,我还没有仔细看过或者准备过这个任务。不过12月2号当我第一次完成英中机器翻译比赛的测试集B提交后,以完成作业的心态了解了一下这个英中机器同传比赛的题意以及数据集,发现这里提供的训练集和英中机器翻译比赛的数据是一致的,也就是说机器翻译模块可以复用之前训练的英中Transformer模型,而真正需要解决的,是标点符号自动标注模块以及断句模块。

感谢Google、Github和开源世界,在测试了几个自动标点标注模块后,我把目光锁定在 punctuator2(A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented text), 一个带attention机制的双向RNN无标点文本标点符号还原工具,通过它很快的构建了英文文本自动标点标注模块,并且用在了英中机器同声传译比赛的验证集和测试集上,验证集结果不算太差,所以对应英中机器翻译的模型,我也做了两次测试集B的提交,但是至于结果如何,我根本无法判断,因为在测试集A上,我没有提交过一次,所以无法判断测试集和验证集的正相关性。但是完成了 AI Challenger 的相关“作业“,我基本上心满意足了,至于结果如何,Who Care?

大约一个周之后测试集B上的结果揭晓,我在英中机器翻译文本比赛上进了前20,英中同声传译比赛上进了前10,不过前者的参数队伍有150多支,后者不足30支,特别是测试集B的提交队伍不到15支,有点诡异。原本以为这就结束了,不过到了12月中旬的某个周末,我微信突然收到了AI Challenger小助手的催收信息,大意是需要提交什么代码验证,问我为什么一直没有提交?我一脸错愕,她让我赶紧查看邮件,原来早在一个周之前的12月9号,AI Challenger发了一封邮件,主题是这样的:“AI Challenger 2017 TOP10 选手通知”

亲爱的AI Challenger,

恭喜你,过五关斩六将进入了TOP10,进入前十的机率是0.56%,每一位都是千里挑一的人才。非常不容易也非常优秀!

为了保证竞赛公平公正性,您还需要在12月10日中午12点前按如下格式提交您的代码至大赛核验邮箱aichallenger@chuangxin.com

邮件格式:
主题:AI ChallengerTOP10代码提交-队伍名称-赛道
正文:
队伍名称
全体队员信息:姓名-AI Challenger昵称-电话-邮箱-所在机构-专业&年级

附件:(文件名称)
1- 代码

非常感谢您的合作。

原来测试集B上的前10名同学需要提交代码复核,我原来以为只有前5名需要去北京现场答辩的同学要做这个,没想到前10名都需要做,赶紧和AI Challenger小助手沟通了一下,因为自己几乎都是通过开源工具完成的比赛,就简单的提交了一份说明文档过去了。正是在参加AI Challenger比赛的同一时期,我们的专利机器翻译产品也马不停蹄的开展了,出于对两个赛道前几名队伍BLEU值的仰望,我准备去北京旁听一下现场答辩,所以当天还和AI Challenger小助手沟通了一下现场观摩的问题,小助手说,前十名可以直接来,所以我觉得进入前十名还是不错的。

没想到第二天一早又收到Challenger小助手的微信留言,大意是:你不用自己买票来观摩比赛了,因为前面有几支队伍因种种原因放弃现场答辩,你自动递补为第5名,需要来北京参加12月21日的现场决赛答辩和颁奖礼,我们给你买机票和定酒店。吃不吃惊?意不意外?我当时的第一反应这真是2017年本人遇到最奇特的一件事情。。。然后很快收到了一封决赛邀请函:

亲爱的AI Challenger,

恭喜你,过五关斩六将走到了决赛,进入决赛的机率是0.28%,每一位都是千里挑一的人才。非常不容易也非常优秀!

“AI Challenger 全球AI挑战赛”面向人工智能领域科研人才,致力于打造大型、全面的科研数据集与世界级竞赛平台。由创新工场、搜狗、今日头条联合创建,旨在从科研角度出发,满足学术界对高质量数据集的需求,推进人工智能在科研与商业领域的结合,促进世界范围内人工智能研发人员共同探索前沿领域的技术突破及应用创新。

2017年是AI Challenger的诞生年,我们公布了百万量级的计算机视觉数据集、千万量级的机器翻译数据集,并主办多条细分赛道的AI竞赛。本次英中机器同传竞赛主要任务为集中优化语音识别后处理和机器翻译模块,旨在解决机器同声传译中的技术问题。

......

恭喜所有的入围选手!所有的入围者将在12月21日到中国北京进行现场答辩,本次大赛将以最终榜单排名结合答辩表现,加权计算总成绩,决出最终的大奖。

在答辩之前,我们需要Top5团队于12月18日下午17点前提交包括:
1-答辩PPT、
2-队员情况(个人姓名、个人高清半身照片、个人学校-年级-专业/公司-部门-职务、是否有指导老师-如有,请附上老师150字内简介)
3-团队出席名单(涉及报销事宜)
4-代码(供审查,如有作弊情况将按大赛规则处理)
5-150字内个人简介-选手手册素材(建议为三段话,第一段话是背景介绍,包括你的学校、实验室、师从老师等信息;第二段话可以介绍你的技术优势,包括Paper、竞赛履历、实习履历、项目经历;第三段话支持自由发挥,个人主页、你的爱好,让我们发现一个独一无二的你)
......

虽然去北京参加现场决赛也只是陪太子读书,不过最终还是决定去参加现场答辩,当然这里还有一关需要验证,前10名只需要提交代码或者代码描述即可,前5名参加决赛的同学还要复现整个流程,我很快被小助手拉入一个小群,里面有来自搜狗的工程师同学,他们给我提供了一台深度学习机器,让我复现整个过程以及最终核验比赛结果。当然,留给我的时间比较紧张,12月21号要去北京参加现场答辩,当时已经是12月18号了,所以Challenger小助手特地给我将时间留到了最后一刻。准备PPT和复现整个流程同时进行(复现并不是等于重新训练一遍,譬如机器翻译模型可以直接上传之前训练好的),终于赶在最后时刻完工。不过我自己答辩现场的感觉匆匆忙忙,效果也一般,但是学习了一圈其他获奖队伍的思路,很有收获:Transformer是主流获奖模型,但是很多功夫在细节,包括数据预处理阶段的筛选,数据 & 模型后处理的比拼,当然,牛逼的深度学习机器也是不可或缺的。

附上当时现场答辩PPT上写得几点思考,抛砖引玉,欢迎大家一起探讨机器翻译特别是神经网络机器翻译的现状和未来:

  • NMT开源工具的生态问题,这个过程中我们尝试了OpenNMT, OpenNMT-py, OpenNMT-tf, Tensorflow-nmt, Tensor2Tensor等工具, 总体感觉OpenNMT的生态最完备,很像SMT时代的Moses
  • NMT的工程化和产品化问题,从学术产品到工程产品,还有很多细节要打磨
  • 面向垂直领域的机器翻译:专利机器翻译是一个多领域的机器翻译问题
  • 由衷感谢这些从idea到开源工具都无私奉献的研究者和从业者们,我们只是站在了你们的肩膀上

当然,参加完AI Challenger比赛之后我们并没有停止对于神经网络机器翻译应用的探索,也有了一些新的体会。这半年来我们一直在打磨AIpatent机器翻译引擎,目标是面向中英专利翻译、中日专利翻译、日英专利翻译提供专业的专利翻译引擎,欢迎有这方面需求的同学试用我们的引擎,目前还在不断迭代中。

注:原创文章,转载请注明出处及保留链接“我爱自然语言处理”:http://www.52nlp.cn

本文链接地址:AI Challenger 2017 奇遇记 http://www.52nlp.cn/?p=10218

Andrew Ng 深度学习公开课系列第五门课程序列模型开课

Deep Learning Specialization on Coursera

Andrew Ng 深度学习课程系列第五门课程序列模型(Sequence Models)在1月的尾巴终于开课 ,在跳票了几次之后,这门和NLP比较相关的深度学习课程终于开课了。这门课程属于Coursera上的深度学习专项系列 ,这个系列有5门课,目前终于完备,感兴趣的同学可以关注:Deep Learning Specialization

This course will teach you how to build models for natural language, audio, and other sequence data. Thanks to deep learning, sequence algorithms are working far better than just two years ago, and this is enabling numerous exciting applications in speech recognition, music synthesis, chatbots, machine translation, natural language understanding, and many others. You will: - Understand how to build and train Recurrent Neural Networks (RNNs), and commonly-used variants such as GRUs and LSTMs. - Be able to apply sequence models to natural language problems, including text synthesis. - Be able to apply sequence models to audio applications, including speech recognition and music synthesis. This is the fifth and final course of the Deep Learning Specialization.

这门课程主要面向自然语言,语音和其他序列数据进行深度学习建模,将会学习递归神经网络,GRU,LSTM等内容,以及如何将其应用到语音识别,机器翻译,自然语言理解等任务中去。个人认为这是目前互联网上最适合入门深度学习的系列系列课程了,Andrew Ng 老师善于讲课,另外用Python代码抽丝剥茧扣作业,课程学起来非常舒服,希望最后这门RNN课程也不负众望。参考我之前写得两篇小结:

Andrew Ng 深度学习课程小记

Andrew Ng (吴恩达) 深度学习课程小结

额外推荐: 深度学习课程资源整理

从零开始搭建深度学习服务器: 深度学习工具安装(Theano + MXNet)

Deep Learning Specialization on Coursera

这个系列写了好几篇文章,这是相关文章的索引,仅供参考:

以下是相关深度学习工具包的安装,包括Theano, MXNet

4. Theano

Theano虽然官宣不在更新,但是它的价值依然很大,很多早期深度学习工具的底层依然依赖的是它。在Ubuntu下安装Theano有两种模式,一种是通过Conda安装,Theano的官方安装文档给得是这个方式;另外一种是pip安装模式,官方文档没有给出很好的描述,我参考了网上其他的文章,安装过程中遇到了几个小问题,不过顺利解决。首先安装相关的依赖:

sudo apt-get install python-numpy python-scipy python-dev python-pip python-nose g++ libopenblas-dev git

这个时候可以先尝试用pip的方式安装Theano:

pip install Theano

测试时会遇到类似找不到pygpu模块的提示,而这个模块,是无法用pip安装的,必须通过Theano提供的libgpuarray编译,官方安装文档也给了专门的说明

git clone https://github.com/Theano/libgpuarray.git
cd libgpuarray/
mkdir Build
cd Build/
cmake .. -DCMAKE_BUILD_TYPE=Release
make
sudo make install
cd ..
sudo pip install Cython(如果提示cython没有安装需要先安装Cython)
sudo python setup.py build
sudo python setup.py install
sudo ldconfig

还有最后一步,配置文件

vim ~/.theanorc

[global]
floatX=float32
device=cuda
[cuda]
root=/usr/local/cuda
[nvcc]
flags=-D_FORCE_INLINES

然后可以试一下在ipython中导入Theano是否ok:

Python 2.7.13 (default, Jan 19 2017, 14:48:08) 
Type "copyright", "credits" or "license" for more information.
 
IPython 5.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.
 
In [1]: import theano
Using cuDNN version 6021 on context None
Mapped name None to device cuda: GeForce GTX 1080 Ti (0000:05:00.0)

5. MXNet

MXNet的安装还是比较方便的,按照MXNet官方的安装指南,我是在Ubuntu17.04的环境下用virtualenv安装的:

Python2.x的安装方式如下:

如果没有安装python环境和virtualenv,可以先安装:
sudo apt-get update
sudo apt-get install -y python-dev python-virtualenv

然后用virtualenv生成MXNet的虚拟环境:
virtualenv --system-site-packages venv
source venv/bin/activate

要升级pip到最新版(不清楚是为什么):
pip install --upgrade pip

目前MXNet的最新版是1.0:
pip install mxnet-cu80==1.0.0

如果需要可视化训练过程,则可以选择安装graphviz:
sudo apt-get install graphviz
pip install graphviz

最后测试一下MXNet在GPU环境下是否生效:

Python 2.7.13 (default, Nov 23 2017, 15:37:09) 
[GCC 6.3.0 20170406] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet as mx
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/textminer/mxnet/venv/local/lib/python2.7/site-packages/mxnet/__init__.py", line 25, in <module>
    from . import engine
  File "/home/textminer/mxnet/venv/local/lib/python2.7/site-packages/mxnet/engine.py", line 23, in <module>
    from .base import _LIB, check_call
  File "/home/textminer/mxnet/venv/local/lib/python2.7/site-packages/mxnet/base.py", line 111, in <module>
    _LIB = _load_lib()
  File "/home/textminer/mxnet/venv/local/lib/python2.7/site-packages/mxnet/base.py", line 103, in _load_lib
    lib = ctypes.CDLL(lib_path[0], ctypes.RTLD_LOCAL)
  File "/usr/lib/python2.7/ctypes/__init__.py", line 362, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libgfortran.so.3: cannot open shared object file: No such file or directory

报了如上libgfortran.so.3的错误,google了一下,需要安装gfortran:

sudo apt-get install gfortran

再次测试,就没有问题了:

(venv) textminer@textminer:~/mxnet$ python
Python 2.7.13 (default, Nov 23 2017, 15:37:09) 
[GCC 6.3.0 20170406] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet as mx
>>> a = mx.nd.ones((2,3), mx.gpu())
>>> b = a * 2 + 1
>>> b.asnumpy()
array([[ 3.,  3.,  3.],
       [ 3.,  3.,  3.]], dtype=float32)

Python3.x下的安装基本上过程相同。

注:原创文章,转载请注明出处及保留链接“我爱自然语言处理”:http://www.52nlp.cn

本文链接地址:从零开始搭建深度学习服务器: 深度学习工具安装(Theano + MXNet) http://www.52nlp.cn/?p=10058

深度学习课程及深度学习公开课资源整理

Deep Learning Specialization on Coursera

这里整理一批深度学习课程或者深度学习相关公开课的资源,持续更新,仅供参考。

1. Andrew Ng (吴恩达) 深度学习专项课程 by Coursera and deeplearning.ai

这是 Andrew Ng 老师离开百度后推出的第一个深度学习项目(deeplearning.ai)的一个课程: Deep Learning Specialization ,课程口号是:Master Deep Learning, and Break into AI. 作为 Coursera 联合创始人 和 机器学习网红课程 "Machine Learning" 的授课者,Andrew Ng 老师引领了数百万同学进入了机器学习领域,而这门深度学习课程的口号也透露了他的野心:继续带领百万人进入深度学习的圣地。

作为 Andrew Ng 老师的粉丝,依然推荐这门课程作为深度学习入门课程首选,并且建议花费上 Coursera 上的课程,一方面可以做题,另外还有证书,最重要的是它的编程作业,是理解课程内容的关键点,仅仅看视频绝对是达不到这个效果的。参考:《Andrew Ng 深度学习课程小记》和《Andrew Ng (吴恩达) 深度学习课程小结》。

2. Geoffrey Hinton 大神的 面向机器学习的神经网络(Neural Networks for Machine Learning)

Geoffrey Hinton大神的这门深度学习课程 2012年在 Coursera 上开过一轮,之后一直沉寂,直到 Coursera 新课程平台上线,这门课程已开过多轮次,来自课程图谱网友的评论:

"Deep learning必修课"

"宗派大师+开拓者直接讲课,秒杀一切二流子"

这门深度学习课程相对上面 Andrew Ng深度学习课程有一定难道,但是没有编程作业,只有Quiz.

3. 牛津大学深度学习课程(2015): Deep learning at Oxford 2015

这门深度学习课程名字虽然是 "Machine Learning 2014-2015",不过主要聚焦在深度学习的内容上,可以作为一门很系统的机器学习深度学习课程:

Machine learning techniques enable us to automatically extract features from data so as to solve predictive tasks, such as speech recognition, object recognition, machine translation, question-answering, anomaly detection, medical diagnosis and prognosis, automatic algorithm configuration, personalisation, robot control, time series forecasting, and much more. Learning systems adapt so that they can solve new tasks, related to previously encountered tasks, more efficiently.

The course focuses on the exciting field of deep learning. By drawing inspiration from neuroscience and statistics, it introduces the basic background on neural networks, back propagation, Boltzmann machines, autoencoders, convolutional neural networks and recurrent neural networks. It illustrates how deep learning is impacting our understanding of intelligence and contributing to the practical design of intelligent machines.

视频Playlist:https://www.youtube.com/playlist?list=PLE6Wd9FR--EfW8dtjAuPoTuPcqmOV53Fu

参考:“牛津大学Nando de Freitas主讲的机器学习课程,重点介绍深度学习,还请来Deepmind的Alex Graves和Karol Gregor客座报告,内容、讲解都属一流,强烈推荐! 云: http://t.cn/RA2vSNX

4. Udacity 深度学习(中/英)by Google

Udacity (优达学城)上由Google工程师主讲的免费深度学习课程,结合Google自己的深度学习工具 Tensorflow ,很不错:

机器学习是发展最快、最令人兴奋的领域之一,而深度学习则代表了机器学习中最前沿但也最有风险的一部分。在本课内容中,你将透彻理解深度学习的动机,并设计用于了解复杂和/或大量数据库的智能系统。

我们将教授你如何训练和优化基本神经网络、卷积神经网络和长短期记忆网络。你将通过项目和任务接触完整的机器学习系统 TensorFlow。你将学习解决一系列曾经以为非常具有挑战性的新问题,并在你用深度学习方法轻松解决这些问题的过程中更好地了解人工智能的复杂属性。

我们与 Google 的首席科学家兼 Google 智囊团技术经理 Vincent Vanhoucke 联合开发了本课内容。此课程提供中文版本。

5. Udacity 纳米基石学位项目:深度学习

Udacity的纳米基石学位项目,收费课程,不过据说更注重实战:

人工智能正颠覆式地改变着我们的世界,而背后推动这场进步的,正是深度学习技术。优达学城和硅谷技术明星一起,带来这门帮你系统性入门的课程。你将通过充满活力的硅谷课程内容、独家实战项目和专业代码审阅,快速掌握深度学习的基础知识和前沿应用。

你在实战项目中的每行代码都会获得专业审阅和反馈,还可以在同步学习小组中,接受学长、导师全程的辅导和督促

6. fast.ai 上的深度学习系列课程

fast.ai上提供了几门深度学习课程,课程标语很有意思:Making neural nets uncool again ,并且 Our courses (all are free and have no ads):

Deep Learning Part 1: Practical Deep Learning for Coders
Why we created the course
What we cover in the course
Deep Learning Part 2: Cutting Edge Deep Learning for Coders
Computational Linear Algebra: Online textbook and Videos
Providing a Good Education in Deep Learning—our teaching philosophy
A Unique Path to Deep Learning Expertise—our teaching approach

7. 台大李宏毅老师深度学习课程:Machine Learning and having it Deep and Structured

难得的免费中文深度学习课程:

课程主页:http://speech.ee.ntu.edu.tw/~tlkagk/courses_MLDS17.html
课程视频Playlist: https://www.youtube.com/playlist?list=PLJV_el3uVTsPMxPbjeX7PicgWbY7F8wW9
B站搬运深度学习课程视频: https://www.bilibili.com/video/av9770302/

8. 台大陈缊侬老师深度学习应用课程:Applied Deep Learning / Machine Learning and Having It Deep and Structured

据说是美女老师,这门课程16年秋季开过一次,不过没有视频,最新的这期是17年秋季课程,刚刚开课,Youtube上正在陆续放出课程视频:

16年课程主页,有Slides等相关资料:https://www.csie.ntu.edu.tw/~yvchen/f105-adl/index.html
17年课程主页,资料正在陆续放出:https://www.csie.ntu.edu.tw/~yvchen/f106-adl/
Youtube视频,目前没有playlist,可以关注其官方号放出的视频:https://www.youtube.com/channel/UCyB2RBqKbxDPGCs1PokeUiA/videos

9. Yann Lecun 深度学习公开课

"Yann Lecun 在 2016 年初于法兰西学院开课,这是其中关于深度学习的 8 堂课。当时是用法语授课,后来加入了英文字幕。
作为人工智能领域大牛和 Facebook AI 实验室(FAIR)的负责人,Yann Lecun 身处业内机器学习研究的最前沿。他曾经公开表示,现有的一些机器学习公开课内容已经有些过时。通过 Yann Lecun 的课程能了解到近几年深度学习研究的最新进展。该系列可作为探索深度学习的进阶课程。"

10. 2016 年蒙特利尔深度学习暑期班

推荐理由:看看嘉宾阵容吧,Yoshua Bengio 教授循环神经网络,Surya Ganguli 教授理论神经科学与深度学习理论,Sumit Chopra 教授 reasoning summit 和 attention,Jeff Dean 讲解 TensorFlow 大规模机器学习,Ruslan Salakhutdinov 讲解学习深度生成式模型,Ryan Olson 讲解深度学习的 GPU 编程,等等。

11. 斯坦福大学深度学习应用课程:CS231n: Convolutional Neural Networks for Visual Recognition

这门面向计算机视觉的深度学习课程由Fei-Fei Li教授掌舵,内容面向斯坦福大学学生,货真价实,评价颇高:

Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems. This course is a deep dive into details of the deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 10-week course, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. The final assignment will involve training a multi-million parameter convolutional neural network and applying it on the largest image classification dataset (ImageNet). We will focus on teaching how to set up the problem of image recognition, the learning algorithms (e.g. backpropagation), practical engineering tricks for training and fine-tuning the networks and guide the students through hands-on assignments and a final course project. Much of the background and materials of this course will be drawn from the ImageNet Challenge.

12. 斯坦福大学深度学习应用课程: Natural Language Processing with Deep Learning

这门课程由NLP领域的大牛 Chris Manning 和 Richard Socher 执掌,绝对是学习深度学习自然语言处理的不二法门。

Natural language processing (NLP) is one of the most important technologies of the information age. Understanding complex language utterances is also a crucial part of artificial intelligence. Applications of NLP are everywhere because people communicate most everything in language: web search, advertisement, emails, customer service, language translation, radiology reports, etc. There are a large variety of underlying tasks and machine learning models behind NLP applications. Recently, deep learning approaches have obtained very high performance across many different NLP tasks. These models can often be trained with a single end-to-end model and do not require traditional, task-specific feature engineering. In this winter quarter course students will learn to implement, train, debug, visualize and invent their own neural network models. The course provides a thorough introduction to cutting-edge research in deep learning applied to NLP. On the model side we will cover word vector representations, window-based neural networks, recurrent neural networks, long-short-term-memory models, recursive neural networks, convolutional neural networks as well as some recent models involving a memory component. Through lectures and programming assignments students will learn the necessary engineering tricks for making neural networks work on practical problems.

这门课程融合了两位授课者之前在斯坦福大学的授课课程,分别是自然语言处理课程 cs224n (Natural Language Processing)和面向自然语言处理的深度学习课程 cs224d (Deep Learning for Natural Language Processing).

13. 斯坦福大学深度学习课程: CS 20SI: Tensorflow for Deep Learning Research

准确的说,这门课程主要是针对深度学习工具Tensorflow的:

Tensorflow is a powerful open-source software library for machine learning developed by researchers at Google Brain. It has many pre-built functions to ease the task of building different neural networks. Tensorflow allows distribution of computation across different computers, as well as multiple CPUs and GPUs within a single machine. TensorFlow provides a Python API, as well as a less documented C++ API. For this course, we will be using Python.

This course will cover the fundamentals and contemporary usage of the Tensorflow library for deep learning research. We aim to help students understand the graphical computational model of Tensorflow, explore the functions it has to offer, and learn how to build and structure models best suited for a deep learning project. Through the course, students will use Tensorflow to build models of different complexity, from simple linear/logistic regression to convolutional neural network and recurrent neural networks with LSTM to solve tasks such as word embeddings, translation, optical character recognition. Students will also learn best practices to structure a model and manage research experiments.

14. 牛津大学 & DeepMind 联合的面向NLP的深度学习应用课程: Deep Learning for Natural Language Processing: 2016-2017

课程主页:https://www.cs.ox.ac.uk/teaching/courses/2016-2017/dl/

github课程项目页面:https://github.com/oxford-cs-deepnlp-2017/

课程视频Playlist: https://www.youtube.com/playlist?list=PL613dYIGMXoZBtZhbyiBqb0QtgK6oJbpm

B站搬运视频: https://www.bilibili.com/video/av9817911/

15. 卡耐基梅隆大学(CMU)深度学习应用课程:CMU CS 11-747, Fall 2017 Neural Networks for NLP

课程主页:http://phontron.com/class/nn4nlp2017/

课程视频Playlist: https://www.youtube.com/watch?v=Sss2EA4hhBQ&list=PL8PYTP1V4I8ABXzdqtOpB_eqBlVAz_xPT

16. MIT组织的一个为期一周的深度学习课程: 6.S191: Introduction to Deep Learning http://introtodeeplearning.com/

17. 奈良先端科学技術大学院大学(NAIST) 2014年推出的一个深度学习短期课程(英文授课):Deep Learning and Neural Networks

18. Deep Learning course: lecture slides and lab notebooks

欢迎大家推荐其他没有覆盖到的深度学习课程。

注:本文首发“课程图谱博客”:http://blog.coursegraph.com ,同步发布到这里,原文链接地址:http://blog.coursegraph.com/深度学习课程资源整理,转载请注明出处。

从零开始搭建深度学习服务器: 深度学习工具安装(TensorFlow + PyTorch + Torch)

Deep Learning Specialization on Coursera

这个系列写了好几篇文章,这是相关文章的索引,仅供参考:

以下是相关深度学习工具包的安装,包括Tensorflow, PyTorch, Torch等:

1. TensorFlow:

首先安装libcupti-dev

sudo apt-get install libcupti-dev

然后用 virtualenv 方式安装 Tensorflow(当前是1.4版本)

sudo apt-get install python-pip python-dev python-virtualenv 
mkdir tensorflow
cd tensorflow
virtualenv --system-site-packages venv
source venv/bin/activate
pip install --upgrade tensorflow-gpu

测试GPU:

Python 2.7.12 (default, Nov 19 2016, 06:48:10) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
...
2017-10-24 20:37:24.290049: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 0 with properties: 
name: GeForce GTX 1080 Ti
major: 6 minor: 1 memoryClockRate (GHz) 1.6575
pciBusID 0000:01:00.0
Total memory: 10.91GiB
Free memory: 10.52GiB
...
2017-10-24 20:37:24.387363: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 1 with properties: 
name: GeForce GTX 1080 Ti
major: 6 minor: 1 memoryClockRate (GHz) 1.6575
pciBusID 0000:02:00.0
Total memory: 10.91GiB
Free memory: 10.76GiB
2017-10-24 20:37:24.388168: I tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0 1 
2017-10-24 20:37:24.388176: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0:   Y Y 
2017-10-24 20:37:24.388179: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 1:   Y Y 
2017-10-24 20:37:24.388186: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0)
2017-10-24 20:37:24.388189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0)
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0
/job:localhost/replica:0/task:0/gpu:1 -> device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0
2017-10-24 20:37:24.449867: I tensorflow/core/common_runtime/direct_session.cc:300] Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0
/job:localhost/replica:0/task:0/gpu:1 -> device: 1, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0
>>> 

2. PyTorch:

首先在PyTorch的官网下载对应的pip安装文件:

然后用virtualenv的方式安装,非常方便:

mkdir pytorch
cd pytorch/
virtualenv venv
source venv/bin/activate
pip install /path/to/torch-0.2.0.post3-cp27-cp27mu-manylinux1_x86_64.whl 
pip install torchvision 

3. Torch

首先按照Torch官方的方法进行安装:http://torch.ch/docs/getting-started.html

git clone https://github.com/torch/distro.git ~/torch --recursive
cd ~/torch; bash install-deps;
./install.sh

如无意外,可以顺利安装,如果遇到了如下两个问题,可按下述方法修改:

1) 执行./install.sh时出现Moses>=1.错误

Missing dependencies for nn:moses >= 1.,有时候执行./install.sh时,会出现这个问题。

用这个方法解决:

sudo apt install luarocks
sudo luarocks install moses

2) install.sh 过程中提示“error -- unsupported GNU version! gcc versions later than 5 are not supported!”

ubuntu17.04自带gcc 6.x 版本,所以降级安装gcc 4.9版本解决问题:

sudo apt-get install g++-4.9  
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 20  
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 20 

成功执行安装脚本后后提示:

Do you want to automatically prepend the Torch install location
to PATH and LD_LIBRARY_PATH in your /home/yourpath/.bashrc? (yes/no)
[yes] >>>
yes

安装脚本会自动将torch的安装路径写入到 .bashrc里,然后输入 th试试:

如果你想用Lua5.2替代LuaJIT的方式安装Torch(If you want to install torch with Lua 5.2 instead of LuaJIT, simply run),可按如下方式安装:

git clone https://github.com/torch/distro.git torch --recursive
cd torch

# clean old torch installation
./clean.sh

在 ~/.bashrec中设置lua的环境:
TORCH_LUA_VERSION=LUA52
并执行 source ~/.bashrc, 然后运行:

./install.sh

遇到第一个问题:

cmake: not found

安装cmake解决:
sudo apt-get install cmake

第二个问题:
readline.c:8:31: fatal error: readline/readline.h: 没有那个文件或目录

安装libreadine-dev解决:
sudo apt-get install libreadline-dev

第三个问题:安装过程依然提示“error -- unsupported GNU version! gcc versions later than 5 are not supported!”

ubuntu17.04自带gcc 6.x 版本,所以降级安装gcc 4.9版本解决问题:

sudo apt-get install g++-4.9
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 20
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 20

安装完毕依然会提示:

Not updating your shell profile.
You might want to
add the following lines to your shell profile:

. /home/textminer/torch/torch/install/bin/torch-activate

在 ~/.profile 文件末尾加上这行 ". /home/textminer/torch/torch/install/bin/torch-activate " 并执行 source ~/.profile,然后输入 th试试。

注:原创文章,转载请注明出处及保留链接“我爱自然语言处理”:http://www.52nlp.cn

本文链接地址:从零开始搭建深度学习服务器: 深度学习工具安装(TensorFlow + PyTorch + Torch) http://www.52nlp.cn/?p=10008

Andrew Ng 深度学习课程系列第四门课程卷积神经网络开课

Deep Learning Specialization on Coursera

Andrew Ng 深度学习课程系列第四门课程卷积神经网络(Convolutional Neural Networks)将于11月6日开课 ,不过课程资料已经放出,现在注册课程已经可以听课了 ,这门课程属于Coursera上的深度学习专项系列 ,这个系列有5门课,前三门已经开过好几轮,但是第4、第5门课程一直处于待定状态,新的一轮将于11月7号开始,感兴趣的同学可以关注:Deep Learning Specialization

This course will teach you how to build convolutional neural networks and apply it to image data. Thanks to deep learning, computer vision is working far better than just two years ago, and this is enabling numerous exciting applications ranging from safe autonomous driving, to accurate face recognition, to automatic reading of radiology images. You will: - Understand how to build a convolutional neural network, including recent variations such as residual networks. - Know how to apply convolutional networks to visual detection and recognition tasks. - Know to use neural style transfer to generate art. - Be able to apply these algorithms to a variety of image, video, and other 2D or 3D data. This is the fourth course of the Deep Learning Specialization.

个人认为这是目前互联网上最适合入门深度学习的课程系列了,Andrew Ng 老师善于讲课,另外用Python代码抽丝剥茧扣作业,课程学起来非常舒服,参考我之前写得两篇小结:

Andrew Ng 深度学习课程小记

Andrew Ng (吴恩达) 深度学习课程小结

额外推荐: 深度学习课程资源整理

Andrew Ng (吴恩达) 深度学习课程小结

Deep Learning Specialization on Coursera

Andrew Ng (吴恩达) 深度学习课程从宣布到现在大概有一个月了,我也在第一时间加入了这个Coursera上的深度学习系列课程,并且在完成第一门课“Neural Networks and Deep Learning(神经网络与深度学习)”的同时写了关于这门课程的一个小结:Andrew Ng 深度学习课程小记。之后我断断续续的完成了第二门深度学习课程“Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization"和第三门深度学习课程“Structuring Machine Learning Projects”的相关视频学习和作业练习,也拿到了课程证书。平心而论,对于一个有经验的工程师来说,这门课程的难度并不高,如果有时间,完全可以在一个周内完成三门课程的相关学习工作。但是对于一个完全没有相关经验但是想入门深度学习的同学来说,可以预先补习一下Python机器学习的相关知识,如果时间允许,建议先修一下 CourseraPython系列课程Python for Everybody Specialization 和 Andrew Ng 本人的 机器学习课程

吴恩达这个深度学习系列课 (Deep Learning Specialization) 有5门子课程,截止目前,第四门"Convolutional Neural Networks" 和第五门"Sequence Models"还没有放出,不过上周四 Coursera 发了一封邮件给学习这门课程的用户:

Dear Learners,

We hope that you are enjoying Structuring Machine Learning Projects and your experience in the Deep Learning Specialization so far!

As we are nearing the one month anniversary of the Deep Learning Specialization, we wanted to thank you for your feedback on the courses thus far, and communicate our timelines for when the next courses of the Specialization will be available.

We plan to begin the first session of Course 4, Convolutional Neural Networks, in early October, with Course 5, Sequence Models, following soon after. We hope these estimated course launch timelines will help you manage your subscription as appropriate.

If you’d like to maintain full access to current course materials on Coursera’s platform for Courses 1-3, you should keep your subscription active. Note that if you only would like to access your Jupyter Notebooks, you can save these locally. If you do not need to access these materials on platform, you can cancel your subscription and restart your subscription later, when the new courses are ready. All of your course progress in the Specialization will be saved, regardless of your decision.

Thank you for your patience as we work on creating a great learning experience for this Specialization. We look forward to sharing this content with you in the coming weeks!

Happy Learning,

Coursera

大意是第四门深度学习课程 CNN(卷积神经网络)将于10月上旬推出,第五门深度学习课程 Sequence Models(序列模型, RNN等)将紧随其后。对于付费订阅的用户,如果你想随时随地获取当前3门深度学习课程的所有资料,最好保持订阅;如果你仅仅想访问 Jupyter Notebooks,也就是获取相关的编程作业,可以先本地保存它们。你也可以现在取消订阅这门课程,直到之后的课程开始后重新订阅,你的所有学习资料将会保存。所以一个比较省钱的办法,就是现在先离线保存相关课程资料,特别是编程作业等,然后取消订阅。当然对于视频,也可以离线下载,不过现在免费访问这门课程的视频有很多办法,譬如Coursera本身的非订阅模式观看视频,或者网易云课堂免费提供了这门课程的视频部分。不过我依然觉得,吴恩达这门深度学习课程,如果仅仅观看视频,最大的功效不过30%,这门课程的精华就在它的练习和编程作业部分,特别是编程作业,非常值得揣摩,花钱很值。

再次回到 Andrew Ng 这门深度学习课程的子课程上,第二门课程是“Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization",有三周课程,包括是深度神经网络的调参、正则化方法和优化算法讲解:

第一周课程是关于深度学习的实践方面的经验 (Practical aspects of Deep Learning), 包括训练集/验证集/测试集的划分,Bias 和
Variance的问题,神经网络中解决过拟合 (Overfitting) 的 Regularization 和 Dropout 方法,以及Gradient Check等:


这周课程依然强大在编程作业上,有三个编程作业需要完成:

完成编程的作业的过程也是一个很好的回顾课程视频的过程,可以把一些听课中容易忽略的点补上。

第二周深度学习课程是关于神经网络中用到的优化算法 (Optimization algorithms),包括 Mini-batch gradient descent,RMSprop, Adam等优化算法:

编程作业也很棒,在老师循循善诱的预设代码下一步一步完成了几个优化算法。

第三周深度学习课程主要关于神经网络中的超参数调优和深度学习框架问题(Hyperparameter tuning , Batch Normalization and Programming Frameworks),顺带讲了一下多分类问题和 Softmax regression, 特别是最后一个视频简单介绍了一下 TensorFlow , 并且编程作业也是和TensorFlow相关,对于还没有学习过Tensorflow的同学,刚好是一个入门学习机会,视频介绍和作业设计都很棒:


第三门深度学习课程Structuring Machine Learning Projects”更简单一些,只有两周课程,只有 Quiz, 没有编程作业,算是Andrew Ng 老师关于深度学习或者机器学习项目方法论的一个总结:

第一周课程主要关于机器学习的策略、项目目标(可量化)、训练集/开发集/测试集的数据分布、和人工评测指标对比等:


课程虽然没有提供编程作业,但是Quiz练习是一个关于城市鸟类识别的机器学习案例研究,通过这个案例串联15个问题,对应着课程视频中的相关经验,值得玩味。

第二周课程的学习目标是:

“Understand what multi-task learning and transfer learning are
Recognize bias, variance and data-mismatch by looking at the performances of your algorithm on train/dev/test sets”

主要讲解了错误分析(Error Analysis), 不匹配训练数据和开发/测试集数据的处理(Mismatched training and dev/test set),机器学习中的迁移学习(Transfer learning)和多任务学习(Multi-task learning),以及端到端深度学习(End-to-end deep learning):

这周课程的选择题作业仍然是一个案例研究,关于无人驾驶的:Autonomous driving (case study),还是用15个问题串起视频中得知识点,体验依然很棒。

最后,关于Andrew Ng (吴恩达) 深度学习课程系列,Coursera上又启动了新一轮课程周期,9月12号开课,对于错过了上一轮学习的同学,现在加入新的一轮课程刚刚好。不过相信 Andrew Ng 深度学习课程会成为他机器学习课程之后 Coursera 上又一个王牌课程,会不断滚动推出的,所以任何时候加入都不会晚。另外,如果已经加入了这门深度学习课程,建议在学习的过程中即使保存资料,我都是一边学习一边保存这门深度学习课程的相关资料的,包括下载了课程视频用于离线观察,完成Quiz和编程作业之后都会保存一份到电脑上,方便随时查看。

索引:Andrew Ng 深度学习课程小记

注:原创文章,转载请注明出处及保留链接“我爱自然语言处理”:http://www.52nlp.cn

本文链接地址:Andrew Ng (吴恩达) 深度学习课程小结 http://www.52nlp.cn/?p=9761

深度学习服务器环境配置: Ubuntu17.04+Nvidia GTX 1080+CUDA 9.0+cuDNN 7.0+TensorFlow 1.3

Deep Learning Specialization on Coursera

这个系列写了好几篇文章,这是相关文章的索引,仅供参考:

一年前,我配置了一套“深度学习服务器”,并且写过两篇关于深度学习服务器环境配置的文章:《深度学习主机环境配置: Ubuntu16.04+Nvidia GTX 1080+CUDA8.0》 和 《深度学习主机环境配置: Ubuntu16.04+GeForce GTX 1080+TensorFlow》 , 获得了很多关注和引用。 这一年来,深度学习的大潮继续,特别是前段时间,吴恩达(Andrew Ng)在Coursera上推出了深度学习系列课程,这门面向初学者的深度学习课程,更是进一步的将深度学习的门槛降低。

前段时间这台主机出了点问题,本着“不折腾毋宁死”的原则,我重新安装了系统,并且选择了最新的Ubuntu17.04,CUDA9.0,cuDNN7.0, TensorFlow1.3,然后又是一堆坑,另外所能Google到的国内外资料目前为止基本上覆盖的还是CUDA8.0, 和cuDNN6.0, 5.0, 所以这里再次记录一下本次深度学习主机环境配置之旅。

1. 准备工作

Ubuntu17.04系统安装完毕之后,首先做两个准备工作,一个是更新apt-get的源,这次用的是网易的源:

deb http://mirrors.163.com/ubuntu/ zesty main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ zesty-security main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ zesty-updates main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ zesty-proposed main restricted universe multiverse
deb http://mirrors.163.com/ubuntu/ zesty-backports main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ zesty main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ zesty-security main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ zesty-updates main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ zesty-proposed main restricted universe multiverse
deb-src http://mirrors.163.com/ubuntu/ zesty-backports main restricted universe multiverse

另外一个事情是将pip源指向清华大学的源镜像:https://mirrors.tuna.tsinghua.edu.cn/help/pypi/,具体添加一个 ~/.config/pip/pip.conf 文件,设置为:

[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple

这两件事情都可以加速安装相关工具包的速度,事半功倍。

然后就是给GTX1080显卡安装驱动,参考了这篇文章《How to install Nvidia Drivers on Ubuntu 17.04 & below, Linux Mint》,并且选择了这篇文章所指的最新的381.09驱动:


sudo apt-get purge nvidia*
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update && sudo apt-get install nvidia-381 nvidia-settings

安装完毕后重启电脑即可,运行nvidia-smi即可检验驱动是否安装成功。不过之后在安装CUDA9的时候,又被安利了一次384.69显卡驱动,所以我不太清楚这个过程是否有必要。

2. 安装CUDA TOOLKIT

依然前往NVIDIA的CUDA官方页面,登录后可以选择CUDA9.0版本下载:CUDA Toolkit 9.0 Release Candidate Downloads, 这次我选择的是面向ubuntu17.04的deb版本:

下载完deb文件之后按照官方给的方法按如下方式安装CUDA9:

sudo dpkg -i cuda-repo-ubuntu1704-9-0-local-rc_9.0.103-1_amd64.deb
sudo apt-key add /var/cuda-repo-9-0-local-rc/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda

安装过程中发现貌似又一次安装了显卡驱动,版本是384.69,安装完毕后运行“nvidia-smi”提示错误:Failed to initialize NVML: Driver/library version mismatch,这个时候是需要重启机器让新的版本的显卡驱动生效,再次运行“nvidia-smi”:

之后可以测试一下CUDA的相关例子,我将cuda9.0下的sample拷贝到一个临时目录下进行编译:


cp -r /usr/local/cuda-9.0/samples/ .
cd samples/
make

然后运行几个例子看一下:

textminer@textminer:~/cuda_sample/samples/1_Utilities/bandwidthTest$ ./bandwidthTest

[CUDA Bandwidth Test] - Starting...
Running on...

Device 0: GeForce GTX 1080
Quick Mode

Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 11258.6

Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 12875.1

Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 231174.2

Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

textminer@textminer:~/cuda_sample/samples/6_Advanced/c++11_cuda$ ./c++11_cuda

GPU Device 0: "GeForce GTX 1080" with compute capability 6.1

Read 3223503 byte corpus from ./warandpeace.txt
counted 107310 instances of 'x', 'y', 'z', or 'w' in "./warandpeace.txt"

最后在 ~/.bashrc 里再设置一下cuda的环境变量:

export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export CUDA_HOME=/usr/local/cuda

同时 source ~/.bashrc 让其生效。

3. 安装cuDNN

安装cuDNN很简单,不过同样需要前往NVIDIA官网:https://developer.nvidia.com/cudnn,这次我们选择的是cuDNN7, 关于cuDNN7,NVIDIA官方主页是这样写的:

What’s New in cuDNN 7?
Deep learning frameworks using cuDNN 7 can leverage new features and performance of the Volta architecture to deliver up to 3x faster training performance compared to Pascal GPUs. cuDNN 7 is now available as a free download to the members of the NVIDIA Developer Program. Highlights include:

Up to 2.5x faster training of ResNet50 and 3x faster training of NMT language translation LSTM RNNs on Tesla V100 vs. Tesla P100
Accelerated convolutions using mixed-precision Tensor Cores operations on Volta GPUs
Grouped Convolutions for models such as ResNeXt and Xception and CTC (Connectionist Temporal Classification) loss layer for temporal classification

我选择的是这个版本:cuDNN v7.0 (August 3, 2017), for CUDA 9.0 RC --- cuDNN v7.0 Library for Linux

下载完毕后解压,然后将相关文件拷贝到cuda安装目录下即可:

tar -zxvf cudnn-9.0-linux-x64-v7.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/ -d
sudo chmod a+r /usr/local/cuda/include/cudnn.h
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*

4. 安装Tensorflow1.3

在安装Tensorflow之前,按照Tensorflow官方安装文档的说明,先安装一个libcupti-dev库:

The libcupti-dev library, which is the NVIDIA CUDA Profile Tools Interface. This library provides advanced profiling support. To install this library, issue the following command:

$ sudo apt-get install libcupti-dev

然后通过virtualenv 的方式安装Tensorflow1.3 GUP版本,注意我用的是Python2.7:

sudo apt-get install python-pip python-dev python-virtualenv
virtualenv --system-site-packages tensorflow1.3
source tensorflow1.3/bin/activate
(tensorflow1.3) textminer@textminer:~/tensorflow/tensorflow1.3$ pip install --upgrade tensorflow-gpu

通过清华的pip源,用这种方式安装tensorflow-gpu版本速度很快:

Collecting tensorflow-gpu
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ca/c4/e39443dcdb80631a86c265fb07317e2c7ea5defe73cb531b7cd94692f8f5/tensorflow_gpu-1.3.0-cp27-cp27mu-manylinux1_x86_64.whl (158.8MB)
21% |███████ | 34.7MB 958kB/s eta 0:02:10

Successfully built markdown html5lib
Installing collected packages: backports.weakref, protobuf, funcsigs, pbr, mock, numpy, markdown, html5lib, bleach, werkzeug, tensorflow-tensorboard, tensorflow-gpu
Successfully installed backports.weakref-1.0rc1 bleach-1.5.0 funcsigs-1.0.2 html5lib-0.9999999 markdown-2.6.9 mock-2.0.0 numpy-1.13.1 pbr-3.1.1 protobuf-3.4.0 tensorflow-gpu-1.3.0 tensorflow-tensorboard-0.1.5 werkzeug-0.12.2

这种方式安装TensorFlow很方便,并且切换tensorflow的版本也很容易,如果不是下面的坑,这是我安装Tensorflow的第一选择。然后尝试运行一下tensorflow,满心期待会出现顺利导入并且有GPU的相关信息出现:

(tensorflow1.3) textminer@textminer:~/tensorflow/tensorflow1.3$ python
Python 2.7.13 (default, Jan 19 2017, 14:48:08)
[GCC 6.3.0 20170118] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf

可是却报如下错误:

File "/home/textminer/tensorflow/tensorflow1.3/local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
ImportError: libcusolver.so.8.0: cannot open shared object file: No such file or directory

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

我看了一下 /usr/local/cuda/lib64/ 下有 libcusolver.so.9.0 这个文件,同时google了一下相关信息,基本上确定这是由于Tensorflow官方版本目前不支持CUDA9, 支撑CUDA8的缘故,所以这个pip版本默认找得是CUDA8.0的后缀文件: libcusolver.so.8.0 。

好在天无绝人之路,虽然这方面的资料很少,还是通过google找到了github上tensorflow的最近的两条issue: Upgrade to CuDNN 7 and CUDA 9 CUDA 9RC + cuDNN7 。前一条是请求TensorFlow官方版本支持CUDA9和cuDNN7的讨论:Please upgrade TensorFlow to support CUDA 9 and CuDNN 7. Nvidia claims this will provide a 2x performance boost on Pascal GPUs. 后一条是一个非官方方式在Tensorflow中支持CUDA9和cuDNN7的源代码安装方案:This is an unofficial and very not supported patch to make it possible to compile TensorFlow with CUDA9RC and cuDNN 7 or CUDA8 + cuDNN 7.

又是源代码安装Tensorflow, 这个方式我是不推荐的,还记得去年夏天用源代码安装Tensorflow的种种痛苦,特别是国内网络不便的情况下,这种方式更是不愿意推荐,不过不得已,我必须试一下。特别声明,如果之后Tensorflow官方版本已经支持CUDA9和cuDNN7了,请直接按上述pip方式安装,以下可以忽略。

5. 源代码方式安装Tensorflow

平心而论,严格按照github上这个10天前的issue的方法做基本上是没问题的:

git clone https://github.com/tensorflow/tensorflow.git
wget https://storage.googleapis.com/tf-performance/public/cuda9rc_patch/0001-CUDA-9.0-and-cuDNN-7.0-support.patch
wget https://storage.googleapis.com/tf-performance/public/cuda9rc_patch/eigen.f3a22f35b044.cuda9.diff
cd tensorflow/
git status
git checkout db596594b5653b43fcb558a4753b39904bb62cbd~
git apply ../0001-CUDA-9.0-and-cuDNN-7.0-support.patch
./configure
bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

但是我还是遇到了一点问题在configure之后用bazel编译tensorflow的时候遇到了如下错误:

ERROR: Skipping '//tensorflow/tools/pip_package:build_pip_package': error loading package 'tensorflow/tools/pip_package': Encountered error while reading extension file 'cuda/build_defs.bzl': no such package '@local_config_cuda//cuda

google了一下之后发现我用的是最新版的bazel_0.5.4, 回退版本是个解决方案,所以回退到了bazel_0.5.2,问题解决。这里特别备注一下configure过程的选择,仅供参考:

Please specify the location of python. [Default is /usr/bin/python]:
Found possible Python library paths:
/usr/local/lib/python2.7/dist-packages
/usr/lib/python2.7/dist-packages
Please input the desired Python library path to use. Default is /usr/local/lib/python2.7/dist-packages
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: Y
jemalloc as malloc support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]: N
No Google Cloud Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Hadoop File System support? [y/N]: N
No Hadoop File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with XLA JIT support? [y/N]:
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with VERBS support? [y/N]:
No VERBS support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL support? [y/N]:
No OpenCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]: 9.0
Please specify the location where CUDA 9.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
"Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 6.0]: 7
Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1]
Do you want to use clang as CUDA compiler? [y/N]: N
nvcc will be used as CUDA compiler.

Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Do you wish to build TensorFlow with MPI support? [y/N]:
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Add "--config=mkl" to your bazel command to build with MKL support.
Please note that MKL on MacOS or windows is still not supported.
If you would like to use a local MKL instead of downloading, please set the environment variable "TF_MKL_ROOT" every time before build.
Configuration finished

即使bazel版本正确和configure无误,第一次用bazel编译 Tensorflow 还是会遇到问题:

bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

不过这个是上述issue中专门提到的,并且给了一个Eigen patch解决方案:

Attempt to build TensorFlow, so that Eigen is downloaded. This build will fail if building for CUDA9RC but will succeed for CUDA8
bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

Apply the Eigen patch:

    cd -P bazel-out/../../../external/eigen_archive
    patch -p1 < ~/Downloads/eigen.f3a22f35b044.cuda9.diff

Build TensorFlow successfully
    cd -
    bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

再次编译Tensorflow成功,最后编译tensorflow的pip安装文件:

bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
ls /tmp/tensorflow_pkg/
tensorflow-1.3.0rc1-cp27-cp27mu-linux_x86_64.whl
sudo pip install /tmp/tensorflow_pkg/tensorflow-1.3.0rc1-cp27-cp27mu-linux_x86_64.whl

我们在ipython中试一下新安装好的Tensorflow:

Python 2.7.13 (default, Jan 19 2017, 14:48:08) 
Type "copyright", "credits" or "license" for more information.
 
IPython 5.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.
 
In [1]: import tensorflow as tf
 
In [2]: hello = tf.constant('Hello, Tensorflow')
 
In [3]: sess = tf.Session()
2017-09-01 13:32:08.828776: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 0 with properties: 
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.835
pciBusID 0000:01:00.0
Total memory: 7.92GiB
Free memory: 7.62GiB
2017-09-01 13:32:08.828808: I tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0 
2017-09-01 13:32:08.828813: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0:   Y 
2017-09-01 13:32:08.828823: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
 
In [4]: print(sess.run(hello))
Hello, Tensorflow

终于看到GPU的相关信息了,接下来,尽情享受Tensorflow GPU版本带来的效率提升吧。

注:原创文章,转载请注明出处及保留链接“我爱自然语言处理”:http://www.52nlp.cn

本文链接地址:深度学习服务器环境配置: Ubuntu17.04+Nvidia GTX 1080+CUDA 9.0+cuDNN 7.0+TensorFlow 1.3 http://www.52nlp.cn/?p=9704

自然语言处理工具包spaCy介绍

Deep Learning Specialization on Coursera

spaCy 是一个Python自然语言处理工具包,诞生于2014年年中,号称“Industrial-Strength Natural Language Processing in Python”,是具有工业级强度的Python NLP工具包。spaCy里大量使用了 Cython 来提高相关模块的性能,这个区别于学术性质更浓的Python NLTK,因此具有了业界应用的实际价值。

安装和编译 spaCy 比较方便,在ubuntu环境下,直接用pip安装即可:

sudo apt-get install build-essential python-dev git
sudo pip install -U spacy

不过安装完毕之后,需要下载相关的模型数据,以英文模型数据为例,可以用"all"参数下载所有的数据:

sudo python -m spacy.en.download all

或者可以分别下载相关的模型和用glove训练好的词向量数据:


# 这个过程下载英文tokenizer,词性标注,句法分析,命名实体识别相关的模型
python -m spacy.en.download parser

# 这个过程下载glove训练好的词向量数据
python -m spacy.en.download glove

下载好的数据放在spacy安装目录下的data里,以我的ubuntu为例:

textminer@textminer:/usr/local/lib/python2.7/dist-packages/spacy/data$ du -sh *
776M en-1.1.0
774M en_glove_cc_300_1m_vectors-1.0.0

进入到英文数据模型下:

textminer@textminer:/usr/local/lib/python2.7/dist-packages/spacy/data/en-1.1.0$ du -sh *
424M deps
8.0K meta.json
35M ner
12M pos
84K tokenizer
300M vocab
6.3M wordnet

可以用如下命令检查模型数据是否安装成功:


textminer@textminer:~$ python -c "import spacy; spacy.load('en'); print('OK')"
OK

也可以用pytest进行测试:


# 首先找到spacy的安装路径:
python -c "import os; import spacy; print(os.path.dirname(spacy.__file__))"
/usr/local/lib/python2.7/dist-packages/spacy

# 再安装pytest:
sudo python -m pip install -U pytest

# 最后进行测试:
python -m pytest /usr/local/lib/python2.7/dist-packages/spacy --vectors --model --slow
============================= test session starts ==============================
platform linux2 -- Python 2.7.12, pytest-3.0.4, py-1.4.31, pluggy-0.4.0
rootdir: /usr/local/lib/python2.7/dist-packages/spacy, inifile:
collected 318 items

../../usr/local/lib/python2.7/dist-packages/spacy/tests/test_matcher.py ........
../../usr/local/lib/python2.7/dist-packages/spacy/tests/matcher/test_entity_id.py ....
../../usr/local/lib/python2.7/dist-packages/spacy/tests/matcher/test_matcher_bugfixes.py .....
......
../../usr/local/lib/python2.7/dist-packages/spacy/tests/vocab/test_vocab.py .......Xx
../../usr/local/lib/python2.7/dist-packages/spacy/tests/website/test_api.py x...............
../../usr/local/lib/python2.7/dist-packages/spacy/tests/website/test_home.py ............

============== 310 passed, 5 xfailed, 3 xpassed in 53.95 seconds ===============

现在可以快速测试一下spaCy的相关功能,我们以英文数据为例,spaCy目前主要支持英文和德文,对其他语言的支持正在陆续加入:


textminer@textminer:~$ ipython
Python 2.7.12 (default, Jul 1 2016, 15:12:24)
Type "copyright", "credits" or "license" for more information.

IPython 2.4.1 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.

In [1]: import spacy

# 加载英文模型数据,稍许等待
In [2]: nlp = spacy.load('en')

Word tokenize功能,spaCy 1.2版本加了中文tokenize接口,基于Jieba中文分词:

In [3]: test_doc = nlp(u"it's word tokenize test for spacy")

In [4]: print(test_doc)
it's word tokenize test for spacy

In [5]: for token in test_doc:
print(token)
...:
it
's
word
tokenize
test
for
spacy

英文断句:


In [6]: test_doc = nlp(u'Natural language processing (NLP) deals with the application of computational models to text or speech data. Application areas within NLP include automatic (machine) translation between languages; dialogue systems, which allow a human to interact with a machine using natural language; and information extraction, where the goal is to transform unstructured text into structured (database) representations that can be searched and browsed in flexible ways. NLP technologies are having a dramatic impact on the way people interact with computers, on the way people interact with each other through the use of language, and on the way people access the vast amount of linguistic data now in electronic form. From a scientific viewpoint, NLP involves fundamental questions of how to structure formal models (for example statistical models) of natural language phenomena, and of how to design algorithms that implement these models.')

In [7]: for sent in test_doc.sents:
print(sent)
...:
Natural language processing (NLP) deals with the application of computational models to text or speech data.
Application areas within NLP include automatic (machine) translation between languages; dialogue systems, which allow a human to interact with a machine using natural language; and information extraction, where the goal is to transform unstructured text into structured (database) representations that can be searched and browsed in flexible ways.
NLP technologies are having a dramatic impact on the way people interact with computers, on the way people interact with each other through the use of language, and on the way people access the vast amount of linguistic data now in electronic form.
From a scientific viewpoint, NLP involves fundamental questions of how to structure formal models (for example statistical models) of natural language phenomena, and of how to design algorithms that implement these models.


词干化(Lemmatize):


In [8]: test_doc = nlp(u"you are best. it is lemmatize test for spacy. I love these books")

In [9]: for token in test_doc:
print(token, token.lemma_, token.lemma)
...:
(you, u'you', 472)
(are, u'be', 488)
(best, u'good', 556)
(., u'.', 419)
(it, u'it', 473)
(is, u'be', 488)
(lemmatize, u'lemmatize', 1510296)
(test, u'test', 1351)
(for, u'for', 480)
(spacy, u'spacy', 173783)
(., u'.', 419)
(I, u'i', 570)
(love, u'love', 644)
(these, u'these', 642)
(books, u'book', 1011)

词性标注(POS Tagging):


In [10]: for token in test_doc:
print(token, token.pos_, token.pos)
....:
(you, u'PRON', 92)
(are, u'VERB', 97)
(best, u'ADJ', 82)
(., u'PUNCT', 94)
(it, u'PRON', 92)
(is, u'VERB', 97)
(lemmatize, u'ADJ', 82)
(test, u'NOUN', 89)
(for, u'ADP', 83)
(spacy, u'NOUN', 89)
(., u'PUNCT', 94)
(I, u'PRON', 92)
(love, u'VERB', 97)
(these, u'DET', 87)
(books, u'NOUN', 89)

命名实体识别(NER):


In [11]: test_doc = nlp(u"Rami Eid is studying at Stony Brook University in New York")

In [12]: for ent in test_doc.ents:
print(ent, ent.label_, ent.label)
....:
(Rami Eid, u'PERSON', 346)
(Stony Brook University, u'ORG', 349)
(New York, u'GPE', 350)

名词短语提取:


In [13]: test_doc = nlp(u'Natural language processing (NLP) deals with the application of computational models to text or speech data. Application areas within NLP include automatic (machine) translation between languages; dialogue systems, which allow a human to interact with a machine using natural language; and information extraction, where the goal is to transform unstructured text into structured (database) representations that can be searched and browsed in flexible ways. NLP technologies are having a dramatic impact on the way people interact with computers, on the way people interact with each other through the use of language, and on the way people access the vast amount of linguistic data now in electronic form. From a scientific viewpoint, NLP involves fundamental questions of how to structure formal models (for example statistical models) of natural language phenomena, and of how to design algorithms that implement these models.')

In [14]: for np in test_doc.noun_chunks:
print(np)
....:
Natural language processing
Natural language processing (NLP) deals
the application
computational models
text
speech
data
Application areas
NLP
automatic (machine) translation
languages
dialogue systems
a human
a machine
natural language
information extraction
the goal
unstructured text
structured (database) representations
flexible ways
NLP technologies
a dramatic impact
the way
people
computers
the way
people
the use
language
the way
people
the vast amount
linguistic data
electronic form
a scientific viewpoint
NLP
fundamental questions
formal models
example
natural language phenomena
algorithms
these models

基于词向量计算两个单词的相似度:


In [15]: test_doc = nlp(u"Apples and oranges are similar. Boots and hippos aren't.")

In [16]: apples = test_doc[0]

In [17]: print(apples)
Apples

In [18]: oranges = test_doc[2]

In [19]: print(oranges)
oranges

In [20]: boots = test_doc[6]

In [21]: print(boots)
Boots

In [22]: hippos = test_doc[8]

In [23]: print(hippos)
hippos

In [24]: apples.similarity(oranges)
Out[24]: 0.77809414836023805

In [25]: boots.similarity(hippos)
Out[25]: 0.038474555379008429

当然,spaCy还包括句法分析的相关功能等。另外值得关注的是 spaCy 从1.0版本起,加入了对深度学习工具的支持,例如 Tensorflow 和 Keras 等,这方面具体可以参考官方文档给出的一个对情感分析(Sentiment Analysis)模型进行分析的例子:Hooking a deep learning model into spaCy.

参考:
spaCy官方文档
Getting Started with spaCy

注:原创文章,转载请注明出处及保留链接“我爱自然语言处理”:http://www.52nlp.cn

本文链接地址:自然语言处理工具包spaCy介绍 http://www.52nlp.cn/?p=9386