条件随机场文献阅读指南

作者52nlp

5 月 24, 2010 #Brown90, #CRF, #Hanna Wallach, #John D. Lafferty, #文献, #最大熵模型, #机器学习, #条件随机场, #自然语言处理

　　与最大熵模型相似，条件随机场（Conditional random fields，CRFs）是一种机器学习模型，在自然语言处理的许多领域（如词性标注、中文分词、命名实体识别等）都有比较好的应用效果。条件随机场最早由John D. Lafferty提出，其也是Brown90的作者之一，和贾里尼克相似，在离开IBM后他去了卡耐基梅隆大学继续搞学术研究，2001年以第一作者的身份发表了CRF的经典论文 “Conditional random fields: Probabilistic models for segmenting and labeling sequence data”。
　　关于条件随机场的参考文献及其他资料，Hanna Wallach在05年整理和维护的这个页面“conditional random fields”非常不错，其中涵盖了自01年CRF提出以来的很多经典论文（不过似乎只到05年，之后并未更新）以及几个相关的工具包(不过也没有包括CRF++），但是仍然非常值得入门条件随机场的读者参考，以下摘选自该网页。

introduction

Conditional random fields (CRFs) are a probabilistic framework for labeling and segmenting structured data, such as sequences, trees and lattices. The underlying idea is that of defining a conditional probability distribution over label sequences given a particular observation sequence, rather than a joint distribution over both label and observation sequences. The primary advantage of CRFs over hidden Markov models is their conditional nature, resulting in the relaxation of the independence assumptions required by HMMs in order to ensure tractable inference. Additionally, CRFs avoid the label bias problem, a weakness exhibited by maximum entropy Markov models (MEMMs) and other conditional Markov models based on directed graphical models. CRFs outperform both MEMMs and HMMs on a number of real-world tasks in many fields, including bioinformatics, computational linguistics and speech recognition.

tutorial

Hanna M. Wallach. Conditional Random Fields: An Introduction. Technical Report MS-CIS-04-21. Department of Computer and Information Science, University of Pennsylvania, 2004.

papers by year

2001

John Lafferty, Andrew McCallum, Fernando Pereira. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML-2001), 2001.

2002

Hanna Wallach. Efficient Training of Conditional Random Fields. M.Sc. thesis, Division of Informatics, University of Edinburgh, 2002.

Thomas G. Dietterich. Machine Learning for Sequential Data: A Review. In Structural, Syntactic, and Statistical Pattern Recognition; Lecture Notes in Computer Science, Vol. 2396, T. Caelli (Ed.), pp. 15–30, Springer-Verlag, 2002.

2003

Fei Sha and Fernando Pereira. Shallow Parsing with Conditional Random Fields. In Proceedings of the 2003 Human Language Technology Conference and North American Chapter of the Association for Computational Linguistics (HLT/NAACL-03), 2003.

Andrew McCallum. Efficiently Inducing Features of Conditional Random Fields. In Proceedings of the 19th Conference in Uncertainty in Articifical Intelligence (UAI-2003), 2003.

David Pinto, Andrew McCallum, Xing Wei and W. Bruce Croft. Table Extraction Using Conditional Random Fields. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2003), 2003.

Andrew McCallum and Wei Li. Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons. In Proceedings of the Seventh Conference on Natural Language Learning (CoNLL), 2003.

Wei Li and Andrew McCallum. Rapid Development of Hindi Named Entity Recognition Using Conditional Random Fields and Feature Induction. In ACM Transactions on Asian Language Information Processing (TALIP), 2003.

Yasemin Altun and Thomas Hofmann. Large Margin Methods for Label Sequence Learning. In Proceedings of 8th European Conference on Speech Communication and Technology (EuroSpeech), 2003.

Simon Lacoste-Julien. Combining SVM with graphical models for supervised classification: an introduction to Max-Margin Markov Networks. CS281A Project Report, UC Berkeley, 2003.

2004

Andrew McCallum, Khashayar Rohanimanesh and Charles Sutton. Dynamic Conditional Random Fields for Jointly Labeling Multiple Sequences. Workshop on Syntax, Semantics, Statistics; 16th Annual Conference on Neural Information Processing Systems (NIPS 2003), 2004.

Kevin Murphy, Antonio Torralba and William T.F. Freeman. Using the forest to see the trees: a graphical model relating features, objects and scenes. In Advances in Neural Information Processing Systems 16 (NIPS 2003), 2004.

Sanjiv Kumar and Martial Hebert. Discriminative Fields for Modeling Spatial Dependencies in Natural Images. In Advances in Neural Information Processing Systems 16 (NIPS 2003), 2004.

Ben Taskar, Carlos Guestrin and Daphne Koller. Max-Margin Markov Networks. In Advances in Neural Information Processing Systems 16 (NIPS 2003), 2004.

Burr Settles. Biomedical Named Entity Recognition Using Conditional Random Fields and Rich Feature Sets. To appear in Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA), 2004.

A demo of the system can be downloaded here.

Charles Sutton, Khashayar Rohanimanesh and Andrew McCallum. Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data. In Proceedings of the Twenty-First International Conference on Machine Learning (ICML 2004), 2004.

John Lafferty, Xiaojin Zhu and Yan Liu. Kernel conditional random fields: representation and clique selection. In Proceedings of the Twenty-First International Conference on Machine Learning (ICML 2004), 2004.

Xuming He, Richard Zemel, and Miguel Á. Carreira-Perpiñán. Multiscale conditional random fields for image labelling. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), 2004.

Yasemin Altun, Alex J. Smola, Thomas Hofmann. Exponential Families for Conditional Random Fields. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (UAI-2004), 2004.

Michelle L. Gregory and Yasemin Altun. Using Conditional Random Fields to Predict Pitch Accents in Conversational Speech. In Proceedings of the 42^nd Annual Meeting of the Association for Computational Linguistics (ACL 2004), 2004.

Brian Roark, Murat Saraclar, Michael Collins and Mark Johnson. Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm. In Proceedings of the 42^nd Annual Meeting of the Association for Computational Linguistics (ACL 2004), 2004.

Ryan McDonald and Fernando Pereira. Identifying Gene and Protein Mentions in Text Using Conditional Random Fields. BioCreative, 2004.

Trausti T. Kristjansson, Aron Culotta, Paul Viola and Andrew McCallum. Interactive Information Extraction with Constrained Conditional Random Fields. In Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI 2004), 2004.

Thomas G. Dietterich, Adam Ashenfelter and Yaroslav Bulatov. Training Conditional Random Fields via Gradient Tree Boosting. In Proceedings of the Twenty-First International Conference on Machine Learning (ICML 2004), 2004.

John Lafferty, Yan Liu and Xiaojin Zhu. Kernel Conditional Random Fields: Representation, Clique Selection, and Semi-Supervised Learning. Technical Report CMU-CS-04-115, Carnegie Mellon University, 2004.

Fuchun Peng and Andrew McCallum (2004). Accurate Information Extraction from Research Papers using Conditional Random Fields. In Proceedings of Human Language Technology Conference and North American Chapter of the Association for Computational Linguistics (HLT/NAACL-04), 2004.

Yasemin Altun, Thomas Hofmann and Alexander J. Smola. Gaussian process classification for segmenting and annotating sequences. In Proceedings of the Twenty-First International Conference on Machine Learning (ICML 2004), 2004.

Yasemin Altun and Thomas Hofmann. Gaussian Process Classification for Segmenting and Annotating Sequences. Technical Report CS-04-12, Department of Computer Science, Brown University, 2004.

2005

Cristian Smimchisescu, Atul Kanaujia, Zhiguo Li and Dimitris Metaxus. Conditional Models for Contextual Human Motion Recognition. In Proceedings of the International Conference on Computer Vision, (ICCV 2005), Beijing, China, 2005.

Ariadna Quattoni, Michael Collins and Trevor Darrel. Conditional Random Fields for Object Recognition. In Advances in Neural Information Processing Systems 17 (NIPS 2004), 2005.

Jospeh Bockhorst and Mark Craven. Markov Networks for Detecting Overlapping Elements in Sequence Data. In Advances in Neural Information Processing Systems 17 (NIPS 2004), 2005.

Antonio Torralba, Kevin P. Murphy, William T. Freeman. Contextual models for object detection using boosted random fields. In Advances in Neural Information Processing Systems 17 (NIPS 2004), 2005.

Sunita Sarawagi and William W. Cohen. Semi-Markov Conditional Random Fields for Information Extraction. In Advances in Neural Information Processing Systems 17 (NIPS 2004), 2005.

Yuan Qi, Martin Szummer and Thomas P. Minka. Bayesian Conditional Random Fields. To appear in Proceedings of the Tenth International W\orkshop on Artificial Intelligence and Statistics (AISTATS 2005), 2005.

Aron Culotta, David Kulp and Andrew McCallum. Gene Prediction with Conditional Random Fields. Technical Report UM-CS-2005-028. University of Massachusetts, Amherst, 2005.

Yang Wang and Qiang Ji. A Dynamic Conditional Random Field Model for Object Segmentation in Image Sequences. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), Volume 1, 2005.

2010

An Introduction to Conditional Random Fields. Charles Sutton, Andrew McCallum. Foundations and Trends in Machine Learning. To appear. 2011.
（注：这篇论文由水木nlp版boycat大牛推荐，非常感谢）

software

MALLET: A Machine Learning for Language Toolkit.

MALLET is an integrated collection of Java code useful for statistical natural language processing, document classification, clustering, information extraction, and other machine learning applications to text.

ABNER: A Biomedical Named Entity Recognizer.

ABNER is a text analysis tool for molecular biology. It is essentially an interactive, user-friendly interface to a system designed as part of the NLPBA/BioNLP 2004 Shared Task challenge.

MinorThird.

MinorThird is a collection of Java classes for storing text, annotating text, and learning to extract entities and categorize text.

Kevin Murphy's MATLAB CRF code.

Conditional random fields (chains, trees and general graphs; includes BP code).

Sunita Sarawagi's CRF package.

The CRF package is a Java implementation of conditional random fields for sequential labeling.

　　最后推荐CRF++:Yet Another CRF toolkit，如果读者对于基于字标注的中文分词感兴趣，可以很快的利用该工具包构造一个基于条件随机场的中文分词工具，而且性能也不赖。

注：转载请注明出处“我爱自然语言处理”：www.52nlp.cn

本文链接地址：https://www.52nlp.cn/条件随机场文献阅读指南

作者 52nlp

LLm 自然语言处理预训练模型

《条件随机场文献阅读指南》有37条评论

luojia说道：

2010年05月31号 19:06

请问，有些论文中中文分词分做两个阶段进行，第一个阶段进行粗分，第二个阶段进行未登录词识别和其他工作；这种分两个阶段的做法对于不分两步的分词有什么优势吗？

[回复]
52nlp 回复:
1 6 月, 2010 at 00:45
未登录词识别是中文分词中非常重要的一部分，最基本的分词只要有充足的词典是可以达到90%的准确率，但是对于剩下的10%，其中最主要的问题可能就是出在这些未登录的问题上，因此这个环节至关重要；即使是基于字标注的中文分词，它也隐藏了对于未登录词的处理，只不过这个处理过程中同等对待了。推荐看看黄昌宁老师的《中文分词十年综述》，这样心里会有数一些。

[回复]
沙克说道：

2010年07月8号 23:17

中文也有一些不错的CRF学习范例与CRF程序(比如pocket-CRF，一个高阶的CRF程序)，只是碍于非教育网对于论文搜索以及下载十分有限，未能传播出去。

欢迎加入CRF群交流：109709858

[回复]
52nlp 回复:
9 7 月, 2010 at 06:58
非常感谢！

[回复]
一体同观分：文献综述 | 六月流火说道：

2010年10月18号 06:11

[...] 或者像这样先写出一个小的部分与最大熵模型相似，条件随机场（Conditional random fields，CRFs）是一种机器学习模型，在自然语言处理的许多领域（如词性标注、中文分词、命名实体识别等）都有比较好的应用效果。条件随机场最早由John D. Lafferty提出，其也是Brown90的作者之一，和贾里尼克相似，在离开IBM后他去了卡耐基梅隆大学继续搞学术研究，2001年以第一作者的身份发表了CRF的经典论文 “Conditional random fields: Probabilistic models for segmenting and labeling sequence data”。关于条件随机场的参考文献及其他资料，Hanna Wallach在05年整理和维护的这个页面“conditional random fields”非常不错，其中涵盖了自01年CRF提出以来的很多经典论文（不过似乎只到05年，之后并未更新）以及几个相关的工具包(不过也没有包括CRF++），但是仍然非常值得入门条件随机场的读者参考，以下摘选自该网页 [...]
Tsing说道：

2010年12月12号 16:51

你好，我在学习使用CRF++，但是CRF++在windows 7下未能正常运行（使用的是example里的train.data和template），读入数据后就异常退出了。不知道52nlp（^__^不知道这样称呼恰当不）有没有在windows下使用CRF++的经验。

[回复]
52nlp 回复:
12 12 月, 2010 at 19:24
抱歉，没用过win7, 也没有windows下使用CRF++的经验，为什么不考虑在linux下使用呢。

[回复]
HelloWord.exe 回复:
24 10 月, 2014 at 16:02
最新版CRF++可以正常在win7上使用，之前的不知道。

[回复]
丕子说道：

2010年12月21号 13:34

最近在看CRF

[回复]
52nlp 回复:
21 12 月, 2010 at 21:31
看了你的博文，写得不错！

[回复]
沙克说道：

2011年01月29号 13:45

博主你好，我最近尝试实现CRF，关于feature template有一些疑问（关于它们具体是怎样被定义出来的），请问你有什么好的学习材料可以介绍以下么？谢谢。

[回复]
52nlp 回复:
30 1 月, 2011 at 11:09
我对CRF实现的细节也不是很清楚，关于“feature template”，是指针对某个应用（譬如中文分词）定义的模版还是CRF内部泛化的模版？
关于实现的细节，可以试试看看几个开源版本里是怎样定义的？抱歉，我大概知道的就这么多了。

[回复]
沙克回复:
30 1 月, 2011 at 12:04
指针对某个任务所用的模板，在CRF++里头，各个任务(chunk, seg..)的template都不同，但是都相近。我再去找找资源吧，谢谢博主:-D

[回复]
52nlp 回复:
30 1 月, 2011 at 13:11
这些任务（chunk, seg...)定义的模版应该都是来自于相关论文里提到一些特征抽取方法吧，特别是chunk和seg，字标注方法对它们的影响比较大，这些论文是有脉络可循的，看一下这些论文应该是可能找到思路的。
learningboy说道：

2011年02月19号 15:31

想问下CRF能不能做图像标签，即图像分割？

[回复]
52nlp 回复:
19 2 月, 2011 at 21:10
没做过图像方面的事情，不太清楚。

[回复]
emnlp 回复:
20 4 月, 2011 at 11:18
可以，嘿嘿，我替博主回答了
不过需要你根据自己的需求来建模

[回复]
sabrina 回复:
22 5 月, 2013 at 15:44
亲，你了解CRF2D吗？它有没有对应的文章！

[回复]
52nlp 回复:
22 5 月, 2013 at 20:47
抱歉，不清楚

Lily 回复:
23 10 月, 2014 at 19:31
你好，我想请教关于CRF做图像分割的问题。

[回复]
emnlp说道：

2011年04月20号 11:27

博主加个这个吧
不错
An Introduction to Conditional Random Fields. Charles Sutton, Andrew McCallum. Foundations and Trends in Machine Learning. To appear. 2011.

[回复]
52nlp 回复:
20 4 月, 2011 at 14:04
多谢！稍后会加！这两天这里因为你活跃多了！呵呵！

[回复]
emnlp 回复:
22 4 月, 2011 at 23:18
呵呵：）
这段有点时间了
换个mj而已霍霍

[回复]
52nlp 回复:
23 4 月, 2011 at 13:39
不穿mj又是谁呢？
maomaoyu10说道：

2011年09月6号 14:58

An Introduction to Conditional Random Fields. Charles Sutton, Andrew McCallum. Foundations and Trends in Machine Learning. To appear. 2011.
——看了下目录，感觉是这两位06年发表的An Introduction to Conditional Random Fields for Relational Learning的扩展版，相对90页的篇幅，感觉入门的同学看06’这篇更合适，从HMM到链式CRF再到一般化CRF的推导，06的这篇短小精干，脉络相当清晰，每处点到为止，也不乏如生成模型与判别模型的比较这样的出彩点。只是这篇短文“From HMMs to CRFs”小节中的后四个公式似乎都少了对时间t的求和，后面参数估计立马又补充回来了，估计是笔误，看的时候注意一下即可。

[回复]
yky 回复:
20 3 月, 2013 at 09:02
请教下如何理解生成模型与判别模型比较章节提到的“naive Bayes无法处理features高度相关的数据，而LR就可以”？实在不明白作者的意思，也没有一个例子。

[回复]
HAHA说道：

2012年12月17号 00:04

请问，CRFs模型与Structured SVM以及Structured Perceptron有什么联系和区别呢？

[回复]
52nlp 回复:
20 12 月, 2012 at 19:16
抱歉，对后二者不太了解。

[回复]
Yu说道：

2013年01月13号 17:27

对CRF了解少的可怜的受大神指引找到了这儿。
感谢博主如此强大的

另外，全文转载至 http://argcandargv.com/archives/983 处
如果需要修改署名，或者觉得在下的转载对您有所侵犯，请指出，一定在最快时间内按照您的意思修改/删除

[回复]
52nlp 回复:
14 1 月, 2013 at 09:12
太客气了，转载注明出处就可以了。

[回复]
Yu说道：

2013年01月13号 19:54

另外，有一些链接已经坏了，
如果可惜，希望能维护下。。拜谢
找到一个
Biomedical Named Entity Recognition Using Conditional Random Fields and Rich Feature Sets
这个，正确链接为：
http://www.cs.cmu.edu/~bsettles/pub/settles.nlpba04.pdf

[回复]
52nlp 回复:
14 1 月, 2013 at 09:14
已更新这个，谢谢。

[回复]
中文分词入门之字标注法4 | 我爱自然语言处理说道：

2013年12月31号 13:00

[…] 关于条件随机场（CRF）的背景知识，推荐参考阅读一些经典的文献：《条件随机场文献阅读指南》，另外再额外推荐一个tutorial:《Classical Probabilistic Models and Conditional Random Fields》, 这份关于CRF的文档分别从概率模型（NB，HMM，ME， CRF)之间的关系以及概率图模型背景来介绍条件随机场，比较清晰： […]
likehood说道：

2014年12月29号 18:04

刚粗略看完HMM，辛苦52nlp了! 这可能是我入手概率图模型的一个契机 🙂
看完之后来到CRF这里，推荐的资料里面我想推荐Notes for a tutorial at CIKM’08 by Charles Elkan
Log-linear models and conditional random fields: http://www.cs.columbia.edu/~smaskey/CS6998-0412/supportmaterial/cikmtutorial.pdf

从log-linear model出发，根据已有的知识储备(最大熵)的model、learning和prediction延伸到CRF，统一到同一个框架下面.

[回复]
52nlp 回复:
30 12 月, 2014 at 00:15
thanks

[回复]
LucasYang说道：

2015年03月11号 16:10

你好，我想度 CRF 的源码，有什么推荐和建议么？谢谢

[回复]
52nlp 回复:
13 3 月, 2015 at 11:57
建议CRF++

[回复]

条件随机场文献阅读指南

作者52nlp

introduction

tutorial

papers by year

2001

2002

2003

2004

2005

2010

software

作者 52nlp

相关文章

Qwen3来了，全尺寸开源，性能拉满！附最新一手实测！

DeepSeek-V3解析及技术报告英中报告对照版

如何构建和优化推理型大型语言模型？DeepSeek R1的启示

《条件随机场文献阅读指南》有37条评论

发表回复

You missed

Qwen3-VL技术报告英中对照版.pdf

DeepSeek-V3.2-Exp：用稀疏注意力实现更高效的长上下文推理

LongCat-Flash：美团发布的高效MoE大模型，支持智能体任务，推理速度达100 token/秒

GLM-4.5：三体合一的开源智能体大模型，重新定义AI推理边界

作者52nlp

相关文章：

作者 52nlp

相关文章

《条件随机场文献阅读指南》有37条评论

发表回复

You missed