1. Andrew Ng 老师的 机器学习课程（Machine Learning）
机器学习入门首选课程，没有之一。这门课程从一开始诞生就备受瞩目，据说全世界有数百万人通过这门课程入门机器学习。课程的级别是入门级别的，对学习者的背景要求不高，Andrew Ng 老师讲解的又很通俗易懂，所以强烈推荐从这门课程开始走入机器学习。课程简介：
机器学习是一门研究在非特定编程条件下让计算机采取行动的学科。最近二十年，机器学习为我们带来了自动驾驶汽车、实用的语音识别、高效的网络搜索，让我们对人类基因的解读能力大大提高。当今机器学习技术已经非常普遍，您很可能在毫无察觉情况下每天使用几十次。许多研究者还认为机器学习是人工智能（AI）取得进展的最有效途径。在本课程中，您将学习最高效的机器学习技术，了解如何使用这些技术，并自己动手实践这些技术。更重要的是，您将不仅将学习理论知识，还将学习如何实践，如何快速使用强大的技术来解决新问题。最后，您将了解在硅谷企业如何在机器学习和AI领域进行创新。 本课程将广泛介绍机器学习、数据挖掘和统计模式识别。相关主题包括：(i) 监督式学习（参数和非参数算法、支持向量机、核函数和神经网络）。(ii) 无监督学习（集群、降维、推荐系统和深度学习）。(iii) 机器学习实例（偏见/方差理论；机器学习和AI领域的创新）。课程将引用很多案例和应用，您还需要学习如何在不同领域应用学习算法，例如智能机器人（感知和控制）、文本理解（网络搜索和垃圾邮件过滤）、计算机视觉、医学信息学、音频、数据库挖掘等领域。
Machine learning is the study that allows computers to adaptively improve their performance with experience accumulated from the data observed. Our two sister courses teach the most fundamental algorithmic, theoretical and practical tools that any user of machine learning needs to know. This first course of the two would focus more on mathematical tools, and the other course would focus more on algorithmic tools. [機器學習旨在讓電腦能由資料中累積的經驗來自我進步。我們的兩項姊妹課程將介紹各領域中的機器學習使用者都應該知道的基礎演算法、理論及實務工具。本課程將較為著重數學類的工具，而另一課程將較為著重方法類的工具。]
Machine learning is the study that allows computers to adaptively improve their performance with experience accumulated from the data observed. Our two sister courses teach the most fundamental algorithmic, theoretical and practical tools that any user of machine learning needs to know. This second course of the two would focus more on algorithmic tools, and the other course would focus more on mathematical tools. [機器學習旨在讓電腦能由資料中累積的經驗來自我進步。我們的兩項姊妹課程將介紹各領域中的機器學習使用者都應該知道的基礎演算法、理論及實務工具。本課程將較為著重方法類的工具，而另一課程將較為著重數學類的工具。
4. 华盛顿大学的 "机器学习专项课程（Machine Learning Specialization）"
这个系列课程包含4门子课程，分别是 机器学习基础：案例研究 , 机器学习：回归 , 机器学习：分类， 机器学习：聚类与检索:
This Specialization from leading researchers at the University of Washington introduces you to the exciting, high-demand field of Machine Learning. Through a series of practical case studies, you will gain applied experience in major areas of Machine Learning including Prediction, Classification, Clustering, and Information Retrieval. You will learn to analyze large and complex datasets, create systems that adapt and improve over time, and build intelligent applications that can make predictions from data.
Do you have data and wonder what it can tell you? Do you need a deeper understanding of the core ways in which machine learning can improve your business? Do you want to be able to converse with specialists about anything from regression and classification to deep learning and recommender systems? In this course, you will get hands-on experience with machine learning from a series of practical case-studies. At the end of the first course you will have studied how to predict house prices based on house-level features, analyze sentiment from user reviews, retrieve documents of interest, recommend products, and search for images. Through hands-on practice with these use cases, you will be able to apply machine learning methods in a wide range of domains. This first course treats the machine learning method as a black box. Using this abstraction, you will focus on understanding tasks of interest, matching these tasks to machine learning tools, and assessing the quality of the output. In subsequent courses, you will delve into the components of this black box by examining models and algorithms. Together, these pieces form the machine learning pipeline, which you will use in developing intelligent applications. Learning Outcomes: By the end of this course, you will be able to: -Identify potential applications of machine learning in practice. -Describe the core differences in analyses enabled by regression, classification, and clustering. -Select the appropriate machine learning task for a potential application. -Apply regression, classification, clustering, retrieval, recommender systems, and deep learning. -Represent your data as features to serve as input to machine learning models. -Assess the model quality in terms of relevant error metrics for each task. -Utilize a dataset to fit a model to analyze new data. -Build an end-to-end application that uses machine learning at its core. -Implement these techniques in Python.
这门课程关注机器学习里面的一个基本问题: 回归（Regression)， 也通过案例研究（预测房价）的方式进行回归问题的学习，最终通过Python实现相关的机器学习算法。
Case Study - Predicting Housing Prices In our first case study, predicting house prices, you will create models that predict a continuous value (price) from input features (square footage, number of bedrooms and bathrooms,...). This is just one of the many places where regression can be applied. Other applications range from predicting health outcomes in medicine, stock prices in finance, and power usage in high-performance computing, to analyzing which regulators are important for gene expression. In this course, you will explore regularized linear regression models for the task of prediction and feature selection. You will be able to handle very large sets of features and select between models of various complexity. You will also analyze the impact of aspects of your data -- such as outliers -- on your selected models and predictions. To fit these models, you will implement optimization algorithms that scale to large datasets. Learning Outcomes: By the end of this course, you will be able to: -Describe the input and output of a regression model. -Compare and contrast bias and variance when modeling data. -Estimate model parameters using optimization algorithms. -Tune parameters with cross validation. -Analyze the performance of the model. -Describe the notion of sparsity and how LASSO leads to sparse solutions. -Deploy methods to select between models. -Exploit the model to form predictions. -Build a regression model to predict prices using a housing dataset. -Implement these techniques in Python.
这门课程关注机器学习里面的另一个基本问题: 分类（Classification)， 通过两个案例研究进行学习：情感分析和贷款违约预测，最终通过Python实现相关的算法（也可以选择其他语言，但是强烈推荐Python)。
Case Studies: Analyzing Sentiment & Loan Default Prediction In our case study on analyzing sentiment, you will create models that predict a class (positive/negative sentiment) from input features (text of the reviews, user profile information,...). In our second case study for this course, loan default prediction, you will tackle financial data, and predict when a loan is likely to be risky or safe for the bank. These tasks are an examples of classification, one of the most widely used areas of machine learning, with a broad array of applications, including ad targeting, spam detection, medical diagnosis and image classification. In this course, you will create classifiers that provide state-of-the-art performance on a variety of tasks. You will become familiar with the most successful techniques, which are most widely used in practice, including logistic regression, decision trees and boosting. In addition, you will be able to design and implement the underlying algorithms that can learn these models at scale, using stochastic gradient ascent. You will implement these technique on real-world, large-scale machine learning tasks. You will also address significant tasks you will face in real-world applications of ML, including handling missing data and measuring precision and recall to evaluate a classifier. This course is hands-on, action-packed, and full of visualizations and illustrations of how these techniques will behave on real data. We've also included optional content in every module, covering advanced topics for those who want to go even deeper! Learning Objectives: By the end of this course, you will be able to: -Describe the input and output of a classification model. -Tackle both binary and multiclass classification problems. -Implement a logistic regression model for large-scale classification. -Create a non-linear model using decision trees. -Improve the performance of any model using boosting. -Scale your methods with stochastic gradient ascent. -Describe the underlying decision boundaries. -Build a classification model to predict sentiment in a product review dataset. -Analyze financial data to predict loan defaults. -Use techniques for handling missing data. -Evaluate your models using precision-recall metrics. -Implement these techniques in Python (or in the language of your choice, though Python is highly recommended).
Case Studies: Finding Similar Documents A reader is interested in a specific news article and you want to find similar articles to recommend. What is the right notion of similarity? Moreover, what if there are millions of other documents? Each time you want to a retrieve a new document, do you need to search through all other documents? How do you group similar documents together? How do you discover new, emerging topics that the documents cover? In this third case study, finding similar documents, you will examine similarity-based algorithms for retrieval. In this course, you will also examine structured representations for describing the documents in the corpus, including clustering and mixed membership models, such as latent Dirichlet allocation (LDA). You will implement expectation maximization (EM) to learn the document clusterings, and see how to scale the methods using MapReduce. Learning Outcomes: By the end of this course, you will be able to: -Create a document retrieval system using k-nearest neighbors. -Identify various similarity metrics for text data. -Reduce computations in k-nearest neighbor search by using KD-trees. -Produce approximate nearest neighbors using locality sensitive hashing. -Compare and contrast supervised and unsupervised learning tasks. -Cluster documents by topic using k-means. -Describe how to parallelize k-means using MapReduce. -Examine probabilistic clustering approaches using mixtures models. -Fit a mixture of Gaussian model using expectation maximization (EM). -Perform mixed membership modeling using latent Dirichlet allocation (LDA). -Describe the steps of a Gibbs sampler and how to use its output to draw inferences. -Compare and contrast initialization techniques for non-convex optimization objectives. -Implement these techniques in Python.
Python机器学习应用课程，这门课程主要聚焦在通过Python应用机器学习，包括机器学习和统计学的区别，机器学习工具包scikit-learn的介绍，有监督学习和无监督学习，数据泛化问题（例如交叉验证和过拟合）等。这门课程同时属于"Python数据科学应用专项课程系列（Applied Data Science with Python Specialization）"。
This course will introduce the learner to applied machine learning, focusing more on the techniques and methods than on the statistics behind these methods. The course will start with a discussion of how machine learning is different than descriptive statistics, and introduce the scikit learn toolkit. The issue of dimensionality of data will be discussed, and the task of clustering data, as well as evaluating those clusters, will be tackled. Supervised approaches for creating predictive models will be described, and learners will be able to apply the scikit learn predictive modelling methods while understanding process issues related to data generalizability (e.g. cross validation, overfitting). The course will end with a look at more advanced techniques, such as building ensembles, and practical limitations of predictive models. By the end of this course, students will be able to identify the difference between a supervised (classification) and unsupervised (clustering) technique, identify which technique they need to apply for a particular dataset and need, engineer features to meet that need, and write python code to carry out an analysis. This course should be taken after Introduction to Data Science in Python and Applied Plotting, Charting & Data Representation in Python and before Applied Text Mining in Python and Applied Social Analysis in Python.
6. 俄罗斯国立高等经济学院和Yandex联合推出的 高级机器学习专项课程系列（Advanced Machine Learning Specialization）
This specialization gives an introduction to deep learning, reinforcement learning, natural language understanding, computer vision and Bayesian methods. Top Kaggle machine learning practitioners and CERN scientists will share their experience of solving real-world problems and help you to fill the gaps between theory and practice. Upon completion of 7 courses you will be able to apply modern machine learning methods in enterprise and understand the caveats of real-world data and settings.
Bayesian methods are used in lots of fields: from game development to drug discovery. They give superpowers to many machine learning algorithms: handling missing data, extracting much more information from small datasets. Bayesian methods also allow us to estimate uncertainty in predictions, which is a really desirable feature for fields like medicine. When Bayesian methods are applied to deep learning, it turns out that they allow you to compress your models 100 folds, and automatically tune hyperparametrs, saving your time and money. In six weeks we will discuss the basics of Bayesian methods: from how to define a probabilistic model to how to make predictions from it. We will see how one can fully automate this workflow and how to speed it up using some advanced techniques. We will also see applications of Bayesian methods to deep learning and how to generate new images with it. We will see how new drugs that cure severe diseases be found with Bayesian methods.
7. 约翰霍普金斯大学的 Practical Machine Learning（机器学习实战）
这门课程从数据科学的角度来应用机器学习进修实战，课程将会介绍机器学习的基础概念譬如训练集，测试集，过拟合和错误率等，同时这门课程也会介绍机器学习的基本模型和算法，例如回归，分类，朴素贝叶斯，以及随机森林。这门课程最终会覆盖一个完整的机器学习实战周期，包括数据采集，特征生成，机器学习算法应用以及结果评估等。这门机器学习实践课程同时属于约翰霍普金斯大学的 数据科学专项课程（Data Science Specialization）系列：
One of the most common tasks performed by data scientists and data analysts are prediction and machine learning. This course will cover the basic components of building and applying prediction functions with an emphasis on practical applications. The course will provide basic grounding in concepts such as training and tests sets, overfitting, and error rates. The course will also introduce a range of model based and algorithmic machine learning methods including regression, classification trees, Naive Bayes, and random forests. The course will cover the complete process of building prediction functions including data collection, feature creation, algorithms, and evaluation.
This course focuses on one of the most important tools in your data analysis arsenal: regression analysis. Using either SAS or Python, you will begin with linear regression and then learn how to adapt when two variables do not present a clear linear relationship. You will examine multiple predictors of your outcome and be able to identify confounding variables, which can tell a more compelling story about your results. You will learn the assumptions underlying regression analysis, how to interpret regression coefficients, and how to use regression diagnostic plots and other tools to evaluate the quality of your regression model. Throughout the course, you will share with others the regression models you have developed and the stories they tell you.
这门课程关注数据分析里的机器学习，机器学习的过程是一个开发、测试和应用预测算法来实现目标的过程，这门课程以 Regression Modeling in Practice（回归模型实战） 为基础，介绍机器学习中的有监督学习概念，同时从基础的分类算法到决策树以及聚类都会覆盖。通过完成这门课程，你将会学习如何应用、测试和解读机器学习算法用来解决实际问题。
Are you interested in predicting future outcomes using your data? This course helps you do just that! Machine learning is the process of developing, testing, and applying predictive algorithms to achieve this goal. Make sure to familiarize yourself with course 3 of this specialization before diving into these machine learning concepts. Building on Course 3, which introduces students to integral supervised machine learning concepts, this course will provide an overview of many additional concepts, techniques, and algorithms in machine learning, from basic classification to decision trees and clustering. By completing this course, you will learn how to apply, test, and interpret machine learning algorithms as alternative methods for addressing your research questions.
10. 加州大学圣地亚哥分校的 Machine Learning With Big Data（大数据机器学习）
Want to make sense of the volumes of data you have collected? Need to incorporate data-driven decisions into your process? This course provides an overview of machine learning techniques to explore, analyze, and leverage data. You will be introduced to tools and algorithms you can use to create machine learning models that learn from data, and to scale those models up to big data problems. At the end of the course, you will be able to: • Design an approach to leverage data using the steps in the machine learning process. • Apply machine learning techniques to explore and prepare data for modeling. • Identify the type of machine learning problem in order to apply the appropriate set of techniques. • Construct models that learn from data using widely available open source tools. • Analyze big data problems using scalable machine learning algorithms on Spark.
11. 俄罗斯搜索巨头Yandex推出的 Big Data Applications: Machine Learning at Scale（大数据应用：大规模机器学习）
Machine learning is transforming the world around us. To become successful, you’d better know what kinds of problems can be solved with machine learning, and how they can be solved. Don’t know where to start? The answer is one button away. During this course you will: - Identify practical problems which can be solved with machine learning - Build, tune and apply linear models with Spark MLLib - Understand methods of text processing - Fit decision trees and boost them with ensemble learning - Construct your own recommender system. As a practical assignment, you will - build and apply linear models for classification and regression tasks; - learn how to work with texts; - automatically construct decision trees and improve their performance with ensemble learning; - finally, you will build your own recommender system! With these skills, you will be able to tackle many practical machine learning tasks. We provide the tools, you choose the place of application to make this world of machines more intelligent.
这门课程同时属于Yandex推出的 面向数据工程师的大数据专项课程系列（Big Data for Data Engineers Specialization）。
PRML读书会第十四章 Combining Models
conditional mixture model条件混合模型：引入概率机制来选择不同模型对某个样本做预测，相比决策树的硬性选择，要有很多优势。
本章主要介绍了这几种混合模型。讲之前，先明确一下混合模型与Bayesian model averaging的区别，贝叶斯模型平均是这样的：假设有H个不同模型h，每个模型的先验概率是p(h)，一个数据集的分布是：
整个数据集X是由一个模型生成的，关于h的概率仅仅表示是由哪个模型来生成的 这件事的不确定性。而本章要讲的混合模型是数据集中，不同的数据点可能由不同模型生成。看后面讲到的内容就明白了。 继续阅读