Machine Learning01:Models and cost function

May 13, 2015

Problems and Models:

machine learning是指通过对一系列含有n个feature，已知具有某种attribute的变量（训练集）进行分析，找到一个hypothesis(或者叫function，model，pattern，就是找到一个模型或者假设)，这个hypothesis接受到一个变量的n个feature后能够分析出这个变量该attribute的值。

比如输入一系列面积为x，价格为y的房屋信息，找出一个hypothesis，向这个hypothesis输入一个X，这个hypothesis就会预测x=X时，y应该是多少。

input: feature:x1,x2,x3 = 200,300,150,…
attribute:y1,y2,y3 = 10000,20000,15000,…
hypothesis: h(x):h(X)=Y,h(400) = 25000,…

这里输入数据只含有一个feature:x,我们要预测的就是y。

我们最终的目的是让h(x)预测的结果和实际的y值相差最小,也就是 minimize(h(x)-y)

hypothesis function:

hypothesis实际上是我们对输入数据表现形态的一个拟合，hypothesis建立以后可以接收一个变量的n个feature后能够分析出这个变量该attribute的值。

假设我们的hypothesis是linear function:h1(x) = ax+b ，假设只含有一个feature x ,如果含有两个feature的话，形式就是h1(x1,x2) = ax1+b*x2+c，可以含有n个feature。

h(x)是一个关于x，关于feature的函数。结果是预测的attribute

如下图所示，图一是一组具有单一feature和单一attribute的样例分布图，图二是根据通过分析这些样例建立的一个hypothesis。

cost function:

对于hypothesis分析出该变量的attribute值attribute(h)和该变量实际上的attribute值attribute(r)，我们找到一个hypothesis，使对训练集中的变量，attribute(h)与attribute(r)差距最小。

对于hypothesis h1(x) = a*x+b

cost function J(h1) = J(a,b) = 1/n*((h1(x1)-y1)+(h1(x2)-y2)+…)
= average(sum(h1(x)-y)))

cost function是一个关于a，b即h(x)中参数的函数，结果是h(x)预测结果和实际值的差距，cost function在某点的值就是该hypothesis对数据拟合的符合程度。

我们的最终目的也就是找到minimize(J(a,b))的a,b，对应的h(x):h(x)=a*x+b。

在多feature的情况下，cost function通常用如下公式计算。

θ是hypothesis中的参数，即a,b,…组成的一个向量。

matlab代码示例


temp = X*theta - y;
J = sum ( temp' * temp ) / ( 2 * m )