用SGD 优化 Logistic Regression_dylan_fan

http://blog.sina.com.cn/u/1680645085

首页博文目录关于我

个人资料

微博

加好友发纸条

写留言加关注

博客等级：
博客积分：

博客访问：
关注人气：
获赠金笔：0支
赠出金笔：0支
荣誉徽章：

正文字体大小：大中小

用SGD 优化 Logistic Regression

(2012-07-11 20:56:12)

标签：

杂谈

最近几天，重新看了下Logistic Regression 。想通过 SGD（随机梯度下降）来优化LR里面的参数。这是一个on-line 学习的一个好的方法。里面主要涉及那个代价函数，这个我之前没有系统学习过。走了些弯路，因为学习结果，受这个代价函数影响很大。

方法如下：

预测函数f(x),从这个函数可以看出，预测的值在0-1之间。f(x)正好是样本x归类到正样本的概率。
http://s1/middle/642c9bddgc4d939fc1860&690优化 Logistic Regression" TITLE="用SGD 优化 Logistic Regression" />

这里，我就得想想如何求得这个 http://s1/middle/642c9bddgc4d9519bb200&690优化 Logistic Regression" TITLE="用SGD 优化 Logistic Regression" />？里面有很多方法，最大似然等。这里，我主要想想说下随机梯度的方法来优化这个参数。

首先，从优化的角度来看，我们应该建立一个代价函数。

代价函数1如下：

http://s7/middle/642c9bddgc4d9d6d669e6&690优化 Logistic Regression" TITLE="用SGD 优化 Logistic Regression" />

提下：代价函数的建立，希望是一个凸函数，这样，局部最优解就是全局最优解。

但是，这里有个问题，就是一旦f(x)->1时候，log函数会超过范围。所以，针对，这个问题，希望做如下修正：

http://s13/middle/642c9bddgc4d9db85b8fc&690优化 Logistic Regression" TITLE="用SGD 优化 Logistic Regression" />

这样 http://s4/middle/642c9bddgc4d9c2405c33&690优化 Logistic Regression" TITLE="用SGD 优化 Logistic Regression" />的更新步骤：

http://s13/middle/642c9bddgc4d9c36b209c&690优化 Logistic Regression" TITLE="用SGD 优化 Logistic Regression" />

其中，r是学习速率。 r是一个很小的值。这里我们可以这样设置学习速率 r=sqrt(t0/(t0+t))

贴上自己写的python代码，
# author :dylan_fan;
# write at 2012/07/07;

import math
import string
import random

class SGD_LR:
    def __init__(self):
#        self.alpha = 0.0000012
#        self.lamb = 0.0000015
        self.lamb = 0.000001
        self.weight = {}
        self.itera_times = 500

    def logistic_train(self,Y, X):
        if len(Y) != len(X):
            print "Y and X length is matched"
            return
        for i in range(self.itera_times):
            cost_f = 0.0
            self.alpha = math.sqrt(0.003/(0.003+i))

            for k in range(len(Y)):
                label = Y[k]
                feature = X[k]
                predict_value = self.predict(feature)
                error = label - predict_value
                cost_f += math.fabs(error)
#                error = label * math.log(predict_value) + (1-label) * math.log(1-predict_value)
#                cost_f += (-error)
                tmp1 = label * math.exp(-predict_value) /(1 + math.exp(-predict_value)) -(1-label) * math.exp(-(1-predict_value)) /(1 + math.exp(-(1 -predict_value)))

                weight_sum = 0.0

                tmp = label/predict_value - (1-label)/(1-predict_value)

                for f , v in feature.items():
                    weight_sum += self.weight[f] * v
                linear_sum = math.exp(-weight_sum)
                derivation = linear_sum / (linear_sum * linear_sum + 2 * linear_sum +1)

                for f in feature.keys():
                    self.weight[f] += self.alpha * (tmp1 * derivation * feature[f] - self.lamb * self.weight[f]) # update ruel by gradient Descent methods
            print 'iteration', i, cost_f/len(Y),'done'

        return

    def logistic_save_model(self, model_file):
        fw = open(model_file, "w")
        for f , w in self.weight.items():
            fw.write("%d : %f\n" %(f,w))
        fw.close()
        print "model save ok..."

    def logistic_load_model(self,model_file):
        fr = open(model_file,"r")
        for line in fr.readlines():
#            print line
            f, w = line.split(":")
            f = string.atoi(f)
            w = string.atof(w)
            self.weight[f] = w
        fr.close()
        print "model load ok..."

    def predict(self, feature):
        weight_sum = 0
        for f , v in feature.items():
            self.weight.setdefault(f,0)
            weight_sum += self.weight[f] * v

        return 1.0 / (1.0 + math.exp(-weight_sum))

    def logistic_predict(self, Y, X):
        if len(Y) != len(X):
            print "Y and X length is matched"
        rmse = 0.0
        p_vals = []
        p_labels = []
        accuate = 0.0
        for i in range(len(Y)):
            feature = X[i]
            label = Y[i]
            predict_value = self.predict(feature)
            rmse += (label - predict_value) * (label - predict_value)
            p_vals.append([predict_value])
            if predict_value > 0.5:
                p_label = 1
            else:
                p_label = 0
            p_labels.append(p_label)
            if p_label == label:
                accuate += 1.0

        rmse /= len(Y)
        accuate /= len(Y)
        print "rmse : %f" %(rmse)
        print "Accuate : %g%%" %(accuate*100)
        return p_labels, p_vals

阅读┊ 收藏 ┊ 喜欢 ▼ ┊打印┊举报/Report

前一篇：面试所思，所获

后一篇：7月份的自己工作

新浪BLOG意见反馈留言板　欢迎批评指正