加载中…
个人资料
  • 博客等级:
  • 博客积分:
  • 博客访问:
  • 关注人气:
  • 获赠金笔:0支
  • 赠出金笔:0支
  • 荣誉徽章:
正文 字体大小:

用SGD 优化 Logistic Regression

(2012-07-11 20:56:12)
标签:

杂谈

最近几天,重新看了下Logistic Regression 。想通过 SGD(随机梯度下降)来优化LR里面的参数。这是一个on-line 学习的一个好的方法。里面主要涉及那个代价函数,这个我之前没有系统学习过。走了些弯路,因为学习结果,受这个代价函数影响很大。

方法如下:

预测函数f(x),从这个函数可以看出,预测的值在0-1之间。f(x)正好是样本x归类到正样本的概率。
http://s1/middle/642c9bddgc4d939fc1860&690优化 Logistic Regression" TITLE="用SGD 优化 Logistic Regression" />

这里,我就得想想如何求得这个http://s1/middle/642c9bddgc4d9519bb200&690优化 Logistic Regression" TITLE="用SGD 优化 Logistic Regression" />?里面有很多方法,最大似然等。这里,我主要想想说下随机梯度的方法来优化这个参数。

首先,从优化的角度来看,我们应该建立一个代价函数。

代价函数1如下:

http://s7/middle/642c9bddgc4d9d6d669e6&690优化 Logistic Regression" TITLE="用SGD 优化 Logistic Regression" />



提下:代价函数的建立,希望是一个凸函数,这样,局部最优解就是全局最优解。

但是,这里有个问题,就是一旦f(x)->1时候,log函数会超过范围。所以,针对,这个问题,希望做如下修正:


http://s13/middle/642c9bddgc4d9db85b8fc&690优化 Logistic Regression" TITLE="用SGD 优化 Logistic Regression" />




这样http://s4/middle/642c9bddgc4d9c2405c33&690优化 Logistic Regression" TITLE="用SGD 优化 Logistic Regression" />的更新步骤:

http://s13/middle/642c9bddgc4d9c36b209c&690优化 Logistic Regression" TITLE="用SGD 优化 Logistic Regression" />

其中,r是学习速率。 r是一个很小的值。这里我们可以这样设置学习速率 r=sqrt(t0/(t0+t))



贴上自己写的python代码,
# author :dylan_fan;
# write at 2012/07/07;


import math
import string
import random


class SGD_LR:
    def __init__(self):
       self.alpha = 0.0000012
       self.lamb = 0.0000015 
        self.lamb = 0.000001
        self.weight = {}
        self.itera_times = 500
            
    def logistic_train(self,Y, X):
        if len(Y) != len(X):
            print "Y and X length is matched"
            return
        for i in range(self.itera_times):
            cost_f = 0.0
            self.alpha = math.sqrt(0.003/(0.003+i))
           
            for k in range(len(Y)):
                label = Y[k]
                feature = X[k]
                predict_value = self.predict(feature)
                error = label - predict_value
                cost_f += math.fabs(error)
               error = label * math.log(predict_value) + (1-label) * math.log(1-predict_value)
               cost_f += (-error)
                tmp1 = label * math.exp(-predict_value) /(1 + math.exp(-predict_value)) -(1-label) * math.exp(-(1-predict_value)) /(1 + math.exp(-(1 -predict_value)))
               
                weight_sum = 0.0
               
                tmp = label/predict_value - (1-label)/(1-predict_value)
               
                for f , v in feature.items():                   
                    weight_sum += self.weight[f] * v
                linear_sum = math.exp(-weight_sum)
                derivation = linear_sum / (linear_sum * linear_sum + 2 * linear_sum +1)
               
                for f in feature.keys():                   
                    self.weight[f] += self.alpha * (tmp1 * derivation * feature[f] - self.lamb * self.weight[f]) # update ruel by gradient Descent methods
            print 'iteration', i, cost_f/len(Y),'done'
               
        return
   
    def logistic_save_model(self, model_file):
        fw = open(model_file, "w")
        for f , w in self.weight.items():
            fw.write("%d : %f\n" %(f,w))
        fw.close()
        print "model save ok..."
        
    def logistic_load_model(self,model_file):
        fr = open(model_file,"r")
        for line in fr.readlines():
           print line
            f, w = line.split(":")
            f = string.atoi(f)
            w = string.atof(w)
            self.weight[f] = w
        fr.close()               
        print "model load ok..."
   
    def predict(self, feature):
        weight_sum = 0       
        for f , v in feature.items():
            self.weight.setdefault(f,0)
            weight_sum += self.weight[f] * v
      
        return 1.0 / (1.0 + math.exp(-weight_sum))
   
    def logistic_predict(self, Y, X):
        if len(Y) != len(X):
            print "Y and X length is matched"
        rmse = 0.0
        p_vals = []
        p_labels = []
        accuate = 0.0
        for i in range(len(Y)):
            feature = X[i]
            label = Y[i]
            predict_value = self.predict(feature)
            rmse += (label - predict_value) * (label - predict_value)
            p_vals.append([predict_value])
            if predict_value > 0.5:
                p_label = 1               
            else:
                p_label = 0               
            p_labels.append(p_label)
            if p_label == label:
                accuate += 1.0
               
        rmse /= len(Y)
        accuate /= len(Y)
        print "rmse : %f" %(rmse)
        print "Accuate : %g%%" %(accuate*100)
        return p_labels, p_vals

0

阅读 收藏 喜欢 打印举报/Report
  

新浪BLOG意见反馈留言板 欢迎批评指正

新浪简介 | About Sina | 广告服务 | 联系我们 | 招聘信息 | 网站律师 | SINA English | 产品答疑

新浪公司 版权所有