359.Somer's D Concordance Statistic_jingju

http://blog.sina.com.cn/u/2745771574

首页博文目录关于我

个人资料

微博

加好友发纸条

写留言加关注

博客等级：
博客积分：

博客访问：
关注人气：
获赠金笔：0支
赠出金笔：0支
荣誉徽章：

正文字体大小：大中小

359.Somer's D Concordance Statistic

(2016-03-20 12:58:18)

标签：

somer's

d

ordinal

coarse-classifying

分类：统计分享

Somer's D is to test the association between two ordinal variables. In SAS, proc freq gives this statistic. You can calculate it using either formula or simulation. Here assumed a 2 X 3 contingency table.

data have;

input x :$12. good bad;

pg =good/(good+bad);

datalines;

owner 6000 300

renter 1950 540

others 1050 160

;

proc rank data=have out=have;

var pg;

ranks xnew;

run;

data have2;

set have;

y=1; w=good; output;

y=0; w=bad ; output;

run;

proc freq data=have2;

tables xnew*y/chisq measures;

weight w;

exact SMDRC; *treat R(xnew:attribute) as dependent and C(y:outcome) as independent;

run;

data formula;

set have2 end=Eof;

array gg[3] _temporary_; array bb[3] _temporary_;

if y=1 then gg[xnew]=w;

if y=0 then bb[xnew]=w;

if Eof then do;

do i =1 to dim(gg);

do j=1 to i-1;

t1 ++bb[j]; t2 ++gg[j];

end;

s ++(t1*gg[i]-t2*bb[i]);

call missing(of t1 t2);

end;

D =s/(sum(of gg[*])*sum(of bb[*]));

end;

run;

data simulation;

call streaminit(12345);

set have2 end=Eof;

array gg[3] _temporary_; array bb[3] _temporary_;

if y=1 then gg[xnew]=w;

if y=0 then bb[xnew]=w;

if Eof then do;

sg=sum(of gg[*]);

sb=sum(of bb[*]);

do i =1 to dim(gg);

gg[i]=gg[i]/sg; bb[i]=bb[i]/sb;

end;

do i =1 to 1e8;

xb=rand('table', of bb[*]);*pick attribute of bad from the bads;

xg=rand('table', of gg[*]);*pick attribute of good from the goods;

bltg ++(xb

bgtg ++(xb >xg);

bgeg ++(xb =xg);

end;

D=(1*bltg -1*bgtg +0*bgeg)/(i-1);

end;

run;

In the simulation, the concordance statistic describes the chance that if one picks a good at random from the goods and a bad at random form the bads, the bad's attribute, xb, will be in a lower class than the good's attribute, xg. The higher this probability, the better the ordering of the characteristic's classes reflects the good-bad split in the population.

From the simulation, D = 0.39502673, comparing to an exact value of 0.359.

In coarse classifying the characteristic for scorecard development, a higher value of Somer's D indicates a more definitive split.

阅读┊ 收藏 ┊ 喜欢 ▼ ┊打印┊举报/Report

前一篇：358. An example of ANCOVA

后一篇：360.Recursive Macro of Functional vs Imperative

新浪BLOG意见反馈留言板　欢迎批评指正