8. MAD(中位数绝对偏差)_jingju

http://blog.sina.com.cn/u/2745771574

首页博文目录关于我

个人资料

微博

加好友发纸条

写留言加关注

博客等级：
博客积分：

博客访问：
关注人气：
获赠金笔：0支
赠出金笔：0支
荣誉徽章：

正文字体大小：大中小

8. MAD(中位数绝对偏差)

(2012-03-29 04:58:29)

标签：

mad

proc

iml

linear

regression

分类：统计分享

Summary:

From WIKI: For a univariate data set X₁, X₂, ..., X_n, the MAD is defined as the median of the absolute deviations from the data's median:

http://s8/middle/a3a92636gbc53f7d0a917&690MAD(中位数绝对偏差)" TITLE="8. MAD(中位数绝对偏差)" />

that is, starting with the residuals (deviations) from the data's median, the MAD is the median of their absolute values.

The calculation of MAD statistic is very straightforward in proc IML. The Scale factor K, i.e., the ratio of STD and MAD, in normal distribution is known to many people. The code was given to test if K was correct. The correctness of K is theortical proved. So, this turns out testing the validity of generated random numbers. The SAS code as tested on IML/Studio 4.3 but should work well in PROC IML.

以上定义从Wiki上引用。MAD的计算非常简单，虽然许多人也许并没有注意到这个统计量值。SAS语句是在IML/Studio上编写。不言而喻，就这段语句而言，和PROC IML语句没有任何的区别。只是放入PROC IML即可。K = STD/MAD, 是有理论依据的。所以对K的检测可归结为随机数特性的检查。

Results:

How MAD was calculated:

<direct MAD from SAS function>

<original vector> <median of vector> <absolute deviation vector> <computed MAD>

1 * 2 * 1 * 1

1 1

2 0

4 2

6 4

9 7

Regression test if K correct:

<Scale factor K> Estimated value <P value:Estimated = K?>

1.4826 1.48211 0.724

SAS code (from IML/Studio):

*compute MAD statistic;

c = {1, 1, 2, 2, 4, 6, 9};

mad0 = mad(c);

median0 = median(c);

c1 = abs(c-median(c));

mad2 = median(c1);

print "How MAD was calculated:",,

mad0[label ='<direct MAD from SAS function>' format =best.],

c[label ="<original vector>"]'*'(median(c))[label ="<median of vector>"]'*'
(c1)[label ="<absolute deviation vector>"]'*' mad2[label ="<computed MAD>"];

*simulate and calculate Scale factor K;

x = j(1000, 1000);

m = j(ncol(x),2);

do i =1 to ncol(x);

call randseed(1234);

_x = x[,i];

call randgen(_x, "normal");

m[i,1] = mad(_x);

m[i,2] = std(_x);

end;

k = 1/quantile('normal', 3/4);

x = m[, 1]; y = m[, 2];

start Regress;

xpxi = inv(x`*x);

beta = xpxi * (x`*y);

yhat = x*beta;

resid = y-yhat;

sse = ssq(resid);

n = nrow(x);

dfe = nrow(x)-ncol(x);

mse = sse/dfe;

cssy = ssq(y-sum(y)/n);

rsquare = (cssy-sse)/cssy;

stdb = sqrt(vecdiag(xpxi)*mse);

t = (beta-k)/stdb;

prob = 1-probf(t#t,1,dfe);

print "Regression test if K correct:",,

        k[label ="<Scale factor K>" format =best7.5]
       beta[label ="Estimated value" format =best7.5]
       prob[label ='<P value:Estimated = K?>' format =pvalue6.3];

finish Regress;

run Regress;

阅读┊ 收藏 ┊ 喜欢 ▼ ┊打印┊举报/Report

前一篇：7. Weibull Distribution Looks Like

后一篇：9. Missing values 数据缺失

新浪BLOG意见反馈留言板　欢迎批评指正