Matlab常用函数之pdist_小吕子_

http://blog.sina.com.cn/u/1274058845

首页博文目录关于我

个人资料

微博

加好友发纸条

写留言加关注

博客等级：
博客积分：

博客访问：
关注人气：
获赠金笔：0支
赠出金笔：0支
荣誉徽章：

正文字体大小：大中小

Matlab常用函数之pdist

(2014-07-26 10:01:48)

标签：

健康

分类： Matlab

Pdist

返回值为向量形式：内容为M*N矩阵X中各成对成分之间两两的欧式距离. D = pdist(X);D为1*M*(M-1)/2维向量.

其他具体用法： D = pdist(X, DISTANCE) computes D using DISTANCE. Choices are:

'euclidean' - Euclidean distance (default)

'seuclidean' - Standardized Euclidean distance. Each coordinate

difference between rows in X is scaled by dividing

by the corresponding element of the standard

deviation S=NANSTD(X). To specify another value for

S, use D=pdist(X,'seuclidean',S).

'cityblock' - City Block distance

'minkowski' - Minkowski distance. The default exponent is 2. To

specify a different exponent, use

D = pdist(X,'minkowski',P), where the exponent P is

a scalar positive value.

'chebychev' - Chebychev distance (maximum coordinate difference)

'mahalanobis' - Mahalanobis distance, using the sample covariance

of X as computed by NANCOV. To compute the distance

with a different covariance, use

D = pdist(X,'mahalanobis',C), where the matrix C

is symmetric and positive definite.

'cosine' - One minus the cosine of the included angle

between observations (treated as vectors)

'correlation' - One minus the sample linear correlation between

observations (treated as sequences of values).

'spearman' - One minus the sample Spearman's rank correlation

between observations (treated as sequences of values).

'hamming' - Hamming distance, percentage of coordinates

that differ

'jaccard' - One minus the Jaccard coefficient, the

percentage of nonzero coordinates that differ

function - A distance function specified using @, for

example @DISTFUN.

A distance function must be of the form

function D2 = DISTFUN(XI, XJ),

taking as arguments a 1-by-N vector XI containing a single row of X, an

M2-by-N matrix XJ containing multiple rows of X, and returning an

M2-by-1 vector of distances D2, whose Jth element is the distance

between the observations XI and XJ(J,:).

The output D is arranged in the order of ((2,1),(3,1),..., (M,1),

(3,2),...(M,2),.....(M,M-1)), i.e. the lower left triangle of the full

M-by-M distance matrix in column order. To get the distance between

the Ith and Jth observations (I < J), either use the formula

D((I-1)*(M-I/2)+J-I), or use the helper function Z = SQUAREFORM(D),

which returns an M-by-M square symmetric matrix, with the (I,J) entry

equal to distance between observation I and observation J.

Example:

% Compute the ordinary Euclidean distance

X = randn(100, 5); % some random points

D = pdist(X, 'euclidean'); % euclidean distance

% Compute the Euclidean distance with each coordinate difference

% scaled by the standard deviation

Dstd = pdist(X,'seuclidean');

% Use a function handle to compute a distance that weights each

% coordinate contribution differently

Wgts = [.1 .3 .3 .2 .1]; % coordinate weights

weuc = @(XI,XJ,W)(sqrt(bsxfun(@minus,XI,XJ).^2 * W'));

Dwgt = pdist(X, @(Xi,Xj) weuc(Xi,Xj,Wgts));

阅读┊ 收藏 ┊ 喜欢 ▼ ┊打印┊举报/Report

前一篇：[转载]新宇教你机器学习之 PCA - Principle Component A

后一篇：Matlab_Gui函数句柄

新浪BLOG意见反馈留言板　欢迎批评指正