SAS中set语句_AshleyYuan

http://blog.sina.com.cn/u/2949712125

首页博文目录关于我

个人资料

微博

加好友发纸条

写留言加关注

博客等级：
博客积分：

博客访问：
关注人气：
获赠金笔：0支
赠出金笔：0支
荣誉徽章：

正文字体大小：大中小

SAS中set语句

(2017-03-24 16:38:57)

标签：

文化

it

教育

分类： SAS

set语句：

set语句有什么用？

试想如果要给数据集增加一列(固定列或者计算列)，增加新变量或者创建子集

下面给出创建新列和增加固定列data步和sql过程的办法

http://common.cnblogs.com/images/copycode.gif


data me(keep=name newVariable total);
    set sashelp.class;

　　 if sex='男';
    newVariable=.;
    total = height+weight;
run;
proc print noobs;run;
proc sql;
    select name, '.' as newVariable, height+weight as total from sashelp.class

    where sex='男';
quit;

http://common.cnblogs.com/images/copycode.gif

对于两个已经排好序的数据集，如果想要合并后依然排好序，有两种方法

第一种：set data1 data2;然后再执行proc sort。

第二种：set data1 data2;by variable;这种效率比第一种高，虽然不知道why...但是书上这么说的。我觉得可能是数据读取次数的问题吧，第二种只需要读一次，第一种要读两次

set语句从一个或多个sas数据集中读取观测值并实现纵向合并，每一个set语句执行时，sas就会读一个观测到pdv中，一个data步可以有多个set语句，每个set语句可以跟多个sas数据集，多个set语句含有多个数据指针。

set会将输入数据集中的所有观测值和变量读取，除非你中间执行其他步骤

SET<(data-set-options(s) )>>;

(data-set-options) specifies actions SAS is to take when it reads variables or observations into the program data vector for processing.

Tip:Data set options that apply to a data set list apply to all of the data sets in the list. Data set options specify actions that apply only to the SAS data set with which they appear. They let you perform the following operations:

主要的功能是以下四天，并给出相关例子

renaming variables ex--> set sashelp.class(rename = (name = name_new));

selecting only the first or last n observations for processing sashelp.class(where =(sex='M')); where和rename要用括号括起来

dropping variables from processing or from the output data set sashelp.class(drop =name sex);sashelp.class(keep=name sex);

specifying a password for a data set

输出两个数据集

data d1(keep = name) d2(keep = name sex);

　　set sashelp.class(keep = name sex);

run;

IN=选项应用

IN本身不是变量，所以不能通过赋值语句获得，IN=的最大作用是标识不同的数据集 　

http://common.cnblogs.com/images/copycode.gif


data one;
    input x y$;
    cards;
    1 a
    2 b
    ;
run;
data two;
    input x z$;
    cards;
    3 c
    2 d
    ;
run;

data me;
    set one(in=ina)two(in=inb);
    if ina=1 then flag=1;else flag=0;
run;

http://common.cnblogs.com/images/copycode.gif

res：
http://images.cnitblog.com/i/561890/201406/180210596617372.jpg

data me;
set sashelp.class(firstobs=3 obs=6);
run;

*获取数据集中的总观测数;

data me(keep = total);
set sashelp.Slkwxl nobs=total_obs; *if 0 then set sashelp.Slkwxl nobs=total_obs;改进语句，因为sas是先编译再执行，所以可以选择不执行,只获取编译的信息就足够了
total = total_obs;
output;
stop; *这里用stop是因为，我们只要象征性读取set中的第一条即可，输出total变量，然后终止程序;
run;

set的流程是这样的，先set第一个观测值，然后往下执行total=total_obs;然后继续执行，遇到stop则停止，否则在没遇到错误的情况下会返回data步开头继续set第二行观测值，所以，如果不屑stop语句，则会出来总数个相同的值为总数的变量

1：程序编译时首先读nobs=选项，该选项在头文件中，nobs=total_obs将总观测数传给临时变量total_obs

2：pdv读入数据集，并把所有变量放入pdv。

文章摘自网上博客

阅读┊ 收藏 ┊ 喜欢 ▼ ┊打印┊举报/Report

后一篇：SAS中如何删去重复项

新浪BLOG意见反馈留言板　欢迎批评指正