SAS编程技巧 -- 宏循环 PK 数据步循环
(2013-01-08 21:18:51)
标签:
杂谈 |
分类: SAS_Quick_Tips |
初学
SAS
为了说明两种循环的区别,我们先看一下两种方法最后生成的 SAS 代码以及运行时间。
1)数据步循环
data temp;
retain rule_statement ;
length rule_statement $5000 ;
set rule_data(where = (rule_id = "r1"));
if 2 = 2 then do;
rule_statement = 'if '||"rc_term1 = "||quote(trim(rc_term1));
if 3 = 2 then do;
rule_statement = trim(left(rule_statement))||' then do ; ';
end;
end;
else if 2 = 3 then do;
rule_statement = trim(left(rule_statement))||" and rc_term1 =
"||quote(trim(rc_term1))||' then do ; ';
end;
else do;
rule_statement = trim(left(rule_statement))||" and rc_term1 =
"||quote(trim(rc_term1));
end;
if 3 = 2 then do;
rule_statement = 'if '||"rc_term2 = "||quote(trim(rc_term2));
if 3 = 2 then do;
rule_statement = trim(left(rule_statement))||' then do ; ';
end;
end;
else if 3 = 3 then do;
rule_statement = trim(left(rule_statement))||" and rc_term2 =
"||quote(trim(rc_term2))||' then do ; ';
end;
else do;
rule_statement = trim(left(rule_statement))||" and rc_term2 =
"||quote(trim(rc_term2));
end;
if 2 = 2 then do;
rule_statement = trim(left(rule_statement))||"_ra_term1 =
"||quote(trim(ra_term1))||';';
end;
else if 2 = 4 then do;
rule_statement = trim(left(rule_statement))||"
sequence = '||left(sequence)||'; output rule_result; end;';
call symput ('rule_statement'||left(put(_n_,15.)),
rule_statement);
call symput("rule_statement_count",_n_);
end;
else do;
rule_statement = trim(left(rule_statement))||" _ra_term1 =
"||quote(trim(ra_term1))||';';
end;
if 3 = 2 then do;
rule_statement = trim(left(rule_statement))||"_ra_term2 =
"||quote(trim(ra_term2))||';';
end;
else if 3 = 4 then do;
rule_statement = trim(left(rule_statement))||"
sequence = '||left(sequence)||'; output rule_result; end;';
call symput ('rule_statement'||left(put(_n_,15.)),
rule_statement);
call symput("rule_statement_count",_n_);
end;
else do;
rule_statement = trim(left(rule_statement))||" _ra_term2 =
"||quote(trim(ra_term2))||';';
end;
if 4 = 2 then do;
rule_statement = trim(left(rule_statement))||"_ra_term3 =
"||quote(trim(ra_term3))||';';
end;
else if 4 = 4 then do;
rule_statement = trim(left(rule_statement))||"
sequence = '||left(sequence)||'; output rule_result; end;';
call symput ('rule_statement'||left(put(_n_,15.)),
rule_statement);
call symput("rule_statement_count",_n_);
end;
else do;
rule_statement = trim(left(rule_statement))||" _ra_term3 =
"||quote(trim(ra_term3))||';';
end;
NOTE: There were 1500 observations read from the data set
WORK.RULE_DATA.
NOTE: The data set WORK.TEMP has 1500 observations and 9
variables.
NOTE: DATA statement used (Total process time):
2)宏循环
data temp;
do;';
rule_statement=trim(rule_statement)||'ra_term1='||quote(trim(ra_term1))||';'||'ra_term2='||quote(trim(ra_term2))||';'||'ra
_term3='||quote(trim(ra_term3))||';'||'
sequence='||strip(sequence)||';output rule_result;end;';
NOTE: There were 1500 observations read from the data set
WORK.RULE_DATA.
NOTE: The data set WORK.TEMP has 1500 observations and 9
variables.
NOTE: DATA statement used (Total process time):
我是在一台虚拟机上跑的,机器本身性能一般,从测试结果看性能相差近
下面是全部代码(主要是生成 Rule 的条件判断和赋值语句),共有三个宏:
1)testdata
生成测试数据,测试数据的行数通过宏变量 rule_data_count 来控制。
%let rule_data_count=1500;
2)loop
采用宏循环,因为要使用 SAS 的宏引用来处理单引号、括号等特殊符号,所以宏变量赋值部分的代码比较复杂。但是,宏替换完成后生成的数据步代码则很简洁,逻辑清晰。有些时候,性能和代码的简洁程度是成反比的,需要权衡利弊之后再做选择。如果数据量不大,则宁肯代码简洁便于维护,也不写很复杂很有技巧的代码。
3) loop_org
采用数据步循环,代码简单易懂,但是宏替换完后生成的数据步中有很多不必要的 if...else 语句,使得代码变得非常低效而且业务逻辑不清晰。
%let rule_data_count=1500;
%let rule_count=2;
%let mvar_rc_term1=rc_term1;
%let mvar_rc_term2=rc_term2;
%let mvar_rc_term3=rc_term3;
%let mvar_ra_term1=ra_term1;
%let mvar_ra_term2=ra_term2;
%let mvar_ra_term3=ra_term3;
%let rc_term1=rc_term_value1;
%let rc_term2=rc_term_value2;
%let rc_term3=rc_term_value3;
%let ra_term1=ra_term_value1;
%let ra_term2=ra_term_value2;
%let ra_term3=ra_term_value3;
%let rulecondition1=
rule1condition:mvar_rc_term1:mvar_rc_term2;
%let ruleaction1=
rule1action:mvar_ra_term1:mvar_ra_term2:mvar_ra_term3;
%let rulecondition2=
rule2condition:mvar_rc_term1:mvar_rc_term3;
%let ruleaction2= rule2action:mvar_ra_term1;
%let rule1=r1;
%let rule2=r2;
%macro testdata;
%let sequence=0;
data rule_data;
run;
%mend;
%macro loop_org;
%do rule_no = 1 %to &rule_count;
%end;
%mend;
%macro loop;
%do rule_no = 1 %to &rule_count;
%end;
%mend;
options mprint;
%testdata;
%loop;
%loop_org;