Oracle start with ... connect by用法简介
(2012-08-26 22:57:37)
标签:
startwithconnectby杂谈 |
通过START WITH . . . CONNECT BY . . .子句来实现SQL的层次查询.
自从Oracle 9i开始,可以通过 SYS_CONNECT_BY_PATH
函数实现将父节点到当前行内容以“path”或者层次元素列表的形式显示出来。
自从Oracle 10g 中,还有其他更多关于层次查询的新特性
。例如,有的时候用户更关心的是每个层次分支中等级最低的内容。
那么你就可以利用伪列函数CONNECT_BY_ISLEAF来判断当前行是不是叶子。如果是叶子就会在伪列中显示“1”,
如果不是叶子而是一个分支(例如当前内容是其他行的父亲)就显示“0”。
在Oracle 10g 之前的版本中,如果在你的树中出现了环状循环(如一个孩子节点引用一个父亲节点),
Oracle 就会报出一个错误提示:“ ORA-01436: CONNECT BY loop in user
data”。如果不删掉对父亲的引用就无法执行查询操作。
而在 Oracle 10g
中,只要指定“NOCYCLE”就可以进行任意的查询操作。与这个关键字相关的还有一个伪列——CONNECT_BY_ISCYCLE,
如果在当前行中引用了某个父亲节点的内容并在树中出现了循环,那么该行的伪列中就会显示“1”,否则就显示“0”。
The start with .. connect by clause can be used to select data
that has a hierarchical relationship
(usually some sort of parent->child,
boss->employee or
thing->parts).
It is also being used when an sql execution plan is
explained.
syntax:
select ... [start with initial-condition] connect by [nocycle]
recurse-condition
level
With level it is possible to show the level in the
hierarchical relation of all the data.
--oracle 9i
sys_connect_by_path
With sys_connect_by_path it is possible to show the entire
path from the top level down to the 'actual' child.
--oracle 10g
connect_by_root
connect_by_root is a new operator that comes with Oracle 10g
and enhances the ability to perform hierarchical queries.
connect_by_is_leaf
connect_by_isleaf is a new operator that comes with Oracle 10g
and enhances the ability to perform hierarchical
queries.
connect_by_iscycle
connect_by_is_cycle is a new operator that comes with Oracle
10g and enhances the ability to perform hierarchical
queries.
--start with ... connect by ... 的处理机制
How must a start with ... connect by select statement be read
and interpreted?
If Oracle encounters such an SQL statement, it proceeds as
described in the following pseude code.
for rec in (select * from some_table) loop
end loop;
procedure RECURSE (rec in MATCHES_SELECT_STMT, new_parent IN
field_type) is
end procedure RECURSE;
created by zhouwf0726 2006.
*******************************************************************************/
--创建测试表,增加测试数据
create table test(superid varchar2(20),id varchar2(20));
insert into test values('0','1');
insert into test values('0','2');
insert into test values('1','11');
insert into test values('1','12');
insert into test values('2','21');
insert into test values('2','22');
insert into test values('11','111');
insert into test values('11','112');
insert into test values('12','121');
insert into test values('12','122');
insert into test values('21','211');
insert into test values('21','212');
insert into test values('22','221');
insert into test values('22','222');
commit;
--层次查询示例
select level||'层',lpad(' ',level*5)||id
id
from test
start with superid = '0' connect by prior id=superid;
select level||'层',connect_by_isleaf,lpad(' ',level*5)||id
id
from test
start with superid = '0' connect by prior id=superid;
--给出两个以前在"数据库字符串分组相加之四"中的例子来理解start with ... connect by
...
--功能:实现按照superid分组,把id用";"连接起来
--实现:以下两个例子都是通过构造2个伪列来实现connect by连接的。
select superid,ltrim(max(sys_connect_by_path(id,';')),';')
from(
select superid,id,row_number() over(partition by superid order
by superid) id1,
row_number() over(order by superid) + dense_rank() over(order
by superid) id2
from test
)
start with id1=1 connect by prior id2 = id2 -1
group by superid order by superid;
select distinct superid,ltrim(first_value(id) over(partition
by superid order by l desc),';')
from(
select superid,level l,sys_connect_by_path(id,';')
id
from(
select superid,id,superid||rownum
parent_rn,superid||to_char(rownum-1) rn
from test
)
connect by prior parent_rn = rn
);
--下面的例子实现把一个整数的各个位上的数字相加,通过这个例子我们再次理解connect by.
create or replace function f_digit_add(innum integer) return
number
is
outnum integer;
begin
end f_digit_add;
/
select f_digit_add(123456) from dual;
一个常见的高级计算机科学问题可以在“有向图”的范畴之下描述。有向图是由一组向量和边所连接的一组有限的节点。
例如,一个节点可以想象为一座“城市”,而每个向量可以想象为两座城市间的一个“航线”。
有很多算法和论文讲到如何解决每种可能路线的遍历问题以及寻找最短路径或者最小代价路径的问题。
这些算法中大部分都是过程化的,或者是使用递归方面来解决的。然而 SQL
的声明性语言使得解决复杂的有向图问题更加容易,
而且不需要很多代码。
让我们以两座城市之间的航线为例子,创建一个表保存一些假想数据:
create table airports
(
);
insert into airports values ('LHR','London Heathrow,
UK');
insert into airports values ('JFK','New York-Kennedy,
USA');
insert into airports values ('GRU','Sao Paulo, Brazil');
create table fares
(
);
insert into fares values('LHR','JFK',700);
insert into fares values('JFK','GRU',600);
insert into fares values('LHR','GRU',1500);
insert into fares
values('GRU','LHR',1600);
不能使用CONNECT BY 语法来解决如何从伦敦到圣保罗,因为在图中有数据产生一个环(从圣保罗飞回):
select * from fares connect by prior arrive = depart start
with depart = 'LHR';
ERROR:
ORA-01436: CONNECT BY loop in user data
要解决有向图问题,我们需要创建一个临时表来保存两个节点之间所有可能的路径。我们必须注意不复制已经处理过的路径,
而且在这种情况下,我们不想路径走回开始处的同一个地点。我还希望跟踪到达目的地所需航程的数目,以及所走路线的描述。
临时表使用以下脚本创建:
create global temporary table faretemp
(
);
一个简单的视图可以在稍微简化这个例子中使用的代码。视图可以根据 fares 表中的单个航程计算从 faretemp
表中的一个路径
到达一下一个航程的数据:
create or replace view nexthop
as
/
show errors;
这个算法相当简单。首先,使用 fares 表中的数据填充 faretemp
表,作为初始的航程。然后,取到我们刚才插入的所有数据,
使用它们建立所有可能的二航程(two-hop)路径。重复这一过程,直至在两个节点之间创建了新路径。
循环过程将在节点间所有可能的路径都被描述之后退出。如果我们只对某个开始条件感兴趣,
那么我们还可以限制第一次的插入从而减少装载数据的量。下面是发现路径的代码:
truncate table faretemp;
begin
end;
/
show errors;
select * from faretemp order by depart,arrive;
可以在表 A 中查看输出。
前面的数据有一个小问题。数据是点之间最短路径(最小航程数)的集合。然而,从伦敦到圣保罗的航程却不是最便宜的一个。
要解决最便宜的费用问题,需要对我们的循环做一个改进,当在一个航程中发现一个更便宜的路线时使用这个路线代替原来的路线。
修改后的代码如下:
truncate table faretemp;
declare
begin
end;
/
show errors;
select * from faretemp order by depart,arrive;
可能在表 B中查看输出。
算法发现LHR、JFK、GRU 路线比 LHR、GRU
路线便宜,所以用前者代替了后者。循环将在没有更便宜的费用,
并且没有其它可能路线时退出。
SQL> select level||'层',connect_by_root,lpad('
',level*5)||id id
select level||'层',connect_by_root,lpad(' ',level*5)||id
id
ERROR 位于第1行:
ORA-00904: 无效列名
connect_by_root似乎应该去掉啊。不然出错。
select
superid,ltrim(max(sys_connect_by_path(id,';')),';')
from(
)
start with id1=1 connect by prior id2 = id2 -1
group by superid order by superid;
ERROR 位于第3行:
ORA-00923: 未找到预期 FROM 关键字
写这篇文章前边的一些理论参考了不少资料。我都是从网上搜索的。
可以按照下边两个主题搜索:
start with connect by
sql有向图
数字求和的可以参考:
http://www.itpub.net/619949,1.html
分组相加的可以参考:
http://www.itpub.net/showthread.php?threadid=614563
一个高级SQL语句问题
假设有一张表,A和B字段都是NUMBER,
A B
1 2
2 3
3 4
4
有这样一些数据
现在想用一条SQL语句,查询出这样的数据
1-》2-》3—》4
就是说,A和B的数据表示一种连接的关系,现在想通过A的一个值,去查询A所对应的B值,直到B为NULL为止,
不知道这个SQL语句怎么写?请教高手!谢谢
*********************************************************************************************/
--以下是利用start with connect by的一个简单回答
CREATE TABLE TEST(COL1 NUMBER(18,0),COL2 NUMBER(18,0));
INSERT INTO TEST VALUES(1,2);
INSERT INTO TEST VALUES(2,3);
INSERT INTO TEST VALUES(3,4);
INSERT INTO TEST VALUES(4,NULL);
INSERT INTO TEST VALUES(5,6);
INSERT INTO TEST VALUES(6,7);
INSERT INTO TEST VALUES(7,8);
INSERT INTO TEST VALUES(8,NULL);
INSERT INTO TEST VALUES(9,10);
INSERT INTO TEST VALUES(10,NULL);
INSERT INTO TEST VALUES(11,12);
INSERT INTO TEST VALUES(12,13);
INSERT INTO TEST VALUES(13,14);
INSERT INTO TEST VALUES(14,NULL);
select max(col) from(
select SUBSTR(col,1,CASE WHEN
INSTR(col,'->')>0 THEN
INSTR(col,'->') - 1 ELSE LENGTH(col) END) FLAG,col
from(
select
ltrim(sys_connect_by_path(col1,'->'),'->')
col from (
select col1,col2,CASE WHEN LAG(COL2,1,NULL) OVER(ORDER BY
ROWNUM) IS NULL THEN 1 ELSE 0 END FLAG
from test
)
start with flag=1 connect by col1=prior col2
)
)
group by flag;
没人顶?自己顶一个!
刚才抽时间大致看了一下:
作者Scott Stephens的例子用他自己的数据没有问题,但是如果改成以下数据就会有问题。
delete from fares;
delete from airports;
insert into airports values('BJ','BEIJING');
insert into airports values('SH','SHANGHAI');
insert into airports values('GZ','GUANGZHOU');
INSERT INTO FARES VALUES('BJ','SH',500);
INSERT INTO FARES VALUES('SH','GZ',1500);
INSERT INTO FARES VALUES('BJ','GZ',1800);
INSERT INTO FARES VALUES('GZ','BJ',1600);
执行:
truncate table faretemp;
begin
-- initial connections
insert into faretemp
select depart,arrive,1,depart||','||arrive,price from
fares;
while sql%rowcount > 0 loop
insert into faretemp
select depart,arrive,hops,route,price from nexthop
where (depart,arrive)
not in (select depart,arrive from faretemp);
end loop;
end;
/
show errors;
select * from faretemp order by depart,arrive;
DEPART ARRIVE
HOPS ROUTE
PRICE
---------- ---------- ---------------------------------------
------------------------------ ----------
BJ
GZ
1 BJ,GZ
1800
BJ
SH
1 BJ,SH
500
GZ
BJ
1 GZ,BJ
1600
GZ
SH
2 GZ,BJ,SH
2100
SH
BJ
2 SH,GZ,BJ
3100
SH
GZ
1 SH,GZ
1500
6 rows selected
明显少了一条遍历数据!
自己也写了个SQL暂时能解决这些测试数据:
select path,price from(
select l,path,lag(price) over(order by rownum) price
from(
select l,flag,path,sum(price) over(partition by flag order by
rownum) price from(
select l,depart,arrive,path,
SUBSTR(path,1,CASE WHEN
INSTR(path,'->')>0 THEN
INSTR(path,'->') - 1 ELSE LENGTH(path) END)
FLAG,price
from(
select level
l,depart,arrive,ltrim(sys_connect_by_path(depart,'->'),'->')
path,price
--ltrim(sys_connect_by_path(price,'+'),'+') price
from fares connect by nocycle depart=prior arrive
)
)
)
where SUBSTR(path,1,CASE WHEN
INSTR(path,'->')>0 THEN
INSTR(path,'->') - 1 ELSE LENGTH(path) END)
!=
SUBSTR(path,CASE WHEN
INSTR(reverse(path),'>-')>0 THEN -
INSTR(reverse(path),'>-') + 1 ELSE - LENGTH(path)
END)
) where l=3;
PATH
PRICE
--------------------------------------------------------------------------------
----------
BJ->SH->GZ
2000
GZ->BJ->SH
2100
SH->GZ->BJ
3100
但是如果再往fare表加入数据,
insert into fares values('GZ','SH',1300);
再次执行:
select path,price from(
select l,path,lag(price) over(order by rownum) price
from(
select l,flag,path,sum(price) over(partition by flag order by
rownum) price from(
select l,depart,arrive,path,
SUBSTR(path,1,CASE WHEN
INSTR(path,'->')>0 THEN
INSTR(path,'->') - 1 ELSE LENGTH(path) END)
FLAG,price
from(
select level
l,depart,arrive,ltrim(sys_connect_by_path(depart,'->'),'->')
path,price
--ltrim(sys_connect_by_path(price,'+'),'+') price
from fares connect by nocycle depart=prior arrive
)
)
)
where SUBSTR(path,1,CASE WHEN
INSTR(path,'->')>0 THEN
INSTR(path,'->') - 1 ELSE LENGTH(path) END)
!=
SUBSTR(path,CASE WHEN
INSTR(reverse(path),'>-')>0 THEN -
INSTR(reverse(path),'>-') + 1 ELSE - LENGTH(path)
END)
) where l=3;
没有把 BJ->GZ->SH
3100这条数据给查出来,这个是否connect
by处理的问题?考虑可能是用nocycle参数oracle遇到第一个闭环就停止了,所以没有继续连接下边的数据。
执行Scott Stephens的脚本:
truncate table faretemp;
declare
l_count integer;
begin
-- initial connections
insert into faretemp
select depart,arrive,1,depart||','||arrive,price from
fares;
l_count := sql%rowcount;
while l_count > 0 loop
update faretemp
set (hops,route,price) =
(select hops,route,price from nexthop
where depart = faretemp.depart
and arrive = faretemp.arrive)
where (depart,arrive) in
(select depart,arrive from nexthop
where price < faretemp.price);
l_count := sql%rowcount;
insert into faretemp
select depart,arrive,hops,route,price from nexthop
where (depart,arrive)
not in (select depart,arrive from faretemp);
l_count := l_count + sql%rowcount;
end loop;
end;
/
show errors;
select * from faretemp order by depart,arrive;
select * from faretemp order by depart,arrive;
DEPART ARRIVE
HOPS ROUTE
PRICE
---------- ---------- ---------------------------------------
------------------------------ ----------
BJ
GZ
1 BJ,GZ
1800
BJ
SH
1 BJ,SH
500
GZ
BJ
1 GZ,BJ
1600
GZ
SH
1 GZ,SH
1300
SH
BJ
2 SH,GZ,BJ
3100
SH
GZ
1 SH,GZ
1500
6 rows selected
就能得到一条遍历数据。
前一篇:Oracle BOM分解[转]
后一篇:Oracle求数字求和 分组连接