merge的使用方法
(2012-07-11 12:10:21)
标签:
杂谈 |
分类: r |
http://mcube.nju.edu.cn/jokergoo/newblog/post/usage-of-merge/
发表于 2010/10/1 周五 晚上 22:17
merge函数可以将两个数据框按照指定的方式连接在一起,如果把数据框看做数据库中的一张表时,merge的功能可以看作是对两个表join
使用方法:
缺省下, merge等价于在两个表中的NATURAL JOIN,或者设置by的值使得merge等价于INNER
JOIN,也可以设置all的值来使得merge等价于OUTER
JOIN或者FULL,在缺省条件下,如果在两个数据框中没有相同的列名,则使用CROSS JOIN 比如有如下两个数据框
merge(x, y, by = , by.x = , by.y = , all = , all.x = , all.y = , sort = , suffixes = , incomparables = , ...)其中的参数解释如下
| Argument | Description | Default |
|---|---|---|
| x | 数据框1 | |
| y | 数据框2 | |
| by | 连接的列名 | intersect(names(x), names(y)) |
| by.x | 第一个数据框要连接的列名 | by |
| by.y | 第二个数据框要连接的列名 | by |
| all | 如果连接后没有这一行的记录,是否要包含此行 | FALSE |
| all.x | 如果连接后第一个数据框中没有这一行的记录,是否要包含此行 | all |
| all.y | 如果连接后第二个数据框中没有这一行的记录,是否要包含此行 | all |
> persons id_p firstname lastname address city 1 1 John Adams Oxford Street London 2 2 George Bush Fifth Avenue New York 3 3 Thomas Carter Changa Street Beijing > orders id_o orderno id_p 1 1 77985 3 2 2 44678 3 3 3 22456 1 4 4 24562 1 5 5 34764 65因为其中有同名的列名,使用merge而不指定要连接的列名
> merge(orders, persons) id_p id_o orderno firstname lastname address city 1 1 3 22456 John Adams Oxford Street London 2 1 4 24562 John Adams Oxford Street London 3 3 1 77985 Thomas Carter Changa Street Beijing 4 3 2 44678 Thomas Carter Changa Street Beijing这个如下的内连接相同
> merge(orders, persons, by="id_p") > merge(orders, persons, by.x="id_p", by.y="id_p")等价于如下的SQL语句
SELECT o.id_p, o.id_o, o.orderno, p.firstname, p.lastname, p.address, p.city FROM orders AS o AND persons AS p WHERE o.id_p = op.id_p
> merge(orders, persons, by.x = "id_p", by.y = "id_p", all.x = TRUE) id_p id_o orderno firstname lastname address city 1 1 3 22456 John Adams Oxford Street London 2 1 4 24562 John Adams Oxford Street London 3 3 1 77985 Thomas Carter Changa Street Beijing 4 3 2 44678 Thomas Carter Changa Street Beijing 5 65 5 34764 <NA> <NA> <NA> <NA> > merge(persons, orders, by.x = "id_p", by.y = "id_p", all.y = TRUE)等价于
SELECT o.id_p, o.id_o, o.orderno, p.firstname, p.lastname, p.address, p.city FROM orders AS o LEFT JOIN persons AS p ON o.id_p = op.id_p SELECT o.id_p, o.id_o, o.orderno, p.firstname, p.lastname, p.address, p.city FROM persons AS p RIGHT JOIN orders AS o ON o.id_p = op.id_p
> merge(orders, persons, by.x = "id_p", by.y = "id_p", all = TRUE) id_p id_o orderno firstname lastname address city 1 1 3 22456 John Adams Oxford Street London 2 1 4 24562 John Adams Oxford Street London 3 2 NA NA George Bush Fifth Avenue New York 4 3 1 77985 Thomas Carter Changa Street Beijing 5 3 2 44678 Thomas Carter Changa Street Beijing 6 65 5 34764 <NA> <NA> <NA> <NA>等价于
SELECT o.id_p, o.id_o, o.orderno, p.firstname, p.lastname, p.address, p.city FROM orders AS o OUTER JOIN persons AS p
前一篇:如何使用R语言删除一行数据
后一篇:debian install r

加载中…