加载中…
个人资料
  • 博客等级:
  • 博客积分:
  • 博客访问:
  • 关注人气:
  • 获赠金笔:0支
  • 赠出金笔:0支
  • 荣誉徽章:
正文 字体大小:

[转载]COCA语料库 查询说明

(2016-10-27 03:35:39)
标签:

转载

原文地址:COCA语料库 查询说明作者:清水
 
WORD(S)   1  
COLLOCATES   2        3       4
POS LIST   5  


You enter the basic search string in [1]. Optionally, you can enter "COLLOCATES" words in [2], and indicate how many words away this is with [3] and [4]. You can use the "Part of Speech" list in [5] to enter tags into [1] or into [2] (when [2] is visible). You can toggle whether to use [COLLOCATES] and [POS LIST] by clicking on these words.

You would use just [1] when you have an exact number of words that you're searching. For example, [n*] chip in [1] would find a two word string, composed of chip immediately preceded by a noun (click to try it).

With the COLLOCATES / CONTEXT searches, on the other hand, the word in [2] can occur anywhere "near" [1], as specified by [3]-[4]. For example, click here to find all cases of a noun within five words to the left or the right of chip.

You can also use a combination of the WORDS and COLLOCATES field (actually, CONTEXT in this case) to carry out syntactic searches involving a variable number of words between the two fields.

 


Although the corpus is not parsed, it is still possible to use part of speech tags and a variable number of words between two parts of the construction, to approximate searches involving noun phrases, relative clauses, and so on. To look for the following constructions, you would enter [-] in WORD(S), [-] in COLLOCATES (actually CONTEXT, in these cases), and [-] for the maximum length in words (up to nine words, left and right) that [-] can be from [-]. Just click on [Click to see] below to run the queries.

Note that if you click on chart or list displays to see the KWIC entries, the KWIC results will display in this frame. If so, you'll have to "back up" one page to get back to this help file.
 

# words construction OVERALL CHART LIST BY [--]
 A [vv*] NOUN PHRASE into [v?g*] Click to see Click to see

1

[vv*] her into [v?g*] e.g. talked her into staying

2 [vv*] the people into [v?g*]
4  0 [vv*] my best friend into [v?g*]
   
 B

what|all RELATIVE CLAUSE do [be] [v*] 

Click to see Click to see
4 what|all he wants to do [be] [v*]  e.g. what|all he wants to do is complain
5 what|all they expected Fred to do [be] [v*]
7 what|all any of these crazy people can do [be] [v*]
8  0 what|all your best friend can possibly hope to do [be] [v*]
   
 C

[expect] [a*]|[d*]|[n*]|[p*] NOUN PHRASE [v?i*]

Click to see Click to see
2 [expect] them to [v?i*]   ( them = [p*] pronoun )
3 [expect] Bill Clinton to [v?i*]   ( Bill = [np*] proper noun )
4 [expect] those six people to [v?i*]   ( those = [d*] demonstrative )
 5 [expect] the people in Florida to [v?i*]   ( the = [a*] article )


Note


Use [a*]|[d*]|[n*]|[p*] to look for the first word of a noun phrase (you may want to refine this further). You can also use the negator - to indicate NOT, e.g. -[v*]|[r*] (not verb or adverb) or -to|will|would(none of these three words). Make sure there is no space to the left or right of | when there is a series of elements.

Notes:

1. Not all of the KWIC entries will in fact be relevant, because we haven't placed any constraints on what is between the yellow and the green parts of the search. But using the yellow portion as an "anchor" is still far better than searching for just the green portion.

2. The yellow (anchor) portion can only have one word, not a sequence of two or three words. For this one word, however, there can be any number of possibilities, such as either what or all in [B] above.



Note on advanced queries involving variable length between words
 

Syntax

Meaning

Examples (Click to run)

Sample matches

One "slot" : Make sure there is no space, or it will be interpreted as two consecutive words

word

One exact word

mysterious

mysterious

[pos]
[pos*]

Part of speech (exact)
Part of speech (wildcard)
[More information]

[vvg]
[v*]
 

going, using
find, does, keeping, started

 

[lemma]

Lemmas (all forms of a word)

[sing]
[tall]

sing, singing, sang
tall, taller, tallest

[=word]

Synonyms
[More information]
[New: synonym chains]

[=strong]
 

formidible, muscular, fervent

 
[user:list]

Customized lists
[More information]

[mark_davies@byu.edu:clothes]
 
tie, shirt, blouse
 

word|word

Any of these words

stunning|gorgeous|charming

stunning, charming, gorgeous

*xx
x?xx
x?xx*

Wildcard: * = any # letters
Wildcard: ? = one letter

un*ly
s?ng
s?ng*

unlikely, unusually
sing, sang, song
song, singer, songbirds

-word

NOT (followed by PoS, lemma, word, etc. Most useful for "multiple slot" queries; see below)

-[nn*]

the, in, is

Combinations of preceding (samples)
You can limit to a particular part of speech by adding a period (full stop) and then the part of speech tag in brackets. This is always optional. Make sure there is no space before or after the period (full stop), or it will be interpreted as two consecutive words
word.[pos] Exact word and part of speech strike.[v*] strike (only as a verb)
word*.[pos] Substring and part of speech dis*.[vvd] discovered, disappeared, discussed
[lemma].[pos] Lemma and part of speech [strike].[v*] strike, struck, striking

[word].[pos]

Synonym and part of speech

[=beat].[v*]

hit, strike, defeat
(but not nouns, like rhythm ordrumming)
You can add "lemma" to any other type of search, such as synonym or customized list, to see all forms of the matching words. Just use an extra set of brackets.

[[=word]]

Synonym and lemma

[[=publish]]

announced, circulating, publishes, issue
(no part of speech specified, so some noun uses)

[[user:list]]

Customized list and lemma

[[mark_davies@byu.edu:clothes]]

tie, tying, socks, socked, shirt, blouses
(no part of speech specified, hencetying)
 You can also choose lemma and part of speech by combining the preceding symbols

[[=word]].[pos]

Synonym and lemma and part of speech

[[=clean]].[v*]

mop, scrubs, polishing

[[user:list]].[pos]

Customized list and lemma and part of speech

[[mark_davies@byu.edu:clothes]].[n*]

tie, ties, sock, socks (i.e. just nouns)
Multiple "slots" : Create sequences of words, using any of the preceding query types. Note that in each case, there is a space between the word "slots" in the query. These are just a few examples, from an unlimited number of combinations. Note on advanced queries involving variable length between words.

nooks and crannies

nooks and crannies

fast|quick|rapid [nn*]

fast food
rapid transit

pretty -[nn*]

pretty smart
pretty as
(but not pretty girl, pretty picture, etc)

[get] her to [v*]

get her to stay
got her to sleep
.|,|;  nevertheless [p*] [v*]
(Notice that punctuation can be used like any "word";
just make sure that it is separated from words by a space)
. Nevertheless it is
; nevertheless he said

[break] the [nn*]

break the law
broke the story

[[beat]].[v*] * [nn*]

beat the Yankees
beaten to death

[=gorgeous] [nn*]

beautiful woman
attractive wife

[put] on [ap*] [mark_davies@byu.edu:clothes].[n*]

put on her hat
putting on my pants


Note on advanced queries involving variable length between words
 

Syntax

Meaning

Examples (Click to run)

Sample matches

One "slot" : Make sure there is no space, or it will be interpreted as two consecutive words

word

One exact word

mysterious

mysterious

[pos]
[pos*]

Part of speech (exact)
Part of speech (wildcard)
[More information]

[vvg]
[v*]
 

going, using
find, does, keeping, started

 

[lemma]

Lemmas (all forms of a word)

[sing]
[tall]

sing, singing, sang
tall, taller, tallest

[=word]

Synonyms
[More information]
[New: synonym chains]

[=strong]
 

formidible, muscular, fervent

 
[user:list]

Customized lists
[More information]

[mark_davies@byu.edu:clothes]
 
tie, shirt, blouse
 

word|word

Any of these words

stunning|gorgeous|charming

stunning, charming, gorgeous

*xx
x?xx
x?xx*

Wildcard: * = any # letters
Wildcard: ? = one letter

un*ly
s?ng
s?ng*

unlikely, unusually
sing, sang, song
song, singer, songbirds

-word

NOT (followed by PoS, lemma, word, etc. Most useful for "multiple slot" queries; see below)

-[nn*]

the, in, is

Combinations of preceding (samples)
You can limit to a particular part of speech by adding a period (full stop) and then the part of speech tag in brackets. This is always optional. Make sure there is no space before or after the period (full stop), or it will be interpreted as two consecutive words
word.[pos] Exact word and part of speech strike.[v*] strike (only as a verb)
word*.[pos] Substring and part of speech dis*.[vvd] discovered, disappeared, discussed
[lemma].[pos] Lemma and part of speech [strike].[v*] strike, struck, striking

[word].[pos]

Synonym and part of speech

[=beat].[v*]

hit, strike, defeat
(but not nouns, like rhythm or drumming)
You can add "lemma" to any other type of search, such as synonym or customized list, to see all forms of the matching words. Just use an extra set of brackets.

[[=word]]

Synonym and lemma

[[=publish]]

announced, circulating, publishes, issue
(no part of speech specified, so some noun uses)

[[user:list]]

Customized list and lemma

[[mark_davies@byu.edu:clothes]]

tie, tying, socks, socked, shirt, blouses
(no part of speech specified, hence tying)
 You can also choose lemma and part of speech by combining the preceding symbols

[[=word]].[pos]

Synonym and lemma and part of speech

[[=clean]].[v*]

mop, scrubs, polishing

[[user:list]].[pos]

Customized list and lemma and part of speech

[[mark_davies@byu.edu:clothes]].[n*]

tie, ties, sock, socks (i.e. just nouns)
Multiple "slots" : Create sequences of words, using any of the preceding query types. Note that in each case, there is a space between the word "slots" in the query. These are just a few examples, from an unlimited number of combinations. Note on advanced queries involving variable length between words.

nooks and crannies

nooks and crannies

fast|quick|rapid [nn*]

fast food
rapid transit

pretty -[nn*]

pretty smart
pretty as
(but not pretty girl, pretty picture, etc)

[get] her to [v*]

get her to stay
got her to sleep
.|,|;  nevertheless [p*] [v*]
(Notice that punctuation can be used like any "word";
just make sure that it is separated from words by a space)
. Nevertheless it is
; nevertheless he said

[break] the [nn*]

break the law
broke the story

[[beat]].[v*] * [nn*]

beat the Yankees
beaten to death

[=gorgeous] [nn*]

beautiful woman
attractive wife

[put] on [ap*] [mark_davies@byu.edu:clothes].[n*]

put on her hat
putting on my pants

 

Collocates or context-based searches allow you to find the most frequent collocates (nearby words) for a given word, which often provide useful insight into word meaning and usage. Note that you can also use this field for advanced syntactic searches involving a variable number of words (such as relative clauses). More information...)

WORD(S)   1  
COLLOCATES   2        3       4 

See an explanation of what happens if you don't enter anything in the [COLLOCATES] field
 

Finds [2] within [3] words to the left and [4] words to the right of [1]. Click on any of the links below to run the query.
 

 1    2  3  /  4 Explanation SORT BY
GROUP BY
Examples
[thick] [nn*] 0/4 A form of thick followed by a noun Frequency
Collocates
glasses, smoke
laugh.[n*] [j*] 5/5 Adjectives within five words of the noun laugh Frequency
Collocates
good, little, big
laugh.[n*]   5/5 Any words within five words of the noun laugh (sorted by relevance) Relevance
Collocates
hearty, scornful
look into [nn*] 0/6 Nouns after a form of look + into Frequency
Collocates
eyes, future
eyes clos* 5/5 Words starting with clos* within five words of eyes Frequency
Collocates
closed, close
work/job hard/tough/difficult 4/0 Work or job preceded by hard or tough or difficult Frequency
Both words
hard//work,
tough//job
[feel] like [*vvg*] 0/4 A form of feel followed by a gerund Frequency
Collocates
crying, taking
find time 0/4 Find followed by time Frequency
Collocates
time
[=gorgeous] [n*] 0/4 Nouns after a synonym of gorgeous Frequency
Collocates
woman, face
[=gorgeous] [n*] 0/4 Nouns after a synonym of gorgeous Frequency
Both words
attractive woman,
beautiful day
[=expensive] [[mark_davies@byu.edu:clothes]] 0/5 Synonym of white followed by a form of a  word in theclothes list created by davies Frequency
Collocates
shoes, short
[=expensive] [[mark_davies@byu.edu:clothes]] 0/5 Synonym of expensive followed by a form of a word in theclothes list created by davies Frequency
Both words
expensive//shoes,
pricey shirt
[=beautiful] [=face].[n*] 5/5 Synonym of beautiful before synonym of the noun face Frequency
Collocates
happy, delighted
[=beautiful] [=face].[n*] 5/5 Synonym of beautiful before synonym of the noun face Frequency
Both words
happy//child,
delighted//boy

NOTES:

1. The [COLLOCATES] line of the search form must be visible in order to do a COLLOCATES search. Otherwise, it will simply look for the string in the WORD(S) field.

2. Nearly any search string that is possible for a simple, non-context search is possible for either the WORD(S) or COLLOCATES fields.

3. For queries that have two or more lemma that are possible for both the WORD(S) and COLLOCATES fields (e.g. those with synonyms, word alternates, or customized lists), you will probably want to set [GROUP BY] to [BOTH WORDS] or [BOTH LEMMAS] (more info...).

0

  

新浪BLOG意见反馈留言板 欢迎批评指正

新浪简介 | About Sina | 广告服务 | 联系我们 | 招聘信息 | 网站律师 | SINA English | 产品答疑

新浪公司 版权所有