用mothur进行OTU分类
(2013-03-30 21:00:44)
标签:
mothurotu序列dna分类 |
分类: 资料 |
mothur 用来处理高通量测序结果或者克隆文库的序列处理
这里进行OTU分类,就是排除克隆测序结果的重复序列,以97%的 相似度进行分类
即cutoff=0.03
mothur v.1.25.1
Last updated: 5/14/2012
by
Patrick D. Schloss
Department of Microbiology
& Immunology
University of Michigan
pschloss@umich.edu
http://www.mothur.org
When using, please cite:
Schloss, P.D., et al., Introducing mothur: Open-source,
platform-independent, co
mmunity-supported software for describing and comparing microbial
communities. A
ppl Environ Microbiol, 2009. 75(23):7537-41.
Distributed under the GNU General Public License
Type 'help()' for information on the commands that are available
Type 'quit()' to exit program
mothur > unique.seqs(fasta=*.fasta)
5**
Output File Names:
It took 0 to read
Aligning sequences from *.unique.fasta ...
Reading in the tmp.fasta template
sequences...
It took 0 to read
***
***
Some of your sequences generated
alignments that eliminated too many bases, a list is provided in
*.unique.flip.accnos. If the reverse compliment proved to be better
it was reported.
It took 1 secs to align
Output File Names:
*.unique.align
*.unique.align.report
*.unique.flip.accnos
mothur > filter.seqs(fasta=*.unique.align)
Length of filtered alignment: ***
Number of columns removed: 0
Length of the original alignment: ***
Number of sequences used to construct filter: ***
Output File Names:
mothur >
dist.seqs(fasta=*.unique.filter.fasta,calc=onegap,countends=F,cutoff=0.03,output=lt)
Output File Name:
g5gta.unique.filter.phylip.dist
It took 0 to calculate the distances
for
mothur > cluster(phylip=*.unique.filter.phylip.dist,method=furthest,cutoff=0.03)
********************#****#****#****#****#****#****#****#****#****#****#
unique
0.01
0.02
0.03
Output File Names:
*.unique.filter.phylip.fn.sabund
*.unique.filter.phylip.fn.rabund
*.unique.filter.phylip.fn.list
It took 0 seconds to cluster
mothur > bin.seqs(fasta=*.fasta,name=*.names)
Using *.unique.filter.phylip.fn.list as
input file for the list parameter.
unique
0.01
0.02
0.03
Output File Names:
*.unique.filter.phylip.fn.unique.fasta
*.unique.filter.phylip.fn.0.01.fasta
*.unique.filter.phylip.fn.0.02.fasta
*.unique.filter.phylip.fn.0.03.fasta
mothur >
get.oturep(phylip=*.unique.filter.phylip.dist,fasta=*.fasta,list=*.unique.filter.phylip.fn.list,label=0.03)
********************#****#****#****#****#****#****#****#****#****#****#
0.03
Output File Names:
*.unique.filter.phylip.fn.0.03.rep.names
*.unique.filter.phylip.fn.0.03.rep.fasta
mothur >
mothur下载地址
http://www.mothur.org/wiki/Download_mothur
支持Mac,Windows以及Linux操作系统
首先要打开mothur -----cmd,进入命令行界面cd
xxx
假设该目录内fasta格式文件叫做 Great.fasta
那么使用如下命令处理:
1.mothur
> unique.seqs(fasta=Great.fasta)
2.mothur >
dist.seqs(fasta=Great.unique.fasta,calc=onegap,countends=F,cutoff=0.03,output=lt)
3.mothur >
cluster(phylip=Great.unique.phylip.dist,method=furthest,cutoff=0.03)
4.mothur >
bin.seqs(fasta=Great.fasta,name=Great.names)
5.mothur >
get.oturep(phylip=Great.unique.phylip.dist,fasta=Great.unique.fasta,list=Great.unique.
phylip.fn.list,label=0.03)
另外,还可以看到有多少个OTUs
Great.unique.phylip.fn.0.03.rep.names 这个文件显示了每个OTU都有哪些序列,可以用记事本打开
注意,cluster命令能读取phylip矩阵,也能读取column矩阵,如果读取的是column,还需要提供一个names文件
另外,用unique.seqs命令,是为了生成names文件
如果在运行第1步的时候,提示[ERROR]:your sequences are not the same length, aborting.
那么需要运行一下命令:
先提供一个template fasta文件,以以上fasta文件中序列长度最普遍的某个为序列模板,新建一个fasta,命名为temp.fasta
a. mothur > unique.seqs(fasta=Great.fasta)
b. mothur > align.seqs(candidate=Great.unique.fasta,template=temp.fasta,flip=T,processors=2)
c. mothur > filter.seqs(fasta=Great.unique.align)
再把这里生成的Great.unique.filter.fasta文件更名为Great.fasta 进行以上5步处理。(尽量在一个新的mothur目录内,打开一个新的mothur窗口,以免文件名冲突)