加载中…
个人资料
  • 博客等级:
  • 博客积分:
  • 博客访问:
  • 关注人气:
  • 获赠金笔:0支
  • 赠出金笔:0支
  • 荣誉徽章:
正文 字体大小:

ARC基于线粒体的测序组装软件

(2015-09-01 19:13:20)
分类: 软件安装
1:软件名称:

Assembly by Reduced Complexity (ARC),软件网址:http://ibest.github.io/ARC/#Contact

2:参考文章:

Assembly by Reduced Complexity (ARC): a hybrid approach for targeted assembly of homologous sequences.(http://biorxiv.org/content/early/2015/02/07/014662)

3:原理:将测序数据比对参考线粒体基因组上,在线粒体基因组上划分bin,然后基于bin进行组装,最后合并组装的结果。类似的软件MITObim文章发表Nucleic Acids Research可参考:https://github.com/chrishah/MITObim
ARC基于线粒体的测序组装软件


4:软件运行:
/share/nas2/genome/biosoft/Python/2.7.8/bin/ARC -c ARC_config.txt
其中配置文件如下
## Name=value pairs:
## reference: contains reference sequences in fasta format
## numcycles: maximum number of times to try remapping
## mapper: the mapper to use (blat/bowtie2)
## assembler: the assembler to use (newbler/spades)
## nprocs: number of cores to use
## format: fastq or fastq, all must be the same
## verbose: control mapping/assembly log generation (True/False)
## urt: For Newbler, enable use read tips mode (True/False)
## map_against_reads: On iteration 1, skip assembly, map against mapped reads (True/False)
## assemblytimeout: kill assemblies and discard targets if they take longer than N minutes
##
## Columns:
## Sample_ID:Sample_ID
## FileName: path for fasta/fastq file
## FileType: PE1, PE2, or SE
## FileFormat: fasta or fastq
# reference=/share/nas29/shim/testing/ARC/data/targets.fa
# numcycles=10
# mapper=bowtie2
# assembler=spades
# nprocs=7
# format=fastq
# verbose=True
# urt=True
# map_against_reads=False
# assemblytimeout=300
# bowtie2_k=3
# rip=True
# cdna=False
# subsample=1
# maskrepeats=True
# sloppymapping=True
Sample_ID FileName FileType
Sample1 /share/nas29/shim/testing/ARC/data/reads/Lampyridae_S99-D01-I_good_1.fq PE1
Sample1 /share/nas29/shim/testing/ARC/data/reads/Lampyridae_S99-D01-I_good_2.fq PE2

5:配置文件标红的分别选择的组装软件这里选择是spades,当然这个软件要在你的环境变量里面,assemblytimeout=300,代表的是拼接超过多少分钟就杀死,因此这里设置的时间较长(10分钟)。原作者提供的配置文件中是1分钟,网站上默认设置为10.  循环数这个一般也不需要10次,个人经验3-4次就可以# numcycles=10
如果你的参考基因组与组装的较远请设置参数map_against_reads为True,如果数据深度较深可以抽样组装例如:subsample=0.4默认为1

0

阅读 收藏 喜欢 打印举报/Report
  

新浪BLOG意见反馈留言板 欢迎批评指正

新浪简介 | About Sina | 广告服务 | 联系我们 | 招聘信息 | 网站律师 | SINA English | 产品答疑

新浪公司 版权所有