一个合理彻底的次世代测序技术分析软件列表(二)(2009-05-26 21:02:50)
de novo Align/Assemble
de novo 短序列组装
* MIRA2 - MIRA (Mimicking Intelligent
Read Assembly) is able to perform true hybrid de-novo assemblies
using reads gathered through 454 sequencing technology (GS20 or GS
FLX). Compatible with 454, Solexa and Sanger data. Linux OS
required.
* SHARCGS - De novo assembly of short
reads. Authors are Dohm JC, Lottaz C, Borodina T and Himmelbauer H.
from the Max-Planck-Institute for Molecular Genetics.
* SSAKE - Version 2.0 of SSAKE (23 Oct
2007) can now handle error-rich sequences. Authors are René Warren,
Granger Sutton, Steven Jones and Robert Holt from the Canada's
Michael Smith Genome Sciences Centre. Perl/Linux.
* VCAKE - De novo assembly of short
reads with robust error correction. An improvement on early
versions of SSAKE.
* Velvet - Velvet is a de novo genomic
assembler specially designed for short read sequencing
technologies, such as Solexa or 454. Need about 20-25X coverage and
paired reads. Developed by Daniel Zerbino and Ewan Birney at the
European Bioinformatics Institute (EMBL-EBI).
SNP/Indel Discovery
单碱基多态性/插入缺失多态性查找软件
* ssahaSNP - ssahaSNP is a polymorphism
detection tool. It detects homozygous SNPs and indels by aligning
shotgun reads to the finished genome sequence. Highly repetitive
elements are filtered out by ignoring those kmer words with high
occurrence numbers. More tuned for ABI Sanger reads. Developers are
Adam Spargo and Zemin Ning from the Sanger Centre. Compaq Alpha,
Linux-64, Linux-32, Solaris and Mac
* PolyBayesShort - A re-incarnation of the
PolyBayes SNP discovery tool developed by Gabor Marth at Washington
University. This version is specifically optimized for the analysis
of large numbers (millions) of high-throughput next-generation
sequencer reads, aligned to whole chromosomes of model organism or
mammalian genomes. Developers at Boston College. Linux-64 and
Linux-32.
* PyroBayes - PyroBayes is a novel base
caller for pyrosequences from the 454 Life Sciences sequencing
machines. It was designed to assign more accurate base quality
estimates to the 454 pyrosequences. Developers at Boston
College.
Genome Annotation/Genome Browser/Alignment
Viewer/Assembly Database
基因组注释/基因组浏览/比对结果浏览/组装数据库
* STADEN - Includes GAP4. GAP5 once
completed will handle next-gen sequencing data. A partially
implemented test version is available here
* EagleView - An information-rich genome
assembler viewer. EagleView can display a dozen different types of
information including base quality and flowgram signal. Developers
at Boston College.
* XMatchView - A visual tool for analyzing
cross_match alignments. Developed by Rene Warren and Steven Jones
at Canada's Michael Smith Genome Sciences Centre. Python/Win or
Linux.
* SAM - Sequence Assembly Manager.
Whole Genome Assembly (WGA) Management and Visualization Tool. It
provides a generic platform for manipulating, analyzing and viewing
WGA data, regardless of input type. Developers are Rene Warren,
Yaron Butterfield, Asim Siddiqui and Steven Jones at Canada's
Michael Smith Genome Sciences Centre. MySQL backend and Perl-CGI
web-based frontend/Linux.
ChIP-Seq/BS-Seq
ChIP-Seq / BS-Seq
* FindPeaks - perform analysis of ChIP-Seq
experiments. It uses a naive algorithm for identifying regions of
high coverage, which represent Chromatin Immunoprecipitation
enrichment of sequence fragments, indicating the location of a
bound protein of interest. Original algorithm by Matthew
Bainbridge, in collaboration with Gordon Robertson. Current code
and implementation by Anthony Fejes. Authors are from the Canada's
Michael Smith Genome Sciences Centre. JAVA/OS independent. Latest
versions available as part of the Vancouver
Short Read Analysis Package
* CHiPSeq - Program used by Johnson et
al. (2007) in their Science publication
* BS-Seq - The source code and data for
the "Shotgun Bisulphite Sequencing of the Arabidopsis Genome
Reveals DNA Methylation Patterning" Nature paper
by Cokus et al. (Steve Jacobsen's lab at UCLA).
POSIX.
* SISSRs - Site Identification from
Short Sequence Reads. BED file input. Raja Jothi @ NIH. Perl.
* QuEST - Quantitative Enrichment of
Sequence Tags. Sidow and Myers Labs at Stanford. From the 2008
publication Genome-wide analysis of transcription factor binding sites
based on ChIP-Seq data. (C++)
**See also this thread for ChIP-Seq, until I get time
to update this list.
Alternate Base Calling
侯补碱基检出
* Rolexa - R-based framework for base
calling of Solexa data. Project publication
* Alta-cyclic - "a novel Illumina
Genome-Analyzer (Solexa) base caller"
专家评论:(lh3 in sanger)
1. In a commercial package, NCGR uses GMAP (http://www.gene.com/share/gmap/) to alignment
Solexa reads. GMAP is free, though.
2. Synamatix has SXOligoSearch (http://synasite.mgrc.com.my:8080/sxo...ligoSearch.php).
It is commercial and from the online decription it looks very
promising.
3. SOAP (http://soap.genomics.org.cn) by Ruiqiang Li, as has
been pointed by ECO.
4. Maq is also able to find SNPs with its own alignment. It has a
graphical viewer, but again for its own alignment format.
5. Illumina has a software list: http://www.illumina.com/pagesnrn.ilmn?ID=245. But most
of the listed softwares have been quoted here. :-)
6. Anthony Fejes discussed some softwares in his blog (http://www.fejes.ca/labels/DNA.html). May be
helpful to someone, too.
7. SSAHA has been optimized for short-reads, too. But yes, SSAHASNP
appears in your "SNP/INDEL discovery" category.
8. Ladeana from Gabor's group has recently published a paper on
Nature Methods, using their MASAIC and PolyBayesShort.
加载中,请稍候...