加载中…
个人资料
  • 博客等级:
  • 博客积分:
  • 博客访问:
  • 关注人气:
  • 获赠金笔:0支
  • 赠出金笔:0支
  • 荣誉徽章:
正文 字体大小:

【转载】基因组分析常用软件列表

(2015-03-20 22:32:43)
标签:

生物信息

分类: 2015

DNA sequencing:

http://molbiol-tools.ca/red_bullet.gif T he DNA Sequence Quality - Phred  provides base calling, chromatogram display and high quality sequence region evaluation and presentation for up to five sequences simultaneously. 

http://molbiol-tools.ca/red_bullet.gif Sequence assembly - you don't need your own contig assembly program when you can use:

http://molbiol-tools.ca/bluebullet1w.gif  EGassember - aligns and merges sequence fragments resulting from shotgun sequencing or gene transcripts (EST) fragments in order to reconstruct the original segment or gene (Reference: A. Masoudi-Nejad et al. 2006. Nucl. Acids Res. 34: W459-462).

http://molbiol-tools.ca/bluebullet1w.gif CAP3 (PBIL, France), 
http://molbiol-tools.ca/bluebullet1w.gif CAP EST Assembler(Istituto FIRC di Oncologia Molecolare, Italy) - Maximum sequence length for each sequence is 30 kb - Maximum number of sequences 10 kb 
http://molbiol-tools.ca/bluebullet1w.gif Divide-and-Conquer Multiple Sequence Alignment (Universitat Bielefeld, Germany)

http://molbiol-tools.ca/red_bullet.gif Sequencing errors - if your DNA sequence doesn't match the expected protein sequence you can check for errors at ERR_WISE or Wise2: Intelligent algorithms for DNA searches(EBI, United Kingdom)or  SEQERR  Detection of Frameshift Errors in Coding Regions

http://molbiol-tools.ca/red_bullet.gif In-silico.com (Dr. Joseba Bikandi & co-workers, Faculty of Pharmacy, in the University of the Basque Country) allows in silico experiments including theoretical PCR amplification, AFLP-PCR , restriction analysis and pulsed field gel electrophoresis [PFGE] with bacterial & archael genomes found in the public database.

http://molbiol-tools.ca/bluebullet1w.gif Genome comparisons:

http://molbiol-tools.ca/red_bullet.gif  GeneOrder 2.0 (D. Seto,  Bioinformatics & Computational Biology, George Mason Univ., U.S.A.)  is ideal for comparing small GenBank genomes (up to 0.25 Mb), while GeneOrder 3.0 extends the limits to approx. 2.0Mb. Each gene from the Query sequence is compared to all of the genes from the Reference database using BLASTP. There are two display formats: graphical and tabular. Currently the graph is an applet and must be saved as a "SCREEN SHOT".

http://molbiol-tools.ca/red_bullet.gif CoreGenes  (D. Seto,  Bioinformatics & Computational Biology, George Mason Univ., U.S.A.) is designed to analyze two to five genomes simultaneously, generating a table of related genes - orthologs and putative orthologs. These entries are linked to their GenBank data.  It has a limit of 0.35 Mb, while the newer version CoreGenes 2.0 extends the limit to  approx. 2.0Mb. If your data is not present in GenBank use this site.

http://molbiol-tools.ca/red_bullet.gif CoreGenes 3 (D. Seto & P. Mahadevan, Bioinformatics & Computational Biology, George Mason Univ., U.S.A) - tallies the total number of genes in common between the two genomes being compared; displays the percent value of genes in common with a specific genome; determines the unique genes contained in a pair of proteomes

http://molbiol-tools.ca/red_bullet.gif  WebACT this is the web version of ACT (Artemis Comparison Tool) a DNA sequence comparison viewer based on Artemis (Reference: T.J. Carver et al. Bioinformatics 21: 3422 - 3423).   Visit the database page of EMBL-EBI and select EMBL and "Standard Query Form"  to determine the EMBL accession number for the sequence you are interested in.

http://molbiol-tools.ca/WebACT.png

http://molbiol-tools.ca/red_bullet.gif WebGMAP is a public web service for annotating and mapping individual cDNA sequences to the genomes of many eukaryote species, currently including Arabidopsis thalianaChlamydomonas reinhardtii,Glycine maxOryza sativaPhyscomitrella patens and Populus trichocarpa. (Reference: C. Liang et al. 2009. Nucl. Acids Res. 37(Web Server issue):W77-W83)

http://molbiol-tools.ca/red_bullet.gif Panseq (Chad Laing, Public Health Agency of Canada) a group of tools for the analysis of the 'pan genome' of a group of genomic sequences. The pan-genome of a bacterial species consists of a core genome and an accessory gene pool, the latter of which allows subpopulations of the organism to adapt to specific environments. These include Novel Region Finder, which will find sequences that are unique to a strain or group of strains with respect to another strain or group of strains. Core/Accessory Genome Analysis will define the core and accessory genome of a group of strains, based on the user configurable parameters of region size and percent sequence identity. Outputs of this feature include concatenated core genomes for each sequence examined, a tab-delimited table depicting the presence / absence of each accessory region among all the sequences examined, a tab-delimited table depicting each SNP among all the sequences examined and NEXUS format files of both the core and accessory genome. Loci Selector determines the most variable and discriminatory loci from a group. 

http://molbiol-tools.ca/bluebullet1w.gif Genome annotation:

http://molbiol-tools.ca/red_bullet.gif RAST (Rapid Annotation using Subsystem Technology) is a fully-automated service for annotating bacterial and archaeal genomes. It provides high quality genome annotations for these genomes across the whole phylogenetic tree. Requires registration. (Reference: Aziz, RK et al. 2008. BMC Genomics 9:75.).  See also MyRAST under Molecular Biology Freeware for Windows.

http://molbiol-tools.ca/red_bullet.gif BASys Bacterial Annotation Tool - this incredible tool supports automated, in-depth annotation of bacterial genomic sequences. It accepts raw DNA sequence data and an optional list of gene identification information (Glimmer) and provides extensive textual annotation and hyperlinked image output. BASys uses >30 programs to determine 60 annotation subfields for each gene, including gene/protein name, GO function, COG function, possible paralogues and orthologues, molecular weight, isoelectric point, operon structure, subcellular localization, signal peptides, transmembrane regions, secondary structure, 3D structure, reactions and pathways. (Reference: G.H. Van Domselaar et al. 2005. Nucl. Acids Res. 33(Web Server issue):W455-W459).

http://molbiol-tools.ca/red_bullet.gif ORF (Groningen Biomolecular Sciences and Biotechnology Institute, Haren, the Netherlands) - offers one of the choice of Glimmer, ZCurve or GeneMark predictions coupled with GenBank or Fasta-formatted output. Works very well and quickly with phage-sized genomes.

http://molbiol-tools.ca/red_bullet.gif BAGEL (Groningen Biomolecular Sciences and Biotechnology Institute, Haren, the Netherlands) - will determine from an existing or non submitted GenBank file the presence of bacteriocins based on a database containing information of known bacteriocins and adjacent genes involved in bacteriocin activity.

http://molbiol-tools.ca/red_bullet.gif MICheck (MIcrobial genome Checker) - enables rapid verification of sets of annotated genes and frameshifts in previously published bacterial genomes, or genomes for which the user has a *.gbk file. This tool can be seen as a preliminary step before the functional re-annotation step to check quickly for missing or wrongly annotated genes. It worked nicely with phage genomes from 43-135kb. (Reference: S. Cruveiller et al. 2005. Nucl. Acids Res. 33: W471- W479).

http://molbiol-tools.ca/red_bullet.gif WebGeSTer Genome Scanner for Terminators - my favourite terminator search program is finally web enabled.  Please note that if you want to analyze data from a *.gbk file you need to use  theirconversion program "GenBank2GeSTer" first. A complete description of each terminator including a diagram is produced by this program.  This site linked to an extensive database of transcriptional terminators in bacterial genome (WebGeSTer DB(Reference: Mitra A. et al. 2011.  Nucl. Acids Res. 39(Database issue):D129-35).

http://molbiol-tools.ca/red_bullet.gif RibEx: Riboswitch Explorer - scans <40kb DNA for potential genes (which are linked to BLASTP) and several hundred regulatory elements, including riboswitches. If you click on the "search for attenuators" it finds terminators and antiterminators. It presents the capculated genes and perits BLAST analysis at NCBI (Reference: C. Abreu-Goodger & E. Merino. 2005. Nucl. Acids Res. 33: W690-W692).

http://molbiol-tools.ca/red_bullet.gif TransTerm (Michael Nuhn, Nano+Bio-Center) - TransTerm searches for rho-independent terminators in the vicinity of annotated genes. This TIGR program can be accessed online in two ways. If you have the genome in GenBank format to use this program since it will only look for terminators in the vicinity of the annotated genes. If the genome has not been annotated use this site. The latter site combines Glimmer and RBSfinder with TransTerm.

http://molbiol-tools.ca/red_bullet.gif tRNAs: tRNAscan-SE(Univerisity of California at San Diego, U.S.A,) and FAStRNA (N. El-Mabrouck, Pasteur Institute, Paris, France). The former site is incredibly sensitive & also provides secondary structure  diagrams of the tRNA molecules. Alternatively use ARAGORN (Reference: Laslett, D. & Canback. 2004. Nucleic Acids Research 32:11-16).
Test sequences.

http://molbiol-tools.ca/red_bullet.gif CRISPRfinder  Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) present a curious repeat structure found in many prokaryotic genomes. They show characteristics of both tandem and interspaced repeats. (Reference: I. Grissa et al. 2007. Nucl. Acids Res. 35(Web Server issue): W52-W57).

http://molbiol-tools.ca/red_bullet.gif LTR_Finder - is an efficient program for finding full-length LTR retrotranspsons in genome sequences. The size of input file is now limited to 50MB (Reference: Z. Xu & H. Wang. 2007. Nucl. Acids Res.35(Web Server issue): W265-W268).
http://molbiol-tools.ca/red_bullet.gif RTAnalyzer finds retrotransposons and detects L1 retrotransposition signatures (Reference: J-F. Lucier et al. 2007. Nucl. Acids Res. 35(Web Server issue):W269-W274

http://molbiol-tools.ca/red_bullet.gif FancyGene is a fast and user-friendly web-based tool for producing images of one or more genes directly on the corresponding genomic locus. Starting from a variety of input formats, FancyGene rebuilds the basic components of a gene (UTRs, intron, exons). Once the initial representation is obtained, the user can superimpose additional features—such as protein domains and/or a variety of biological markers—in specific positions. (Reference: D. Rambaldi & F.D. Ciccarelli. 2009. Bioinformatics 25: 2281-2282).

http://molbiol-tools.ca/red_bullet.gif SLEP is a pipeline for predicting the localization of bacterial proteins starting from genome sequences. It combines the results of several tools: Glimmer, TMHMM, PRODIV-TMHMM, LipoP, PSortB.

http://molbiol-tools.ca/red_bullet.gif Ori-Finder finds oriCs in bacterial genomes based on an integrated method comprising the analysis of base composition asymmetry using the Z-curve method, distribution of DnaA boxes, and the occurrence of genes frequently close to oriCs.  ( Reference: F. Gao & C.-T. Zhang. 2008. BMC Bioinformatics. 9:79).

http://molbiol-tools.ca/red_bullet.gif MG-RAST (Metagenome Rapid Annotation using Subsystem Technology) is a fully-automated service for annotating metagenome samples. It provides annotation of sequence fragments, their phylogenetic classification and an initial metabolic reconstruction. The service also provides means for comparing phylogenetic classifications and metabolic reconstructions of metagenomes (Reference: F. Meyer et al. 2008. BMC Bioinformatics 9: 386).

http://molbiol-tools.ca/bluebullet1w.gif Correcting genome annotations:

http://molbiol-tools.ca/red_bullet.gif gbk2tbl (Andre Villegas, Public Health Agency of Canada) - One of the problems with GenBank is that scientists do not update their submission data nor correct errors. In part this is due to laziness; but is also due to the fact that GenBank is, in most cases, unwilling to accept a new version of the Sequin file.  Tbl2asn is a command-line program that automates the creation of sequence records for submission to GenBank but, from my perspective, it is not easy to use.  Gbk2tbl will generate a five-column table of the genome features,  which can be easily edited in Notepad.

http://molbiol-tools.ca/bluebullet1w.gif Genome visualization:

http://molbiol-tools.ca/red_bullet.gif CGView Server is a comparative genomics tool for circular genomes that allows sequence feature information to be visualized in the context of sequence analysis results. A genome sequence is supplied to the program in FASTA, GenBankEMBL, or raw format. Up to three comparison sequences (or sequence sets) in FASTA format can also be submitted. The CGView Server uses BLAST to compare the genome sequence to the comparison sequences, and then converts the results and any available feature information (from the GenBankEMBL, or optional GFF file) or analysis information (from an optional GFF file) into a high-quality graphical map showing the entire genome sequence, or a zoomed view of a region of interest. Several options are available for specifying how the BLAST comparisons are conducted, and for controlling how results are displayed.(Reference: Grant JR & Stothard P. 2008. Nucleic Acids Res. 36(Web Server issue): W181-184)

http://molbiol-tools.ca/red_bullet.gif GenomeVx makes editable, publication-quality, maps of mitochondrial and chloroplast genomes and of large plasmids. These maps show the location of genes and chromosomal features as well as a position scale. The program takes as input either raw feature positions or GenBank records. In the latter case, features are automatically extracted and colored, an example of which is given. Output is in the Adobe Portable Document Format (PDF) and can be edited by programs such as Adobe Illustrator.(Reference: G. Conant & K. Woolfe. 2008. Bioinformatics 24:861-862)

http://molbiol-tools.ca/red_bullet.gif DNAPlotter - is an interactive Java application for generating circular and linear representations of genomes. Making use of the Artemis libraries to provide a user-friendly method of loading in sequence files (EMBL, GenBank, GFF) as well as data from relational databases, it filters features of interest to display on separate user-definable tracks. It can be used to produce publication quality images for papers or web pages.(Reference: Carver, T. et al. 2008. Bioinformatics 25:119-120)

http://molbiol-tools.ca/Styphi.jpg

http://molbiol-tools.ca/red_bullet.gif GeneWiz (Center for Biological Sequence Analysis, Danish Technical University) produces linear or circular genome altases such as the one below.  They have ready name ones for most bacteria, but by uploading custom data in GenBank format (.gbk) one can make one's own diagram showing the genetic and physical properties of your genome.

http://molbiol-tools.ca/GeneWiz_map.jpg

http://molbiol-tools.ca/red_bullet.gif SLEP is a pipeline for predicting the localization of bacterial proteins starting from genome sequences. It combines the results of several tools: Glimmer, TMHMM, PRODIV-TMHMM, LipoP, PSortB.

http://molbiol-tools.ca/bluebullet1w.gif Genomic Islands:

http://molbiol-tools.ca/red_bullet.gif MobilomeFINDER: web-based tools for in silico and experimental discovery of bacterial genomic islands (Reference: H-Y. Ou et al. Nucl. Acids Res. 35 Web Server issue W97-W104)

http://molbiol-tools.ca/red_bullet.gif Prophage Finder this tool predicts potential prophage loci in prokaryotic genome sequences.  However, it does not make any predictions as to whether the identified prophage is functional and it is also important to note the identified prophage region will most likely not represent the entire prophage. (Reference: Bose, M. & Barber, R. 2006.  In Silico Biol. 6: 0020).

http://molbiol-tools.ca/red_bullet.gif Phage_Finder was created to identify prophage regions in completed bacterial genomes. Using a test dataset of 42 bacterial genomes whose prophages have been manually identified, Phage_Finder found 91% of the regions, resulting in 7% false positive and 9% false negative prophages. A search of 302 complete bacterial genomes predicted 403 putative prophage regions, accounting for 2.7% of the total bacterial DNA. Analysis of the 285 putative attachment sites revealed tRNAs are targets for integration slightly more frequently (33%) than intergenic (31%) or intragenic (28%) regions, while tmRNAs were targeted in 8% of the regions. (Reference: D.E. Fouts. 2006. Nucleic Acids Res. 34: 5839–5851).

http://molbiol-tools.ca/red_bullet.gif Prophinder similar program

http://molbiol-tools.ca/red_bullet.gif IslandViewer - integrates two sequence composition GI prediction methods SIGI-HMM and IslandPath-DIMOB, and a single comparative GI prediction method IslandPick (Reference: M.G.I. Langille et al. 2008. BMC Bioinformatics 9: 329).

http://molbiol-tools.ca/bluebullet1w.gif Synthetic genes:

http://molbiol-tools.ca/red_bullet.gif  GeneDesign is an excellent resource for designing synthetic genes. It includes tools for codon optimization and removal of restriction sites (Reference: Richarson, S.M. et al. 2006. Genome Research 16:550-556)

http://molbiol-tools.ca/bluebullet1w.gif Metagenomics:

http://molbiol-tools.ca/red_bullet.gif Orphelia  - Orphelia is a metagenomic ORF finding tool for the prediction of protein coding genes in short, environmental DNA sequences with unknown phylogenetic origin. Orphelia is based on a two-stage machine learning approach that was recently introduced by our group. After the initial extraction of ORFs, linear discriminants are used to extract features from those ORFs. Subsequently, an artificial neural network combines the features and computes a gene probability for each ORF in a fragment. A greedy strategy computes a likely combination of high scoring ORFs with an overlap constraint.  (Reference: K.J. Hoff et al. 2009. Nucl. Acids Res.37(Web Server issue:W101-W105)

more information pls links to:

http://molbiol-tools.ca/

0

阅读 收藏 喜欢 打印举报/Report
  

新浪BLOG意见反馈留言板 欢迎批评指正

新浪简介 | About Sina | 广告服务 | 联系我们 | 招聘信息 | 网站律师 | SINA English | 产品答疑

新浪公司 版权所有