ZhongjieWang_新浪博客

(2018-01-25 20:40)

标签：

linux

分类： Bioinformatics

一，简介

The core algorithm is based on approximate seeds and allows for sensitive analysis of NGS reads. The main application of SortMeRNA is filtering rRNA from metatranscriptomic data. SortMeRNA takes as input a file of reads (fasta or fastq format) and one or multiple rRNA database file(s), and sorts apart aligned and rejected reads into two files specified by the user. Additional applications include clustering and taxonomy assignation available through QIIME v1.9.1 (http://qiime.org). SortMeRNA works with Illumina, Ion Torrent and PacBio data, and can produce SAM and BLAST-like alignments.

二，安装

对于Ubuntu系统来说：sudo apt install sortmerna
对于其他系统，去找manual吧。

三，用法

目前大部分测序数据都是pair-end，一般会给正反方向的两个文件

阅读收藏

查看全文>>

直系同源基因和旁系同源基因

(2017-06-24 21:39)

转载▼

标签：

biology

分类： Biology

直系同源和旁系同源直系同源(orthologous) 指的是不同物种之间的同源性，例如蛋白质的同源性，DNA 序列的同源性。 Orthologs 是指来自于不同物种的由垂直家系（物种形成）进化而来的蛋白，并且典型的保留与原始蛋白有相同的功能。

旁系同源（Paralogs）是那些在一定物种中的来源于基因复制的蛋白，可能会进化出新的与原来有关的功能。蛋白质

阅读收藏

查看全文>>

四种建树方法

(2017-04-24 20:41)

转载▼

标签：

bioinfo

分类： Bioinformatics

第一种：UPGMA法(unweighted pair group method using arithmetic average)

非加权配对算术平均法或非加权组平均法 NTSYS 3.4

阅读收藏

查看全文>>

聚磷相关酶：ppk和ppk2基因的区别

(2017-04-08 16:31)

转载▼

标签：

biology

分类： Biology

3 聚磷相关酶及其酶学调控 ( Enzymes and their regulation involved in microbial polyphosphate accumulation)

在生物体内，多种酶共同调控 poly⁃P 的合成和分解代谢，见图 2( Brown and Kornberg， 2008;

阅读收藏

查看全文>>

blast+中blastp参数详解

(2016-11-25 14:20)

转载▼

标签：

linux

查遍网络也没发现有对blastp参数详解的资料，blastn、blastall等资料倒是不少，所以我决定自己动手，把所有的参数解释一遍。

以下是blastp的整体用法概述：
wang@wang-ThinkStation:~/mydata/Accumulibacter/blastp-pro$ blastp -help
USAGE
blastp [-h] [-help] [-import_search_strategy filename]
    [-export_search_strategy filename] [-task task_name] [-db database_name]
    [-dbsize num_letters] [-gilist filename] [-seqidlist filename]
    [-negative_gilist filename] [-entrez_query entrez_query]
    [-db_soft_mask filtering_algorithm] [-db_hard_mask filtering_algorithm]
    [-subject subject_input_file] [-subject_loc range] [-query input_file]
    [-out output_file] [-evalue evalue] [-word_size int_value]
    [-gapopen open_penalty] [-gapextend extend_penalty]
&nbs

阅读收藏

查看全文>>

IDBA-UD组装基因组简单用法

(2016-11-24 20:34)

转载▼

标签：

linux

分类： Bioinformatics

之前组装基因组一直用另外一个软件：SPAdes，组装效果还不错，但是IDBA的大名早就听说过，所以趁着这次刚那个两个菌的数据，分别用这两个软件组装一下，对比一下效果，在SPAdes的网站上面看到过几个组装软件的对比图，毫无疑问，SPAdes排第一，但是IDBA能排第二，说明IDBA的组装效果还可以。

一，使用说明

安装

If you use the release package.

Exract the package, then use make to compile the source code.

 $ ./c

阅读收藏

查看全文>>

Gblocks简单使用

(2016-11-18 14:40)

转载▼

标签：

linux

分类： Bioinformatics

Gblocks is a computer program written in ANSI C language that eliminates poorly aligned positions and divergent regions of an alignment of DNA or protein sequences.
Gblocks是一个用ANSI C语言编写出的软件，用于消除DNA或者蛋白序列不好的比对位点和分散区域。这些位置可能不是同源的或可能已被多重取代饱和，最好的在进行系统发生分析之前消除它们。一般来说，不太严格的数据区域选择比较适合短序列比对，而条件严格的数据区选择比较适合长片段比对。默认的参数是严格的。