mRNA序列、cDNA序列、ORF序列、CDS序列、Promoter、STS、ETS、strand
(2015-08-21 12:27:51)分类: biology |
可以通过Ensembl查找基因或转录本序列:
http://www.ensembl.org/info/website/tutorials/sequence.html
但是,若采用手动提取转录本序列,则要注意strand的正负链:+/1代表forward
strand;-/-1代表的是reverse
strand。具体的意思:参考:https://www.biostars.org/p/3423/
-
DNA is double-stranded. By convention, for a reference chromosome, one whole strand is designated the "forward strand" and the other the "reverse strand". This designation is arbitrary. Sometimes the terms "plus strand" and "minus strand" are used instead.
-
Visually (I'm not talking about the transcription machinery yet), you would typically read the sequence of a strand in the 5-3 direction. For the forward strand, this means reading left-to-right, and for the reverse strand it means right-to-left.
-
A gene can live on a DNA strand in one of two orientations. The gene is said to have a
coding strand(also known as its sense strand), and a template strand (also known as its antisense strand). For 50% of genes, its coding strand will correspond to the chromosome's forward strand, and for the other 50% it will correspond to the reverse strand. -
The mRNA (and protein) sequence of a gene corresponds to the DNA sequence as read (again, visually) from the gene's coding strand. So the mRNA sequence always corresponds to the 5-3 coding sequence of a gene.
-
Now, the RNA polymerase machinery moves along the DNA in the 5-3 orientation of the coding strand (e.g. left-to-right for a forward strand gene). It reads the bases from the template strand (so it is reading in the 3-5 direction from the point-of-view of the template strand), and builds the mRNA as it goes. This means that the mRNA matches the coding sequence of the gene, not the template sequence. (Thisdiagram
from Wikipedia illustrates). -
Annotations such as Ensembl and UCSC are concerned with the coding sequences of genes, so when they say a gene is on the forward strand, it means the gene's coding sequence is on the forward strand. To follow through again, that means that during transcription of this forward-strand gene, the gene's template sequence is read from the reverse strand, producing an mRNA that matches the sequence on the forward strand.
注意:strand Is either 1 for forward strand or -1 for reverse strand也就是说若strand为1,拼接后的序列通过碱基互补和T->U成为mRNA序列,若strand为-1,则需要将序列反向再经过T->U成为mRNA序列。