用Annovar注释人类以外的基因组
(2022-06-08 10:55:29)| 分类: 日常记录 |
一、以大肠杆菌为例,构建所需注释所需文件
1,下载gtf或者gff文件。
wget
https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/022/665/GCF_000022665.1_ASM2266v1/GCF_000022665.1_ASM2266v1_genomic.gff.gz
.
wget
https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/022/665/GCF_000022665.1_ASM2266v1/GCF_000022665.1_ASM2266v1_genomic.gtf.gz
.
2,新建注释文件夹
mkdir ECdb
3,解压文件
gunzip GCF_000022665.1_ASM2266v1_genomic.gff.gz
4,下载gff3ToGenePred’
或gtfToGenePred 工具,推荐使用GTF格式,因为有些GFF3格式文件转换可能不正确
wget
http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/gtfToGenePred
.
wget
http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/gff3ToGenePred
.
chmod +x gff3ToGenePred
5,生成GenePred文件
./gtfToGenePred -genePredExt GCF_000022665.1_ASM2266v1_genomic.gtf
EC_refGene.txt
./gff3ToGenePred
GCF_000022665.1_ASM2266v1_genomic.gff EC_refGene.txt -useName
哪个格式对用哪个
6,下载基因组文件
wget
https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/022/665/GCF_000022665.1_ASM2266v1/GCF_000022665.1_ASM2266v1_genomic.fna.gz
.
gunzip
GCF_000022665.1_ASM2266v1_genomic.fna.gz
mv GCF_000022665.1_ASM2266v1_genomic.fna
GCF_000022665.1_ASM2266v1_genomic.dna.fa
7,用 retrieve_seq_from_fasta.pl生成 transcript FASTA
file
perl /apps/annovar/retrieve_seq_from_fasta.pl --format refGene
--seqfile GCF_000022665.1_ASM2266v1_genomic.dna.fa EC_refGene.txt
--outfile EC_refGeneMrna.fa
8,注释
perl /apps/annovar/table_annovar.pl LDTYF06_323.raw.variant.vcf.gz
ECdb -buildver EC -out test -remove -protocol refGene -operation g
-nastring . -vcfinput --dot2underline

加载中…