加载中…
个人资料
  • 博客等级:
  • 博客积分:
  • 博客访问:
  • 关注人气:
  • 获赠金笔:0支
  • 赠出金笔:0支
  • 荣誉徽章:
正文 字体大小:

如何用python批量下载ensemble plants 里面的全部蛋白序列(python脚本)

(2014-04-29 09:31:04)
标签:

python

ensebmle

ftp

it

分类: 编程语言

import ftplib
import os
import socket

HOST='ftp.ensemblgenomes.org'
DIRN='pub/release-22/plants/fasta/'

def main():
        try:
                f=ftplib.FTP(HOST)
        except ftplib.error_perm:
                print('Can not contect"%s" '%HOST)
                return
        print('Connect "%s" successfully!'%HOST)

        try:
                f.login('anonymous','1')
        except ftplib.error_perm:
                print("Fail to login in !")
                f.quit()
                return
        print("Login successfully!")

        try:
                f.cwd(DIRN)
        except ftplib.error_perm:
                print('Fail to list!')
                f.quit()
                return
        #print(f.nlst())
        dowloadlist=f.nlst()
        for FILE in dowloadlist:
                try:
                        f.cwd(FILE+"/"+"pep")
    
                except ftplib.error_perm:
                        print('Fail to visit folder!')
                        #f.quit()
                        #return
                print(f.nlst())

                for item in f.nlst():
                        if 'pep.all.fa.gz' in item and os.path.exists(item) is False:
                                f.retrbinary('RETR %s' % item,open(item,'wb').write)
                                print('file"%s"is dowloaded successfully!' % item)
               
                f.cwd("..")
                f.cwd("..")
                                               
if __name__ == '__main__':
        main()

print('ok')

0

阅读 收藏 喜欢 打印举报/Report
  

新浪BLOG意见反馈留言板 欢迎批评指正

新浪简介 | About Sina | 广告服务 | 联系我们 | 招聘信息 | 网站律师 | SINA English | 产品答疑

新浪公司 版权所有