加载中…
个人资料
  • 博客等级:
  • 博客积分:
  • 博客访问:
  • 关注人气:
  • 获赠金笔:0支
  • 赠出金笔:0支
  • 荣誉徽章:
正文 字体大小:

PDB格式和MOL2格式

(2012-03-05 20:55:31)
标签:

pdb

mol2

格式

python

杂谈

分类: 分子模拟
说明 http://www.wwpdb.org/format/sect1.html

COLUMNS  对齐  DATA  TYPE    FIELD        DEFINITION
-------------------------------------------------------------------------------------
 
1 -    左   Record name   "ATOM   
7 - 11   右   Integer       serial       Atom serial number.
13 - 16  左    Atom          name         Atom name. 若元素名为双则左靠,单(H,C)则空一格再左靠
17             Character     altLoc       Alternate location indicator.
18 - 20        Residue name  resName      Residue name.
22             Character     chainID     Chain identifier.
23 - 26  右    Integer       resSeq       Residue sequence number.
27             AChar         iCode        Code for insertion of residues.
31 - 38  右    Real(8.3)               Orthogonal coordinates for X in Angstroms.
39 - 46        Real(8.3)               Orthogonal coordinates for Y in Angstroms.
47 - 54        Real(8.3)               Orthogonal coordinates for Z in Angstroms.
55 - 60  右    Real(6.2)    occupancy    Occupancy.
61 - 66  右    Real(6.2)    tempFactor   Temperature factor.
77 - 78  右    LString(2)   element      Element symbol, right-justified.
79 - 80        LString(2)   charge       Charge on the atom.
具体关键坐标信息 http://www.wwpdb.org/format/sect9.html


    def ReadPDBLine(self, Line):
        self.line = Line
        self.atomname = Line[11:16].strip()
        self.chain = Line[21:22]
        if Line[22:26].strip() != "":
            self.resid = int(Line[22:26])
        else:
            self.resid = 0
       
        if len(self.atomname)==1: # redo using rjust
            self.atomname = self.atomname + "  "
        elif len(self.atomname)==2:
            self.atomname = self.atomname + " "
        elif len(self.atomname)==3:
            self.atomname = self.atomname + " " # This line is necessary for babel to work, though many PDBs in the PDB would have this line commented out
       
        self.coordinates = point(float(Line[30:38]), float(Line[38:46]), float(Line[46:54]))
       
        if len(Line) >= 79:
            self.element = Line[76:79].strip().upper() # element specified explicitly at end of life
        elif self.element == "": # try to guess at element from name
            two_letters = self.atomname[0:2].strip().upper()
            if two_letters=='BR':
                self.element='BR'
            elif two_letters=='CL':
                self.element='CL'
            elif two_letters=='BI':
                self.element='BI'
            elif two_letters=='AS':
                self.element='AS'
            elif two_letters=='AG':
                self.element='AG'
            elif two_letters=='LI':
                self.element='LI'
            elif two_letters=='HG':
                self.element='HG'
            elif two_letters=='MG':
                self.element='MG'
            elif two_letters=='RH':
                self.element='RH'
            elif two_letters=='ZN':
                self.element='ZN'
            else: #So, just assume it's the first letter.
                self.element = self.atomname[0:1].strip().upper()
               
        # Any number needs to be removed from the element name
        self.element = self.element.replace('0','')
        self.element = self.element.replace('1','')
        self.element = self.element.replace('2','')
        self.element = self.element.replace('3','')
        self.element = self.element.replace('4','')
        self.element = self.element.replace('5','')
        self.element = self.element.replace('6','')
        self.element = self.element.replace('7','')
        self.element = self.element.replace('8','')
        self.element = self.element.replace('9','')

        self.PDBIndex = Line[6:12].strip()
        self.resname = Line[16:20]
        if self.resname.strip() == "": self.resname = " MOL"

读取pdb格式中原子信息的python脚本

 "mol2_format.pdf" (SYBYL)http://vdisk.weibo.com/s/3mw_2
MOL2格式
#comments
@<TRIPOS>MOLECULE
MOl name
atoms_num bond_num res_num ? ?(num_feature,num_set?一般为0,0)
type:SMALL, BIOPOLYMER, PROTEIN, NUCLEIC_ACID, SACCHARIDE
charges:NO_CHARGES,MMFF94_CHARGES, USER_CHARGES,GASTEIGER等
comments

@<TRIPOS>ATOM
len(atoms)_4位名字右对齐_10位*3(xyz)_5位原子类型(左对齐)_len(残基num)_len(残基名(左对齐))_8位电荷(右对齐)
前面到原子类型都是必须的
@<TRIPOS>BOND
bond_index() atom1 atom2 bond_type:1,2,3,ar

@<TRIPOS>SUBSTRUCTURE
subst_id subst_name root_atom [subst_type [dict_type[chain [sub_type [inter_bonds [status
[comment]]]]]]]
root_atom为根原子编号,可能为质心
其中subst_type:temp, perm, residue, group or domain 小分子多为GROUP,大分子为RESIDUE
dict_type为残基类型,1为蛋白,4为小分子
chain是<=4位的字符串,sub_type:链中的子类型(如残基类型),<=4位。如为小分子,可以为****
inter_bonds:为该残基与相连N个残基数,独立分子为0,端残基为1,二硫键为3,其余多为2(大分子)

0

阅读 收藏 喜欢 打印举报/Report
  

新浪BLOG意见反馈留言板 欢迎批评指正

新浪简介 | About Sina | 广告服务 | 联系我们 | 招聘信息 | 网站律师 | SINA English | 产品答疑

新浪公司 版权所有