Protein Data Bank (file format): Difference between revisions
No edit summary |
|||
Line 77: | Line 77: | ||
==External links== |
==External links== |
||
* [http://www.wwpdb.org/ |
* [http://www.wwpdb.org/format/ PDB Format Guide] This is the current version (3.2) of the PDB format specification. |
||
* [http://pdbml.rcsb.org/ PDBML] An more recent, alternative XML-based file format for molecular coordinates. |
* [http://pdbml.rcsb.org/ PDBML] An more recent, alternative XML-based file format for molecular coordinates. |
||
* [http://www.rcsb.org/pdb/ The |
* [http://www.rcsb.org/pdb/ The RCSB Protein Data Bank] |
||
* [http://www.pdbe.org Protein Data Bank in Europe] |
|||
* [http://www.ncbi.nih.gov/Structure/MMDB/mmdb.shtml The Molecular Modeling DataBase (MMDB)] from [[National Center for Biotechnology Information|NCBI]] |
* [http://www.ncbi.nih.gov/Structure/MMDB/mmdb.shtml The Molecular Modeling DataBase (MMDB)] from [[National Center for Biotechnology Information|NCBI]] |
||
* [http://www.ebi.ac.uk/msd/ The Macromolecular Structure Database] from the European Bioinformatics Institute |
|||
* [http://www.rcsb.org/pdb/uniformity/index.html The Data Uniformity Project] from PDB |
* [http://www.rcsb.org/pdb/uniformity/index.html The Data Uniformity Project] from PDB |
||
* [http://watcut.uwaterloo.ca/cgi-bin/makemultimer MakeMultimer] An online tool for expanding BIOMT records in pdb files |
* [http://watcut.uwaterloo.ca/cgi-bin/makemultimer MakeMultimer] An online tool for expanding BIOMT records in pdb files |
Revision as of 08:29, 15 October 2010
Filename extension |
.pdb, .ent, .brk |
---|---|
Internet media type | chemical/x-pdb |
Type of format | chemical file format |
The Protein Data Bank (pdb) file format is a textual file format describing the three dimensional structures of molecules held in the Protein Data Bank. Most of the information in that database pertains to proteins, and the pdb format accordingly provides for rich description and annotation of protein properties. However, proteins are often crystallized in association with other molecules or ions such as water, ions, nucleic acids, drug molecules and so on, which therefore can be described in the pdb format as well.
Example
A typical pdb file describing a protein consists of hundreds to thousands of lines like the following (taken from a file describing the structure of a synthetic collagen-like peptide :
HEADER EXTRACELLULAR MATRIX 22-JAN-98 1A3I TITLE X-RAY CRYSTALLOGRAPHIC DETERMINATION OF A COLLAGEN-LIKE TITLE 2 PEPTIDE WITH THE REPEATING SEQUENCE (PRO-PRO-GLY) ... EXPDTA X-RAY DIFFRACTION AUTHOR R.Z.KRAMER,L.VITAGLIANO,J.BELLA,R.BERISIO,L.MAZZARELLA, AUTHOR 2 B.BRODSKY,A.ZAGARI,H.M.BERMAN ... REMARK 350 BIOMOLECULE: 1 REMARK 350 APPLY THE FOLLOWING TO CHAINS: A, B, C REMARK 350 BIOMT1 1 1.000000 0.000000 0.000000 0.00000 REMARK 350 BIOMT2 1 0.000000 1.000000 0.000000 0.00000 ... SEQRES 1 A 9 PRO PRO GLY PRO PRO GLY PRO PRO GLY SEQRES 1 B 6 PRO PRO GLY PRO PRO GLY SEQRES 1 C 6 PRO PRO GLY PRO PRO GLY ... ATOM 1 N PRO A 1 8.316 21.206 21.530 1.00 17.44 N ATOM 2 CA PRO A 1 7.608 20.729 20.336 1.00 17.44 C ATOM 3 C PRO A 1 8.487 20.707 19.092 1.00 17.44 C ATOM 4 O PRO A 1 9.466 21.457 19.005 1.00 17.44 O ATOM 5 CB PRO A 1 6.460 21.723 20.211 1.00 22.26 C ... HETATM 130 C ACY 401 3.682 22.541 11.236 1.00 21.19 C HETATM 131 O ACY 401 2.807 23.097 10.553 1.00 21.19 O HETATM 132 OXT ACY 401 4.306 23.101 12.291 1.00 21.19 O ...
- ATOM records
- describe the coordinates of the atoms that are part of the protein. For example, the first ATOM line above describes the alpha-N atom of the first residue of peptide chain A, which is a proline residue; the first three floating point numbers are its x, y and z coordinates and are in units of Ångströms.[1] The next three columns are the occupancy, temperature factor, and the element name, respectively.
- HETATM records
- describe coordinates of hetero-atoms, that is those atoms which are not part of the protein molecule.
- SEQRES records
- give the sequences of the three peptide chains (named A, B and C), which are very short in this example but usually span multiple lines.
- REMARK records
- can contain free-form annotation, but they also accommodate standardized information; for example, the
REMARK 350 BIOMT
records describe how to compute the coordinates of the experimentally observed multimer from those of the explicitly specified ones of a single repeating unit. - HEADER, TITLE and AUTHOR records
- provide information about the researchers who defined the structure; numerous other types of records are available to provide other types of information.
Through the years the file format has undergone many changes and revisions. Its original format was dictated by the width of computer punch cards (80 columns). The most recent revision is 3.2.[2]
See also
- Chemical file format
- ScientificPython — provides an interface for Python
- The protein data bank
Molecular visualization software capable of displaying pdb files:
- Coot_(program)
- Jmol
- PyMOL
- RasMol
- VMD
- Gabedit
- Molden
- Molekel
- Software for molecular mechanics modeling
- Cn3D
References
- ^ "wwPDB Format version 3.2: Coordinate Section".
- ^ "Atomic Coordinate Entry Format Version 3.2". wwPDB. 2008.
{{cite web}}
: Unknown parameter|month=
ignored (help)
External links
- PDB Format Guide This is the current version (3.2) of the PDB format specification.
- PDBML An more recent, alternative XML-based file format for molecular coordinates.
- The RCSB Protein Data Bank
- Protein Data Bank in Europe
- The Molecular Modeling DataBase (MMDB) from NCBI
- The Data Uniformity Project from PDB
- MakeMultimer An online tool for expanding BIOMT records in pdb files