|  | ||||||
|  | 
 | |||||
| 
 | Primary Structure SectionThe primary structure section of a PDB file contains the sequence of residues in each chain of the macromolecule. Embedded in these records are chain identifiers and sequence numbers that allow other records to link into the sequence. DBREFOverviewThe DBREF record provides cross-reference links between PDB sequences and the corresponding database entry or entries.Record Format 
 
COLUMNS       DATA TYPE          FIELD          DEFINITION
----------------------------------------------------------------
 1 - 6        Record name        "DBREF "
 8 - 11       IDcode             idCode         ID code of this entry.
13            Character          chainID        Chain identifier.
15 - 18       Integer            seqBegin       Initial sequence number 
                                                of the PDB sequence segment.
19            AChar              insertBegin    Initial insertion code 
                                                of the PDB sequence segment.
21 - 24       Integer            seqEnd         Ending sequence number 
                                                of the PDB sequence segment.
25            AChar              insertEnd      Ending insertion code 
                                                of the PDB sequence segment.
27 - 32       LString            database       Sequence database name. 
34 - 41       LString            dbAccession    Sequence database accession code.
43 - 54      LString            dbIdCode        Sequence database 
                                                identification code.
56 - 60      Integer            dbseqBegin      Initial sequence number of the
                                                database seqment.
61           AChar              idbnsBeg        Insertion code of initial residue
                                                of the segment, if PDB is the
                                                reference.
63 - 67      Integer            dbseqEnd        Ending sequence number of the
                                                database segment.
68           AChar              dbinsEnd        Insertion code of the ending
                                                residue of the segment, if PDB is
                                                the reference.
Details
 
 
    Database name                         database 
                                     (code in columns 27 - 32)
    ----------------------------------------------------------
    GenBank                               GB
    Protein Data Bank                     PDB
    Protein Identification Resource       PIR
    SWISS-PROT                            SWS
    TREMBL                                TREMBL
    UNIPROT                               UNP
 The sequence database entry found during PDB's search is compared to that provided by the depositor and any differences are resolved or annotated. In most cases, only one reference to a sequence database will be provided. PDB does not guarantee that all possible references to the listed databases will be provided.Relationships to Other Record Types DBREF represents the sequence as found in SEQRES records.Example 
 
         1         2         3         4         5         6         7
1234567890123456789012345678901234567890123456789012345678901234567890
DBREF  2J83 A   61   322  UNP    Q8TL28   Q8TL28_METAC    61    322
DBREF  2J83 B   61   322  UNP    Q8TL28   Q8TL28_METAC    61    322
DBREF  1ABC B    1B   36  PDB    1ABC     1ABC             1B    36
DBREF  3AKY      3   220  SWS    P07170   KAD1_YEAST       5    222 
DBREF  1HAN      2   288  GB     397884   X66122           1    287 
DBREF  3HSV A    1    92  SWS    P22121   HSF_KLULA      193    284
DBREF  3HSV B    1    92  SWS    P22121   HSF_KLULA      193    284
DBREF  1ARL      1   307  SWS    P00730   CBPA_BOVIN     111    417  
SEQADVOverviewThe SEQADV record identifies conflicts between sequence information in the SEQRES records of the PDB entry and the sequence database entry given on DBREF. Please note that these records were designed to identify differences and not errors. No assumption is made as to which database contains the correct data. PDB may include REMARK records in the entry that reflect the depositor's view of which database has the correct sequence.Record Format 
 COLUMNS DATA TYPE FIELD DEFINITION ----------------------------------------------------------------- 1 - 6 Record name "SEQADV" 8 - 11 IDcode idCode ID code of this entry. 13 - 15 Residue name resName Name of the PDB residue in conflict. 17 Character chainID PDB chain identifier. 19 - 22 Integer seqNum PDB sequence number. 23 AChar iCode PDB insertion code. 25 - 28 LString database 30 - 38 LString dbIdCode Sequence database accession number. 40 - 42 Residue name dbRes Sequence database residue name. 44 - 48 Integer dbSeq Sequence database sequence number. 50 - 70 LString conflict Conflict comment.Details 
 
 
       Cloning artifact
       Conflict
       Engineered
       Disordered
       Variant
       Insertion
       Deletion
       Microheterogeneity
       D-configuration
 SEQADV records are automatically generated by the PDB.Relationships to Other Record Types SEQADV refers to the sequence as found in the SEQRES records, and to the sequence database reference found on DBREF. REMARK 999 contains text that explains discrepancies when the explanation is too lengthy to fit in SEQADV.Example 
 
         1         2         3         4         5         6         7
1234567890123456789012345678901234567890123456789012345678901234567890
SEQADV 2J83 ALA A  269  UNP  Q8TL28    CYS   269 ENGINEERED MUTATION
SEQADV 2J83 ALA B  269  UNP  Q8TL28    CYS   269 ENGINEERED MUTATION 
SEQADV 3ABC MET A   -1  SWS  P10725              CLONING ARTIFACT
SEQADV 3ABC GLY A   50  SWS  P10725    VAL    50 ENGINEERED
SEQRESOverviewSEQRES records contain the amino acid or nucleic acid sequence of residues in each chain of the macromolecule that was studied.Record Format 
 
COLUMNS      DATA TYPE       FIELD          DEFINITION
-------------------------------------------------------------------
 1 -  6      Record name     "SEQRES"
 9 - 10      Integer         serNum         Serial number of the SEQRES record
                                                for the current chain. Starts at 1
                                                and increments by one each line.
                                                Reset to 1 for each chain.
12           Character       chainID        Chain identifier. This may be any
                                                single legal character, including a
                                                blank which is used if there is
                                                only one chain.
14 - 17      Integer         numRes         Number of residues in the chain.
                                                This value is repeated on every
                                                record.
20 - 22      Residue name    resName        Residue name.
24 - 26      Residue name    resName        Residue name.
28 - 30      Residue name    resName        Residue name.
32 - 34      Residue name    resName        Residue name.
36 - 38      Residue name    resName        Residue name.
40 - 42      Residue name    resName        Residue name.
44 - 46      Residue name    resName        Residue name.
48 - 50      Residue name    resName        Residue name.
52 - 54      Residue name    resName        Residue name.
56 - 58      Residue name    resName        Residue name.
60 - 62      Residue name    resName        Residue name.
64 - 66      Residue name    resName        Residue name.
68 - 70      Residue name    resName        Residue name.
Verification/Validation/Value Authority ControlThe residues presented on the SEQRES records must agree with those found in the ATOM records. The SEQRES records are checked by PDB using the sequence databases and information provided by the depositor. SEQRES is compared to the ATOM records during processing, and both are checked against the sequence database. All discrepancies are either resolved or annotated in the entry.Example 
 
         1         2         3         4         5         6         7
1234567890123456789012345678901234567890123456789012345678901234567890
SEQRES   1 A   21  GLY ILE VAL GLU GLN CYS CYS THR SER ILE CYS SER LEU
SEQRES   2 A   21  TYR GLN LEU GLU ASN TYR CYS ASN                    
SEQRES   1 B   30  PHE VAL ASN GLN HIS LEU CYS GLY SER HIS LEU VAL GLU
SEQRES   2 B   30  ALA LEU TYR LEU VAL CYS GLY GLU ARG GLY PHE PHE TYR
SEQRES   3 B   30  THR PRO LYS ALA                                    
SEQRES   1 C   21  GLY ILE VAL GLU GLN CYS CYS THR SER ILE CYS SER LEU
SEQRES   2 C   21  TYR GLN LEU GLU ASN TYR CYS ASN                    
SEQRES   1 D   30  PHE VAL ASN GLN HIS LEU CYS GLY SER HIS LEU VAL GLU
SEQRES   2 D   30  ALA LEU TYR LEU VAL CYS GLY GLU ARG GLY PHE PHE TYR
SEQRES   3 D   30  THR PRO LYS ALA                                    
Known ProblemsPolysaccharides do not lend themselves to being represented in SEQRES. There is no mechanism provided to describe sequence runs when the exact ordering of the sequence is not known. For cyclic peptides, PDB arbitrarily assigns a residue as the N-terminus. No distinction is made between ribo- and deoxyribonucleotides in the SEQRES records. These residues are identified with the same residue name (i.e., A, C, G, T, U). MODRESOverviewThe MODRES record provides descriptions of modifications (e.g., chemical or post-translational) to protein and nucleic acid residues. Included are a mapping between residue names given in a PDB entry and standard residues.Record Format 
 
COLUMNS    DATA TYPE        FIELD         DEFINITION
----------------------------------------------------
 1 - 6     Record name      "MODRES"
 8 - 11    IDcode           idCode     ID code of this entry.
13 - 15    Residue name     resName    Residue name used in this entry.
17         Character        chainID    Chain identifier.
19 - 22    Integer          seqNum     Sequence number.
23         AChar            iCode      Insertion code.
25 - 27    Residue name     stdRes     Standard residue name.
30 - 70    String           comment    Description of the residue
                                       modification
Details
 
 
       Glycosylation site
       Post-translational modification
       Designed chemical modification
       Phosphorylation site
       Blocked N-terminus
       Aminated C-terminus
       D-configuration
       Reduced peptide bond
 MODRES is generated by the PDB.Relationships to Other Record Types MODRES maps ATOM and HETATM records to the standard residue names. SEQADV, HET, and FORMUL may also appear.Example 
 
         1         2         3         4         5         6         7
1234567890123456789012345678901234567890123456789012345678901234567890
MODRES 1ABC ASN A   22A ASN  GLYCOSYLATION SITE
MODRES 2ABC TTQ A   50A TRP  POST-TRANSLATIONAL MODIFICATION
MODRES 3ABC DAL A   32  ALA  POST-TRANSLATIONAL MODIFICATION,D-ALANINE
MODRES 3ABC DAL B   32  ALA  POST-TRANSLATIONAL MODIFICATION,D-ALANINE
� 2007 wwPDB |