4. Heterogen Section

The heterogen section of a PDB file contains the complete description of non-standard residues in the entry.


HET

Overview

HET records are used to describe non-standard residues, such as prosthetic groups, inhibitors, solvent molecules, and ions for which coordinates are supplied. Groups are considered HET if they are:

- not one of the standard amino acids, and
- not one of the nucleic acids (C, G, A, T, U, and I), and
- not one of the modified versions of nucleic acids (+C, +G, +A, +T, +U, and +I), and
- not an unknown amino acid or nucleic acid where UNK is used to indicate the unknown residue name.

Het records also describe heterogens for which the chemical identity is unknown, in which case the group is assigned the hetID UNK.

Record Format

COLUMNS        DATA TYPE       FIELD         DEFINITION
---------------------------------------------------------------------------------
 1 -  6        Record name     "HET   "

 8 - 10        LString(3)      hetID         Het identifier, right-justified.

13             Character       ChainID       Chain identifier.

14 - 17        Integer         seqNum        Sequence number.

18             AChar           iCode         Insertion code.

21 - 25        Integer         numHetAtoms   Number of HETATM records for the
                                             group present in the entry.

31 - 70        String          text          Text describing Het group.

Details

* Each HET group is assigned a hetID of not more than three (3) alphanumeric characters. The sequence number, chain identifier, insertion code, and number of coordinate records are given for each occurrence of the HET group in the entry. The chemical name of the HET group is given in the HETNAM record and synonyms for the chemical name are given in the HETSYN records.

* There is a separate HET record for each occurrence of the HET group in an entry.

* A particular HET group is represented in the PDB archives with a unique hetID.

* PDB entries do not have HET records for water molecules.

* The Text field is for descriptive material. The token PART_OF followed by a value may be used to indicate that the HET group is part of a larger group which has been represented by its separate components (e.g., PART_OF: actinomycin). Segment identifiers, columns 73 - 76 of ATOM/HETATM records, may also be used to relate individual components of a large HET group.

* Unknown atoms or ions will be represented as UNX with the chemical formula X1.

Verification/Validation/Value Authority Control

For each het group that appears in the entry, PDB checks that the corresponding HET, HETNAM, HETSYN, FORMUL, HETATM, and CONECT records appear, if applicable. The HET record is generated automatically by PDB using the het group dictionary and information from the HETATM records.

Each unique hetID represents a unique molecule.

Relationships to Other Record Types

For each het group that appears in the entry, the corresponding HET, HETNAM, HETSYN, FORMUL, HETATM, and CONECT records must appear, if applicable. LINK records may also appear.

Example

         1         2         3         4         5         6         7
1234567890123456789012345678901234567890123456789012345678901234567890
HET    TRS    975       8

HET    STA  I   4      25     PART_OF: HIV INHIBITOR;

HET    FUC  Y   1      10     PART_OF: NONOATE COMPLEX; L-FUCOSE
HET    GAL  Y   2      11     PART_OF: NONOATE COMPLEX
HET    NAG  Y   3      15     PART_OF: NONOATE COMPLEX
HET    FUC  Y   4      10     PART_OF: NONOATE COMPLEX
HET    NON  Y   5      12     PART_OF: NONOATE COMPLEX

HET    UNX  A 161       1     PSEUDO CARBON ATOM OF UNKNOWN LIGAND
HET    UNX  A 162       1     PSEUDO CARBON ATOM OF UNKNOWN LIGAND
HET    UNX  A 163       1     PSEUDO CARBON ATOM OF UNKNOWN LIGAND

Known Problems

Even though groups may be chemically bound to others with loss of atoms (e.g., H, O), the PDB has only one representation for the complete molecule. However, a few small groups are represented separately as ions, groups, and molecules.

PDB does not include CAS registry and Cambridge Structural Database (CSD) accession numbers.

Large het groups are broken into recognizable sub-groups to obviate difficulties associated with the limitations of the atom naming conventions used by the PDB. The description of how to reassemble the full molecule is addressed in a REMARK. The token PART_OF and use of segment identifiers may help to describe the larger entity.