The heterogen section of a PDB file contains the complete description of non-standard residues in the entry.
Overview
HET records are used to describe non-standard residues, such as prosthetic groups, inhibitors, solvent molecules, and ions for which coordinates are supplied. Groups are considered HET if they are:
- not one of the standard amino acids, and
- not one of the nucleic acids (C, G, A, T, U, and I), and
- not one of the modified versions of nucleic acids (+C, +G, +A, +T, +U, and +I), and
- not an unknown amino acid or nucleic acid where UNK is used to indicate the unknown residue name.
Het records also describe heterogens for which the chemical identity is unknown, in which case the group is assigned the hetID UNK.
Record Format
COLUMNS DATA TYPE FIELD DEFINITION --------------------------------------------------------------------------------- 1 - 6 Record name "HET " 8 - 10 LString(3) hetID Het identifier, right-justified. 13 Character ChainID Chain identifier. 14 - 17 Integer seqNum Sequence number. 18 AChar iCode Insertion code. 21 - 25 Integer numHetAtoms Number of HETATM records for the group present in the entry. 31 - 70 String text Text describing Het group.
Details
* Each HET group is assigned a hetID of not more than three (3) alphanumeric characters. The sequence number, chain identifier, insertion code, and number of coordinate records are given for each occurrence of the HET group in the entry. The chemical name of the HET group is given in the HETNAM record and synonyms for the chemical name are given in the HETSYN records.
* There is a separate HET record for each occurrence of the HET group in an entry.
* A particular HET group is represented in the PDB archives with a unique hetID.
* PDB entries do not have HET records for water molecules.
* The Text field is for descriptive material. The token PART_OF followed by a value may be used to indicate that the HET group is part of a larger group which has been represented by its separate components (e.g., PART_OF: actinomycin). Segment identifiers, columns 73 - 76 of ATOM/HETATM records, may also be used to relate individual components of a large HET group.
* Unknown atoms or ions will be represented as UNX with the chemical formula X1.
Verification/Validation/Value Authority Control
For each het group that appears in the entry, PDB checks that the corresponding HET, HETNAM, HETSYN, FORMUL, HETATM, and CONECT records appear, if applicable. The HET record is generated automatically by PDB using the het group dictionary and information from the HETATM records.
Each unique hetID represents a unique molecule.
Relationships to Other Record Types
For each het group that appears in the entry, the corresponding HET, HETNAM, HETSYN, FORMUL, HETATM, and CONECT records must appear, if applicable. LINK records may also appear.
Example
1 2 3 4 5 6 7 1234567890123456789012345678901234567890123456789012345678901234567890 HET TRS 975 8 HET STA I 4 25 PART_OF: HIV INHIBITOR; HET FUC Y 1 10 PART_OF: NONOATE COMPLEX; L-FUCOSE HET GAL Y 2 11 PART_OF: NONOATE COMPLEX HET NAG Y 3 15 PART_OF: NONOATE COMPLEX HET FUC Y 4 10 PART_OF: NONOATE COMPLEX HET NON Y 5 12 PART_OF: NONOATE COMPLEX HET UNX A 161 1 PSEUDO CARBON ATOM OF UNKNOWN LIGAND HET UNX A 162 1 PSEUDO CARBON ATOM OF UNKNOWN LIGAND HET UNX A 163 1 PSEUDO CARBON ATOM OF UNKNOWN LIGAND
Known Problems
Even though groups may be chemically bound to others with loss of atoms (e.g., H, O), the PDB has only one representation for the complete molecule. However, a few small groups are represented separately as ions, groups, and molecules.
PDB does not include CAS registry and Cambridge Structural Database (CSD) accession numbers.
Large het groups are broken into recognizable sub-groups to obviate difficulties associated with the limitations of the atom naming conventions used by the PDB. The description of how to reassemble the full molecule is addressed in a REMARK. The token PART_OF and use of segment identifiers may help to describe the larger entity.