Class SubstructureIdentifier

  • All Implemented Interfaces:
    Serializable, StructureIdentifier

    public class SubstructureIdentifier
    extends Object
    implements StructureIdentifier
    This is the canonical way to identify a part of a structure.

    The current syntax allows the specification of a set of residues from the first model of a structure. Future versions may be extended to represent additional properties.

    Identifiers should adhere to the following specification, although some additional forms may be tolerated where unambiguous for backwards compatibility.

                    name          := pdbID
                                   | pdbID '.' chainID
                                   | pdbID '.' range
                    range         := range (',' range)?
                                   | chainID
                                   | chainID '_' resNum '-' resNum
                    pdbID         := [1-9][a-zA-Z0-9]{3}
                                   | PDB_[a-zA-Z0-9]{8}
                    chainID       := [a-zA-Z0-9]+
                    resNum        := [-+]?[0-9]+[A-Za-z]?
     
    For example:
                    1TIM                                    #whole structure (short format)
                    1tim                                    #same as above
                    4HHB.C                                  #single chain
                    3AA0.A,B                                #two chains
                    4GCR.A_1-40                             #substructure
          3iek.A_17-28,A_56-294,A_320-377         #substructure of 3 disjoint parts
                    PDB_00001TIM                            #whole structure (extended format)
                    pdb_00001tim                            #same as above
                    PDB_00004HHB.C                          #single chain
                    PDB_00003AA0.A,B                        #two chains
                    PDB_00004GCR.A_1-40                     #substructure
          pdb_00003iek.A_17-28,A_56-294,A_320-377 #substructure of 3 disjoint parts
     
    More options may be added to the specification at a future time.
    Author:
    dmyersturnbull, Spencer Bliven
    See Also:
    Serialized Form
    • Constructor Detail

      • SubstructureIdentifier

        public SubstructureIdentifier​(String pdbId,
                                      List<ResidueRange> ranges)
        Create a new identifier based on a set of ranges. If ranges is empty, includes all residues.
        Parameters:
        pdbId - a pdb id, can't be null
        ranges - the ranges
      • SubstructureIdentifier

        public SubstructureIdentifier​(PdbId pdbId,
                                      List<ResidueRange> ranges)
        Create a new identifier based on a set of ranges. If ranges is empty, includes all residues.
        Parameters:
        pdbId -
        ranges -
    • Method Detail

      • getIdentifier

        public String getIdentifier()
        Get the String form of this identifier. This provides the canonical form for a StructureIdentifier and has all the information needed to recreate a particular substructure. Example: 3iek.A_17-28,A_56-294
        Specified by:
        getIdentifier in interface StructureIdentifier
        Returns:
        The String form of this identifier
      • getPdbId

        public PdbId getPdbId()
        Get the PDB identifier part of the SubstructureIdentifier
        Returns:
        the PDB ID
      • reduce

        public Structure reduce​(Structure s)
                         throws StructureException
        Takes a complete structure as input and reduces it to residues present in the specified ranges

        The returned structure will be a shallow copy of the input, with shared Chains, Residues, etc.

        Ligands are handled in a special way. If a full chain is selected (e.g. '1ABC.A') then any waters and ligands with matching chain name are included. If a residue range is present ('1ABC.A:1-100') then any ligands (technically non-water non-polymer atoms) within StructureTools.DEFAULT_LIGAND_PROXIMITY_CUTOFF of the selected range are included, regardless of chain.

        Specified by:
        reduce in interface StructureIdentifier
        Parameters:
        s - A full structure, e.g. as loaded from the PDB. The structure ID should match that returned by getPdbId().
        Returns:
        Throws:
        StructureException
        See Also:
        StructureTools#getReducedStructure(Structure, String)
      • copyLigandsByProximity

        protected static void copyLigandsByProximity​(Structure full,
                                                     Structure reduced,
                                                     double cutoff,
                                                     int fromModel,
                                                     int toModel)
        Supplements the reduced structure with ligands from the full structure based on a distance cutoff. Ligand groups are moved (destructively) from full to reduced if they fall within the cutoff of any atom in the reduced structure.
        Parameters:
        full - Structure containing all ligands
        reduced - Structure with a subset of the polymer groups from full
        cutoff - Distance cutoff (Å)
        fromModel - source model in full
        toModel - destination model in reduced
        See Also:
        StructureTools.getLigandsByProximity(java.util.Collection, Atom[], double)