Class StructureTools


  • public class StructureTools
    extends Object
    A class that provides some tool methods.
    Since:
    1.0
    Author:
    Andreas Prlic, Jules Jacobsen
    • Method Detail

      • getNrAtoms

        public static int getNrAtoms​(Structure s)
        Count how many Atoms are contained within a Structure object.
        Parameters:
        s - the structure object
        Returns:
        the number of Atoms in this Structure
      • getNrGroups

        public static int getNrGroups​(Structure s)
        Count how many groups are contained within a structure object.
        Parameters:
        s - the structure object
        Returns:
        the number of groups in the structure
      • getAtomArray

        public static Atom[] getAtomArray​(Structure s,
                                          String[] atomNames)
        Returns an array of the requested Atoms from the Structure object. Iterates over all groups and checks if the requested atoms are in this group, no matter if this is a AminoAcid or HetatomImpl group. If the group does not contain all requested atoms then no atoms are added for that group. For structures with more than one model, only model 0 will be used.
        Parameters:
        s - the structure to get the atoms from
        atomNames - contains the atom names to be used.
        Returns:
        an Atom[] array
      • getAtomArrayAllModels

        public static Atom[] getAtomArrayAllModels​(Structure s,
                                                   String[] atomNames)
        Returns an array of the requested Atoms from the Structure object. In contrast to getAtomArray(Structure, String[]) this method iterates over all chains. Iterates over all chains and groups and checks if the requested atoms are in this group, no matter if this is a AminoAcid or HetatomImpl group. If the group does not contain all requested atoms then no atoms are added for that group. For structures with more than one model, only model 0 will be used.
        Parameters:
        s - the structure to get the atoms from
        atomNames - contains the atom names to be used.
        Returns:
        an Atom[] array
      • getAllAtomArray

        public static Atom[] getAllAtomArray​(Structure s)
        Convert all atoms of the structure (all models) into an Atom array
        Parameters:
        s - input structure
        Returns:
        all atom array
      • getAllAtomArray

        public static Atom[] getAllAtomArray​(Structure s,
                                             int model)
        Convert all atoms of the structure (specified model) into an Atom array
        Parameters:
        s - input structure
        Returns:
        all atom array
      • getAllAtomArray

        public static Atom[] getAllAtomArray​(Chain c)
        Returns and array of all atoms of the chain, including Hydrogens (if present) and all HETATOMs. Waters are not included.
        Parameters:
        c - input chain
        Returns:
        all atom array
      • getUnalignedGroups

        public static List<GroupgetUnalignedGroups​(Atom[] ca)
        List of groups from the structure not included in ca (e.g. ligands). Unaligned groups are searched from all chains referenced in ca, as well as any chains in the first model of the structure from ca[0], if any.
        Parameters:
        ca - an array of atoms
        Returns:
      • getLigandsByProximity

        public static List<GroupgetLigandsByProximity​(Collection<Group> target,
                                                        Atom[] query,
                                                        double cutoff)
        Finds all ligand groups from the target which fall within the cutoff distance of some atom from the query set.
        Parameters:
        target - Set of groups including the ligands
        query - Atom selection
        cutoff - Distance from query atoms to consider, in angstroms
        Returns:
        All groups from the target with at least one atom within cutoff of a query atom
        See Also:
        DEFAULT_LIGAND_PROXIMITY_CUTOFF
      • addGroupToStructure

        public static Chain addGroupToStructure​(Structure s,
                                                Group g,
                                                int model,
                                                Chain chainGuess,
                                                boolean clone)
        Adds a particular group to a structure. A new chain will be created if necessary.

        When adding multiple groups, pass the return value of one call as the chainGuess parameter of the next call for efficiency.

         Chain guess = null;
         for(Group g : groups) {
             guess = addGroupToStructure(s, g, guess );
         }
         
        Parameters:
        s - structure to receive the group
        g - group to add
        chainGuess - (optional) If not null, should be a chain from s. Used to improve performance when adding many groups from the same chain
        clone - Indicates whether the input group should be cloned before being added to the new chain
        Returns:
        the chain g was added to
      • addGroupsToStructure

        public static void addGroupsToStructure​(Structure s,
                                                Collection<Group> groups,
                                                int model,
                                                boolean clone)
        Add a list of groups to a new structure. Chains will be automatically created in the new structure as needed.
        Parameters:
        s - structure to receive the group
        groups - groups to add
        model - model number
        clone - Indicates whether the input groups should be cloned before being added to the new chain
      • getAllGroupsFromSubset

        public static Set<GroupgetAllGroupsFromSubset​(Atom[] atoms)
        Expand a set of atoms into all groups from the same structure. If the structure is set, only the first atom is used (assuming all atoms come from the same original structure). If the atoms aren't linked to a structure (for instance, for cloned atoms), searches all chains of all atoms for groups.
        Parameters:
        atoms - Sample of atoms
        Returns:
        All groups from all chains accessible from the input atoms
      • getAllGroupsFromSubset

        public static Set<GroupgetAllGroupsFromSubset​(Atom[] atoms,
                                                        GroupType types)
        Expand a set of atoms into all groups from the same structure. If the structure is set, only the first atom is used (assuming all atoms come from the same original structure). If the atoms aren't linked to a structure (for instance, for cloned atoms), searches all chains of all atoms for groups.
        Parameters:
        atoms - Sample of atoms
        types - Type of groups to return (useful for getting only ligands, for instance). Null gets all groups.
        Returns:
        All groups from all chains accessible from the input atoms
      • getAllNonHAtomArray

        public static final Atom[] getAllNonHAtomArray​(Structure s,
                                                       boolean hetAtoms)
        Returns and array of all non-Hydrogen atoms in the given Structure, optionally including HET atoms or not. Waters are not included.
        Parameters:
        s -
        hetAtoms - if true HET atoms are included in array, if false they are not
        Returns:
      • getAllNonHAtomArray

        public static Atom[] getAllNonHAtomArray​(Structure s,
                                                 boolean hetAtoms,
                                                 int modelNr)
        Returns and array of all non-Hydrogen atoms in the given Structure, optionally including HET atoms or not. Waters are not included.
        Parameters:
        s -
        hetAtoms - if true HET atoms are included in array, if false they are not
        modelNr - Model number to draw atoms from
        Returns:
      • getAllNonHAtomArray

        public static Atom[] getAllNonHAtomArray​(Chain c,
                                                 boolean hetAtoms)
        Returns and array of all non-Hydrogen atoms in the given Chain, optionally including HET atoms or not Waters are not included.
        Parameters:
        c -
        hetAtoms - if true HET atoms are included in array, if false they are not
        Returns:
      • getAllNonHCoordsArray

        public static javax.vecmath.Point3d[] getAllNonHCoordsArray​(Chain c,
                                                                    boolean hetAtoms)
        Returns and array of all non-Hydrogen atoms coordinates in the given Chain, optionally including HET atoms or not Waters are not included.
        Parameters:
        c -
        hetAtoms - if true HET atoms are included in array, if false they are not
        Returns:
      • getAtomArray

        public static Atom[] getAtomArray​(Chain c,
                                          String[] atomNames)
        Returns an array of the requested Atoms from the Chain object. Iterates over all groups and checks if the requested atoms are in this group, no matter if this is a AminoAcid or Hetatom group. If the group does not contain all requested atoms then no atoms are added for that group.
        Parameters:
        c - the Chain to get the atoms from
        atomNames - contains the atom names to be used.
        Returns:
        an Atom[] array
      • getRepresentativeAtomArray

        public static Atom[] getRepresentativeAtomArray​(Chain c)
        Gets a representative atom for each group that is part of the chain backbone. Note that modified aminoacids won't be returned as part of the backbone if the ReducedChemCompProvider was used to load the structure. For amino acids, the representative is a CA carbon. For nucleotides, the representative is the "P". Other group types will be ignored.
        Parameters:
        c -
        Returns:
        representative Atoms of the chain backbone
        Since:
        Biojava 4.1.0
      • cloneAtomArray

        public static Atom[] cloneAtomArray​(Atom[] ca)
        Provides an equivalent copy of Atoms in a new array. Clones everything, starting with parent groups and chains. The chain will only contain groups that are part of the input array.
        Parameters:
        ca - array of representative atoms, e.g. CA atoms
        Returns:
        Atom array
        Since:
        Biojava 4.1.0
      • cloneGroups

        public static Group[] cloneGroups​(Atom[] ca)
        Clone a set of representative Atoms, but returns the parent groups
        Parameters:
        ca - Atom array
        Returns:
        Group array
      • duplicateCA2

        public static Atom[] duplicateCA2​(Atom[] ca2)
        Utility method for working with circular permutations. Creates a duplicated and cloned set of Calpha atoms from the input array.
        Parameters:
        ca2 - atom array
        Returns:
        cloned and duplicated set of input array
      • getRepresentativeAtomArray

        public static Atom[] getRepresentativeAtomArray​(Structure s)
        Gets a representative atom for each group that is part of the chain backbone. Note that modified aminoacids won't be returned as part of the backbone if the ReducedChemCompProvider was used to load the structure. For amino acids, the representative is a CA carbon. For nucleotides, the representative is the "P". Other group types will be ignored.
        Parameters:
        s - Input structure
        Returns:
        representative Atoms of the structure backbone
        Since:
        Biojava 4.1.0
      • getBackboneAtomArray

        public static Atom[] getBackboneAtomArray​(Structure s)
        Return an Atom array of the main chain atoms: CA, C, N, O Any group that contains those atoms will be included, be it a standard aminoacid or not
        Parameters:
        s - the structure object
        Returns:
        an Atom[] array
      • get1LetterCodeAmino

        public static Character get1LetterCodeAmino​(String groupCode3)
        Convert three character amino acid codes into single character e.g. convert CYS to C. Valid 3-letter codes will be those of the standard 20 amino acids plus MSE, CSE, SEC, PYH, PYL (see the aminoAcids map)
        Parameters:
        groupCode3 - a three character amino acid representation String
        Returns:
        the 1 letter code, or null if the given 3 letter code does not correspond to an amino acid code
        See Also:
        get1LetterCode(String)
      • get1LetterCode

        public static Character get1LetterCode​(String groupCode3)
        Convert a three letter amino acid or nucleotide code into a single character code. If the code does not correspond to an amino acid or nucleotide, returns UNKNOWN_GROUP_LABEL. Returned null for nucleotides prior to version 4.0.1.
        Parameters:
        groupCode3 - three letter representation
        Returns:
        The 1-letter abbreviation
      • isNucleotide

        public static boolean isNucleotide​(String groupCode3)
        Test if the three-letter code of an ATOM entry corresponds to a nucleotide or to an aminoacid.
        Parameters:
        groupCode3 - 3-character code for a group.
      • getAtomsInContact

        public static AtomContactSet getAtomsInContact​(Chain chain,
                                                       String[] atomNames,
                                                       double cutoff)
        Returns the set of intra-chain contacts for the given chain for given atom names, i.e. the contact map. Uses a spatial hashing algorithm that speeds up the calculation without need of full distance matrix. The parsing mode FileParsingParameters.setAlignSeqRes(boolean) needs to be set to true for this to work.
        Parameters:
        chain -
        atomNames - the array with atom names to be used. Beware: CA will do both C-alphas an Calciums if null all non-H atoms of non-hetatoms will be used
        cutoff -
        Returns:
      • getAtomsInContact

        public static AtomContactSet getAtomsInContact​(Chain chain,
                                                       double cutoff)
        Returns the set of intra-chain contacts for the given chain for all non-H atoms of non-hetatoms, i.e. the contact map. Uses a spatial hashing algorithm that speeds up the calculation without need of full distance matrix. The parsing mode FileParsingParameters.setAlignSeqRes(boolean) needs to be set to true for this to work.
        Parameters:
        chain -
        cutoff -
        Returns:
      • getRepresentativeAtomsInContact

        public static AtomContactSet getRepresentativeAtomsInContact​(Chain chain,
                                                                     double cutoff)
        Returns the set of intra-chain contacts for the given chain for C-alpha or C3' atoms (including non-standard aminoacids appearing as HETATM groups), i.e. the contact map. Uses a spatial hashing algorithm that speeds up the calculation without need of full distance matrix.
        Parameters:
        chain -
        cutoff -
        Returns:
        Since:
        Biojava 4.1.0
      • getAtomsInContact

        public static AtomContactSet getAtomsInContact​(Chain chain1,
                                                       Chain chain2,
                                                       String[] atomNames,
                                                       double cutoff,
                                                       boolean hetAtoms)
        Returns the set of inter-chain contacts between the two given chains for the given atom names. Uses a spatial hashing algorithm that speeds up the calculation without need of full distance matrix. The parsing mode FileParsingParameters.setAlignSeqRes(boolean) needs to be set to true for this to work.
        Parameters:
        chain1 -
        chain2 -
        atomNames - the array with atom names to be used. For Calphas use {"CA"}, if null all non-H atoms will be used. Note HET atoms are ignored unless this parameter is null.
        cutoff -
        hetAtoms - if true HET atoms are included, if false they are not
        Returns:
      • getAtomsInContact

        public static AtomContactSet getAtomsInContact​(Chain chain1,
                                                       Chain chain2,
                                                       double cutoff,
                                                       boolean hetAtoms)
        Returns the set of inter-chain contacts between the two given chains for all non-H atoms. Uses a spatial hashing algorithm that speeds up the calculation without need of full distance matrix. The parsing mode FileParsingParameters.setAlignSeqRes(boolean) needs to be set to true for this to work.
        Parameters:
        chain1 -
        chain2 -
        cutoff -
        hetAtoms - if true HET atoms are included, if false they are not
        Returns:
      • getGroupDistancesWithinShell

        public static Map<Group,​DoublegetGroupDistancesWithinShell​(Structure structure,
                                                                           Atom centroid,
                                                                           Set<ResidueNumber> excludeResidues,
                                                                           double radius,
                                                                           boolean includeWater,
                                                                           boolean useAverageDistance)
        Finds Groups in structure that contain at least one Atom that is within radius Angstroms of centroid.
        Parameters:
        structure - The structure from which to find Groups
        centroid - The centroid of the shell
        excludeResidues - A list of ResidueNumbers to exclude
        radius - The radius from centroid, in Angstroms
        includeWater - Whether to include Groups whose only atoms are water
        useAverageDistance - When set to true, distances are the arithmetic mean (1-norm) of the distances of atoms that belong to the group and that are within the shell; otherwise, distances are the minimum of these values
        Returns:
        A map of Groups within (or partially within) the shell, to their distances in Angstroms
      • getGroupsWithinShell

        public static Set<GroupgetGroupsWithinShell​(Structure structure,
                                                      Group group,
                                                      double distance,
                                                      boolean includeWater)

        Returns a Set of Groups in a structure within the distance specified of a given group.

        Updated 18-Sep-2015 sroughley to return a Set so only a unique set of Groups returned

        Parameters:
        structure - The structure to work with
        group - The 'query' group
        distance - The cutoff distance
        includeWater - Should water residues be included in the output?
        Returns:
        LinkedHashSet of Groups within at least one atom with distance of at least one atom in group
      • removeModels

        public static Structure removeModels​(Structure s)
        Remove all models from a Structure and keep only the first
        Parameters:
        s - original Structure
        Returns:
        a structure that contains only the first model
        Since:
        3.0.5
      • getStructure

        public static Structure getStructure​(String name,
                                             PDBFileParser parser,
                                             AtomCache cache)
                                      throws IOException,
                                             StructureException
        Flexibly get a structure from an input String. The intent of this method is to allow any reasonable string which could refer to a structure to be correctly parsed. The following are currently supported:
        1. Filename (if name refers to an existing file)
        2. PDB ID
        3. SCOP domains
        4. PDP domains
        5. Residue ranges
        6. Other formats supported by AtomCache
        Parameters:
        name - Some reference to the protein structure
        parser - A clean PDBFileParser to use if it is a file. If null, a PDBFileParser will be instantiated if needed.
        cache - An AtomCache to use if the structure can be fetched from the PDB. If null, a AtomCache will be instantiated if needed.
        Returns:
        A Structure object
        Throws:
        IOException - if name is an existing file, but doesn't parse correctly
        StructureException - if the format is unknown, or if AtomCache throws an exception.
      • cleanUpAltLocs

        public static void cleanUpAltLocs​(Structure structure)
        Cleans up the structure's alternate location (altloc) groups. All alternate location groups should have all atoms (except in the case of microheterogenity) or when a deuterium exists. Ensure that all the alt loc groups have all the atoms in the main group.
        Parameters:
        structure - The Structure to be cleaned up
      • hasNonDeuteratedEquiv

        public static boolean hasNonDeuteratedEquiv​(Atom atom,
                                                    Group currentGroup)
        Check to see if an Deuterated atom has a non deuterated brother in the group.
        Parameters:
        atom - the input atom that is putatively deuterium
        currentGroup - the group the atom is in
        Returns:
        true if the atom is deuterated and it's hydrogen equive exists.
      • hasDeuteratedEquiv

        public static boolean hasDeuteratedEquiv​(Atom atom,
                                                 Group currentGroup)
        Check to see if a Hydrogen has a Deuterated brother in the group.
        Parameters:
        atom - the input atom that is putatively hydorgen
        currentGroup - the group the atom is in
        Returns:
        true if the atom is hydrogen and it's Deuterium equiv exists.
      • reduceToRepresentativeAtoms

        public static void reduceToRepresentativeAtoms​(Structure structure)
        Remove all atoms but the representative atoms (C alphas or phosphates) from the given structure.
        Parameters:
        structure - the structure
        Since:
        5.4.0