oddt.toolkits.extras.rdkit package¶
Submodules¶
oddt.toolkits.extras.rdkit.fixer module¶
-
oddt.toolkits.extras.rdkit.fixer.
AddMissingAtoms
(protein, residue, amap, template)[source]¶ Add missing atoms to protein molecule only at the residue according to template.
- Parameters
- protein: rdkit.Chem.rdchem.RWMol
Mol with whole protein. Note that it is modified in place.
- residue:
Mol with residue only
- amap: list
List mapping atom IDs in residue to atom IDs in whole protein (amap[i] = j means that i’th atom in residue corresponds to j’th atom in protein)
- template:
Residue template
- Returns
- ——-
- protein: rdkit.Chem.rdchem.RWMol
Modified protein
- visited_bonds: list
Bonds that match the template
- is_complete: bool
Indicates whether all atoms in template were found in residue
-
oddt.toolkits.extras.rdkit.fixer.
ExtractPocketAndLigand
(mol, cutoff=12.0, expandResidues=True, ligand_residue=None, ligand_residue_blacklist=None, append_residues=None)[source]¶ Function extracting a ligand (the largest HETATM residue) and the protein pocket within certain cutoff. The selection of pocket atoms can be expanded to contain whole residues. The single atom HETATM residues are attributed to pocket (metals and waters)
- Parameters
- mol: rdkit.Chem.rdchem.Mol
Molecule with a protein ligand complex
- cutoff: float (default=12.)
Distance cutoff for the pocket atoms
- expandResidues: bool (default=True)
Expand selection to whole residues within cutoff.
- ligand_residue: string (default None)
Residue name which explicitly pint to a ligand(s).
- ligand_residue_blacklist: array-like, optional (default None)
List of residues to ignore during ligand lookup.
- append_residues: array-like, optional (default None)
List of residues to append to pocket, even if they are HETATM, such as MSE, ATP, AMP, ADP, etc.
- Returns
- pocket: rdkit.Chem.rdchem.RWMol
Pocket constructed of protein residues/atoms around ligand
- ligand: rdkit.Chem.rdchem.RWMol
Largest HETATM residue contained in input molecule
-
oddt.toolkits.extras.rdkit.fixer.
FetchAffinityTable
(pdbids, affinity_types)[source]¶ Fetch affinity data from RCSB PDB server.
- Parameters
- pdbids: array-like
List of PDB IDs of structres with protein-ligand complexes.
- affinity_types: array-like
List of types of affinity data to retrieve. Available types are: Ki, Kd, EC50, IC50, deltaG, deltaH, deltaS, Ka.
- Returns
- ligand_affinity: pd.DataFrame
Table with protein-ligand binding affinities. Table contains following columns: structureId, ligandId, ligandFormula, ligandMolecularWeight + columns named after affinity types specified byt the user.
-
oddt.toolkits.extras.rdkit.fixer.
FetchStructure
(pdbid, sanitize=False, removeHs=True, cache_dir=None)[source]¶ Fetch the structure in PDB format from RCSB PDB server and read it with rdkit.
- Parameters
- pdbid: str
PDB IDs of the structre
- sanitize: bool, optional (default False)
Toggles molecule sanitation
- removeHs: bool, optional (default False)
Indicates wheter Hs should be removed during reading
- Returns
- mol: Chem.rdchem.Mol
Retrieved molecule
-
oddt.toolkits.extras.rdkit.fixer.
GetAtomResidueId
(atom)[source]¶ Return (residue number, residue name, chain id) for a given atom
-
oddt.toolkits.extras.rdkit.fixer.
GetResidues
(mol, atom_list=None)[source]¶ Create dictrionary that maps residues to atom IDs: (res number, res name, chain id) –> [atom1 idx, atom2 idx, …]
-
oddt.toolkits.extras.rdkit.fixer.
IsResidueConnected
(mol, atom_ids)[source]¶ Check if residue with given atom IDs is connected to other residues in the molecule.
-
oddt.toolkits.extras.rdkit.fixer.
MolToTemplates
(mol)[source]¶ Prepare set of templates for a given PDB residue.
-
oddt.toolkits.extras.rdkit.fixer.
PrepareComplexes
(pdbids, pocket_dist_cutoff=12.0, affinity_types=None, cache_dir=None)[source]¶ Fetch structures and affinity data from RCSB PDB server and prepare ligand-pocket pairs for small molecules with known activites.
- Parameters
- pdbids: array-like
List of PDB IDs of structres with protein-ligand complexes.
- pocket_dist_cutoff: float, optional (default 12.)
Distance cutoff for the pocket atoms
- affinity_types: array-like, optional (default None)
List of types of affinity data to retrieve. Available types are: Ki, Kd, EC50, IC50, deltaG, deltaH, deltaS, Ka. If not specified Ki, Kd, EC50, and IC50 are used.
- Returns
- complexes: dict
Dictionary with pocket-ligand paris, structured as follows: {‘pdbid’: {‘ligid’: (pocket_mol, ligand_mol)}. Ligands have binding affinity data stored as properties.
-
oddt.toolkits.extras.rdkit.fixer.
PreparePDBMol
(mol, removeHs=True, removeHOHs=True, residue_whitelist=None, residue_blacklist=None, remove_incomplete=False, add_missing_atoms=False, custom_templates=None, replace_default_templates=False)[source]¶ - Prepares protein molecule by:
Removing Hs by hard using atomic number [default=True]
Removes HOH [default=True]
Assign bond orders from smiles of PDB residues (over 24k templates)
Removes bonds to metals
- Parameters
- mol: rdkit.Chem.rdchem.Mol
Mol with whole protein.
- removeHs: bool, optional (default True)
If True, hydrogens will be forcefully removed
- removeHOHs: bool, optional (default True)
If True, remove waters using residue name
- residue_whitelist: array-like, optional (default None)
List of residues to clean. If not specified, all residues present in the structure will be used.
- residue_blacklist: array-like, optional (default None)
List of residues to ignore during cleaning. If not specified, all residues present in the structure will be cleaned.
- remove_incomplete: bool, optional (default False)
If True, remove residues that do not fully match the template
- add_missing_atoms: bool (default=False)
Switch to add missing atoms accordingly to template SMILES structure.
- custom_templates: str or dict, optional (default None)
Custom templates for residues. Can be either path to SMILES file, or dictionary mapping names to SMILES or Mol objects
- replace_default_templates: bool, optional (default False)
Indicates whether default default templates should be replaced by cusom ones. If False, default templates will be updated with custom ones. This argument is ignored if custom_templates is None.
- Returns
- new_mol: rdkit.Chem.rdchem.RWMol
Modified protein
-
oddt.toolkits.extras.rdkit.fixer.
PreparePDBResidue
(protein, residue, amap, template)[source]¶ - Parameters
- protein: rdkit.Chem.rdchem.RWMol
Mol with whole protein. Note that it is modified in place.
- residue:
Mol with residue only
- amap: list
List mapping atom IDs in residue to atom IDs in whole protein (amap[i] = j means that i’th atom in residue corresponds to j’th atom in protein)
- template:
Residue template
- Returns
- ——-
- protein: rdkit.Chem.rdchem.RWMol
Modified protein
- visited_bonds: list
Bonds that match the template
- is_complete: bool
Indicates whether all atoms in template were found in residue
-
oddt.toolkits.extras.rdkit.fixer.
ReadTemplates
(filename, resnames)[source]¶ Load templates from file for specified residues
-
oddt.toolkits.extras.rdkit.fixer.
SimplifyMol
(mol)[source]¶ Change all bonds to single and discharge/dearomatize all atoms. The molecule is modified in-place (no copy is made).
-
oddt.toolkits.extras.rdkit.fixer.
UFFConstrainedOptimize
(mol, moving_atoms=None, fixed_atoms=None, cutoff=5.0, verbose=False)[source]¶ Minimize a molecule using UFF forcefield with a set of moving/fixed atoms. If both moving and fixed atoms are provided, fixed_atoms parameter will be ignored. The minimization is done in-place (without copying molecule).
- Parameters
- mol: rdkit.Chem.rdchem.Mol
Molecule to be minimized.
- moving_atoms: array-like (default=None)
Indices of freely moving atoms. If None, fixed atoms are assigned based on fixed_atoms. These two arguments are mutually exclusive.
- fixed_atoms: array-like (default=None)
Indices of fixed atoms. If None, fixed atoms are assigned based on moving_atoms. These two arguments are mutually exclusive.
- cutoff: float (default=10.)
Distance cutoff for the UFF minimization
- Returns
- mol: rdkit.Chem.rdchem.Mol
Molecule with mimimized moving_atoms
Module contents¶
-
oddt.toolkits.extras.rdkit.
AtomListToSubMol
(mol, amap, includeConformer=False)[source]¶ - Parameters
- mol: rdkit.Chem.rdchem.Mol
Molecule
- amap: array-like
List of atom indices (zero-based)
- includeConformer: bool (default=True)
Toogle to include atoms coordinates in submolecule.
- Returns
- submol: rdkit.Chem.rdchem.RWMol
Submol determined by specified atom list
-
oddt.toolkits.extras.rdkit.
MolFromPDBBlock
(molBlock, sanitize=True, removeHs=True, flavor=0)[source]¶
-
oddt.toolkits.extras.rdkit.
MolFromPDBQTBlock
(block, sanitize=True, removeHs=True)[source]¶ Read PDBQT block to a RDKit Molecule
- Parameters
- block: string
Residue name which explicitly pint to a ligand(s).
- sanitize: bool (default=True)
Should the sanitization be performed
- removeHs: bool (default=True)
Should hydrogens be removed when reading molecule.
- Returns
- mol: rdkit.Chem.rdchem.Mol
Molecule read from PDBQT
-
oddt.toolkits.extras.rdkit.
MolToPDBQTBlock
(mol, flexible=True, addHs=False, computeCharges=False)[source]¶ Write RDKit Molecule to a PDBQT block
- Parameters
- mol: rdkit.Chem.rdchem.Mol
Molecule with a protein ligand complex
- flexible: bool (default=True)
Should the molecule encode torsions. Ligands should be flexible, proteins in turn can be rigid.
- addHs: bool (default=False)
The PDBQT format requires at least polar Hs on donors. By default Hs are added.
- computeCharges: bool (default=False)
Should the partial charges be automatically computed. If the Hs are added the charges must and will be recomputed. If there are no partial charge information, they are set to 0.0.
- Returns
- block: str
String wit PDBQT encoded molecule