oddt.toolkits.extras.rdkit package

Submodules

oddt.toolkits.extras.rdkit.fixer module

exception oddt.toolkits.extras.rdkit.fixer.AddAtomsError[source]

Bases: Exception

oddt.toolkits.extras.rdkit.fixer.AddMissingAtoms(protein, residue, amap, template)[source]

Add missing atoms to protein molecule only at the residue according to template.

Parameters
protein: rdkit.Chem.rdchem.RWMol

Mol with whole protein. Note that it is modified in place.

residue:

Mol with residue only

amap: list

List mapping atom IDs in residue to atom IDs in whole protein (amap[i] = j means that i’th atom in residue corresponds to j’th atom in protein)

template:

Residue template

Returns
——-
protein: rdkit.Chem.rdchem.RWMol

Modified protein

visited_bonds: list

Bonds that match the template

is_complete: bool

Indicates whether all atoms in template were found in residue

oddt.toolkits.extras.rdkit.fixer.ExtractPocketAndLigand(mol, cutoff=12.0, expandResidues=True, ligand_residue=None, ligand_residue_blacklist=None, append_residues=None)[source]

Function extracting a ligand (the largest HETATM residue) and the protein pocket within certain cutoff. The selection of pocket atoms can be expanded to contain whole residues. The single atom HETATM residues are attributed to pocket (metals and waters)

Parameters
mol: rdkit.Chem.rdchem.Mol

Molecule with a protein ligand complex

cutoff: float (default=12.)

Distance cutoff for the pocket atoms

expandResidues: bool (default=True)

Expand selection to whole residues within cutoff.

ligand_residue: string (default None)

Residue name which explicitly pint to a ligand(s).

ligand_residue_blacklist: array-like, optional (default None)

List of residues to ignore during ligand lookup.

append_residues: array-like, optional (default None)

List of residues to append to pocket, even if they are HETATM, such as MSE, ATP, AMP, ADP, etc.

Returns
pocket: rdkit.Chem.rdchem.RWMol

Pocket constructed of protein residues/atoms around ligand

ligand: rdkit.Chem.rdchem.RWMol

Largest HETATM residue contained in input molecule

oddt.toolkits.extras.rdkit.fixer.FetchAffinityTable(pdbids, affinity_types)[source]

Fetch affinity data from RCSB PDB server.

Parameters
pdbids: array-like

List of PDB IDs of structres with protein-ligand complexes.

affinity_types: array-like

List of types of affinity data to retrieve. Available types are: Ki, Kd, EC50, IC50, deltaG, deltaH, deltaS, Ka.

Returns
ligand_affinity: pd.DataFrame

Table with protein-ligand binding affinities. Table contains following columns: structureId, ligandId, ligandFormula, ligandMolecularWeight + columns named after affinity types specified byt the user.

oddt.toolkits.extras.rdkit.fixer.FetchStructure(pdbid, sanitize=False, removeHs=True, cache_dir=None)[source]

Fetch the structure in PDB format from RCSB PDB server and read it with rdkit.

Parameters
pdbid: str

PDB IDs of the structre

sanitize: bool, optional (default False)

Toggles molecule sanitation

removeHs: bool, optional (default False)

Indicates wheter Hs should be removed during reading

Returns
mol: Chem.rdchem.Mol

Retrieved molecule

exception oddt.toolkits.extras.rdkit.fixer.FixerError[source]

Bases: Exception

oddt.toolkits.extras.rdkit.fixer.GetAtomResidueId(atom)[source]

Return (residue number, residue name, chain id) for a given atom

oddt.toolkits.extras.rdkit.fixer.GetResidues(mol, atom_list=None)[source]

Create dictrionary that maps residues to atom IDs: (res number, res name, chain id) –> [atom1 idx, atom2 idx, …]

oddt.toolkits.extras.rdkit.fixer.IsResidueConnected(mol, atom_ids)[source]

Check if residue with given atom IDs is connected to other residues in the molecule.

oddt.toolkits.extras.rdkit.fixer.MolToTemplates(mol)[source]

Prepare set of templates for a given PDB residue.

oddt.toolkits.extras.rdkit.fixer.PrepareComplexes(pdbids, pocket_dist_cutoff=12.0, affinity_types=None, cache_dir=None)[source]

Fetch structures and affinity data from RCSB PDB server and prepare ligand-pocket pairs for small molecules with known activites.

Parameters
pdbids: array-like

List of PDB IDs of structres with protein-ligand complexes.

pocket_dist_cutoff: float, optional (default 12.)

Distance cutoff for the pocket atoms

affinity_types: array-like, optional (default None)

List of types of affinity data to retrieve. Available types are: Ki, Kd, EC50, IC50, deltaG, deltaH, deltaS, Ka. If not specified Ki, Kd, EC50, and IC50 are used.

Returns
complexes: dict

Dictionary with pocket-ligand paris, structured as follows: {‘pdbid’: {‘ligid’: (pocket_mol, ligand_mol)}. Ligands have binding affinity data stored as properties.

oddt.toolkits.extras.rdkit.fixer.PreparePDBMol(mol, removeHs=True, removeHOHs=True, residue_whitelist=None, residue_blacklist=None, remove_incomplete=False, add_missing_atoms=False, custom_templates=None, replace_default_templates=False)[source]
Prepares protein molecule by:
  • Removing Hs by hard using atomic number [default=True]

  • Removes HOH [default=True]

  • Assign bond orders from smiles of PDB residues (over 24k templates)

  • Removes bonds to metals

Parameters
mol: rdkit.Chem.rdchem.Mol

Mol with whole protein.

removeHs: bool, optional (default True)

If True, hydrogens will be forcefully removed

removeHOHs: bool, optional (default True)

If True, remove waters using residue name

residue_whitelist: array-like, optional (default None)

List of residues to clean. If not specified, all residues present in the structure will be used.

residue_blacklist: array-like, optional (default None)

List of residues to ignore during cleaning. If not specified, all residues present in the structure will be cleaned.

remove_incomplete: bool, optional (default False)

If True, remove residues that do not fully match the template

add_missing_atoms: bool (default=False)

Switch to add missing atoms accordingly to template SMILES structure.

custom_templates: str or dict, optional (default None)

Custom templates for residues. Can be either path to SMILES file, or dictionary mapping names to SMILES or Mol objects

replace_default_templates: bool, optional (default False)

Indicates whether default default templates should be replaced by cusom ones. If False, default templates will be updated with custom ones. This argument is ignored if custom_templates is None.

Returns
new_mol: rdkit.Chem.rdchem.RWMol

Modified protein

oddt.toolkits.extras.rdkit.fixer.PreparePDBResidue(protein, residue, amap, template)[source]
Parameters
protein: rdkit.Chem.rdchem.RWMol

Mol with whole protein. Note that it is modified in place.

residue:

Mol with residue only

amap: list

List mapping atom IDs in residue to atom IDs in whole protein (amap[i] = j means that i’th atom in residue corresponds to j’th atom in protein)

template:

Residue template

Returns
——-
protein: rdkit.Chem.rdchem.RWMol

Modified protein

visited_bonds: list

Bonds that match the template

is_complete: bool

Indicates whether all atoms in template were found in residue

oddt.toolkits.extras.rdkit.fixer.ReadTemplates(filename, resnames)[source]

Load templates from file for specified residues

exception oddt.toolkits.extras.rdkit.fixer.SanitizeError[source]

Bases: Exception

oddt.toolkits.extras.rdkit.fixer.SimplifyMol(mol)[source]

Change all bonds to single and discharge/dearomatize all atoms. The molecule is modified in-place (no copy is made).

exception oddt.toolkits.extras.rdkit.fixer.SubstructureMatchError[source]

Bases: Exception

oddt.toolkits.extras.rdkit.fixer.UFFConstrainedOptimize(mol, moving_atoms=None, fixed_atoms=None, cutoff=5.0, verbose=False)[source]

Minimize a molecule using UFF forcefield with a set of moving/fixed atoms. If both moving and fixed atoms are provided, fixed_atoms parameter will be ignored. The minimization is done in-place (without copying molecule).

Parameters
mol: rdkit.Chem.rdchem.Mol

Molecule to be minimized.

moving_atoms: array-like (default=None)

Indices of freely moving atoms. If None, fixed atoms are assigned based on fixed_atoms. These two arguments are mutually exclusive.

fixed_atoms: array-like (default=None)

Indices of fixed atoms. If None, fixed atoms are assigned based on moving_atoms. These two arguments are mutually exclusive.

cutoff: float (default=10.)

Distance cutoff for the UFF minimization

Returns
mol: rdkit.Chem.rdchem.Mol

Molecule with mimimized moving_atoms

Module contents

oddt.toolkits.extras.rdkit.AtomListToSubMol(mol, amap, includeConformer=False)[source]
Parameters
mol: rdkit.Chem.rdchem.Mol

Molecule

amap: array-like

List of atom indices (zero-based)

includeConformer: bool (default=True)

Toogle to include atoms coordinates in submolecule.

Returns
submol: rdkit.Chem.rdchem.RWMol

Submol determined by specified atom list

oddt.toolkits.extras.rdkit.MolFromPDBBlock(molBlock, sanitize=True, removeHs=True, flavor=0)[source]
oddt.toolkits.extras.rdkit.MolFromPDBQTBlock(block, sanitize=True, removeHs=True)[source]

Read PDBQT block to a RDKit Molecule

Parameters
block: string

Residue name which explicitly pint to a ligand(s).

sanitize: bool (default=True)

Should the sanitization be performed

removeHs: bool (default=True)

Should hydrogens be removed when reading molecule.

Returns
mol: rdkit.Chem.rdchem.Mol

Molecule read from PDBQT

oddt.toolkits.extras.rdkit.MolToPDBQTBlock(mol, flexible=True, addHs=False, computeCharges=False)[source]

Write RDKit Molecule to a PDBQT block

Parameters
mol: rdkit.Chem.rdchem.Mol

Molecule with a protein ligand complex

flexible: bool (default=True)

Should the molecule encode torsions. Ligands should be flexible, proteins in turn can be rigid.

addHs: bool (default=False)

The PDBQT format requires at least polar Hs on donors. By default Hs are added.

computeCharges: bool (default=False)

Should the partial charges be automatically computed. If the Hs are added the charges must and will be recomputed. If there are no partial charge information, they are set to 0.0.

Returns
block: str

String wit PDBQT encoded molecule

oddt.toolkits.extras.rdkit.PDBQTAtomLines(mol, donors, acceptors)[source]

Create a list with PDBQT atom lines for each atom in molecule. Donors and acceptors are given as a list of atom indices.

oddt.toolkits.extras.rdkit.PathFromAtomList(mol, amap)[source]