oddt package¶

Submodules¶

oddt.datasets module¶

Datasets wrapped in convenient models

class oddt.datasets.CASF(home)[source]¶

Bases: object

Load CASF dataset as described in Li, Y. et al. Comparative Assessment of Scoring Functions on an Updated Benchmark: 2. Evaluation Methods and General Results. J. Chem. Inf. Model. 54, 1717-1736. (2014) http://dx.doi.org/10.1021/ci500081m

Parameters

home: string: Path to CASF dataset main directory

Methods

`precomputed_score`([scoring_function])	Load precomputed results of scoring power test for various scoring functions.
`precomputed_screening`([scoring_function, …])	Load precomputed results of screening power test for various scoring functions

precomputed_score(scoring_function=None)[source]¶

Load precomputed results of scoring power test for various scoring functions.

Parameters

scoring_function: string (default=None): Name of the scoring function to get results If None, all results are returned.

precomputed_screening(scoring_function=None, cluster_id=None)[source]¶

Load precomputed results of screening power test for various scoring functions

Parameters

scoring_function: string (default=None): Name of the scoring function to get results If None, all results are returned
cluster_id: int (default=None): Number of the protein cluster to get results If None, all results are returned

class oddt.datasets.dude(home)[source]¶

Bases: object

A wrapper for DUD-E (A Database of Useful Decoys: Enhanced) http://dude.docking.org/

Parameters

homestr: Path to files from dud-e

class oddt.datasets.pdbbind(home, version=None, default_set=None, opt=None)[source]¶

Bases: object

Attributes

activities
ids

property activities¶

property ids¶

oddt.fingerprints module¶

Module checks interactions between two molecules and creates interacion fingerprints.

oddt.fingerprints.ECFP(mol, depth=2, size=4096, count_bits=True, sparse=True, use_pharm_features=False)[source]¶

Extended connectivity fingerprints (ECFP) with an option to include atom features (FCPF). Depth of a fingerprint is counted as bond-steps, thus the depth for ECFP2 = 1, ECPF4 = 2, ECFP6 = 3, etc.

Reference: Rogers D, Hahn M. Extended-connectivity fingerprints. J Chem Inf Model. 2010;50: 742-754. http://dx.doi.org/10.1021/ci100050t

Parameters

moloddt.toolkit.Molecule object: Input molecule for the FP calculations
depthint (deafult = 2): The depth of the fingerprint, i.e. the number of bonds in Morgan algorithm. Note: For ECFP2: depth = 1, ECFP4: depth = 2, etc.
sizeint (default = 4096): Final size of fingerprint to which it is folded.
count_bitsbool (default = True): Should the bits be counted or unique. In dense representation it translates to integer array (count_bits=True) or boolean array if False.
sparsebool (default=True): Should fingerprints be dense (contain all bits) or sparse (just the on bits).
use_pharm_featuresbool (default=False): Switch to use pharmacophoric features as atom representation instead of explicit atomic numbers etc.

Returns

fingerprintnumpy array: Calsulated FP of fixed size (dense) or on bits indices (sparse). Dtype is either integer or boolean.

oddt.fingerprints.InteractionFingerprint(ligand, protein, strict=True)[source]¶

Interaction fingerprint accomplished by converting the molecular interaction of ligand-protein into bit array according to the residue of choice and the interaction. For every residue (One row = one residue) there are eight bits which represent eight type of interactions:

(Column 0) hydrophobic contacts
(Column 1) aromatic face to face
(Column 2) aromatic edge to face
(Column 3) hydrogen bond (protein as hydrogen bond donor)
(Column 4) hydrogen bond (protein as hydrogen bond acceptor)
(Column 5) salt bridges (protein positively charged)
(Column 6) salt bridges (protein negatively charged)
(Column 7) salt bridges (ionic bond with metal ion)

Parameters

ligand, proteinoddt.toolkit.Molecule object: Molecules, which are analysed in order to find interactions.
strictbool (deafult = True): If False, do not include condition, which informs whether atoms form ‘strict’ H-bond (pass all angular cutoffs).

Returns

InteractionFingerprintnumpy array: Vector of calculated IFP (size = no residues * 8 type of interaction)

oddt.fingerprints.PLEC(ligand, protein, depth_ligand=2, depth_protein=4, distance_cutoff=4.5, size=16384, count_bits=True, sparse=True, ignore_hoh=True, bits_info=None)[source]¶

Protein ligand extended connectivity fingerprint. For every pair of atoms in contact, compute ECFP and then hash every single, corresponding depth.

Parameters

ligand, proteinoddt.toolkit.Molecule object: Molecules, which are analysed in order to find interactions.
depth_ligand, depth_proteinint (deafult = (2, 4)): The depth of the fingerprint, i.e. the number of bonds in Morgan algorithm. Note: For ECFP2: depth = 1, ECFP4: depth = 2, etc.
size: int (default = 16384): SPLIF is folded to given size.
distance_cutoff: float (default=4.5): Cutoff distance for close contacts.
sparsebool (default = True): Should fingerprints be dense (contain all bits) or sparse (just the on bits).
count_bitsbool (default = True): Should the bits be counted or unique. In dense representation it translates to integer array (count_bits=True) or boolean array if False.
ignore_hohbool (default = True): Should the water molecules be ignored. This is based on the name of the residue (‘HOH’).
bits_infodict or None (default = None): If dictionary is provided it is filled with information about bit contents. Root atom index and depth is provided for both ligand and protein. Dictionary is modified in-place.

Returns

PLECnumpy array: fp (size = atoms in contacts * max(depth_protein, depth_ligand))

oddt.fingerprints.SPLIF(ligand, protein, depth=1, size=4096, distance_cutoff=4.5)[source]¶

Calculates structural protein-ligand interaction fingerprint (SPLIF), based on http://pubs.acs.org/doi/abs/10.1021/ci500319f.

Parameters

ligand, proteinoddt.toolkit.Molecule object: Molecules, which are analysed in order to find interactions.
depthint (deafult = 1): The depth of the fingerprint, i.e. the number of bonds in Morgan algorithm. Note: For ECFP2: depth = 1, ECFP4: depth = 2, etc.
size: int (default = 4096): SPLIF is folded to given size.
distance_cutoff: float (default=4.5): Cutoff distance for close contacts.

Returns

SPLIFnumpy array: Calculated SPLIF.shape = (no. of atoms, ). Every row consists of three elements:

row[0] = index of hashed atoms row[1].shape = (7, 3) -> ligand’s atom coords and 6 his neigbor’s row[2].shape = (7, 3) -> protein’s atom coords and 6 his neigbor’s

oddt.fingerprints.SimpleInteractionFingerprint(ligand, protein, strict=True)[source]¶

Based on http://dx.doi.org/10.1016/j.csbj.2014.05.004. Every IFP consists of 8 bits per amino acid (One row = one amino acid) and present eight type of interaction:

(Column 0) hydrophobic contacts
(Column 1) aromatic face to face
(Column 2) aromatic edge to face
(Column 3) hydrogen bond (protein as hydrogen bond donor)
(Column 4) hydrogen bond (protein as hydrogen bond acceptor)
(Column 5) salt bridges (protein positively charged)
(Column 6) salt bridges (protein negatively charged)
(Column 7) salt bridges (ionic bond with metal ion)

Returns matrix, which is sorted according to this pattern : ‘ALA’, ‘ARG’, ‘ASN’, ‘ASP’, ‘CYS’, ‘GLN’, ‘GLU’, ‘GLY’, ‘HIS’, ‘ILE’, ‘LEU’, ‘LYS’, ‘MET’, ‘PHE’, ‘PRO’, ‘SER’, ‘THR’, ‘TRP’, ‘TYR’, ‘VAL’, ‘’. The ‘’ means cofactor. Index of amino acid in pattern coresponds to row in returned matrix.

Parameters

ligand, proteinoddt.toolkit.Molecule object: Molecules, which are analysed in order to find interactions.
strictbool (deafult = True): If False, do not include condition, which informs whether atoms form ‘strict’ H-bond (pass all angular cutoffs).

Returns

InteractionFingerprintnumpy array: Vector of calculated IFP (size = 168)

oddt.fingerprints.dice(a, b, sparse=False)[source]¶

Calculates the Dice coefficient, the ratio of the bits in common to the arithmetic mean of the number of ‘on’ bits in the two fingerprints. Supports integer and boolean fingerprints.

Parameters

a, bnumpy array: Interaction fingerprints, which are compared in order to determine similarity.
sparsebool (default=False): Type of FPs to use. Defaults to dense form.

Returns

scorefloat: Similarity between a, b.

oddt.fingerprints.similarity_SPLIF(reference, query, rmsd_cutoff=1.0)[source]¶

Calculates similarity between structural interaction fingerprints, based on doi:http://pubs.acs.org/doi/abs/10.1021/ci500319f.

Parameters

reference, query: numpy.array: SPLIFs, which are compared in order to determine similarity.
rmsd_cutoffint (default = 1): Specific treshold for which, bits are considered as fully matching.

Returns

SimilarityScorefloat: Similarity between given fingerprints.

oddt.fingerprints.tanimoto(a, b, sparse=False)[source]¶

Tanimoto coefficient, supports boolean fingerprints. Integer fingerprints are casted to boolean.

Parameters

a, bnumpy array: Interaction fingerprints, which are compared in order to determine similarity.
sparsebool (default=False): Type of FPs to use. Defaults to dense form.

Returns

scorefloat: Similarity between a, b.

oddt.interactions module¶

Module calculates interactions between two molecules (proein-protein, protein-ligand, small-small). Currently following interacions are implemented:

hydrogen bonds

halogen bonds

pi stacking (parallel and perpendicular)

salt bridges

hydrophobic contacts

pi-cation

metal coordination

pi-metal

oddt.interactions.acceptor_metal(mol1, mol2, tolerance=30, cutoff=4)[source]¶

Returns pairs of acceptor-metal atoms, which meet metal coordination criteria Note: This function is directional (mol1 holds acceptors, mol2 holds metals)

Parameters

mol1, mol2oddt.toolkit.Molecule object: Molecules to compute acceptor and metal pairs
cutofffloat, (default=4): Distance cutoff for A-M pairs
toleranceint, (default=30): Range (+/- tolerance) from perfect direction defined by atoms hybridization in metal coordination are considered as strict.

Returns

a, datom_dict-type numpy array: Aligned arrays of atoms forming metal coordination, firstly acceptors, secondly metals.
strictnumpy array, dtype=bool: Boolean array align with atom pairs, informing whether atoms form ‘strict’ metal coordination (pass all angular cutoffs). If false, only distance cutoff is met, therefore the interaction is ‘crude’.

oddt.interactions.close_contacts(x, y, cutoff, x_column='coords', y_column='coords', cutoff_low=0.0)[source]¶

Returns pairs of atoms which are within close contac distance cutoff. The cutoff is semi-inclusive, i.e (cutoff_low, cutoff].

Parameters

x, yatom_dict-type numpy array: Atom dictionaries generated by oddt.toolkit.Molecule objects.
cutofffloat: Cutoff distance for close contacts
x_column, ycolumnstring, (default=’coords’): Column containing coordinates of atoms (or pseudo-atoms, i.e. ring centroids)
cutoff_lowfloat (default=0.): Lower bound of contacts to find (exclusive). Zero by default. .. versionadded:: 0.6

Returns

x_, y_atom_dict-type numpy array: Aligned pairs of atoms in close contact for further processing.

oddt.interactions.halogenbond_acceptor_halogen(mol1, mol2, tolerance=30, cutoff=4)[source]¶

Returns pairs of acceptor-halogen atoms, which meet halogen bond criteria

Parameters

mol1, mol2oddt.toolkit.Molecule object: Molecules to compute halogen bond acceptor and halogen pairs
cutofffloat, (default=4): Distance cutoff for A-H pairs
toleranceint, (default=30): Range (+/- tolerance) from perfect direction defined by atoms hybridization in which halogen bonds are considered as strict.

Returns

a, hatom_dict-type numpy array: Aligned arrays of atoms forming halogen bond, firstly acceptors, secondly halogens
strictnumpy array, dtype=bool: Boolean array align with atom pairs, informing whether atoms form ‘strict’ halogen bond (pass all angular cutoffs). If false, only distance cutoff is met, therefore the bond is ‘crude’.

oddt.interactions.halogenbonds(mol1, mol2, cutoff=4, tolerance=30)[source]¶

Calculates halogen bonds between molecules

Parameters

mol1, mol2oddt.toolkit.Molecule object: Molecules to compute halogen bond acceptor and halogen pairs
cutofffloat, (default=4): Distance cutoff for A-H pairs
toleranceint, (default=30): Range (+/- tolerance) from perfect direction defined by atoms hybridization in which halogen bonds are considered as strict.

Returns

mol1_atoms, mol2_atomsatom_dict-type numpy array: Aligned arrays of atoms forming halogen bond
strictnumpy array, dtype=bool: Boolean array align with atom pairs, informing whether atoms form ‘strict’ halogen bond (pass all angular cutoffs). If false, only distance cutoff is met, therefore the bond is ‘crude’.

oddt.interactions.hbond_acceptor_donor(mol1, mol2, cutoff=3.5, tolerance=30, donor_exact=False)[source]¶

Returns pairs of acceptor-donor atoms, which meet H-bond criteria

Parameters

mol1, mol2oddt.toolkit.Molecule object: Molecules to compute H-bond acceptor and H-bond donor pairs
cutofffloat, (default=3.5): Distance cutoff for A-D pairs
toleranceint, (default=30): Range (+/- tolerance) from perfect direction defined by acceptor/donor hybridization in which H-bonds are considered as strict.
donor_exactbool: Use exact protonation states for donors, i.e. require Hs on donor. By default ODDT implies some tautomeric structures as protonated, even if there is no H on specific atom.

Returns

a, datom_dict-type numpy array: Aligned arrays of atoms forming H-bond, firstly acceptors, secondly donors.
strictnumpy array, dtype=bool: Boolean array align with atom pairs, informing whether atoms form ‘strict’ H-bond (pass all angular cutoffs). If false, only distance cutoff is met, therefore the bond is ‘crude’.

oddt.interactions.hbonds(mol1, mol2, cutoff=3.5, tolerance=30, mol1_exact=False, mol2_exact=False)[source]¶

Calculates H-bonds between molecules

Parameters

mol1, mol2oddt.toolkit.Molecule object: Molecules to compute H-bond acceptor and H-bond donor pairs
cutofffloat, (default=3.5): Distance cutoff for A-D pairs
toleranceint, (default=30): Range (+/- tolerance) from perfect direction defined by atoms hybridization in which H-bonds are considered as strict.
mol1_exact, mol2_exactbool: If set to true, exact protonations states are used for respective molecules, i.e. require Hs. By default ODDT implies some tautomeric structures as protonated, even if there is no H on specific atom.

Returns

mol1_atoms, mol2_atomsatom_dict-type numpy array: Aligned arrays of atoms forming H-bond
strictnumpy array, dtype=bool: Boolean array align with atom pairs, informing whether atoms form ‘strict’ H-bond (pass all angular cutoffs). If false, only distance cutoff is met, therefore the bond is ‘crude’.

oddt.interactions.hydrophobic_contacts(mol1, mol2, cutoff=4)[source]¶

Calculates hydrophobic contacts between molecules

Parameters

mol1, mol2oddt.toolkit.Molecule object: Molecules to compute hydrophobe pairs
cutofffloat, (default=4): Distance cutoff for hydrophobe pairs

Returns

mol1_atoms, mol2_atomsatom_dict-type numpy array: Aligned arrays of atoms forming hydrophobic contacts

oddt.interactions.pi_cation(mol1, mol2, cutoff=5, tolerance=30, cation_exact=False)[source]¶

Returns pairs of ring-cation atoms, which meet pi-cation criteria

Parameters

mol1, mol2oddt.toolkit.Molecule object: Molecules to compute ring-cation pairs
cutofffloat, (default=5): Distance cutoff for Pi-cation pairs
toleranceint, (default=30): Range (+/- tolerance) from perfect direction (perpendicular) in which pi-cation are considered as strict.
cation_exactbool: Requires interacting atoms to have non-zero formal charge.

Returns

r1ring_dict-type numpy array: Aligned rings forming pi-stacking
plus2atom_dict-type numpy array: Aligned cations forming pi-cation
strict_parallelnumpy array, dtype=bool: Boolean array align with ring-cation pairs, informing whether they form ‘strict’ pi-cation. If false, only distance cutoff is met, therefore the interaction is ‘crude’.

oddt.interactions.pi_metal(mol1, mol2, cutoff=5, tolerance=30)[source]¶

Returns pairs of ring-metal atoms, which meet pi-metal criteria

Parameters

mol1, mol2oddt.toolkit.Molecule object: Molecules to compute ring-metal pairs
cutofffloat, (default=5): Distance cutoff for Pi-metal pairs
toleranceint, (default=30): Range (+/- tolerance) from perfect direction (perpendicular) in which pi-metal are considered as strict.

Returns

r1ring_dict-type numpy array: Aligned rings forming pi-metal
matom_dict-type numpy array: Aligned metals forming pi-metal
strict_parallelnumpy array, dtype=bool: Boolean array align with ring-metal pairs, informing whether they form ‘strict’ pi-metal. If false, only distance cutoff is met, therefore the interaction is ‘crude’.

oddt.interactions.pi_stacking(mol1, mol2, cutoff=5, tolerance=30)[source]¶

Returns pairs of rings, which meet pi stacking criteria

Parameters

mol1, mol2oddt.toolkit.Molecule object: Molecules to compute ring pairs
cutofffloat, (default=5): Distance cutoff for Pi-stacking pairs
toleranceint, (default=30): Range (+/- tolerance) from perfect direction (parallel or perpendicular) in which pi-stackings are considered as strict.

Returns

r1, r2ring_dict-type numpy array: Aligned arrays of rings forming pi-stacking
strict_parallelnumpy array, dtype=bool: Boolean array align with ring pairs, informing whether rings form ‘strict’ parallel pi-stacking. If false, only distance cutoff is met, therefore the stacking is ‘crude’.
strict_perpendicularnumpy array, dtype=bool: Boolean array align with ring pairs, informing whether rings form ‘strict’ perpendicular pi-stacking (T-shaped, T-face, etc.). If false, only distance cutoff is met, therefore the stacking is ‘crude’.

oddt.interactions.salt_bridge_plus_minus(mol1, mol2, cutoff=4, cation_exact=False, anion_exact=False)[source]¶

Returns pairs of plus-mins atoms, which meet salt bridge criteria

Parameters

mol1, mol2oddt.toolkit.Molecule object: Molecules to compute plus and minus pairs
cutofffloat, (default=4): Distance cutoff for A-H pairs
cation_exact, anion_exactbool: Requires interacting atoms to have non-zero formal charge.

Returns

plus, minusatom_dict-type numpy array: Aligned arrays of atoms forming salt bridge, firstly plus, secondly minus

oddt.interactions.salt_bridges(mol1, mol2, cutoff=4, mol1_exact=False, mol2_exact=False)[source]¶

Calculates salt bridges between molecules

Parameters

mol1, mol2oddt.toolkit.Molecule object: Molecules to compute plus and minus pairs
cutofffloat, (default=4): Distance cutoff for plus-minus pairs
cation_exact, anion_exactbool: Requires interacting atoms to have non-zero formal charge.

Returns

mol1_atoms, mol2_atomsatom_dict-type numpy array: Aligned arrays of atoms forming salt bridges

oddt.metrics module¶

Metrics for estimating performance of drug discovery methods implemented in ODDT

oddt.metrics.auc(x, y)[source]¶

Compute Area Under the Curve (AUC) using the trapezoidal rule.

This is a general function, given points on a curve. For computing the area under the ROC-curve, see roc_auc_score(). For an alternative way to summarize a precision-recall curve, see average_precision_score().

Parameters

xndarray of shape (n,): x coordinates. These must be either monotonic increasing or monotonic decreasing.
yndarray of shape, (n,): y coordinates.

Returns

aucfloat

oddt.pandas module¶

oddt.shape module¶

oddt.shape.common_usr(molecule, ctd=None, cst=None, fct=None, ftf=None, atoms_type=None)[source]¶

Function used in USR and USRCAT function

Parameters

moleculeoddt.toolkit.Molecule: Molecule to compute USR shape descriptor
ctdnumpy array or None (default = None): Coordinates of the molecular centroid If ‘None’, the point is calculated
cstnumpy array or None (default = None): Coordinates of the closest atom to the molecular centroid If ‘None’, the point is calculated
fctnumpy array or None (default = None): Coordinates of the farthest atom to the molecular centroid If ‘None’, the point is calculated
ftfnumpy array or None (default = None): Coordinates of the farthest atom to the farthest atom to the molecular centroid If ‘None’, the point is calculated
atoms_typestr or None (default None): Type of atoms to be selected from atom_dict If ‘None’, all atoms are used to calculate shape descriptor

Returns

shape_descriptornumpy array, shape = (12): Array describing shape of molecule

oddt.shape.electroshape(mol)[source]¶

Computes shape descriptor based on Armstrong, M. S. et al. ElectroShape: fast molecular similarity calculations incorporating shape, chirality and electrostatics. J Comput Aided Mol Des 24, 789-801 (2010). http://dx.doi.org/doi:10.1007/s10822-010-9374-0

Aside from spatial coordinates, atoms’ charges are also used as the fourth dimension to describe shape of the molecule.

Parameters

moloddt.toolkit.Molecule: Molecule to compute Electroshape descriptor

Returns

shape_descriptornumpy array, shape = (15): Array describing shape of molecule

oddt.shape.usr(molecule)[source]¶

Computes USR shape descriptor based on Ballester PJ, Richards WG (2007). Ultrafast shape recognition to search compound databases for similar molecular shapes. Journal of computational chemistry, 28(10):1711-23. http://dx.doi.org/10.1002/jcc.20681

Parameters

moleculeoddt.toolkit.Molecule: Molecule to compute USR shape descriptor

Returns

shape_descriptornumpy array, shape = (12): Array describing shape of molecule

oddt.shape.usr_cat(molecule)[source]¶

Computes USRCAT shape descriptor based on Adrian M Schreyer, Tom Blundell (2012). USRCAT: real-time ultrafast shape recognition with pharmacophoric constraints. Journal of Cheminformatics, 2012 4:27. http://dx.doi.org/10.1186/1758-2946-4-27

Parameters

moleculeoddt.toolkit.Molecule: Molecule to compute USRCAT shape descriptor

Returns

shape_descriptornumpy array, shape = (60): Array describing shape of molecule

oddt.shape.usr_similarity(mol1_shape, mol2_shape, ow=1.0, hw=1.0, rw=1.0, aw=1.0, dw=1.0)[source]¶

Computes similarity between molecules

Parameters

mol1_shapenumpy array: USR shape descriptor
mol2_shapenumpy array: USR shape descriptor
owfloat (default = 1.): Scaling factor for all atoms Only used for USRCAT, ignored for other types
hwfloat (default = 1.): Scaling factor for hydrophobic atoms Only used for USRCAT, ignored for other types
rwfloat (default = 1.): Scaling factor for aromatic atoms Only used for USRCAT, ignored for other types
awfloat (default = 1.): Scaling factor for acceptors Only used for USRCAT, ignored for other types
dwfloat (default = 1.): Scaling factor for donors Only used for USRCAT, ignored for other types

Returns

similarityfloat from 0 to 1: Similarity between shapes of molecules, 1 indicates identical molecules

oddt.spatial module¶

Spatial functions included in ODDT Mainly used by other modules, but can be accessed directly.

oddt.spatial.angle(p1, p2, p3)[source]¶

Returns an angle from a series of 3 points (point #2 is centroid). Angle is returned in degrees.

Parameters

p1,p2,p3numpy arrays, shape = [n_points, n_dimensions]: Triplets of points in n-dimensional space, aligned in rows.

Returns

anglesnumpy array, shape = [n_points]: Series of angles in degrees

oddt.spatial.angle_2v(v1, v2)[source]¶

Returns an angle between two vecors.Angle is returned in degrees.

Parameters

v1,v2numpy arrays, shape = [n_vectors, n_dimensions]: Pairs of vectors in n-dimensional space, aligned in rows.

Returns

anglesnumpy array, shape = [n_vectors]: Series of angles in degrees

oddt.spatial.dihedral(p1, p2, p3, p4)[source]¶

Returns an dihedral angle from a series of 4 points. Dihedral is returned in degrees. Function distingishes clockwise and antyclockwise dihedrals.

Parameters

p1, p2, p3, p4numpy arrays, shape = [n_points, n_dimensions]: Quadruplets of points in n-dimensional space, aligned in rows.

Returns

anglesnumpy array, shape = [n_points]: Series of angles in degrees

oddt.spatial.distance(x, y)[source]¶

Computes distance between each pair of points from x and y.

Parameters

xnumpy arrays, shape = [n_x, 3]: Array of poinds in 3D
ynumpy arrays, shape = [n_y, 3]: Array of poinds in 3D

Returns

dist_matrixnumpy arrays, shape = [n_x, n_y]: Distance matrix

oddt.spatial.rmsd(ref, mol, ignore_h=True, method=None, normalize=False)[source]¶

Computes root mean square deviation (RMSD) between two molecules (including or excluding Hydrogens). No symmetry checks are performed.

Parameters

refoddt.toolkit.Molecule object

Reference molecule for the RMSD calculation

moloddt.toolkit.Molecule object

Query molecule for RMSD calculation

ignore_hbool (default=False)

Flag indicating to ignore Hydrogen atoms while performing RMSD calculation. This toggle works only with ‘hungarian’ method and without sorting (method=None).

methodstr (default=None)

The method to be used for atom asignment between ref and mol. None means that direct matching is applied, which is the default behavior. Available methods:

canonize - match heavy atoms using canonical ordering (it forces

ignoring H’s) - hungarian - minimize RMSD using Hungarian algorithm - min_symmetry - makes multiple molecule-molecule matches and finds minimal RMSD (the slowest). Hydrogens are ignored.

normalizebool (default=False)

Normalize RMSD by square root of rot. bonds

Returns

rmsdfloat: RMSD between two molecules

oddt.spatial.rotate(coords, alpha, beta, gamma)[source]¶

Rotate coords by cerain angle in X, Y, Z. Angles are specified in radians.

Parameters

coordsnumpy arrays, shape = [n_points, 3]: Coordinates in 3-dimensional space.
alpha, beta, gamma: float: Angles to rotate the coordinates along X, Y and Z axis. Angles are specified in radians.

Returns

new_coordsnumpy arrays, shape = [n_points, 3]: Rorated coordinates in 3-dimensional space.

oddt.surface module¶

This module generates and does computation with molecular surfaces.

oddt.surface.find_surface_residues(molecule, max_dist=None, scaling=1.0)[source]¶

Finds residues close to the molecular surface using generate_surface_marching_cubes. Ignores hydrogens and waters present in the molecule.

Parameters

moleculeoddt.toolkit.Molecule: Molecule to find surface residues in.
max_distarray_like, numeric or None (default = None): Maximum distance from the surface where residues would still be considered close. If None, compares distances to radii of respective atoms.
scalingfloat (default = 1.0): Expands the grid in which computation is done by generate_surface_marching_cubes by a factor of scaling. Results in a more accurate representation of the surface, and therefore more accurate computation of distances but increases computation time.

Returns

atom_dictnumpy array: An atom_dict containing only the surface residues from the original molecule.

oddt.surface.generate_surface_marching_cubes(molecule, remove_hoh=False, scaling=1.0, probe_radius=1.4)[source]¶

Generates a molecular surface mesh using the marching_cubes method from scikit-image. Ignores hydrogens present in the molecule.

Parameters

moleculeoddt.toolkit.Molecule object: Molecule for which the surface will be generated
remove_hohbool (default = False): If True, remove waters from the molecule before generating the surface. Requires molecule.protein to be set to True.
scalingfloat (default = 1.0): Expands the grid in which computation is done by a factor of scaling. Results in a more accurate representation of the surface, but increases computation time.
probe_radiusfloat (default = 1.4): Radius of a ball used to patch up holes inside the molecule resulting from some molecular distances being larger (usually in protein). Basically reduces the surface to one accesible by other molecules of radius smaller than probe_radius.

Returns

vertsnumpy array: Spatial coordinates for mesh vertices.
facesnumpy array: Faces are defined by referencing vertices from verts.

oddt.utils module¶

Common utilities for ODDT

oddt.utils.check_molecule(mol, force_protein=False, force_coords=False, non_zero_atoms=False)[source]¶

Universal validator of molecule objects. Usage of positional arguments is allowed only for molecule object, otherwise it is prohibitted (i.e. the order of arguments will change). Desired properties of molecule are validated based on specified arguments. By default only the object type is checked. In case of discrepancy to the specification a ValueError is raised with appropriate message.

New in version 0.6.

Parameters

mol: oddt.toolkit.Molecule object: Object to verify
force_protein: bool (default=False): Force the molecule to be marked as protein (mol.protein).
force_coords: bool (default=False): Force the molecule to have non-zero coordinates.
non_zero_atoms: bool (default=False): Check if molecule has at least one atom.

oddt.utils.chunker(iterable, chunksize=100)[source]¶: Generate chunks from a generator object. If iterable is passed which is not a generator it will be converted to one with iter().

New in version 0.6.

oddt.utils.compose_iter(iterable, funcs)[source]¶: Chain functions and apply them to iterable, by exhausting the iterable. Functions are executed in the order from funcs.

New in version 0.6.

oddt.utils.is_molecule(obj)[source]¶: Check whether an object is an oddt.toolkits.{rdk,ob}.Molecule instance.

New in version 0.6.

oddt.utils.is_openbabel_molecule(obj)[source]¶: Check whether an object is an oddt.toolkits.ob.Molecule instance.

New in version 0.6.

oddt.utils.is_rdkit_molecule(obj)[source]¶: Check whether an object is an oddt.toolkits.rdk.Molecule instance.

New in version 0.6.

oddt.utils.method_caller(obj, methodname, *args, **kwargs)[source]¶: Helper function to workaround Python 2 pickle limitations to parallelize methods and generator objects

oddt.virtualscreening module¶

ODDT pipeline framework for virtual screening

class oddt.virtualscreening.virtualscreening(n_cpu=- 1, verbose=False, chunksize=100)[source]¶

Bases: object

Virtual Screening pipeline stack

Parameters

n_cpu: int (default=-1): The number of parallel procesors to use
verbose: bool (default=False): Verbosity flag for some methods

Methods

`apply_filter`(expression[, soft_fail])	Filtering method, can use raw expressions (strings to be evaled in if statement, can use oddt.toolkit.Molecule methods, eg. mol.molwt < 500) Currently supported presets: * Lipinski Rule of 5 (‘ro5’ or ‘l5’) * Fragment Rule of 3 (‘ro3’) * PAINS filter (‘pains’).
`dock`(engine, protein, args, *kwargs)	Docking procedure.
`fetch`()	A method to exhaust the pipeline.
`load_ligands`(fmt, ligands_file, **kwargs)	Loads file with ligands.
`score`(function[, protein])	Scoring procedure compatible with any scoring function implemented in ODDT and other pickled SFs which are subclasses of oddt.scoring.scorer.
`similarity`(method, query[, cutoff, protein])	Similarity filter. Supported structural methods:
`write`(fmt, filename[, csv_filename])	Outputs molecules to a file
`write_csv`(csv_filename[, fields, keep_pipe])	Outputs molecules to a csv file

apply_filter(expression, soft_fail=0)[source]¶

Filtering method, can use raw expressions (strings to be evaled in if statement, can use oddt.toolkit.Molecule methods, eg. mol.molwt < 500) Currently supported presets:

Lipinski Rule of 5 (‘ro5’ or ‘l5’)

Fragment Rule of 3 (‘ro3’)

PAINS filter (‘pains’)

Parameters

expression: string or list of strings: Expresion(s) to be used while filtering.
soft_fail: int (default=0): The number of faulures molecule can have to pass filter, aka. soft-fails.

dock(engine, protein, *args, **kwargs)[source]¶

Docking procedure.

Parameters

engine: string: Which docking engine to use.

Notes

Additional parameters are passed directly to the engine. Following docking engines are supported:

1. Audodock Vina (`engine="autodock_vina"`), see oddt.docking.autodock_vina.

fetch()[source]¶: A method to exhaust the pipeline. Itself it is lazy (a generator)

load_ligands(fmt, ligands_file, **kwargs)[source]¶

Loads file with ligands.

Parameters

file_type: string: Type of molecular file
ligands_file: string: Path to a file, which is loaded to pipeline

score(function, protein=None, *args, **kwargs)[source]¶

Scoring procedure compatible with any scoring function implemented in ODDT and other pickled SFs which are subclasses of oddt.scoring.scorer.

Parameters

function: string: Which scoring function to use.
protein: oddt.toolkit.Molecule: Default protein to use as reference

Notes

Additional parameters are passed directly to the scoring function.

similarity(method, query, cutoff=0.9, protein=None)[source]¶

Similarity filter. Supported structural methods:

ift: interaction fingerprints
sift: simple interaction fingerprints
usr: Ultrafast Shape recognition
usr_cat: Ultrafast Shape recognition, Credo Atom Types
electroshape: Electroshape, an USR method including partial charges

Parameters

method: string: Similarity method used to compare molecules. Avaiale methods: * ifp - interaction fingerprint (requires a receptor) * sifp - simple interaction fingerprint (requires a receptor) * usr - Ultrafast Shape Reckognition * usr_cat - USR, with CREDO atom types * electroshape - Electroshape, USR with moments representing partial charge
query: oddt.toolkit.Molecule or list of oddt.toolkit.Molecule: Query molecules to compare the pipeline to.
cutoff: float: Similarity cutoff for filtering molecules. Any similarity lower than it will be filtered out.
protein: oddt.toolkit.Molecule (default = None): Protein for underling method. By default it’s empty, but sturctural fingerprints need one.

write(fmt, filename, csv_filename=None, **kwargs)[source]¶

Outputs molecules to a file

Parameters

file_type: string: Type of molecular file
ligands_file: string: Path to a output file
csv_filename: string: Optional path to a CSV file

write_csv(csv_filename, fields=None, keep_pipe=False, **kwargs)[source]¶

Outputs molecules to a csv file

Parameters

csv_filename: string: Optional path to a CSV file
fields: list (default None): List of fields to save in CSV file
keep_pipe: bool (default=False): If set to True, the ligand pipe is sustained.

Module contents¶

Open Drug Discovery Toolkit¶

Universal and easy to use resource for various drug discovery tasks, ie docking, virutal screening, rescoring.

Attributes¶

toolkitmodule,: Toolkits backend module, currenlty OpenBabel [ob] and RDKit [rdk]. This setting is toolkit-wide, and sets given toolkit as default

oddt package¶

Subpackages¶

Submodules¶

oddt.datasets module¶

oddt.fingerprints module¶

oddt.interactions module¶

oddt.metrics module¶

oddt.pandas module¶

oddt.shape module¶

oddt.spatial module¶

oddt.surface module¶

oddt.utils module¶

oddt.virtualscreening module¶

Module contents¶

Open Drug Discovery Toolkit¶

Attributes¶