oddt package

Submodules

oddt.datasets module

Datasets wrapped in conviniet models

class oddt.datasets.pdbbind(home, version=None, default_set=None, data_file=None, opt={})[source]

Bases: object

Attributes

activities
ids
activities
ids

oddt.interactions module

Module calculates interactions between two molecules (proein-protein, protein-ligand, small-small). Currently following interacions are implemented:

  • hydrogen bonds
  • halogen bonds
  • pi stacking (parallel and perpendicular)
  • salt bridges
  • hydrophobic contacts
  • pi-cation
  • metal coordination
  • pi-metal
oddt.interactions.close_contacts(x, y, cutoff, x_column='coords', y_column='coords')[source]

Returns pairs of atoms which are within close contac distance cutoff.

Parameters:

x, y : atom_dict-type numpy array

Atom dictionaries generated by oddt.toolkit.Molecule objects.

cutoff : float

Cutoff distance for close contacts

x_column, ycolumn : string, (default=’coords’)

Column containing coordinates of atoms (or pseudo-atoms, i.e. ring centroids)

Returns:

x_, y_ : atom_dict-type numpy array

Aligned pairs of atoms in close contact for further processing.

oddt.interactions.hbond_acceptor_donor(mol1, mol2, cutoff=3.5, base_angle=120, tolerance=30)[source]

Returns pairs of acceptor-donor atoms, which meet H-bond criteria

Parameters:

mol1, mol2 : oddt.toolkit.Molecule object

Molecules to compute H-bond acceptor and H-bond donor pairs

cutoff : float, (default=3.5)

Distance cutoff for A-D pairs

base_angle : int, (default=120)

Base angle determining allowed direction of hydrogen bond formation, which is devided by the number of neighbors of acceptor atom to establish final directional angle

tolerance : int, (default=30)

Range (+/- tolerance) from perfect direction (base_angle/n_neighbors) in which H-bonds are considered as strict.

Returns:

a, d : atom_dict-type numpy array

Aligned arrays of atoms forming H-bond, firstly acceptors, secondly donors.

strict : numpy array, dtype=bool

Boolean array align with atom pairs, informing whether atoms form ‘strict’ H-bond (pass all angular cutoffs). If false, only distance cutoff is met, therefore the bond is ‘crude’.

oddt.interactions.hbond(mol1, mol2, *args, **kwargs)[source]

Calculates H-bonds between molecules

Parameters:

mol1, mol2 : oddt.toolkit.Molecule object

Molecules to compute H-bond acceptor and H-bond donor pairs

cutoff : float, (default=3.5)

Distance cutoff for A-D pairs

base_angle : int, (default=120)

Base angle determining allowed direction of hydrogen bond formation, which is devided by the number of neighbors of acceptor atom to establish final directional angle

tolerance : int, (default=30)

Range (+/- tolerance) from perfect direction (base_angle/n_neighbors) in which H-bonds are considered as strict.

Returns:

mol1_atoms, mol2_atoms : atom_dict-type numpy array

Aligned arrays of atoms forming H-bond

strict : numpy array, dtype=bool

Boolean array align with atom pairs, informing whether atoms form ‘strict’ H-bond (pass all angular cutoffs). If false, only distance cutoff is met, therefore the bond is ‘crude’.

oddt.interactions.halogenbond_acceptor_halogen(mol1, mol2, base_angle_acceptor=120, base_angle_halogen=180, tolerance=30, cutoff=4)[source]

Returns pairs of acceptor-halogen atoms, which meet halogen bond criteria

Parameters:

mol1, mol2 : oddt.toolkit.Molecule object

Molecules to compute halogen bond acceptor and halogen pairs

cutoff : float, (default=4)

Distance cutoff for A-H pairs

base_angle_acceptor : int, (default=120)

Base angle determining allowed direction of halogen bond formation, which is devided by the number of neighbors of acceptor atom to establish final directional angle

base_angle_halogen : int (default=180)

Ideal base angle between halogen bond and halogen-neighbor bond

tolerance : int, (default=30)

Range (+/- tolerance) from perfect direction (base_angle/n_neighbors) in which halogen bonds are considered as strict.

Returns:

a, h : atom_dict-type numpy array

Aligned arrays of atoms forming halogen bond, firstly acceptors, secondly halogens

strict : numpy array, dtype=bool

Boolean array align with atom pairs, informing whether atoms form ‘strict’ halogen bond (pass all angular cutoffs). If false, only distance cutoff is met, therefore the bond is ‘crude’.

oddt.interactions.halogenbond(mol1, mol2, **kwargs)[source]

Calculates halogen bonds between molecules

Parameters:

mol1, mol2 : oddt.toolkit.Molecule object

Molecules to compute halogen bond acceptor and halogen pairs

cutoff : float, (default=4)

Distance cutoff for A-H pairs

base_angle_acceptor : int, (default=120)

Base angle determining allowed direction of halogen bond formation, which is devided by the number of neighbors of acceptor atom to establish final directional angle

base_angle_halogen : int (default=180)

Ideal base angle between halogen bond and halogen-neighbor bond

tolerance : int, (default=30)

Range (+/- tolerance) from perfect direction (base_angle/n_neighbors) in which halogen bonds are considered as strict.

Returns:

mol1_atoms, mol2_atoms : atom_dict-type numpy array

Aligned arrays of atoms forming halogen bond

strict : numpy array, dtype=bool

Boolean array align with atom pairs, informing whether atoms form ‘strict’ halogen bond (pass all angular cutoffs). If false, only distance cutoff is met, therefore the bond is ‘crude’.

oddt.interactions.pi_stacking(mol1, mol2, cutoff=5, tolerance=30)[source]

Returns pairs of rings, which meet pi stacking criteria

Parameters:

mol1, mol2 : oddt.toolkit.Molecule object

Molecules to compute ring pairs

cutoff : float, (default=5)

Distance cutoff for Pi-stacking pairs

tolerance : int, (default=30)

Range (+/- tolerance) from perfect direction (parallel or perpendicular) in which pi-stackings are considered as strict.

Returns:

r1, r2 : ring_dict-type numpy array

Aligned arrays of rings forming pi-stacking

strict_parallel : numpy array, dtype=bool

Boolean array align with ring pairs, informing whether rings form ‘strict’ parallel pi-stacking. If false, only distance cutoff is met, therefore the stacking is ‘crude’.

strict_perpendicular : numpy array, dtype=bool

Boolean array align with ring pairs, informing whether rings form ‘strict’ perpendicular pi-stacking (T-shaped, T-face, etc.). If false, only distance cutoff is met, therefore the stacking is ‘crude’.

oddt.interactions.salt_bridge_plus_minus(mol1, mol2, cutoff=4)[source]

Returns pairs of plus-mins atoms, which meet salt bridge criteria

Parameters:

mol1, mol2 : oddt.toolkit.Molecule object

Molecules to compute plus and minus pairs

cutoff : float, (default=4)

Distance cutoff for A-H pairs

Returns:

plus, minus : atom_dict-type numpy array

Aligned arrays of atoms forming salt bridge, firstly plus, secondly minus

oddt.interactions.salt_bridges(mol1, mol2, *args, **kwargs)[source]

Calculates salt bridges between molecules

Parameters:

mol1, mol2 : oddt.toolkit.Molecule object

Molecules to compute plus and minus pairs

cutoff : float, (default=4)

Distance cutoff for plus-minus pairs

Returns:

mol1_atoms, mol2_atoms : atom_dict-type numpy array

Aligned arrays of atoms forming salt bridges

oddt.interactions.hydrophobic_contacts(mol1, mol2, cutoff=4)[source]

Calculates hydrophobic contacts between molecules

Parameters:

mol1, mol2 : oddt.toolkit.Molecule object

Molecules to compute hydrophobe pairs

cutoff : float, (default=4)

Distance cutoff for hydrophobe pairs

Returns:

mol1_atoms, mol2_atoms : atom_dict-type numpy array

Aligned arrays of atoms forming hydrophobic contacts

oddt.interactions.pi_cation(mol1, mol2, cutoff=5, tolerance=30)[source]

Returns pairs of ring-cation atoms, which meet pi-cation criteria

Parameters:

mol1, mol2 : oddt.toolkit.Molecule object

Molecules to compute ring-cation pairs

cutoff : float, (default=5)

Distance cutoff for Pi-cation pairs

tolerance : int, (default=30)

Range (+/- tolerance) from perfect direction (perpendicular) in which pi-cation are considered as strict.

Returns:

r1 : ring_dict-type numpy array

Aligned rings forming pi-stacking

plus2 : atom_dict-type numpy array

Aligned cations forming pi-cation

strict_parallel : numpy array, dtype=bool

Boolean array align with ring-cation pairs, informing whether they form ‘strict’ pi-cation. If false, only distance cutoff is met, therefore the interaction is ‘crude’.

oddt.interactions.acceptor_metal(mol1, mol2, base_angle=120, tolerance=30, cutoff=4)[source]

Returns pairs of acceptor-metal atoms, which meet metal coordination criteria Note: This function is directional (mol1 holds acceptors, mol2 holds metals)

Parameters:

mol1, mol2 : oddt.toolkit.Molecule object

Molecules to compute acceptor and metal pairs

cutoff : float, (default=4)

Distance cutoff for A-M pairs

base_angle : int, (default=120)

Base angle determining allowed direction of metal coordination, which is devided by the number of neighbors of acceptor atom to establish final directional angle

tolerance : int, (default=30)

Range (+/- tolerance) from perfect direction (base_angle/n_neighbors) in metal coordination are considered as strict.

Returns:

a, d : atom_dict-type numpy array

Aligned arrays of atoms forming metal coordination, firstly acceptors, secondly metals.

strict : numpy array, dtype=bool

Boolean array align with atom pairs, informing whether atoms form ‘strict’ metal coordination (pass all angular cutoffs). If false, only distance cutoff is met, therefore the interaction is ‘crude’.

oddt.interactions.pi_metal(mol1, mol2, cutoff=5, tolerance=30)[source]

Returns pairs of ring-metal atoms, which meet pi-metal criteria

Parameters:

mol1, mol2 : oddt.toolkit.Molecule object

Molecules to compute ring-metal pairs

cutoff : float, (default=5)

Distance cutoff for Pi-metal pairs

tolerance : int, (default=30)

Range (+/- tolerance) from perfect direction (perpendicular) in which pi-metal are considered as strict.

Returns:

r1 : ring_dict-type numpy array

Aligned rings forming pi-metal

m : atom_dict-type numpy array

Aligned metals forming pi-metal

strict_parallel : numpy array, dtype=bool

Boolean array align with ring-metal pairs, informing whether they form ‘strict’ pi-metal. If false, only distance cutoff is met, therefore the interaction is ‘crude’.

oddt.metrics module

Metrics for estimating performance of drug discovery methods implemented in ODDT

oddt.metrics.roc(y_true, y_score, pos_label=None, sample_weight=None, drop_intermediate=True)

Compute Receiver operating characteristic (ROC)

Note: this implementation is restricted to the binary classification task.

Read more in the User Guide.

Parameters:

y_true : array, shape = [n_samples]

True binary labels in range {0, 1} or {-1, 1}. If labels are not binary, pos_label should be explicitly given.

y_score : array, shape = [n_samples]

Target scores, can either be probability estimates of the positive class or confidence values.

pos_label : int

Label considered as positive and others are considered negative.

sample_weight : array-like of shape = [n_samples], optional

Sample weights.

drop_intermediate : boolean, optional (default=True)

Whether to drop some suboptimal thresholds which would not appear on a plotted ROC curve. This is useful in order to create lighter ROC curves.

New in version 0.17: parameter drop_intermediate.

Returns:

fpr : array, shape = [>2]

Increasing false positive rates such that element i is the false positive rate of predictions with score >= thresholds[i].

tpr : array, shape = [>2]

Increasing true positive rates such that element i is the true positive rate of predictions with score >= thresholds[i].

thresholds : array, shape = [n_thresholds]

Decreasing thresholds on the decision function used to compute fpr and tpr. thresholds[0] represents no instances being predicted and is arbitrarily set to max(y_score) + 1.

See also

roc_auc_score
Compute Area Under the Curve (AUC) from prediction scores

Notes

Since the thresholds are sorted from low to high values, they are reversed upon returning them to ensure they correspond to both fpr and tpr, which are sorted in reversed order during their calculation.

References

[R1]Wikipedia entry for the Receiver operating characteristic

Examples

>>> import numpy as np
>>> from sklearn import metrics
>>> y = np.array([1, 1, 2, 2])
>>> scores = np.array([0.1, 0.4, 0.35, 0.8])
>>> fpr, tpr, thresholds = metrics.roc_curve(y, scores, pos_label=2)
>>> fpr
array([ 0. ,  0.5,  0.5,  1. ])
>>> tpr
array([ 0.5,  0.5,  1. ,  1. ])
>>> thresholds
array([ 0.8 ,  0.4 ,  0.35,  0.1 ])
oddt.metrics.auc(x, y, reorder=False)[source]

Compute Area Under the Curve (AUC) using the trapezoidal rule

This is a general function, given points on a curve. For computing the area under the ROC-curve, see roc_auc_score().

Parameters:

x : array, shape = [n]

x coordinates.

y : array, shape = [n]

y coordinates.

reorder : boolean, optional (default=False)

If True, assume that the curve is ascending in the case of ties, as for an ROC curve. If the curve is non-ascending, the result will be wrong.

Returns:

auc : float

See also

roc_auc_score
Computes the area under the ROC curve
precision_recall_curve
Compute precision-recall pairs for different probability thresholds

Examples

>>> import numpy as np
>>> from sklearn import metrics
>>> y = np.array([1, 1, 2, 2])
>>> pred = np.array([0.1, 0.4, 0.35, 0.8])
>>> fpr, tpr, thresholds = metrics.roc_curve(y, pred, pos_label=2)
>>> metrics.auc(fpr, tpr)
0.75
oddt.metrics.roc_auc(y_true, y_score, pos_label=None, ascending_score=True)[source]

Computes ROC AUC score

Parameters:

y_true : array, shape=[n_samples]

True binary labels, in range {0,1} or {-1,1}. If positive label is different than 1, it must be explicitly defined.

y_score : array, shape=[n_samples]

Scores for tested series of samples

pos_label: int

Positive label of samples (if other than 1)

ascending_score: bool (default=True)

Indicates if your score is ascendig

Returns:

ef : float

Enrichment Factor for given percenage in range 0:1

oddt.metrics.roc_log_auc(y_true, y_score, pos_label=None, ascending_score=True, log_min=0.001, log_max=1.0)[source]

Computes area under semi-log ROC for random distribution.

Parameters:

y_true : array, shape=[n_samples]

True binary labels, in range {0,1} or {-1,1}. If positive label is different than 1, it must be explicitly defined.

y_score : array, shape=[n_samples]

Scores for tested series of samples

pos_label: int

Positive label of samples (if other than 1)

ascending_score: bool (default=True)

Indicates if your score is ascendig

log_min : float (default=0.001)

Minimum logarithm value for estimating AUC

log_max : float (default=1.)

Maximum logarithm value for estimating AUC.

Returns:

auc : float

semi-log ROC AUC

oddt.metrics.enrichment_factor(y_true, y_score, percentage=1, pos_label=None)[source]

Computes enrichment factor for given percentage, i.e. EF_1% is enrichment factor for first percent of given samples.

Parameters:

y_true : array, shape=[n_samples]

True binary labels, in range {0,1} or {-1,1}. If positive label is different than 1, it must be explicitly defined.

y_score : array, shape=[n_samples]

Scores for tested series of samples

percentage : int or float

The percentage for which EF is being calculated

pos_label: int

Positive label of samples (if other than 1)

Returns:

ef : float

Enrichment Factor for given percenage in range 0:1

oddt.metrics.random_roc_log_auc(log_min=0.001, log_max=1.0)[source]

Computes area under semi-log ROC for random distribution.

Parameters:

log_min : float (default=0.001)

Minimum logarithm value for estimating AUC

log_max : float (default=1.)

Maximum logarithm value for estimating AUC.

Returns:

auc : float

semi-log ROC AUC for random distribution

oddt.metrics.rmse(y_true, y_pred)[source]

Compute Root Mean Squared Error (RMSE)

Parameters:

y_true : array-like of shape = [n_samples] or [n_samples, n_outputs]

Ground truth (correct) target values.

y_pred : array-like of shape = [n_samples] or [n_samples, n_outputs]

Estimated target values.

Returns:

rmse : float

A positive floating point value (the best value is 0.0).

oddt.spatial module

Spatial functions included in ODDT Mainly used by other modules, but can be accessed directly.

oddt.spatial.angle(p1, p2, p3)[source]

Returns an angle from a series of 3 points (point #2 is centroid).Angle is returned in degrees.

Parameters:

p1,p2,p3 : numpy arrays, shape = [n_points, n_dimensions]

Triplets of points in n-dimensional space, aligned in rows.

Returns:

angles : numpy array, shape = [n_points]

Series of angles in degrees

oddt.spatial.angle_2v(v1, v2)[source]

Returns an angle between two vecors.Angle is returned in degrees.

Parameters:

v1,v2 : numpy arrays, shape = [n_vectors, n_dimensions]

Pairs of vectors in n-dimensional space, aligned in rows.

Returns:

angles : numpy array, shape = [n_vectors]

Series of angles in degrees

oddt.spatial.dihedral(p1, p2, p3, p4)[source]

Returns an dihedral angle from a series of 4 points. Dihedral is returned in degrees. Function distingishes clockwise and antyclockwise dihedrals.

Parameters:

p1,p2,p3,p4 : numpy arrays, shape = [n_points, n_dimensions]

Quadruplets of points in n-dimensional space, aligned in rows.

Returns:

angles : numpy array, shape = [n_points]

Series of angles in degrees

oddt.spatial.distance(XA, XB, metric='euclidean', p=2, V=None, VI=None, w=None)

Computes distance between each pair of the two collections of inputs.

The following are common calling conventions:

  1. Y = cdist(XA, XB, 'euclidean')

    Computes the distance between \(m\) points using Euclidean distance (2-norm) as the distance metric between the points. The points are arranged as \(m\) \(n\)-dimensional row vectors in the matrix X.

  2. Y = cdist(XA, XB, 'minkowski', p)

    Computes the distances using the Minkowski distance \(||u-v||_p\) (\(p\)-norm) where \(p \geq 1\).

  3. Y = cdist(XA, XB, 'cityblock')

    Computes the city block or Manhattan distance between the points.

  4. Y = cdist(XA, XB, 'seuclidean', V=None)

    Computes the standardized Euclidean distance. The standardized Euclidean distance between two n-vectors u and v is

    \[\sqrt{\sum {(u_i-v_i)^2 / V[x_i]}}.\]

    V is the variance vector; V[i] is the variance computed over all the i’th components of the points. If not passed, it is automatically computed.

  5. Y = cdist(XA, XB, 'sqeuclidean')

    Computes the squared Euclidean distance \(||u-v||_2^2\) between the vectors.

  6. Y = cdist(XA, XB, 'cosine')

    Computes the cosine distance between vectors u and v,

    \[1 - \frac{u \cdot v} {{||u||}_2 {||v||}_2}\]

    where \(||*||_2\) is the 2-norm of its argument *, and \(u \cdot v\) is the dot product of \(u\) and \(v\).

  7. Y = cdist(XA, XB, 'correlation')

    Computes the correlation distance between vectors u and v. This is

    \[1 - \frac{(u - \bar{u}) \cdot (v - \bar{v})} {{||(u - \bar{u})||}_2 {||(v - \bar{v})||}_2}\]

    where \(\bar{v}\) is the mean of the elements of vector v, and \(x \cdot y\) is the dot product of \(x\) and \(y\).

  8. Y = cdist(XA, XB, 'hamming')

    Computes the normalized Hamming distance, or the proportion of those vector elements between two n-vectors u and v which disagree. To save memory, the matrix X can be of type boolean.

  9. Y = cdist(XA, XB, 'jaccard')

    Computes the Jaccard distance between the points. Given two vectors, u and v, the Jaccard distance is the proportion of those elements u[i] and v[i] that disagree where at least one of them is non-zero.

  10. Y = cdist(XA, XB, 'chebyshev')

Computes the Chebyshev distance between the points. The Chebyshev distance between two n-vectors u and v is the maximum norm-1 distance between their respective elements. More precisely, the distance is given by

\[d(u,v) = \max_i {|u_i-v_i|}.\]
  1. Y = cdist(XA, XB, 'canberra')

Computes the Canberra distance between the points. The Canberra distance between two points u and v is

\[d(u,v) = \sum_i \frac{|u_i-v_i|} {|u_i|+|v_i|}.\]
  1. Y = cdist(XA, XB, 'braycurtis')

Computes the Bray-Curtis distance between the points. The Bray-Curtis distance between two points u and v is

\[d(u,v) = \frac{\sum_i (u_i-v_i)} {\sum_i (u_i+v_i)}\]
  1. Y = cdist(XA, XB, 'mahalanobis', VI=None)
Computes the Mahalanobis distance between the points. The Mahalanobis distance between two points u and v is \((u-v)(1/V)(u-v)^T\) where \((1/V)\) (the VI variable) is the inverse covariance. If VI is not None, VI will be used as the inverse covariance matrix.
  1. Y = cdist(XA, XB, 'yule')
Computes the Yule distance between the boolean vectors. (see yule function documentation)
  1. Y = cdist(XA, XB, 'matching')
Computes the matching distance between the boolean vectors. (see matching function documentation)
  1. Y = cdist(XA, XB, 'dice')
Computes the Dice distance between the boolean vectors. (see dice function documentation)
  1. Y = cdist(XA, XB, 'kulsinski')
Computes the Kulsinski distance between the boolean vectors. (see kulsinski function documentation)
  1. Y = cdist(XA, XB, 'rogerstanimoto')
Computes the Rogers-Tanimoto distance between the boolean vectors. (see rogerstanimoto function documentation)
  1. Y = cdist(XA, XB, 'russellrao')
Computes the Russell-Rao distance between the boolean vectors. (see russellrao function documentation)
  1. Y = cdist(XA, XB, 'sokalmichener')
Computes the Sokal-Michener distance between the boolean vectors. (see sokalmichener function documentation)
  1. Y = cdist(XA, XB, 'sokalsneath')
Computes the Sokal-Sneath distance between the vectors. (see sokalsneath function documentation)
  1. Y = cdist(XA, XB, 'wminkowski')
Computes the weighted Minkowski distance between the vectors. (see wminkowski function documentation)
  1. Y = cdist(XA, XB, f)

Computes the distance between all pairs of vectors in X using the user supplied 2-arity function f. For example, Euclidean distance between the vectors could be computed as follows:

dm = cdist(XA, XB, lambda u, v: np.sqrt(((u-v)**2).sum()))

Note that you should avoid passing a reference to one of the distance functions defined in this library. For example,:

dm = cdist(XA, XB, sokalsneath)

would calculate the pair-wise distances between the vectors in X using the Python function sokalsneath. This would result in sokalsneath being called \({n \choose 2}\) times, which is inefficient. Instead, the optimized C version is more efficient, and we call it using the following syntax:

dm = cdist(XA, XB, 'sokalsneath')
Parameters:

XA : ndarray

An \(m_A\) by \(n\) array of \(m_A\) original observations in an \(n\)-dimensional space. Inputs are converted to float type.

XB : ndarray

An \(m_B\) by \(n\) array of \(m_B\) original observations in an \(n\)-dimensional space. Inputs are converted to float type.

metric : str or callable, optional

The distance metric to use. If a string, the distance function can be ‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘cityblock’, ‘correlation’, ‘cosine’, ‘dice’, ‘euclidean’, ‘hamming’, ‘jaccard’, ‘kulsinski’, ‘mahalanobis’, ‘matching’, ‘minkowski’, ‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’, ‘sokalsneath’, ‘sqeuclidean’, ‘wminkowski’, ‘yule’.

w : ndarray, optional

The weight vector (for weighted Minkowski).

p : scalar, optional

The p-norm to apply (for Minkowski, weighted and unweighted)

V : ndarray, optional

The variance vector (for standardized Euclidean).

VI : ndarray, optional

The inverse of the covariance matrix (for Mahalanobis).

Returns:

Y : ndarray

A \(m_A\) by \(m_B\) distance matrix is returned. For each \(i\) and \(j\), the metric dist(u=XA[i], v=XB[j]) is computed and stored in the \(ij\) th entry.

Raises:

ValueError

An exception is thrown if XA and XB do not have the same number of columns.

Examples

Find the Euclidean distances between four 2-D coordinates:

>>> from scipy.spatial import distance
>>> coords = [(35.0456, -85.2672),
...           (35.1174, -89.9711),
...           (35.9728, -83.9422),
...           (36.1667, -86.7833)]
>>> distance.cdist(coords, coords, 'euclidean')
array([[ 0.    ,  4.7044,  1.6172,  1.8856],
       [ 4.7044,  0.    ,  6.0893,  3.3561],
       [ 1.6172,  6.0893,  0.    ,  2.8477],
       [ 1.8856,  3.3561,  2.8477,  0.    ]])

Find the Manhattan distance from a 3-D point to the corners of the unit cube:

>>> a = np.array([[0, 0, 0],
                  [0, 0, 1],
                  [0, 1, 0],
                  [0, 1, 1],
                  [1, 0, 0],
                  [1, 0, 1],
                  [1, 1, 0],
                  [1, 1, 1]])
>>> b = np.array([[ 0.1,  0.2,  0.4]])
>>> distance.cdist(a, b, 'cityblock')
array([[ 0.7],
       [ 0.9],
       [ 1.3],
       [ 1.5],
       [ 1.5],
       [ 1.7],
       [ 2.1],
       [ 2.3]])

oddt.virtualscreening module

ODDT pipeline framework for virtual screening

class oddt.virtualscreening.virtualscreening(n_cpu=-1, verbose=False)[source]

Virtual Screening pipeline stack

Parameters:

n_cpu: int (default=-1)

The number of parallel procesors to use

verbose: bool (default=False)

Verbosity flag for some methods

Methods

apply_filter(expression[, filter_type, ...]) Filtering method, can use raw expressions (strings to be evaled in if statement, can use oddt.toolkit.Molecule methods, eg.
dock(engine, protein, *args, **kwargs) Docking procedure.
fetch()
load_ligands(fmt, ligands_file, *args, **kwargs) Loads file with ligands.
score(function[, protein]) Scoring procedure.
write(fmt, filename[, csv_filename]) Outputs molecules to a file
write_csv(csv_filename[, keep_pipe]) Outputs molecules to a csv file
apply_filter(expression, filter_type='expression', soft_fail=0)[source]

Filtering method, can use raw expressions (strings to be evaled in if statement, can use oddt.toolkit.Molecule methods, eg. ‘mol.molwt < 500’) Currently supported presets:

  • Lipinski Rule of 5 (‘r5’ or ‘l5’)
  • Fragment Rule of 3 (‘r3’)
Parameters:

expression: string or list of strings

Expresion(s) to be used while filtering.

filter_type: ‘expression’ or ‘preset’ (default=’expression’)

Specify filter type: ‘expression’ or ‘preset’. Default strings are treated as expressions.

soft_fail: int (default=0)

The number of faulures molecule can have to pass filter, aka. soft-fails.

dock(engine, protein, *args, **kwargs)[source]

Docking procedure.

Parameters:

engine: string

Which docking engine to use.

fetch()[source]
load_ligands(fmt, ligands_file, *args, **kwargs)[source]

Loads file with ligands.

Parameters:

file_type: string

Type of molecular file

ligands_file: string

Path to a file, which is loaded to pipeline

score(function, protein=None, *args, **kwargs)[source]

Scoring procedure.

Parameters:

function: string

Which scoring function to use.

protein: oddt.toolkit.Molecule

Default protein to use as reference

write(fmt, filename, csv_filename=None, **kwargs)[source]

Outputs molecules to a file

Parameters:

file_type: string

Type of molecular file

ligands_file: string

Path to a output file

csv_filename: string

Optional path to a CSV file

write_csv(csv_filename, keep_pipe=False, **kwargs)[source]

Outputs molecules to a csv file

Parameters:

csv_filename: string

Optional path to a CSV file

keep_pipe: bool (default=False)

If set to True, the ligand pipe is sustained.

Module contents

Open Drug Discovery Toolkit

Universal and easy to use resource for various drug discovery tasks, ie docking, virutal screening, rescoring.

toolkit : module,
Toolkits backend module, currenlty OpenBabel [ob] and RDKit [rdk]. This setting is toolkit-wide, and sets given toolkit as default