oddt.scoring.functions package

Submodules

oddt.scoring.functions.NNScore module

class oddt.scoring.functions.NNScore.nnscore(protein=None, n_jobs=- 1)[source]

Bases: oddt.scoring.scorer

NNScore implementation [1]. Based on Binana descriptors [2] and an ensemble of 20 best scored nerual networks with a hidden layer of 5 nodes. The NNScore predicts binding affinity (pKi/d).

Parameters
proteinoddt.toolkit.Molecule object

Receptor for the scored ligands

n_jobs: int (default=-1)

Number of cores to use for scoring and training. By default (-1) all cores are allocated.

References

1

Durrant JD, McCammon JA. NNScore 2.0: a neural-network receptor-ligand scoring function. J Chem Inf Model. 2011;51: 2897-2903. doi:10.1021/ci2003889

2

Durrant JD, McCammon JA. BINANA: a novel algorithm for ligand-binding characterization. J Mol Graph Model. 2011;29: 888-893. doi:10.1016/j.jmgm.2011.01.004

Methods

fit(ligands, target, *args, **kwargs)

Trains model on supplied ligands and target values

load([filename, pdbbind_version])

Loads scoring function from a pickle file.

predict(ligands, *args, **kwargs)

Predicts values (eg.

predict_ligand(ligand)

Local method to score one ligand and update it’s scores.

predict_ligands(ligands)

Method to score ligands in a lazy fashion.

save(filename)

Saves scoring function to a pickle file.

score(…)

Parameters

set_protein(protein)

Proxy method to update protein in all relevant places.

gen_training_data

train

gen_training_data(pdbbind_dir, pdbbind_versions=(2007, 2012, 2013, 2014, 2015, 2016), home_dir=None, use_proteins=False)[source]
classmethod load(filename=None, pdbbind_version=2016)[source]

Loads scoring function from a pickle file.

Parameters
filename: string

Pickle filename

Returns
sf: scorer-like object

Scoring function object loaded from a pickle

train(home_dir=None, sf_pickle=None, pdbbind_version=2016)[source]

oddt.scoring.functions.PLECscore module

class oddt.scoring.functions.PLECscore.PLECscore(protein=None, n_jobs=- 1, version='linear', depth_protein=5, depth_ligand=1, size=65536)[source]

Bases: oddt.scoring.scorer

PLECscore - a novel scoring function based on PLEC fingerprints. The underlying model can be one of:

  • linear regression

  • neural network (dense, 200x200x200)

  • random forest (100 trees)

The scoring function is trained on PDBbind v2016 database and even with linear model outperforms other machine-learning ones in terms of Pearson correlation coefficient on “core set”. For details see PLEC publication. PLECscore predicts binding affinity (pKi/d).

New in version 0.6.

Parameters
proteinoddt.toolkit.Molecule object

Receptor for the scored ligands

n_jobs: int (default=-1)

Number of cores to use for scoring and training. By default (-1) all cores are allocated.

version: str (default=’linear’)

A version of scoring function (‘linear’, ‘nn’ or ‘rf’) - which model should be used for the scoring function.

depth_protein: int (default=5)

The depth of ECFP environments generated on the protein side of interaction. By default 6 (0 to 5) environments are generated.

depth_ligand: int (default=1)

The depth of ECFP environments generated on the ligand side of interaction. By default 2 (0 to 1) environments are generated.

size: int (default=65536)

The final size of a folded PLEC fingerprint. This setting is not used to limit the data encoded in PLEC fingerprint (for that tune the depths), but only the final lenght. Setting it to too low value will lead to many collisions.

Methods

fit(ligands, target, *args, **kwargs)

Trains model on supplied ligands and target values

load([filename, version, pdbbind_version, …])

Loads scoring function from a pickle file.

predict(ligands, *args, **kwargs)

Predicts values (eg.

predict_ligand(ligand)

Local method to score one ligand and update it’s scores.

predict_ligands(ligands)

Method to score ligands in a lazy fashion.

save(filename)

Saves scoring function to a pickle file.

score(…)

Parameters

set_protein(protein)

Proxy method to update protein in all relevant places.

gen_json

gen_training_data

train

gen_json(home_dir=None, pdbbind_version=2016)[source]
gen_training_data(pdbbind_dir, pdbbind_versions=(2016), home_dir=None, use_proteins=True)[source]
classmethod load(filename=None, version='linear', pdbbind_version=2016, depth_protein=5, depth_ligand=1, size=65536)[source]

Loads scoring function from a pickle file.

Parameters
filename: string

Pickle filename

Returns
sf: scorer-like object

Scoring function object loaded from a pickle

train(home_dir=None, sf_pickle=None, pdbbind_version=2016, ignore_json=False)[source]

oddt.scoring.functions.RFScore module

class oddt.scoring.functions.RFScore.rfscore(protein=None, n_jobs=- 1, version=1, spr=0, **kwargs)[source]

Bases: oddt.scoring.scorer

Scoring function implementing RF-Score variants. It predicts the binding affinity (pKi/d) of ligand in a complex utilizng simple descriptors (close contacts of atoms <12A) with sophisticated machine-learning model (random forest). The third variand supplements those contacts with Vina partial scores. For futher details see RF-Score publications v1[Rd9e4db499696-1]_, v2[Rd9e4db499696-2]_, v3[Rd9e4db499696-3]_.

Parameters
proteinoddt.toolkit.Molecule object

Receptor for the scored ligands

n_jobs: int (default=-1)

Number of cores to use for scoring and training. By default (-1) all cores are allocated.

version: int (default=1)

Scoring function variant. The deault is the simplest one (v1).

spr: int (default=0)

The minimum number of contacts in each pair of atom types in the training set for the column to be included in training. This is a way of removal of not frequent and empty contacts.

References

1

Ballester PJ, Mitchell JBO. A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking. Bioinformatics. 2010;26: 1169-1175. doi:10.1093/bioinformatics/btq112

2

Ballester PJ, Schreyer A, Blundell TL. Does a more precise chemical description of protein-ligand complexes lead to more accurate prediction of binding affinity? J Chem Inf Model. 2014;54: 944-955. doi:10.1021/ci500091r

3

Li H, Leung K-S, Wong M-H, Ballester PJ. Improving AutoDock Vina Using Random Forest: The Growing Accuracy of Binding Affinity Prediction by the Effective Exploitation of Larger Data Sets. Mol Inform. WILEY-VCH Verlag; 2015;34: 115-126. doi:10.1002/minf.201400132

Methods

fit(ligands, target, *args, **kwargs)

Trains model on supplied ligands and target values

load([filename, version, pdbbind_version])

Loads scoring function from a pickle file.

predict(ligands, *args, **kwargs)

Predicts values (eg.

predict_ligand(ligand)

Local method to score one ligand and update it’s scores.

predict_ligands(ligands)

Method to score ligands in a lazy fashion.

save(filename)

Saves scoring function to a pickle file.

score(…)

Parameters

set_protein(protein)

Proxy method to update protein in all relevant places.

gen_training_data

train

gen_training_data(pdbbind_dir, pdbbind_versions=(2007, 2012, 2013, 2014, 2015, 2016), home_dir=None, use_proteins=False)[source]
classmethod load(filename=None, version=1, pdbbind_version=2016)[source]

Loads scoring function from a pickle file.

Parameters
filename: string

Pickle filename

Returns
sf: scorer-like object

Scoring function object loaded from a pickle

train(home_dir=None, sf_pickle=None, pdbbind_version=2016)[source]

Module contents

class oddt.scoring.functions.PLECscore(protein=None, n_jobs=- 1, version='linear', depth_protein=5, depth_ligand=1, size=65536)[source]

Bases: oddt.scoring.scorer

PLECscore - a novel scoring function based on PLEC fingerprints. The underlying model can be one of:

  • linear regression

  • neural network (dense, 200x200x200)

  • random forest (100 trees)

The scoring function is trained on PDBbind v2016 database and even with linear model outperforms other machine-learning ones in terms of Pearson correlation coefficient on “core set”. For details see PLEC publication. PLECscore predicts binding affinity (pKi/d).

New in version 0.6.

Parameters
proteinoddt.toolkit.Molecule object

Receptor for the scored ligands

n_jobs: int (default=-1)

Number of cores to use for scoring and training. By default (-1) all cores are allocated.

version: str (default=’linear’)

A version of scoring function (‘linear’, ‘nn’ or ‘rf’) - which model should be used for the scoring function.

depth_protein: int (default=5)

The depth of ECFP environments generated on the protein side of interaction. By default 6 (0 to 5) environments are generated.

depth_ligand: int (default=1)

The depth of ECFP environments generated on the ligand side of interaction. By default 2 (0 to 1) environments are generated.

size: int (default=65536)

The final size of a folded PLEC fingerprint. This setting is not used to limit the data encoded in PLEC fingerprint (for that tune the depths), but only the final lenght. Setting it to too low value will lead to many collisions.

Methods

gen_json

gen_training_data

train

gen_json(home_dir=None, pdbbind_version=2016)[source]
gen_training_data(pdbbind_dir, pdbbind_versions=(2016), home_dir=None, use_proteins=True)[source]
classmethod load(filename=None, version='linear', pdbbind_version=2016, depth_protein=5, depth_ligand=1, size=65536)[source]

Loads scoring function from a pickle file.

Parameters
filename: string

Pickle filename

Returns
sf: scorer-like object

Scoring function object loaded from a pickle

train(home_dir=None, sf_pickle=None, pdbbind_version=2016, ignore_json=False)[source]
class oddt.scoring.functions.nnscore(protein=None, n_jobs=- 1)[source]

Bases: oddt.scoring.scorer

NNScore implementation [1]. Based on Binana descriptors [2] and an ensemble of 20 best scored nerual networks with a hidden layer of 5 nodes. The NNScore predicts binding affinity (pKi/d).

Parameters
proteinoddt.toolkit.Molecule object

Receptor for the scored ligands

n_jobs: int (default=-1)

Number of cores to use for scoring and training. By default (-1) all cores are allocated.

References

1

Durrant JD, McCammon JA. NNScore 2.0: a neural-network receptor-ligand scoring function. J Chem Inf Model. 2011;51: 2897-2903. doi:10.1021/ci2003889

2

Durrant JD, McCammon JA. BINANA: a novel algorithm for ligand-binding characterization. J Mol Graph Model. 2011;29: 888-893. doi:10.1016/j.jmgm.2011.01.004

Methods

fit(ligands, target, *args, **kwargs)

Trains model on supplied ligands and target values

load([filename, pdbbind_version])

Loads scoring function from a pickle file.

predict(ligands, *args, **kwargs)

Predicts values (eg.

predict_ligand(ligand)

Local method to score one ligand and update it’s scores.

predict_ligands(ligands)

Method to score ligands in a lazy fashion.

save(filename)

Saves scoring function to a pickle file.

score(…)

Parameters

set_protein(protein)

Proxy method to update protein in all relevant places.

gen_training_data

train

gen_training_data(pdbbind_dir, pdbbind_versions=(2007, 2012, 2013, 2014, 2015, 2016), home_dir=None, use_proteins=False)[source]
classmethod load(filename=None, pdbbind_version=2016)[source]

Loads scoring function from a pickle file.

Parameters
filename: string

Pickle filename

Returns
sf: scorer-like object

Scoring function object loaded from a pickle

train(home_dir=None, sf_pickle=None, pdbbind_version=2016)[source]
class oddt.scoring.functions.rfscore(protein=None, n_jobs=- 1, version=1, spr=0, **kwargs)[source]

Bases: oddt.scoring.scorer

Scoring function implementing RF-Score variants. It predicts the binding affinity (pKi/d) of ligand in a complex utilizng simple descriptors (close contacts of atoms <12A) with sophisticated machine-learning model (random forest). The third variand supplements those contacts with Vina partial scores. For futher details see RF-Score publications v1[R062ccc3ea4fa-1]_, v2[R062ccc3ea4fa-2]_, v3[R062ccc3ea4fa-3]_.

Parameters
proteinoddt.toolkit.Molecule object

Receptor for the scored ligands

n_jobs: int (default=-1)

Number of cores to use for scoring and training. By default (-1) all cores are allocated.

version: int (default=1)

Scoring function variant. The deault is the simplest one (v1).

spr: int (default=0)

The minimum number of contacts in each pair of atom types in the training set for the column to be included in training. This is a way of removal of not frequent and empty contacts.

References

1

Ballester PJ, Mitchell JBO. A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking. Bioinformatics. 2010;26: 1169-1175. doi:10.1093/bioinformatics/btq112

2

Ballester PJ, Schreyer A, Blundell TL. Does a more precise chemical description of protein-ligand complexes lead to more accurate prediction of binding affinity? J Chem Inf Model. 2014;54: 944-955. doi:10.1021/ci500091r

3

Li H, Leung K-S, Wong M-H, Ballester PJ. Improving AutoDock Vina Using Random Forest: The Growing Accuracy of Binding Affinity Prediction by the Effective Exploitation of Larger Data Sets. Mol Inform. WILEY-VCH Verlag; 2015;34: 115-126. doi:10.1002/minf.201400132

Methods

fit(ligands, target, *args, **kwargs)

Trains model on supplied ligands and target values

load([filename, version, pdbbind_version])

Loads scoring function from a pickle file.

predict(ligands, *args, **kwargs)

Predicts values (eg.

predict_ligand(ligand)

Local method to score one ligand and update it’s scores.

predict_ligands(ligands)

Method to score ligands in a lazy fashion.

save(filename)

Saves scoring function to a pickle file.

score(…)

Parameters

set_protein(protein)

Proxy method to update protein in all relevant places.

gen_training_data

train

gen_training_data(pdbbind_dir, pdbbind_versions=(2007, 2012, 2013, 2014, 2015, 2016), home_dir=None, use_proteins=False)[source]
classmethod load(filename=None, version=1, pdbbind_version=2016)[source]

Loads scoring function from a pickle file.

Parameters
filename: string

Pickle filename

Returns
sf: scorer-like object

Scoring function object loaded from a pickle

train(home_dir=None, sf_pickle=None, pdbbind_version=2016)[source]