oddt.scoring.functions package¶

Submodules¶

oddt.scoring.functions.NNScore module¶

class oddt.scoring.functions.NNScore.nnscore(protein=None, n_jobs=- 1)[source]¶

Bases: oddt.scoring.scorer

NNScore implementation [1]. Based on Binana descriptors [2] and an ensemble of 20 best scored nerual networks with a hidden layer of 5 nodes. The NNScore predicts binding affinity (pKi/d).

Parameters

proteinoddt.toolkit.Molecule object: Receptor for the scored ligands
n_jobs: int (default=-1): Number of cores to use for scoring and training. By default (-1) all cores are allocated.

References

1: Durrant JD, McCammon JA. NNScore 2.0: a neural-network receptor-ligand scoring function. J Chem Inf Model. 2011;51: 2897-2903. doi:10.1021/ci2003889
2: Durrant JD, McCammon JA. BINANA: a novel algorithm for ligand-binding characterization. J Mol Graph Model. 2011;29: 888-893. doi:10.1016/j.jmgm.2011.01.004

Methods

`fit`(ligands, target, args, *kwargs)	Trains model on supplied ligands and target values
`load`([filename, pdbbind_version])	Loads scoring function from a pickle file.
`predict`(ligands, args, *kwargs)	Predicts values (eg.
`predict_ligand`(ligand)	Local method to score one ligand and update it’s scores.
`predict_ligands`(ligands)	Method to score ligands in a lazy fashion.
`save`(filename)	Saves scoring function to a pickle file.
`score`(…)	Parameters
`set_protein`(protein)	Proxy method to update protein in all relevant places.

gen_training_data
train

gen_training_data(pdbbind_dir, pdbbind_versions=(2007, 2012, 2013, 2014, 2015, 2016), home_dir=None, use_proteins=False)[source]¶

classmethod load(filename=None, pdbbind_version=2016)[source]¶

Loads scoring function from a pickle file.

Parameters

filename: string: Pickle filename

Returns

sf: scorer-like object: Scoring function object loaded from a pickle

train(home_dir=None, sf_pickle=None, pdbbind_version=2016)[source]¶

oddt.scoring.functions.PLECscore module¶

class oddt.scoring.functions.PLECscore.PLECscore(protein=None, n_jobs=- 1, version='linear', depth_protein=5, depth_ligand=1, size=65536)[source]¶

Bases: oddt.scoring.scorer

PLECscore - a novel scoring function based on PLEC fingerprints. The underlying model can be one of:

linear regression

neural network (dense, 200x200x200)

random forest (100 trees)

The scoring function is trained on PDBbind v2016 database and even with linear model outperforms other machine-learning ones in terms of Pearson correlation coefficient on “core set”. For details see PLEC publication. PLECscore predicts binding affinity (pKi/d).

New in version 0.6.

Parameters

proteinoddt.toolkit.Molecule object: Receptor for the scored ligands
n_jobs: int (default=-1): Number of cores to use for scoring and training. By default (-1) all cores are allocated.
version: str (default=’linear’): A version of scoring function (‘linear’, ‘nn’ or ‘rf’) - which model should be used for the scoring function.
depth_protein: int (default=5): The depth of ECFP environments generated on the protein side of interaction. By default 6 (0 to 5) environments are generated.
depth_ligand: int (default=1): The depth of ECFP environments generated on the ligand side of interaction. By default 2 (0 to 1) environments are generated.
size: int (default=65536): The final size of a folded PLEC fingerprint. This setting is not used to limit the data encoded in PLEC fingerprint (for that tune the depths), but only the final lenght. Setting it to too low value will lead to many collisions.

Methods

`fit`(ligands, target, args, *kwargs)	Trains model on supplied ligands and target values
`load`([filename, version, pdbbind_version, …])	Loads scoring function from a pickle file.
`predict`(ligands, args, *kwargs)	Predicts values (eg.
`predict_ligand`(ligand)	Local method to score one ligand and update it’s scores.
`predict_ligands`(ligands)	Method to score ligands in a lazy fashion.
`save`(filename)	Saves scoring function to a pickle file.
`score`(…)	Parameters
`set_protein`(protein)	Proxy method to update protein in all relevant places.

gen_json
gen_training_data
train

gen_json(home_dir=None, pdbbind_version=2016)[source]¶

gen_training_data(pdbbind_dir, pdbbind_versions=(2016), home_dir=None, use_proteins=True)[source]¶

classmethod load(filename=None, version='linear', pdbbind_version=2016, depth_protein=5, depth_ligand=1, size=65536)[source]¶

Loads scoring function from a pickle file.

Parameters

filename: string: Pickle filename

Returns

sf: scorer-like object: Scoring function object loaded from a pickle

train(home_dir=None, sf_pickle=None, pdbbind_version=2016, ignore_json=False)[source]¶

oddt.scoring.functions.RFScore module¶

class oddt.scoring.functions.RFScore.rfscore(protein=None, n_jobs=- 1, version=1, spr=0, **kwargs)[source]¶

Bases: oddt.scoring.scorer

Scoring function implementing RF-Score variants. It predicts the binding affinity (pKi/d) of ligand in a complex utilizng simple descriptors (close contacts of atoms <12A) with sophisticated machine-learning model (random forest). The third variand supplements those contacts with Vina partial scores. For futher details see RF-Score publications v1[Rd9e4db499696-1]_, v2[Rd9e4db499696-2]_, v3[Rd9e4db499696-3]_.

Parameters

proteinoddt.toolkit.Molecule object: Receptor for the scored ligands
n_jobs: int (default=-1): Number of cores to use for scoring and training. By default (-1) all cores are allocated.
version: int (default=1): Scoring function variant. The deault is the simplest one (v1).
spr: int (default=0): The minimum number of contacts in each pair of atom types in the training set for the column to be included in training. This is a way of removal of not frequent and empty contacts.

References

1: Ballester PJ, Mitchell JBO. A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking. Bioinformatics. 2010;26: 1169-1175. doi:10.1093/bioinformatics/btq112
2: Ballester PJ, Schreyer A, Blundell TL. Does a more precise chemical description of protein-ligand complexes lead to more accurate prediction of binding affinity? J Chem Inf Model. 2014;54: 944-955. doi:10.1021/ci500091r
3: Li H, Leung K-S, Wong M-H, Ballester PJ. Improving AutoDock Vina Using Random Forest: The Growing Accuracy of Binding Affinity Prediction by the Effective Exploitation of Larger Data Sets. Mol Inform. WILEY-VCH Verlag; 2015;34: 115-126. doi:10.1002/minf.201400132

Methods

`fit`(ligands, target, args, *kwargs)	Trains model on supplied ligands and target values
`load`([filename, version, pdbbind_version])	Loads scoring function from a pickle file.
`predict`(ligands, args, *kwargs)	Predicts values (eg.
`predict_ligand`(ligand)	Local method to score one ligand and update it’s scores.
`predict_ligands`(ligands)	Method to score ligands in a lazy fashion.
`save`(filename)	Saves scoring function to a pickle file.
`score`(…)	Parameters
`set_protein`(protein)	Proxy method to update protein in all relevant places.

gen_training_data
train

gen_training_data(pdbbind_dir, pdbbind_versions=(2007, 2012, 2013, 2014, 2015, 2016), home_dir=None, use_proteins=False)[source]¶

classmethod load(filename=None, version=1, pdbbind_version=2016)[source]¶

Loads scoring function from a pickle file.

Parameters

filename: string: Pickle filename

Returns

sf: scorer-like object: Scoring function object loaded from a pickle

train(home_dir=None, sf_pickle=None, pdbbind_version=2016)[source]¶

Module contents¶

class oddt.scoring.functions.PLECscore(protein=None, n_jobs=- 1, version='linear', depth_protein=5, depth_ligand=1, size=65536)[source]¶

Bases: oddt.scoring.scorer

PLECscore - a novel scoring function based on PLEC fingerprints. The underlying model can be one of:

linear regression

neural network (dense, 200x200x200)

random forest (100 trees)

The scoring function is trained on PDBbind v2016 database and even with linear model outperforms other machine-learning ones in terms of Pearson correlation coefficient on “core set”. For details see PLEC publication. PLECscore predicts binding affinity (pKi/d).

New in version 0.6.

Parameters

proteinoddt.toolkit.Molecule object: Receptor for the scored ligands
n_jobs: int (default=-1): Number of cores to use for scoring and training. By default (-1) all cores are allocated.
version: str (default=’linear’): A version of scoring function (‘linear’, ‘nn’ or ‘rf’) - which model should be used for the scoring function.
depth_protein: int (default=5): The depth of ECFP environments generated on the protein side of interaction. By default 6 (0 to 5) environments are generated.
depth_ligand: int (default=1): The depth of ECFP environments generated on the ligand side of interaction. By default 2 (0 to 1) environments are generated.
size: int (default=65536): The final size of a folded PLEC fingerprint. This setting is not used to limit the data encoded in PLEC fingerprint (for that tune the depths), but only the final lenght. Setting it to too low value will lead to many collisions.

Methods

gen_json
gen_training_data
train

gen_json(home_dir=None, pdbbind_version=2016)[source]¶

gen_training_data(pdbbind_dir, pdbbind_versions=(2016), home_dir=None, use_proteins=True)[source]¶

classmethod load(filename=None, version='linear', pdbbind_version=2016, depth_protein=5, depth_ligand=1, size=65536)[source]¶

Loads scoring function from a pickle file.

Parameters

filename: string: Pickle filename

Returns

sf: scorer-like object: Scoring function object loaded from a pickle

train(home_dir=None, sf_pickle=None, pdbbind_version=2016, ignore_json=False)[source]¶

class oddt.scoring.functions.nnscore(protein=None, n_jobs=- 1)[source]¶

Bases: oddt.scoring.scorer

NNScore implementation [1]. Based on Binana descriptors [2] and an ensemble of 20 best scored nerual networks with a hidden layer of 5 nodes. The NNScore predicts binding affinity (pKi/d).

Parameters

proteinoddt.toolkit.Molecule object: Receptor for the scored ligands
n_jobs: int (default=-1): Number of cores to use for scoring and training. By default (-1) all cores are allocated.

References

1: Durrant JD, McCammon JA. NNScore 2.0: a neural-network receptor-ligand scoring function. J Chem Inf Model. 2011;51: 2897-2903. doi:10.1021/ci2003889
2: Durrant JD, McCammon JA. BINANA: a novel algorithm for ligand-binding characterization. J Mol Graph Model. 2011;29: 888-893. doi:10.1016/j.jmgm.2011.01.004

Methods

`fit`(ligands, target, args, *kwargs)	Trains model on supplied ligands and target values
`load`([filename, pdbbind_version])	Loads scoring function from a pickle file.
`predict`(ligands, args, *kwargs)	Predicts values (eg.
`predict_ligand`(ligand)	Local method to score one ligand and update it’s scores.
`predict_ligands`(ligands)	Method to score ligands in a lazy fashion.
`save`(filename)	Saves scoring function to a pickle file.
`score`(…)	Parameters
`set_protein`(protein)	Proxy method to update protein in all relevant places.

gen_training_data
train

gen_training_data(pdbbind_dir, pdbbind_versions=(2007, 2012, 2013, 2014, 2015, 2016), home_dir=None, use_proteins=False)[source]¶

classmethod load(filename=None, pdbbind_version=2016)[source]¶

Loads scoring function from a pickle file.

Parameters

filename: string: Pickle filename

Returns

sf: scorer-like object: Scoring function object loaded from a pickle

train(home_dir=None, sf_pickle=None, pdbbind_version=2016)[source]¶

class oddt.scoring.functions.rfscore(protein=None, n_jobs=- 1, version=1, spr=0, **kwargs)[source]¶

Bases: oddt.scoring.scorer

Scoring function implementing RF-Score variants. It predicts the binding affinity (pKi/d) of ligand in a complex utilizng simple descriptors (close contacts of atoms <12A) with sophisticated machine-learning model (random forest). The third variand supplements those contacts with Vina partial scores. For futher details see RF-Score publications v1[R062ccc3ea4fa-1]_, v2[R062ccc3ea4fa-2]_, v3[R062ccc3ea4fa-3]_.

Parameters

proteinoddt.toolkit.Molecule object: Receptor for the scored ligands
n_jobs: int (default=-1): Number of cores to use for scoring and training. By default (-1) all cores are allocated.
version: int (default=1): Scoring function variant. The deault is the simplest one (v1).
spr: int (default=0): The minimum number of contacts in each pair of atom types in the training set for the column to be included in training. This is a way of removal of not frequent and empty contacts.

References

1: Ballester PJ, Mitchell JBO. A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking. Bioinformatics. 2010;26: 1169-1175. doi:10.1093/bioinformatics/btq112
2: Ballester PJ, Schreyer A, Blundell TL. Does a more precise chemical description of protein-ligand complexes lead to more accurate prediction of binding affinity? J Chem Inf Model. 2014;54: 944-955. doi:10.1021/ci500091r
3: Li H, Leung K-S, Wong M-H, Ballester PJ. Improving AutoDock Vina Using Random Forest: The Growing Accuracy of Binding Affinity Prediction by the Effective Exploitation of Larger Data Sets. Mol Inform. WILEY-VCH Verlag; 2015;34: 115-126. doi:10.1002/minf.201400132

Methods

`fit`(ligands, target, args, *kwargs)	Trains model on supplied ligands and target values
`load`([filename, version, pdbbind_version])	Loads scoring function from a pickle file.
`predict`(ligands, args, *kwargs)	Predicts values (eg.
`predict_ligand`(ligand)	Local method to score one ligand and update it’s scores.
`predict_ligands`(ligands)	Method to score ligands in a lazy fashion.
`save`(filename)	Saves scoring function to a pickle file.
`score`(…)	Parameters
`set_protein`(protein)	Proxy method to update protein in all relevant places.

gen_training_data
train

gen_training_data(pdbbind_dir, pdbbind_versions=(2007, 2012, 2013, 2014, 2015, 2016), home_dir=None, use_proteins=False)[source]¶

classmethod load(filename=None, version=1, pdbbind_version=2016)[source]¶

Loads scoring function from a pickle file.

Parameters

filename: string: Pickle filename

Returns

sf: scorer-like object: Scoring function object loaded from a pickle

train(home_dir=None, sf_pickle=None, pdbbind_version=2016)[source]¶