oddt.scoring.functions package¶
Submodules¶
oddt.scoring.functions.NNScore module¶
-
class
oddt.scoring.functions.NNScore.
nnscore
(protein=None, n_jobs=- 1)[source]¶ Bases:
oddt.scoring.scorer
NNScore implementation [1]. Based on Binana descriptors [2] and an ensemble of 20 best scored nerual networks with a hidden layer of 5 nodes. The NNScore predicts binding affinity (pKi/d).
- Parameters
- proteinoddt.toolkit.Molecule object
Receptor for the scored ligands
- n_jobs: int (default=-1)
Number of cores to use for scoring and training. By default (-1) all cores are allocated.
References
- 1
Durrant JD, McCammon JA. NNScore 2.0: a neural-network receptor-ligand scoring function. J Chem Inf Model. 2011;51: 2897-2903. doi:10.1021/ci2003889
- 2
Durrant JD, McCammon JA. BINANA: a novel algorithm for ligand-binding characterization. J Mol Graph Model. 2011;29: 888-893. doi:10.1016/j.jmgm.2011.01.004
Methods
fit
(ligands, target, *args, **kwargs)Trains model on supplied ligands and target values
load
([filename, pdbbind_version])Loads scoring function from a pickle file.
predict
(ligands, *args, **kwargs)Predicts values (eg.
predict_ligand
(ligand)Local method to score one ligand and update it’s scores.
predict_ligands
(ligands)Method to score ligands in a lazy fashion.
save
(filename)Saves scoring function to a pickle file.
score
(…)- Parameters
set_protein
(protein)Proxy method to update protein in all relevant places.
gen_training_data
train
-
gen_training_data
(pdbbind_dir, pdbbind_versions=(2007, 2012, 2013, 2014, 2015, 2016), home_dir=None, use_proteins=False)[source]¶
oddt.scoring.functions.PLECscore module¶
-
class
oddt.scoring.functions.PLECscore.
PLECscore
(protein=None, n_jobs=- 1, version='linear', depth_protein=5, depth_ligand=1, size=65536)[source]¶ Bases:
oddt.scoring.scorer
PLECscore - a novel scoring function based on PLEC fingerprints. The underlying model can be one of:
linear regression
neural network (dense, 200x200x200)
random forest (100 trees)
The scoring function is trained on PDBbind v2016 database and even with linear model outperforms other machine-learning ones in terms of Pearson correlation coefficient on “core set”. For details see PLEC publication. PLECscore predicts binding affinity (pKi/d).
New in version 0.6.
- Parameters
- proteinoddt.toolkit.Molecule object
Receptor for the scored ligands
- n_jobs: int (default=-1)
Number of cores to use for scoring and training. By default (-1) all cores are allocated.
- version: str (default=’linear’)
A version of scoring function (‘linear’, ‘nn’ or ‘rf’) - which model should be used for the scoring function.
- depth_protein: int (default=5)
The depth of ECFP environments generated on the protein side of interaction. By default 6 (0 to 5) environments are generated.
- depth_ligand: int (default=1)
The depth of ECFP environments generated on the ligand side of interaction. By default 2 (0 to 1) environments are generated.
- size: int (default=65536)
The final size of a folded PLEC fingerprint. This setting is not used to limit the data encoded in PLEC fingerprint (for that tune the depths), but only the final lenght. Setting it to too low value will lead to many collisions.
Methods
fit
(ligands, target, *args, **kwargs)Trains model on supplied ligands and target values
load
([filename, version, pdbbind_version, …])Loads scoring function from a pickle file.
predict
(ligands, *args, **kwargs)Predicts values (eg.
predict_ligand
(ligand)Local method to score one ligand and update it’s scores.
predict_ligands
(ligands)Method to score ligands in a lazy fashion.
save
(filename)Saves scoring function to a pickle file.
score
(…)- Parameters
set_protein
(protein)Proxy method to update protein in all relevant places.
gen_json
gen_training_data
train
oddt.scoring.functions.RFScore module¶
-
class
oddt.scoring.functions.RFScore.
rfscore
(protein=None, n_jobs=- 1, version=1, spr=0, **kwargs)[source]¶ Bases:
oddt.scoring.scorer
Scoring function implementing RF-Score variants. It predicts the binding affinity (pKi/d) of ligand in a complex utilizng simple descriptors (close contacts of atoms <12A) with sophisticated machine-learning model (random forest). The third variand supplements those contacts with Vina partial scores. For futher details see RF-Score publications v1[Rd9e4db499696-1]_, v2[Rd9e4db499696-2]_, v3[Rd9e4db499696-3]_.
- Parameters
- proteinoddt.toolkit.Molecule object
Receptor for the scored ligands
- n_jobs: int (default=-1)
Number of cores to use for scoring and training. By default (-1) all cores are allocated.
- version: int (default=1)
Scoring function variant. The deault is the simplest one (v1).
- spr: int (default=0)
The minimum number of contacts in each pair of atom types in the training set for the column to be included in training. This is a way of removal of not frequent and empty contacts.
References
- 1
Ballester PJ, Mitchell JBO. A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking. Bioinformatics. 2010;26: 1169-1175. doi:10.1093/bioinformatics/btq112
- 2
Ballester PJ, Schreyer A, Blundell TL. Does a more precise chemical description of protein-ligand complexes lead to more accurate prediction of binding affinity? J Chem Inf Model. 2014;54: 944-955. doi:10.1021/ci500091r
- 3
Li H, Leung K-S, Wong M-H, Ballester PJ. Improving AutoDock Vina Using Random Forest: The Growing Accuracy of Binding Affinity Prediction by the Effective Exploitation of Larger Data Sets. Mol Inform. WILEY-VCH Verlag; 2015;34: 115-126. doi:10.1002/minf.201400132
Methods
fit
(ligands, target, *args, **kwargs)Trains model on supplied ligands and target values
load
([filename, version, pdbbind_version])Loads scoring function from a pickle file.
predict
(ligands, *args, **kwargs)Predicts values (eg.
predict_ligand
(ligand)Local method to score one ligand and update it’s scores.
predict_ligands
(ligands)Method to score ligands in a lazy fashion.
save
(filename)Saves scoring function to a pickle file.
score
(…)- Parameters
set_protein
(protein)Proxy method to update protein in all relevant places.
gen_training_data
train
-
gen_training_data
(pdbbind_dir, pdbbind_versions=(2007, 2012, 2013, 2014, 2015, 2016), home_dir=None, use_proteins=False)[source]¶
Module contents¶
-
class
oddt.scoring.functions.
PLECscore
(protein=None, n_jobs=- 1, version='linear', depth_protein=5, depth_ligand=1, size=65536)[source]¶ Bases:
oddt.scoring.scorer
PLECscore - a novel scoring function based on PLEC fingerprints. The underlying model can be one of:
linear regression
neural network (dense, 200x200x200)
random forest (100 trees)
The scoring function is trained on PDBbind v2016 database and even with linear model outperforms other machine-learning ones in terms of Pearson correlation coefficient on “core set”. For details see PLEC publication. PLECscore predicts binding affinity (pKi/d).
New in version 0.6.
- Parameters
- proteinoddt.toolkit.Molecule object
Receptor for the scored ligands
- n_jobs: int (default=-1)
Number of cores to use for scoring and training. By default (-1) all cores are allocated.
- version: str (default=’linear’)
A version of scoring function (‘linear’, ‘nn’ or ‘rf’) - which model should be used for the scoring function.
- depth_protein: int (default=5)
The depth of ECFP environments generated on the protein side of interaction. By default 6 (0 to 5) environments are generated.
- depth_ligand: int (default=1)
The depth of ECFP environments generated on the ligand side of interaction. By default 2 (0 to 1) environments are generated.
- size: int (default=65536)
The final size of a folded PLEC fingerprint. This setting is not used to limit the data encoded in PLEC fingerprint (for that tune the depths), but only the final lenght. Setting it to too low value will lead to many collisions.
Methods
gen_json
gen_training_data
train
-
class
oddt.scoring.functions.
nnscore
(protein=None, n_jobs=- 1)[source]¶ Bases:
oddt.scoring.scorer
NNScore implementation [1]. Based on Binana descriptors [2] and an ensemble of 20 best scored nerual networks with a hidden layer of 5 nodes. The NNScore predicts binding affinity (pKi/d).
- Parameters
- proteinoddt.toolkit.Molecule object
Receptor for the scored ligands
- n_jobs: int (default=-1)
Number of cores to use for scoring and training. By default (-1) all cores are allocated.
References
- 1
Durrant JD, McCammon JA. NNScore 2.0: a neural-network receptor-ligand scoring function. J Chem Inf Model. 2011;51: 2897-2903. doi:10.1021/ci2003889
- 2
Durrant JD, McCammon JA. BINANA: a novel algorithm for ligand-binding characterization. J Mol Graph Model. 2011;29: 888-893. doi:10.1016/j.jmgm.2011.01.004
Methods
fit
(ligands, target, *args, **kwargs)Trains model on supplied ligands and target values
load
([filename, pdbbind_version])Loads scoring function from a pickle file.
predict
(ligands, *args, **kwargs)Predicts values (eg.
predict_ligand
(ligand)Local method to score one ligand and update it’s scores.
predict_ligands
(ligands)Method to score ligands in a lazy fashion.
save
(filename)Saves scoring function to a pickle file.
score
(…)- Parameters
set_protein
(protein)Proxy method to update protein in all relevant places.
gen_training_data
train
-
gen_training_data
(pdbbind_dir, pdbbind_versions=(2007, 2012, 2013, 2014, 2015, 2016), home_dir=None, use_proteins=False)[source]¶
-
class
oddt.scoring.functions.
rfscore
(protein=None, n_jobs=- 1, version=1, spr=0, **kwargs)[source]¶ Bases:
oddt.scoring.scorer
Scoring function implementing RF-Score variants. It predicts the binding affinity (pKi/d) of ligand in a complex utilizng simple descriptors (close contacts of atoms <12A) with sophisticated machine-learning model (random forest). The third variand supplements those contacts with Vina partial scores. For futher details see RF-Score publications v1[R062ccc3ea4fa-1]_, v2[R062ccc3ea4fa-2]_, v3[R062ccc3ea4fa-3]_.
- Parameters
- proteinoddt.toolkit.Molecule object
Receptor for the scored ligands
- n_jobs: int (default=-1)
Number of cores to use for scoring and training. By default (-1) all cores are allocated.
- version: int (default=1)
Scoring function variant. The deault is the simplest one (v1).
- spr: int (default=0)
The minimum number of contacts in each pair of atom types in the training set for the column to be included in training. This is a way of removal of not frequent and empty contacts.
References
- 1
Ballester PJ, Mitchell JBO. A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking. Bioinformatics. 2010;26: 1169-1175. doi:10.1093/bioinformatics/btq112
- 2
Ballester PJ, Schreyer A, Blundell TL. Does a more precise chemical description of protein-ligand complexes lead to more accurate prediction of binding affinity? J Chem Inf Model. 2014;54: 944-955. doi:10.1021/ci500091r
- 3
Li H, Leung K-S, Wong M-H, Ballester PJ. Improving AutoDock Vina Using Random Forest: The Growing Accuracy of Binding Affinity Prediction by the Effective Exploitation of Larger Data Sets. Mol Inform. WILEY-VCH Verlag; 2015;34: 115-126. doi:10.1002/minf.201400132
Methods
fit
(ligands, target, *args, **kwargs)Trains model on supplied ligands and target values
load
([filename, version, pdbbind_version])Loads scoring function from a pickle file.
predict
(ligands, *args, **kwargs)Predicts values (eg.
predict_ligand
(ligand)Local method to score one ligand and update it’s scores.
predict_ligands
(ligands)Method to score ligands in a lazy fashion.
save
(filename)Saves scoring function to a pickle file.
score
(…)- Parameters
set_protein
(protein)Proxy method to update protein in all relevant places.
gen_training_data
train
-
gen_training_data
(pdbbind_dir, pdbbind_versions=(2007, 2012, 2013, 2014, 2015, 2016), home_dir=None, use_proteins=False)[source]¶