oddt.toolkits package

Submodules

oddt.toolkits.common module

Code common to all toolkits

oddt.toolkits.common.detect_secondary_structure(res_dict)[source]

Detect alpha helices and beta sheets in res_dict by phi and psi angles

oddt.toolkits.ob module

class oddt.toolkits.ob.Atom(OBAtom)[source]

Bases: pybel.Atom

Attributes

atomicmass
atomicnum
bonds
cidx
coordidx
coords
exactmass
formalcharge
heavyvalence
heterovalence
hyb
idx
implicitvalence
isotope
neighbors
partialcharge
residue
spin
type
valence
vector
atomicmass
atomicnum
bonds
cidx
coordidx
coords
exactmass
formalcharge
heavyvalence
heterovalence
hyb
idx
implicitvalence
isotope
neighbors
partialcharge
residue
spin
type
valence
vector
class oddt.toolkits.ob.AtomStack(OBMol)[source]

Bases: object

class oddt.toolkits.ob.Bond(OBBond)[source]

Bases: object

Attributes

atoms
isrotor
order
atoms
isrotor
order
class oddt.toolkits.ob.BondStack(OBMol)[source]

Bases: object

class oddt.toolkits.ob.Fingerprint(fingerprint)[source]

Bases: pybel.Fingerprint

Attributes

bits
raw
bits
raw
class oddt.toolkits.ob.Molecule(OBMol=None, source=None, protein=False)[source]

Bases: pybel.Molecule

Attributes

OBMol
atom_dict
atoms
bonds
canonic_order Returns np.array with canonic order of heavy atoms in the molecule
charge
charges
clone
conformers
coords
data
dim
energy
exactmass
formula
molwt
num_rotors Number of strict rotatable
res_dict
residues
ring_dict
smiles
spin
sssr
title
unitcell

Methods

addh([only_polar]) Add hydrogens
calccharges([model]) Estimates atomic partial charges in the molecule.
calcdesc([descnames]) Calculate descriptor values.
calcfp([fptype]) Calculate a molecular fingerprint.
clone_coords(source)
convertdbonds() Convert Dative Bonds.
draw([show, filename, update, usecoords]) Create a 2D depiction of the molecule.
localopt([forcefield, steps]) Locally optimize the coordinates.
make2D() Generate 2D coordinates for molecule
make3D([forcefield, steps]) Generate 3D coordinates
removeh() Remove hydrogens
write([format, filename, overwrite, opt, size])
OBMol
addh(only_polar=False)[source]

Add hydrogens

atom_dict
atoms
bonds
calccharges(model='mmff94')

Estimates atomic partial charges in the molecule.

Optional parameters:
model – default is “mmff94”. See the charges variable for a list
of available charge models (in shell, obabel -L charges)

This method populates the partialcharge attribute of each atom in the molecule in place.

calcdesc(descnames=[])

Calculate descriptor values.

Optional parameter:
descnames – a list of names of descriptors

If descnames is not specified, all available descriptors are calculated. See the descs variable for a list of available descriptors.

calcfp(fptype='FP2')

Calculate a molecular fingerprint.

Optional parameters:
fptype – the fingerprint type (default is “FP2”). See the
fps variable for a list of of available fingerprint types.
canonic_order

Returns np.array with canonic order of heavy atoms in the molecule

charge
charges
clone
clone_coords(source)[source]
conformers
convertdbonds()

Convert Dative Bonds.

coords
data
dim
draw(show=True, filename=None, update=False, usecoords=False)

Create a 2D depiction of the molecule.

Optional parameters:

show – display on screen (default is True) filename – write to file (default is None) update – update the coordinates of the atoms to those

determined by the structure diagram generator (default is False)
usecoords – don’t calculate 2D coordinates, just use
the current coordinates (default is False)

Tkinter and Python Imaging Library are required for image display.

energy
exactmass
formula
localopt(forcefield='mmff94', steps=500)

Locally optimize the coordinates.

Optional parameters:
forcefield – default is “mmff94”. See the forcefields variable
for a list of available forcefields.

steps – default is 500

If the molecule does not have any coordinates, make3D() is called before the optimization. Note that the molecule needs to have explicit hydrogens. If not, call addh().

make2D()[source]

Generate 2D coordinates for molecule

make3D(forcefield='mmff94', steps=50)[source]

Generate 3D coordinates

molwt
num_rotors

Number of strict rotatable

removeh()[source]

Remove hydrogens

res_dict
residues
ring_dict
smiles
spin
sssr
title
unitcell
write(format='smi', filename=None, overwrite=False, opt=None, size=None)[source]
class oddt.toolkits.ob.MoleculeData(obmol)[source]

Bases: pybel.MoleculeData

Methods

clear()
has_key(key)
items()
iteritems()
keys()
to_dict()
update(dictionary)
values()
clear()
has_key(key)
items()
iteritems()
keys()
to_dict()[source]
update(dictionary)
values()
class oddt.toolkits.ob.Outputfile(format, filename, overwrite=False, opt=None)[source]

Bases: pybel.Outputfile

Methods

close() Close the Outputfile to further writing.
write(molecule) Write a molecule to the output file.
close()

Close the Outputfile to further writing.

write(molecule)

Write a molecule to the output file.

Required parameters:
molecule
class oddt.toolkits.ob.Residue(OBResidue)[source]

Bases: object

Represent a Pybel residue.

Required parameter:
OBResidue – an Open Babel OBResidue
Attributes:
atoms, idx, name.

(refer to the Open Babel library documentation for more info).

The original Open Babel atom can be accessed using the attribute:
OBResidue

Attributes

atoms
idx
name
atoms
idx
name
class oddt.toolkits.ob.ResidueStack(OBMol)[source]

Bases: object

class oddt.toolkits.ob.Smarts(smartspattern)[source]

Bases: pybel.Smarts

Initialise with a SMARTS pattern.

Methods

findall(molecule) Find all matches of the SMARTS pattern to a particular molecule.
match(molecule) Checks if there is any match.
findall(molecule)

Find all matches of the SMARTS pattern to a particular molecule.

Required parameters:
molecule
match(molecule)[source]

Checks if there is any match. Returns True or False

oddt.toolkits.ob.readfile(format, filename, opt=None, lazy=False)[source]

oddt.toolkits.rdk module

rdkit - A Cinfony module for accessing the RDKit from CPython

Global variables:
Chem and AllChem - the underlying RDKit Python bindings informats - a dictionary of supported input formats outformats - a dictionary of supported output formats descs - a list of supported descriptors fps - a list of supported fingerprint types forcefields - a list of supported forcefields
class oddt.toolkits.rdk.Atom(Atom)[source]

Bases: object

Represent an rdkit Atom.

Required parameters:
Atom – an RDKit Atom
Attributes:
atomicnum, coords, formalcharge
The original RDKit Atom can be accessed using the attribute:
Atom

Attributes

atomicnum
bonds
coords
formalcharge
idx Note that this index is 1-based and RDKit’s internal index in 0-based.
neighbors
partialcharge
atomicnum
bonds
coords
formalcharge
idx

Note that this index is 1-based and RDKit’s internal index in 0-based. Changed to be compatible with OpenBabel

neighbors
partialcharge
class oddt.toolkits.rdk.AtomStack(Mol)[source]

Bases: object

class oddt.toolkits.rdk.Bond(Bond)[source]

Bases: object

Attributes

atoms
isrotor
order
atoms
isrotor
order
class oddt.toolkits.rdk.BondStack(Mol)[source]

Bases: object

class oddt.toolkits.rdk.Fingerprint(fingerprint)[source]

Bases: object

A Molecular Fingerprint.

Required parameters:
fingerprint – a vector calculated by one of the fingerprint methods
Attributes:
fp – the underlying fingerprint object bits – a list of bits set in the Fingerprint
Methods:

The “|” operator can be used to calculate the Tanimoto coeff. For example, given two Fingerprints ‘a’, and ‘b’, the Tanimoto coefficient is given by:

tanimoto = a | b

Attributes

raw
raw
class oddt.toolkits.rdk.Molecule(Mol=None, source=None, protein=False)[source]

Bases: object

Represent an rdkit Molecule.

Required parameter:
Mol – an RDKit Mol or any type of cinfony Molecule
Attributes:
atoms, data, formula, molwt, title
Methods:
addh(), calcfp(), calcdesc(), draw(), localopt(), make3D(), removeh(), write()
The underlying RDKit Mol can be accessed using the attribute:
Mol

Attributes

Mol
atom_dict
atoms
bonds
canonic_order Returns np.array with canonic order of heavy atoms in the molecule
charges
clone
coords
data
formula
molwt
num_rotors
res_dict
residues
ring_dict
smiles
sssr
title

Methods

addh([only_polar]) Add hydrogens.
calcdesc([descnames]) Calculate descriptor values.
calcfp([fptype, opt]) Calculate a molecular fingerprint.
clone_coords(source)
localopt([forcefield, steps]) Locally optimize the coordinates.
make2D() Generate 2D coordinates for molecule
make3D([forcefield, steps]) Generate 3D coordinates.
removeh(**kwargs) Remove hydrogens.
write([format, filename, overwrite, size]) Write the molecule to a file or return a string.
Mol
addh(only_polar=False, **kwargs)[source]

Add hydrogens.

atom_dict
atoms
bonds
calcdesc(descnames=None)[source]

Calculate descriptor values.

Optional parameter:
descnames – a list of names of descriptors

If descnames is not specified, all available descriptors are calculated. See the descs variable for a list of available descriptors.

calcfp(fptype='rdkit', opt=None)[source]

Calculate a molecular fingerprint.

Optional parameters:
fptype – the fingerprint type (default is “rdkit”). See the
fps variable for a list of of available fingerprint types.
opt – a dictionary of options for fingerprints. Currently only used
for radius and bitInfo in Morgan fingerprints.
canonic_order

Returns np.array with canonic order of heavy atoms in the molecule

charges
clone
clone_coords(source)[source]
coords
data
formula
localopt(forcefield='uff', steps=500)[source]

Locally optimize the coordinates.

Optional parameters:
forcefield – default is “uff”. See the forcefields variable
for a list of available forcefields.

steps – default is 500

If the molecule does not have any coordinates, make3D() is called before the optimization.

make2D()[source]

Generate 2D coordinates for molecule

make3D(forcefield='mmff94', steps=50)[source]

Generate 3D coordinates.

Optional parameters:
forcefield – default is “uff”. See the forcefields variable
for a list of available forcefields.

steps – default is 50

Once coordinates are generated, a quick local optimization is carried out with 50 steps and the UFF forcefield. Call localopt() if you want to improve the coordinates further.

molwt
num_rotors
removeh(**kwargs)[source]

Remove hydrogens.

res_dict
residues
ring_dict
smiles
sssr
title
write(format='smi', filename=None, overwrite=False, size=None, **kwargs)[source]

Write the molecule to a file or return a string.

Optional parameters:
format – see the informats variable for a list of available
output formats (default is “smi”)

filename – default is None overwite – if the output file already exists, should it

be overwritten? (default is False)

If a filename is specified, the result is written to a file. Otherwise, a string is returned containing the result.

To write multiple molecules to the same file you should use the Outputfile class.

class oddt.toolkits.rdk.MoleculeData(Mol)[source]

Bases: object

Store molecule data in a dictionary-type object

Required parameters:
Mol – an RDKit Mol

Methods and accessor methods are like those of a dictionary except that the data is retrieved on-the-fly from the underlying Mol.

Example: >>> mol = next(readfile(“sdf”, ‘head.sdf’)) >>> data = mol.data >>> print(data) {‘Comment’: ‘CORINA 2.61 0041 25.10.2001’, ‘NSC’: ‘1’} >>> print(len(data), data.keys(), data.has_key(“NSC”)) 2 [‘Comment’, ‘NSC’] True >>> print(data[‘Comment’]) CORINA 2.61 0041 25.10.2001 >>> data[‘Comment’] = ‘This is a new comment’ >>> for k,v in data.items(): ... print(k, “–>”, v) Comment –> This is a new comment NSC –> 1 >>> del data[‘NSC’] >>> print(len(data), data.keys(), data.has_key(“NSC”)) 1 [‘Comment’] False

Methods

clear()
has_key(key)
items()
iteritems()
keys()
to_dict()
update(dictionary)
values()
clear()[source]
has_key(key)[source]
items()[source]
iteritems()[source]
keys()[source]
to_dict()[source]
update(dictionary)[source]
values()[source]
class oddt.toolkits.rdk.Outputfile(format, filename, overwrite=False)[source]

Bases: object

Represent a file to which output is to be sent.

Required parameters:
format - see the outformats variable for a list of available
output formats

filename

Optional parameters:
overwite – if the output file already exists, should it
be overwritten? (default is False)
Methods:
write(molecule) close()

Methods

close() Close the Outputfile to further writing.
write(molecule) Write a molecule to the output file.
close()[source]

Close the Outputfile to further writing.

write(molecule)[source]

Write a molecule to the output file.

Required parameters:
molecule
class oddt.toolkits.rdk.Residue(ParentMol, atom_path)[source]

Bases: object

Represent a RDKit residue.

Required parameter:
ParentMol – Parent molecule (Mol) object path – atoms path of a residue
Attributes:
atoms, idx, name.

(refer to the Open Babel library documentation for more info).

The Mol object constucted of residues’ atoms can be accessed using the attribute:
Residue

Attributes

atoms
idx
name
atoms
idx
name
class oddt.toolkits.rdk.ResidueStack(Mol, paths)[source]

Bases: object

class oddt.toolkits.rdk.Smarts(smartspattern)[source]

Bases: object

Initialise with a SMARTS pattern.

Methods

findall(molecule) Find all matches of the SMARTS pattern to a particular molecule.
match(molecule) Find all matches of the SMARTS pattern to a particular molecule.
findall(molecule)[source]

Find all matches of the SMARTS pattern to a particular molecule.

Required parameters:
molecule
match(molecule)[source]

Find all matches of the SMARTS pattern to a particular molecule.

Required parameters:
molecule
oddt.toolkits.rdk.base_feature_factory = <rdkit.Chem.rdMolChemicalFeatures.MolChemicalFeatureFactory object>

Global feature factory based on BaseFeatures.fdef

oddt.toolkits.rdk.descs = ['fr_C_O_noCOO', 'PEOE_VSA3', 'Chi4v', 'fr_Ar_COO', 'fr_SH', 'Chi4n', 'SMR_VSA10', 'fr_para_hydroxylation', 'fr_barbitur', 'fr_Ar_NH', 'fr_halogen', 'fr_dihydropyridine', 'fr_priamide', 'SlogP_VSA4', 'fr_guanido', 'MinPartialCharge', 'fr_furan', 'fr_morpholine', 'fr_nitroso', 'NumAromaticCarbocycles', 'fr_COO2', 'fr_amidine', 'SMR_VSA7', 'fr_benzodiazepine', 'ExactMolWt', 'fr_Imine', 'MolWt', 'fr_hdrzine', 'fr_urea', 'NumAromaticRings', 'fr_quatN', 'NumSaturatedHeterocycles', 'NumAliphaticHeterocycles', 'fr_benzene', 'fr_phos_acid', 'fr_sulfone', 'VSA_EState10', 'fr_aniline', 'fr_N_O', 'fr_sulfonamd', 'fr_thiazole', 'TPSA', 'EState_VSA8', 'PEOE_VSA14', 'PEOE_VSA13', 'PEOE_VSA12', 'PEOE_VSA11', 'PEOE_VSA10', 'BalabanJ', 'fr_lactone', 'fr_Al_COO', 'EState_VSA10', 'EState_VSA11', 'HeavyAtomMolWt', 'fr_nitro_arom', 'Chi0', 'Chi1', 'NumAliphaticRings', 'MolLogP', 'fr_nitro', 'fr_Al_OH', 'fr_azo', 'NumAliphaticCarbocycles', 'fr_C_O', 'fr_ether', 'fr_phenol_noOrthoHbond', 'fr_alkyl_halide', 'NumValenceElectrons', 'fr_aryl_methyl', 'fr_Ndealkylation2', 'MinEStateIndex', 'fr_term_acetylene', 'HallKierAlpha', 'fr_C_S', 'fr_thiocyan', 'fr_ketone_Topliss', 'VSA_EState4', 'Ipc', 'VSA_EState6', 'VSA_EState7', 'VSA_EState1', 'VSA_EState2', 'VSA_EState3', 'fr_HOCCN', 'fr_phos_ester', 'BertzCT', 'SlogP_VSA12', 'EState_VSA9', 'SlogP_VSA10', 'SlogP_VSA11', 'fr_COO', 'NHOHCount', 'fr_unbrch_alkane', 'NumSaturatedRings', 'MaxPartialCharge', 'fr_methoxy', 'fr_thiophene', 'SlogP_VSA8', 'SlogP_VSA9', 'MinAbsPartialCharge', 'SlogP_VSA5', 'SlogP_VSA6', 'SlogP_VSA7', 'SlogP_VSA1', 'SlogP_VSA2', 'SlogP_VSA3', 'NumRadicalElectrons', 'fr_NH2', 'fr_piperzine', 'fr_nitrile', 'NumHeteroatoms', 'fr_NH1', 'fr_NH0', 'MaxAbsEStateIndex', 'LabuteASA', 'fr_amide', 'Chi3n', 'fr_imidazole', 'SMR_VSA3', 'SMR_VSA2', 'SMR_VSA1', 'Chi3v', 'SMR_VSA6', 'Kappa3', 'Kappa2', 'EState_VSA6', 'EState_VSA7', 'SMR_VSA9', 'EState_VSA5', 'EState_VSA2', 'EState_VSA3', 'fr_Ndealkylation1', 'EState_VSA1', 'fr_ketone', 'SMR_VSA5', 'MinAbsEStateIndex', 'fr_diazo', 'SMR_VSA4', 'fr_Ar_N', 'fr_Nhpyrrole', 'fr_ester', 'VSA_EState5', 'EState_VSA4', 'NumHDonors', 'fr_prisulfonamd', 'fr_oxime', 'SMR_VSA8', 'fr_isocyan', 'Chi2n', 'Chi2v', 'HeavyAtomCount', 'fr_azide', 'NumHAcceptors', 'fr_lactam', 'fr_allylic_oxid', 'VSA_EState8', 'fr_oxazole', 'VSA_EState9', 'fr_piperdine', 'fr_Ar_OH', 'fr_sulfide', 'fr_alkyl_carbamate', 'NOCount', 'Chi1n', 'PEOE_VSA8', 'PEOE_VSA7', 'PEOE_VSA6', 'PEOE_VSA5', 'PEOE_VSA4', 'MaxEStateIndex', 'PEOE_VSA2', 'PEOE_VSA1', 'NumSaturatedCarbocycles', 'fr_imide', 'FractionCSP3', 'Chi1v', 'fr_Al_OH_noTert', 'fr_epoxide', 'fr_hdrzone', 'fr_isothiocyan', 'NumAromaticHeterocycles', 'fr_bicyclic', 'Kappa1', 'Chi0n', 'fr_phenol', 'MolMR', 'PEOE_VSA9', 'fr_aldehyde', 'fr_pyridine', 'fr_tetrazole', 'RingCount', 'fr_nitro_arom_nonortho', 'Chi0v', 'fr_ArN', 'NumRotatableBonds', 'MaxAbsPartialCharge']

A list of supported descriptors

oddt.toolkits.rdk.forcefields = ['mmff94', 'uff']

A list of supported forcefields

oddt.toolkits.rdk.fps = ['rdkit', 'layered', 'maccs', 'atompairs', 'torsions', 'morgan']

A list of supported fingerprint types

oddt.toolkits.rdk.informats = {'inchi': 'InChI', 'mol2': 'Tripos MOL2 file', 'sdf': 'MDL SDF file', 'smi': 'SMILES', 'mol': 'MDL MOL file'}

A dictionary of supported input formats

oddt.toolkits.rdk.outformats = {'inchikey': 'InChIKey', 'sdf': 'MDL SDF file', 'can': 'Canonical SMILES', 'smi': 'SMILES', 'mol': 'MDL MOL file', 'inchi': 'InChI'}

A dictionary of supported output formats

oddt.toolkits.rdk.readfile(format, filename, lazy=False, opt=None, *args, **kwargs)[source]

Iterate over the molecules in a file.

Required parameters:
format - see the informats variable for a list of available
input formats

filename

You can access the first molecule in a file using the next() method of the iterator:

mol = next(readfile(“smi”, “myfile.smi”))
You can make a list of the molecules in a file using:
mols = list(readfile(“smi”, “myfile.smi”))

You can iterate over the molecules in a file as shown in the following code snippet: >>> atomtotal = 0 >>> for mol in readfile(“sdf”, “head.sdf”): ... atomtotal += len(mol.atoms) ... >>> print(atomtotal) 43

oddt.toolkits.rdk.readstring(format, string, **kwargs)[source]

Read in a molecule from a string.

Required parameters:
format - see the informats variable for a list of available
input formats

string

Example: >>> input = “C1=CC=CS1” >>> mymol = readstring(“smi”, input) >>> len(mymol.atoms) 5

Module contents