oddt.toolkits package¶
Submodules¶
oddt.toolkits.ob module¶
-
class
oddt.toolkits.ob.
Residue
(OBResidue)[source]¶ Bases:
object
Represent a Pybel residue.
- Required parameter:
- OBResidue – an Open Babel OBResidue
- Attributes:
- atoms, idx, name.
(refer to the Open Babel library documentation for more info).
- The original Open Babel atom can be accessed using the attribute:
- OBResidue
Attributes
atoms
idx
name
-
atoms
¶
-
idx
¶
-
name
¶
oddt.toolkits.rdk module¶
rdkit - A Cinfony module for accessing the RDKit from CPython
- Global variables:
- Chem and AllChem - the underlying RDKit Python bindings informats - a dictionary of supported input formats outformats - a dictionary of supported output formats descs - a list of supported descriptors fps - a list of supported fingerprint types forcefields - a list of supported forcefields
-
class
oddt.toolkits.rdk.
Atom
(Atom)[source]¶ Bases:
object
Represent an rdkit Atom.
- Required parameters:
- Atom – an RDKit Atom
- Attributes:
- atomicnum, coords, formalcharge
- The original RDKit Atom can be accessed using the attribute:
- Atom
Attributes
atomicnum
coords
formalcharge
idx
neighbors
partialcharge
-
atomicnum
¶
-
coords
¶
-
formalcharge
¶
-
idx
¶
-
neighbors
¶
-
partialcharge
¶
-
class
oddt.toolkits.rdk.
Fingerprint
(fingerprint)[source]¶ Bases:
object
A Molecular Fingerprint.
- Required parameters:
- fingerprint – a vector calculated by one of the fingerprint methods
- Attributes:
- fp – the underlying fingerprint object bits – a list of bits set in the Fingerprint
- Methods:
The “|” operator can be used to calculate the Tanimoto coeff. For example, given two Fingerprints ‘a’, and ‘b’, the Tanimoto coefficient is given by:
tanimoto = a | b
Attributes
raw
-
raw
¶
-
class
oddt.toolkits.rdk.
Molecule
(Mol=None, source=None, protein=False)[source]¶ Bases:
object
Represent an rdkit Molecule.
- Required parameter:
- Mol – an RDKit Mol or any type of cinfony Molecule
- Attributes:
- atoms, data, formula, molwt, title
- Methods:
- addh(), calcfp(), calcdesc(), draw(), localopt(), make3D(), removeh(), write()
- The underlying RDKit Mol can be accessed using the attribute:
- Mol
Attributes
Mol
atom_dict
atoms
canonic_order
Returns np.array with canonic order of heavy atoms in the molecule charges
clone
coords
data
formula
molwt
num_rotors
res_dict
ring_dict
sssr
title
Methods
addh
()Add hydrogens. calcdesc
([descnames])Calculate descriptor values. calcfp
([fptype, opt])Calculate a molecular fingerprint. clone_coords
(source)draw
([show, filename, update, usecoords])Create a 2D depiction of the molecule. localopt
([forcefield, steps])Locally optimize the coordinates. make3D
([forcefield, steps])Generate 3D coordinates. removeh
()Remove hydrogens. write
([format, filename, overwrite])Write the molecule to a file or return a string. -
Mol
¶
-
atom_dict
¶
-
atoms
¶
-
calcdesc
(descnames=[])[source]¶ Calculate descriptor values.
- Optional parameter:
- descnames – a list of names of descriptors
If descnames is not specified, all available descriptors are calculated. See the descs variable for a list of available descriptors.
-
calcfp
(fptype='rdkit', opt=None)[source]¶ Calculate a molecular fingerprint.
- Optional parameters:
- fptype – the fingerprint type (default is “rdkit”). See the
- fps variable for a list of of available fingerprint types.
- opt – a dictionary of options for fingerprints. Currently only used
- for radius and bitInfo in Morgan fingerprints.
-
canonic_order
¶ Returns np.array with canonic order of heavy atoms in the molecule
-
charges
¶
-
clone
¶
-
coords
¶
-
data
¶
-
draw
(show=True, filename=None, update=False, usecoords=False)[source]¶ Create a 2D depiction of the molecule.
- Optional parameters:
show – display on screen (default is True) filename – write to file (default is None) update – update the coordinates of the atoms to those
determined by the structure diagram generator (default is False)- usecoords – don’t calculate 2D coordinates, just use
- the current coordinates (default is False)
Aggdraw or Cairo is used for 2D depiction. Tkinter and Python Imaging Library are required for image display.
-
formula
¶
-
localopt
(forcefield='uff', steps=500)[source]¶ Locally optimize the coordinates.
- Optional parameters:
- forcefield – default is “uff”. See the forcefields variable
- for a list of available forcefields.
steps – default is 500
If the molecule does not have any coordinates, make3D() is called before the optimization.
-
make3D
(forcefield='uff', steps=50)[source]¶ Generate 3D coordinates.
- Optional parameters:
- forcefield – default is “uff”. See the forcefields variable
- for a list of available forcefields.
steps – default is 50
Once coordinates are generated, a quick local optimization is carried out with 50 steps and the UFF forcefield. Call localopt() if you want to improve the coordinates further.
-
molwt
¶
-
num_rotors
¶
-
res_dict
¶
-
ring_dict
¶
-
sssr
¶
-
title
¶
-
write
(format='smi', filename=None, overwrite=False, **kwargs)[source]¶ Write the molecule to a file or return a string.
- Optional parameters:
- format – see the informats variable for a list of available
- output formats (default is “smi”)
filename – default is None overwite – if the output file already exists, should it
be overwritten? (default is False)
If a filename is specified, the result is written to a file. Otherwise, a string is returned containing the result.
To write multiple molecules to the same file you should use the Outputfile class.
-
class
oddt.toolkits.rdk.
MoleculeData
(Mol)[source]¶ Bases:
object
Store molecule data in a dictionary-type object
- Required parameters:
- Mol – an RDKit Mol
Methods and accessor methods are like those of a dictionary except that the data is retrieved on-the-fly from the underlying Mol.
Example: >>> mol = readfile(“sdf”, ‘head.sdf’).next() >>> data = mol.data >>> print data {‘Comment’: ‘CORINA 2.61 0041 25.10.2001’, ‘NSC’: ‘1’} >>> print len(data), data.keys(), data.has_key(“NSC”) 2 [‘Comment’, ‘NSC’] True >>> print data[‘Comment’] CORINA 2.61 0041 25.10.2001 >>> data[‘Comment’] = ‘This is a new comment’ >>> for k,v in data.iteritems(): ... print k, “–>”, v Comment –> This is a new comment NSC –> 1 >>> del data[‘NSC’] >>> print len(data), data.keys(), data.has_key(“NSC”) 1 [‘Comment’] False
Methods
clear
()has_key
(key)items
()iteritems
()keys
()update
(dictionary)values
()
-
class
oddt.toolkits.rdk.
Outputfile
(format, filename, overwrite=False)[source]¶ Bases:
object
Represent a file to which output is to be sent.
- Required parameters:
- format - see the outformats variable for a list of available
- output formats
filename
- Optional parameters:
- overwite – if the output file already exists, should it
- be overwritten? (default is False)
- Methods:
- write(molecule) close()
Methods
close
()Close the Outputfile to further writing. write
(molecule)Write a molecule to the output file.
-
class
oddt.toolkits.rdk.
Smarts
(smartspattern)[source]¶ Bases:
object
A Smarts Pattern Matcher
- Required parameters:
- smartspattern
- Methods:
- findall(molecule)
Example: >>> mol = readstring(“smi”,”CCN(CC)CC”) # triethylamine >>> smarts = Smarts(“[#6][#6]”) # Matches an ethyl group >>> print smarts.findall(mol) [(0, 1), (3, 4), (5, 6)]
The numbers returned are the indices (starting from 0) of the atoms that match the SMARTS pattern. In this case, there are three matches for each of the three ethyl groups in the molecule.
Methods
findall
(molecule)Find all matches of the SMARTS pattern to a particular molecule.
-
oddt.toolkits.rdk.
descs
= []¶ A list of supported descriptors
-
oddt.toolkits.rdk.
forcefields
= ['uff']¶ A list of supported forcefields
-
oddt.toolkits.rdk.
fps
= ['rdkit', 'layered', 'maccs', 'atompairs', 'torsions', 'morgan']¶ A list of supported fingerprint types
-
oddt.toolkits.rdk.
informats
= {'inchi': 'InChI', 'mol2': 'Tripos MOL2 file', 'sdf': 'MDL SDF file', 'smi': 'SMILES', 'mol': 'MDL MOL file'}¶ A dictionary of supported input formats
-
oddt.toolkits.rdk.
outformats
= {'inchikey': 'InChIKey', 'sdf': 'MDL SDF file', 'can': 'Canonical SMILES', 'smi': 'SMILES', 'mol': 'MDL MOL file', 'inchi': 'InChI'}¶ A dictionary of supported output formats
-
oddt.toolkits.rdk.
readfile
(format, filename, *args, **kwargs)[source]¶ Iterate over the molecules in a file.
- Required parameters:
- format - see the informats variable for a list of available
- input formats
filename
You can access the first molecule in a file using the next() method of the iterator:
mol = readfile(“smi”, “myfile.smi”).next()- You can make a list of the molecules in a file using:
- mols = list(readfile(“smi”, “myfile.smi”))
You can iterate over the molecules in a file as shown in the following code snippet: >>> atomtotal = 0 >>> for mol in readfile(“sdf”, “head.sdf”): ... atomtotal += len(mol.atoms) ... >>> print atomtotal 43