Python Scripting¶
Most of the functionality in msys is exposed in its Python interface.
Overview¶
This section introduces the Python interface and explains how to use it effectively. We begin by introducing some concepts that pervade the Python interface, then move to some examples.
Msys ids¶
In Msys, instances of the Atom, Bond, Residue, and Chain classes
are all Handles, in the sense that they refer to a piece of data held
by the parent System. All Msys handles have an immutable id
property that uniquely identifies them within their parent System.
Objects that hold references to other objects do so through the id
of that object. Two handles of the same type will compare equal to each
other if and only if they belong the same System and possess the same
id
.
When you load a system from a file, or create one from scratch, these
ids
will be numbered consecutively, starting at zero. Deleting
Atoms, Bonds, etc. from the System can introduce gaps in the set of
ids
, but, once created, the id
of an object never changes.
When Msys writes a DMS file, the primary keys of the particles will
be contiguous starting at 0, and will appear in the order in which the
particles appear in the System, even if the ids
of the atoms in the
System are noncontiguous due to deletions. When Msys loads a DMS file,
if the primary keys happen to be noncontiguous, Msys will still create a
System with the usual contiguous ids.
Msys properties¶
Many objects in Msys (in particular, Atoms, Bonds, Terms, and
Params) can have typed attributes given to all members of the set
to which they belong. In Msys, these attributes are referred to as
properties, or props for short, and have a type of either int,
float, or str (string). The available property names and their
types can be queried in the appropriate parent object, using the
props
, atom_props
, etc. properties of the parent.
The value of the property for a given element can be read and modified
using a dictionary-like interface on the element itself:
mol = msys.LoadDMS('input.dms')
# find all distinct values of the 'grp_energy' atom property, if it exists
grp_energy_vals = set()
if 'grp_energy' in mol.atom_props:
for atm in mol.atoms:
grp_energy_vals.add( atm['grp_energy'] )
# add a new property 'foo' of type 'float'
mol.addAtomProp('foo', float)
# Set the value of foo to the z coordinate of the atom
for a in mol.atoms: a['foo'] = a.pos[2]
When you add a property to a set of elements, the initial value will be 0 for int and float types, and the empty string for str types. If a property with the same name and type already exists, no action is taken. An exception is thrown if you try to add a property with the same name but different type from an existing property.
Getting started¶
Once you have your Python environment by loading the appropriate modules, fire up Python, import the msys module, and load a dms or mae file:
import msys
# Load the entire contents of a DMS file
dms=msys.LoadDMS('system.dms')
# Import an MAE file, performing conversions on its forcefield data
mae=msys.LoadMAE('system.mae')
You can also create a new System from scratch:
# mol = msys.CreateSystem()
A System resides entirely in memory; changes to the System will not persist until/unless you write it back out to a file:
# Save the system as a DMS file
msys.SaveDMS(dms, 'output.dms')
# Export to MAE file
msys.SaveMAE(dms, 'output.mae')
Msys also lets you append chemical systems to an existing file, for certain file formats. The supported Save methods will have an ‘append’ option in their function signatures.
The full set of Atoms, Bonds, Residues, Chains, and TermTables are available by fetching them from the system. You can also fetch the bonds involving a particular atom, the atoms in a residue, or the bonds in a chain in a similar way:
# get the number of atoms, and the total charge
atoms = dms.atoms
natoms = len(atoms)
total_charge = sum(a.charge for a in atoms)
# find the atoms participating in double bonds
for chn in mol.chains:
for res in chn.residues:
for atm in res.atoms:
for bnd in atm.bonds:
if bnd.order == 2:
print "atom %d in chain %s residue %s:%d has a double bond" % (
atm.id, chn.name, res.name, res.num)
# iterate over tables, print atoms per term and number of terms
for t in mol.tables:
print "table %s: %d atoms, %d terms" % (t.name, t.natoms, t.nterms)
# fetch the stretch_harm table. Throws an exception if no such table
stretch = mol.table('stretch_harm')
Atom selections let you fetch a list of atoms using the VMD atom selection
language. The select
method returns a list of Atoms, which is
just a subset of the list that would be returned by the atoms
property:
# fetch the backbone atoms. Note that bb is just a Python list
bb = mol.select('backbone')
Once you have the atoms, if you actually want to work with the residues or chains, it’s easy to do:
# get the set of distinct residues in the backbone
bb_residues = set(a.residue for a in bb)
Note that the atoms returned by select
refer back to the original
system. Msys also provides the means to create a new System independent
of the original, using either the CloneSystem
function or the
clone
method of System. When you clone a subset of a System, the
Terms in the forcefield whose atoms are completely contained in the
selection will be copied to the new System:
# get the list of protein atoms
pro_atoms = mol.select('protein')
# construct a new system containing just the protein
protein = msys.CloneSystem(pro_atoms)
# Atoms in the cloned system have the same attributes as the originals,
# but modifications to one do not affect the other
assert pro_atoms[0].charge == protein.atoms[0].charge
pro_atoms[0].charge += 3
assert pro_atoms[0].charge != protein.atoms[0].charge
The clone
method of System is a more concise way of selecting a
set of atoms, then immediately creating a new System from it:
# create a new System with all the hydrogens removed
hless = mol.clone('not hydrogen')
# create a copy of the original
dup = mol.clone()
You can append the structure and associated forcefield from one System
onto another using System’s append
method:
# duplicate the protein by appending to itself
protein.append(protein)
# load a water system and append it to the protein system. Just as for
# CloneSystem, after appending water to protein, modifications to water
# will not affect any atoms in protein.
water = msy.LoadDMS('water.dms')
protein.append(water)
Terms in a system’s forcefield can be accessed and modified by going through the corresponding TermTable:
stretch = protein.table('stretch_harm')
terms = stretch.terms
params = stretch.params
props = params.props # ['fc', 'r0']
print "%d stretch terms, %d stretch params" % (terms.nterms, params.nparams)
You can change the properties of a selected Term using a dictionary-like interface:
# Change the force constant of the first stretch term to 42
stretch.terms[0]['fc] = 42
Adding new forcefield terms¶
Msys provides an interface for adding a TermTable corresponding to a “standard” forcefield term and configuring that table with its category and its the expected set of properties:
# Get the available set of TermTable schemas:
schemas = msys.TableSchemas()
# For bonded, constraint, virtual, and polar terms, as well as
the exclusion table:
table = mol.addTableFromSchema('posre_harm') # position restraints
# Get the available set of nonbonded schemas
nb_schemas = msys.NonbondedSchemas()
# For a nonbonded table:
nb = mol.addNonbondedFromSchema('vdw_12_6')
The addNonbondedFromSchema
also takes care of configuring the
nonbonded_info
properties of the System; see the section on
nonbonded parameters for more details.
If you have a new table type that hasn’t made it into Msys’ canonical
set, you can simply use addTable
and configure the table yourself:
table = mol.addTable('funky_harm', 2)
table.params.addProp('fk', float)
table.params.addProp('r0', float)
If a table with a given name already exists in a System, addTable
and addTableFromSchema
will just return the existing table.
Files with multiple components¶
To examine every structure in a multi-component file without having to load them all into memory at once, use LoadMany. Unlike the Load function, which always returns one System, LoadMany is a generator which iterates over molecular structures in the input file:
for mol in msys.LoadMany('input.mol2'):
print mol.name
Not every file format supports LoadMany; in cases where it doesn’t, LoadMany will stop after a single iteration, yielding just one System.
If you use LoadMany to load a file, each System will have only one Ct. However, if you use Load to import an MAE or DMS file, and the file contains multiple components, the new System will contain Ct elements corresponding to those components:
mol = msys.Load('small_vancomycin_complex.mae')
for ct in mol.cts:
print ct.name
# prints:
# vancomycin_diala_complex
# SPC water box
The ct information wil be preserved when saving the System back to an MAE or DMS file.
You can create a multi-ct system from existing Systems using the
append
method:
pro = msys.Load('protein.dms')
pro.ct(0).name = 'protein'
wat = msys.Load('water.dms')
wat.ct(0).name = 'water'
pro.append(wat)
assert pro.ncts == 2 # assuming there was 1 ct in protein.dms and wat.dms
assert pro.ct(1).name == 'water'
msys.Save(pro, 'combined.dms')
The msys module¶
This is the high-level Python interface for msys, intended for use by chemists.
-
class
msys.
AnnotatedSystem
(sys, allow_bad_charges=False)[source]¶ System that has been annotated with additional chemical information
The AnnotatedSystem class provides chemical annotation useful primarily for evaluating smarts patterns. The system is expected to already have have chemical reasonable bond orders and formal charges, and to have no missing atoms (e.g. hydrogens). If these criteria cannot be met, set allow_bad_charges=True in the constructor to bypass these checks; in that case the AnnotatedSystem can still be used to evaluate smarts patterns, but patterns making use of the electronic state of the system (e.g. aromaticity, hybridization, etc.) will not be correct (the system will appear to be entirely aliphatic). You may also use the AssignBondOrderAndFormalCharge function to assign reasonable bond orders and formal charges, assuming there are no missing atoms.
The AnnotatedSystem defines a model for aromaticity. First, the SSSR (smallest set of smallest rings) is determined. Next, rings which share bonds are detected and grouped into ring systems. Rings are initially marked as nonaromatic. If the ring system taken as a whole is deemed to be aromatic, then all rings within it are aromatic as well; otherwise, individual rings are checked for aromaticity. Rings are checked in this fashion until no new rings are found to be aromatic.
A ring system is deemed to be aromatic if it satisfies Huckel’s 4N+2 rule for the number of electrons in the ring(s). An internal double bond (i.e. a bond between two atoms in the ring) adds 2 to the electron count. An external double bond (a bond between a ring atom and an atom not in that ring) adds 1 to the electron count. An external double bond between a carbon and a nonaromatic carbon makes the ring unconditionally nonaromtic. An atom with a lone pair and no double bonds adds 2 to the electron count.
-
__init__
(sys, allow_bad_charges=False)[source]¶ Construct from System. AnnotatedSystem is not updated if System is subsequently modified.
-
__weakref__
¶ list of weak references to the object (if defined)
-
property
errors
¶ List of errors found during system analysis if allow_bad_charges=True
-
-
msys.
ApplyDihedralGeometry
(a, b, c, r, theta, phi)[source]¶ Return the position of atom d with cd length r, bcd angle theta, and abcd dihedral phi, all in radians.
-
msys.
AssignBondOrderAndFormalCharge
(system_or_atoms, total_charge=None, compute_resonant_charges=False, *, timeout=60.0)[source]¶ Assign bond orders and formal charges to a molecular system.
Determines bond orders and formal charges by preferring neutral charges and placing negative charges with more electronegative atoms, under octet constraints and the total system charge constraint. Assigns the bond orders and formal charges to the system. Can assign to a subset of atoms of the system, provided these atoms form complete connected fragments.
- Parameters
system_or_atoms – either a System or a list of Atoms
total_charge – if not None, integral total charge
compute_resonant_charges (bool) – compute and store resonant charge in atom property ‘resonant_charge’ and resonant bond order in bond property ‘resonant_order’.
timeout (float) – maximum time allowed, in seconds. Note: calling this function on a chemically incomplete system, i.e. just protein backbone, cause msys to hit the timeout.
-
class
msys.
Atom
(_ptr, _id)[source]¶ Represents an atom (or pseudoparticle) in a chemical system
-
property
aromatic
¶
-
property
atomic_number
¶
-
property
bonded_atoms
¶ Atoms bonded to this atom
-
property
bonds
¶ Bonds connected to this atom
-
property
charge
¶
-
property
formal_charge
¶
-
property
fragid
¶
-
property
mass
¶
-
property
name
¶
-
property
nbonds
¶ number of bonds to this atom
-
property
nhydrogens
¶ number of bonded hydrogens
-
property
pos
¶ position
-
property
valence
¶ sum of bond orders
-
property
vel
¶ velocity
-
property
vx
¶
-
property
vy
¶
-
property
vz
¶
-
property
x
¶
-
property
y
¶
-
property
z
¶
-
property
-
class
msys.
Bond
(_ptr, _id)[source]¶ Represents a bond in a System
-
property
atoms
¶ Atoms in this Bond
-
property
first
¶ first Atom in the bond (the one with lower id)
-
property
order
¶ bond order (int)
-
property
second
¶ second Atom in the bond (the one with higher id)
-
property
-
exception
msys.
BrokenBondsError
[source]¶ -
__weakref__
¶ list of weak references to the object (if defined)
-
-
class
msys.
Chain
(_ptr, _id)[source]¶ Represents a chain (of Residues) in a System
-
property
ct
¶ Return the Ct for this chain
-
property
name
¶
-
property
nresidues
¶ number of residues in this chain
-
property
residues
¶ list of Residues in this Chain
-
property
segid
¶
-
property
-
msys.
CloneSystem
(atoms)[source]¶ Call System.clone(atoms) using the System from the first atom.
DEPRECATED. Use System.clone directly instead.
-
msys.
ComputeTopologicalIds
(system)[source]¶ Compute and return the topological ids for the atoms or system
-
msys.
ConvertFromOEChem
(oe_mol, force=False)[source]¶ Construct a System from the given OEChem OEMol
- Parameters
oe_mol (oechem.OEMol) – the OEChem molecule to convert
force (bool) – whether sanity checks should be performed before conversion
- Returns
System
- Return type
mol (System)
-
msys.
ConvertFromRdkit
(rdmol)[source]¶ Construct an msys System from an RDMol :param mol: system :type mol: rdkit.ROMol
- Returns
System
Notes
All atoms will be assigned to the same Residue. Only the first conformer will be used, if any. There must not be any implicit hydrogens. Bonds will be kekulized since msys doesn’t maintain aromaticity. Chiral tags will not be maintained.
-
msys.
ConvertToOEChem
(mol_or_atoms)[source]¶ Construct an OEChem OEMol from the given System
- Parameters
mol – System or [Atoms]
- Returns
oechem.OEMol
Notes
If [Atoms] are give, only bonds involving the specified atoms will be passed to to the OEMol; this is the same behavior as System.clone(atoms).
-
msys.
ConvertToRdkit
(mol, sanitize=True)[source]¶ Construct an RDKit ROMol from the given System
- Parameters
mol (System) – System
sanitize (bool) – whether to sanitize the molecule
- Returns
rdkit.ROMol
Notes: alchemical systems may require sanitize=False
-
class
msys.
Ct
(_ptr, _id)[source]¶ Represents a list of Chains in a System
The Ct class exists mainly to provide a separate namespace for chains. If you merge two systems each of which has a chain A, you probably want the chains to remain separate. Cts accomplish this.
The Ct class also provides a key-value namespace for assigning arbitrary properties to Systems.
-
append
(system)[source]¶ Appends atoms and forcefield from system to self. Returns a list of of the new created atoms in self. Systems must have identical nonbonded_info.vdw_funct. Does not overwrite the global cell information in self.
-
property
atoms
¶ list of Atoms in this Ct
-
property
bonds
¶ list of Bonds in this Ct
-
property
chains
¶ list of Chains in this Ct
-
property
name
¶ Name of Ct
-
property
natoms
¶ number of Atoms in the Ct
-
property
nchains
¶ number of Chains in this Ct
-
-
msys.
FindDistinctFragments
(system, key='graph')[source]¶ Find connected sets of atoms with identical topology.
- Parameters
system – System chemical system
key – str one of ‘graph’, ‘inchi’, ‘oechem_smiles’, or list of strings, one per fragment
- Returns
- mapping from representative fragment id to ids of fragments
having identical topology.
- Return type
dict[int -> [int]]
Notes
Fragments are distinguished according to value given by ‘key’; if ‘graph’, topologically identical fragments will be considered identical even if they have different stereochemistry. Choose one of the other options to include stereochemistry in the fragment disambiguation.
-
msys.
FromSmilesString
(smiles, forbid_stereo=True)[source]¶ Construct a System from a smiles string.
- Parameters
smiles (str) – the smiles string
forbid_stereo (bool) – if True, raise exception if smiles has stereo
EXPERIMENTAL. In particular, stereo information in the smiles string is ignored. Set forbid_stereo=False to permit stereo specifications to be silently ignored. This flag may be removed at a later date once stereo support has been added.
-
msys.
GetBondsAnglesDihedrals
(system)[source]¶ Return bonds, angles and dihedrals deduced from bond topology Returns: dict
-
msys.
GetSSSR
(atoms, all_relevant=False)[source]¶ Get smallest set of smallest rings (SSSR) for a system fragment.
The SSSR is in general not unique; the SSSR of a tetrahedron is any three of its four triangular faces. The set of rings that is the union of all SSSR’s (all relevant rings) may be obtained by setting all_relevant to True.
Arguments: atoms – [msys.Atom, …, msys.Atom] from a single system all_relevant – bool Returns: [[msys.Atom, …, msys.Atom], …, [msys.Atom, …, msys.Atom]]
-
class
msys.
Graph
(system_or_atoms, colors=None)[source]¶ Represents the chemical topology of a System
Used mainly to implement graph isomorphism; see the match() method
-
__init__
(system_or_atoms, colors=None)[source]¶ Construct Graph :param system_or_atoms: System or [Atoms] :param colors: None, [Int] or Callable
If colors is provided, it is used to specify the vertex color for each atom in the graph. By default, atomic number is used. A color may be provided for each atom, or a callable accepting an Atom as argument. Atoms with color zero are ignored for purposes of graph matching, and no mapping will be returned for them.
-
__weakref__
¶ list of weak references to the object (if defined)
-
static
hash_atoms
(atoms)[source]¶ string hash of specified atoms
- Parameters
atoms ([msys.Atom]) – list of atoms
-
match
(graph)[source]¶ Find a graph isomorphism between self and the given Graph. If no isomorphism could be found, return None; otherwise return mapping from atoms in this graph to atoms in that graph.
-
matchAll
(graph, substructure=False)[source]¶ Find all graph isomorphisms between self and the given Graph. If no isomorphism could be found, return empty list; otherwise return list of dicts mapping atoms in this graph to atoms in that graph. If substructure is True, return isomorphisms between self and any subgraph of the given Graph.
-
-
class
msys.
HydrogenBondFinder
(system, donors, acceptors, cutoff=3.5)[source]¶ Find candidate hydrogen bonds.
More hbonds will be found than are “realistic”; further filtering may be performed using the energy attribute of the returned hbonds. A reasonable filter seems to be around -1.0 (more negative is stronger); i.e. energies greater than that are more likely than not to be spurious.
The HydrogenBond class can also be used directly to compute hydrogen bond geometry and energies by supplying donor, acceptor and hydrogen positions.
-
__init__
(system, donors, acceptors, cutoff=3.5)[source]¶ - Parameters
system (System) – msys system
donors – selection string, list of ids, or list of Atoms
acceptors – selection string, list of ids, or list of Atoms
cutoff (float) – distance cutoff for donor and acceptor
Note
If Atoms are provided, they must be members of system.
-
__weakref__
¶ list of weak references to the object (if defined)
-
-
class
msys.
InChI
(system, DoNotAddH=True, SNon=False, FixedH=True)[source]¶ InChI holds an the result of an inchi invocation for a structure
-
__init__
(system, DoNotAddH=True, SNon=False, FixedH=True)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
__weakref__
¶ list of weak references to the object (if defined)
-
property
auxinfo
¶ Auxiliary info
-
property
key
¶ inchi key for this object’s string.
-
property
message
¶ Message returned by inchi during calculation
-
property
ok
¶ Was an inchi computed successfully?
-
property
string
¶ Computed inchi string
-
-
class
msys.
IndexedFileLoader
(path, idx_path=None)[source]¶ Supports random access to multi-structure files
-
__getitem__
(index)[source]¶ Get structure at given index
- Parameters
index (int) – 0-based index
- Returns
msys System
- Return type
mol (System)
-
__init__
(path, idx_path=None)[source]¶ Open an indexed file loader, creating an index file if needed. :param path: file path. File type is inferred from the extension. :type path: str :param idx_path: index file path. Defaults to $path.idx. :type idx_path: str
Note
You need write permission to the location of the index file.
-
__weakref__
¶ list of weak references to the object (if defined)
-
property
path
¶ path to source file
-
-
msys.
Load
(path, structure_only=False, without_tables=None)[source]¶ Infer the file type of path and load the file.
- Parameters
path – if str, a file path or PDB accession code. if int, an Anton jobid (require ‘yas’ garden module)
structure_only (bool) – Omit force tables and pseudo atoms
without_tables (bool) – Omit force tables.
- Returns
new System
-
msys.
LoadDMS
(path=None, structure_only=False, buffer=None)[source]¶ Load the DMS file at the given path and return a System containing it. If structure_only is True, only Atoms, Bonds, Residues and Chains will be loaded, along with the GlobalCell, and no pseudos (atoms with atomic number less than one) will be loaded.
If the buffer argument is provided, it is expected to hold the contents of a DMS file, and the path argument will be ignored.
-
msys.
LoadMAE
(path=None, ignore_unrecognized=False, buffer=None, structure_only=False)[source]¶ load the MAE file at the given path and return a System containing it. Forcefield tables will be created that attempt to match as closely as possible the force terms in the MAE file; numerical differences are bound to exist. If ignore_unrecognized is True, ignore unrecognized force tables.
If the buffer argument is provided, it is expected to hold the contents of an MAE file, and the path argument will be ignored.
If the contents of the file specified by path, or the contents of buffer, are recognized as being gzip-compressed, they will be decompressed on the fly.
If structure_only is True, no forcefield components will be loaded.
-
msys.
LoadMany
(path, structure_only=False, error_writer=<_io.TextIOWrapper name='<stderr>' mode='w' encoding='UTF-8'>)[source]¶ Iterate over structures in a file, if the file type supports iteration.
for mol in LoadMany(‘input.mol2’): …
If there was an error reading a structure, LoadMany returns None for that iteration. If error_writer is not None, it’s write() method is invoked with the contents of the exception message as argument. error_writer defaults to sys.stderr.
-
msys.
LoadMol2
(path, multiple=False)[source]¶ Load a mol2 file at the given path. If multiple is True, return a list of Systems, one for each MOLECULE record. Otherwise, return just one System corresponding to the first MOLECULE record.
-
msys.
LoadPDB
(path)[source]¶ Load a PDB file at the given path and return a System. No bonds will be created, even if CONECT records are parent.
-
msys.
LoadPrmTop
(path, structure_only=False)[source]¶ Load an Amber7 prmtop file at the given path and return a System. Coordinates and global cell information are not present in the file.
-
msys.
LoadXYZ
(path)[source]¶ Load an xyz file at the given path. Guesses bonds based on guessed atomic numbers based on atom name.
-
msys.
MatchFragments
(mol1, mol2, key='graph')[source]¶ construct an atom to atom mapping for all fragments from mol1 to mol2
- Parameters
mol1 – System
mol2 – System
key – see FindDistinctFragments
- Returns
dict[Atom -> Atom] or None
-
class
msys.
Param
(_ptr, _id)[source]¶ A Param instance is a reference to a row in a ParamTable. Use the
dict
-style interface to get and set values in the row. Msys will take care of converting input values to the type of the corresponding column, and raise an exception if the conversion cannot be performed.-
duplicate
()[source]¶ create a new entry in the parent parameter table with the same values as this one, returning it.
-
property
id
¶ id in parent table
-
property
system
¶ parent System
-
property
table
¶ parent ParamTable
-
-
class
msys.
ParamTable
(_ptr)[source]¶ The ParamTable class is a 2d table, whose rows are indexed by
id
and whose columns are properties; see the discussion of properties in the Overview. A ParamTable is used by TermTables to hold the shared parameters for its Terms.-
addParam
(**kwds)[source]¶ add and return a new Param().
If keyword arguments are supplied, they will be assigned to the newly created Param before returning it.
-
addProp
(name, type)[source]¶ add a new property of the given type, which must be int, float, or str.
-
property
nparams
¶ number of Params
-
property
nprops
¶ number of properties
-
property
params
¶ list of all Params in table
-
property
props
¶ names of the properties
-
-
msys.
ParseSDF
(text)[source]¶ Iterate over blocks in sdf format text. Accepts normal and gzipped text.
-
msys.
ReadCrdCoordinates
(mol, path)[source]¶ Read coordinates from the given Amber crd file into the given System.
-
msys.
ReadPDBCoordinates
(mol, path)[source]¶ Read coordinates and box from the given pdb file into the given System.
-
class
msys.
Residue
(_ptr, _id)[source]¶ Represents a residue (group of Atoms) in a System
-
property
atoms
¶ list of Atoms in this Residue
-
property
center
¶ return geometric center of positions of atoms in residue
-
property
chain
¶ parent chain
-
property
insertion
¶ insertion code
-
property
name
¶ residue name
-
property
natoms
¶ number of atoms in this residue
-
property
resid
¶ the PDB residue identifier
-
property
-
msys.
Save
(mol, path, append=False, structure_only=False)[source]¶ Save the given system to path, using a file format guessed from the path name. Not all formats support both append and structure_only options; see the corresponding SaveXXX functions.
-
msys.
SaveDMS
(system, path, structure_only=False, unbuffered=False)[source]¶ Export the System to a DMS file at the given path.
-
msys.
SaveMAE
(system, path, with_forcefield=True, append=False)[source]¶ Export the System (or list of systems) to an MAE file at the given path.
-
msys.
SaveMol2
(system, path, selection='none', append=False, moe=True)[source]¶ Export the System to a mol2 file at the given path. You should probably call AssignBondOrderAndFormalCharge() before exporting the system.
- Parameters
system (System) – msys system
path (str) – file name to save to
selection (str) – msys selection string to restrict to
append (bool) – if True, don’t clobber path, just append to it
moe (bool) – output guadinium groups with aromatic bonds so that MOE will read correctly
-
msys.
SavePDB
(system, path, append=False, reorder=False)[source]¶ Export the System to a PDB file at the given path.
-
msys.
SerializeMAE
(system, with_forcefield=True)[source]¶ Return the MAE form of the System as bytes.
-
class
msys.
SmartsPattern
(pattern)[source]¶ A class representing a compiled SMARTS pattern
The Msys smarts implementation is similar to that of Daylight smarts <http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html>, with support for arbitrarily nested recursive smarts. A few features are not currently supported; warnings will be generated when these constructs are used in a smarts pattern.
Directional bonds; e.g.
\
and /; these are treated as single bonds (i.e. as a - character).Chiral specification (
@
,@@
, etc); ignored.Implicit hydrogen (
h
): treated as explicitH
.Explicit degree (
D
): treated as bond countX
.Isotopes: (
[12C]
): ignored.Atom class (
[C:6]
): ignored.
On the other hand, Msys does support hybridization using the
^
token, as in OpenBabel:[c^2] select sp2 aromatic carbon
-
__weakref__
¶ list of weak references to the object (if defined)
-
findMatches
(annotated_system, atoms=None)[source]¶ Return list of lists representing ids of matches of this pattern in this system, optionally requiring that the first atom match belongs to the given set of atoms. An AnnotatedSystem must be used here, which can be constructed from a System after calling AssignBondOrderAndFormalCharge.
-
match
(annotated_system)[source]¶ Return True if a match is found anywhere; False otherwise.
This is much faster than checking for an empty result from findMatches.
-
property
natoms
¶ Number of atoms in the compiled smarts pattern
-
property
pattern
¶ The pattern used to initialize the object
-
property
warnings
¶ Warnings, if any, emitted during compilation
-
class
msys.
SpatialHash
(pos, ids=None, box=None)[source]¶ SpatialHash provides an interface for efficient spatial queries on particle positions.
-
__init__
(pos, ids=None, box=None)[source]¶ Construct from particle positions. If ids are provided, they should be a numpy array of type uint32 and specify which rows of the Nx3 pos array are to be hashed. If box is provided, it must be a 3x3 array of doubles, and the search will be performed using periodic boundary conditions.
-
__weakref__
¶ list of weak references to the object (if defined)
-
findContacts
(radius, pos, ids=None, reuse_voxels=False)[source]¶ Find pairs of particles within radius of each other.
- Parameters
radius (float) – search radius
pos (array[float]) – positions
ids (array[uint]) – particle indices
reuse_voxels (bool) – assume voxelize(R>=radius) has already been called
ignore_excluded (bool) – exclude atom pairs in the exclusion table.
- Returns
Mx1 arrays of ids and distances.
- Return type
i, j, dists (tuple)
The first array corresponds to ids in the call to findContacts; the second column to the ids passed to the SpatialHash constructor.
IMPORTANT: pairs with the same id in the constructor and the call the findContacts are excluded from the output set. Therefore, the positions passsed to findContacts should correspond to the same atom indices as the positions passed to the SpatialHash constructor.
-
findNearest
(k, pos, ids=None)[source]¶ Find at most k particles from pos with the smallest minimum distance to some particle in the spatial hash. If ids is not provided, it defaults to arange(len(pos)).
-
findWithin
(radius, pos, ids=None, reuse_voxels=False)[source]¶ Find particles from pos which are within the given radius of some particle in the spatial hash (i.e. provided in the SpatialHash constructor). By default, voxelization is performed at the same resolution as the query radius, but this can be overridden by calling voxelize() manually, then calling findWithin() with reuse_voxels=True. pos is expected to be an Nx3 array of floats. The ids parameter defaults to arange(len(pos)); supply an array of ids to limit the search to a subset of rows in pos.
-
voxelize
(radius)[source]¶ Perform voxelization such that findWithin queries with reuse_voxels=True at a radius equal to or less than the given radius can be performed accurately. For queries at radius less than the voxelization, it may be worthwhile to revoxelize at a smaller radius. Note that, by default, findWithin calls voxelize with the query radius as argument, so it is not strictly necessary ever to use this method.
-
-
class
msys.
System
(_ptr)[source]¶ The System class holds all structure and forcefield data for a single chemical system. Create a new System using
msys.CreateSystem()
, or from a file usingmsys.LoadDMS
ormsys.LoadMAE.
A System organizes the information in a DMS file into several different groups:
Tables - TermTables are grouped and accessed by name
cell - the unit cell vectors for the System, in the form of a 3x3 NumPy array.
nonbonded_info - the NonbondedInfo object describing the type of nonbonded interactions.
provenance - a list of Provenance objects describing how the input file has been processed.
Auxiliary tables: Everything else in the DMS file that does not fit into one of the above categories finds its way into an auxiliary table. Notable denizens of this category include:
cmap tables
forcefield (annotation for parameters in the DMS file)
-
__init__
(_ptr)[source]¶ Construct from SystemPtr. Do not invoke directly; use CreateSystem() instead.
-
addAtomProp
(name, type)[source]¶ add a custom atom property with the given name and type. type should be int, float, or str.
-
addBondProp
(name, type)[source]¶ add a custom bond property with the given name and type. type should be int, float, or str.
-
addChain
(ct=4294967295)[source]¶ add and return a new Chain. If no ct is given, the chain will be added to the first ct, creating one if necessary.
-
addNonbondedFromSchema
(funct, rule='')[source]¶ Add a nonbonded table to the system, and configure the nonbonded info according to funct and rule. funct must be the name of recognized nonbonded type. rule is not checked; at some point in the future we might start requiring that it be one of the valid combining rules for the specified funct. If nonbonded_info’s vdw_funct and vdw_rule are empty, they are overridden by the provided values; otherwise, the corresponding values must agree if funct and rule are not empty. A nonbonded table is returned.
-
addTable
(name, natoms, params=None)[source]¶ add a table with the given name and number of atoms. If a table with the same name already exists, it is returned, otherwise the newly created table is returned. If no ParamTable params is supplied, a new one is created.
-
addTableFromSchema
(type, name=None)[source]¶ Add a table to the system if it not already present, returning it. If optional name field is provided, the table will be added with the given name; otherwise the name is taken from the table schema.
-
analyze
()[source]¶ Assign atom and residue types. This needs to be called manually only if you create a system from scratch, using msys.CreateSystem(); in that case, analyze() should be called before performing any atom selections.
-
append
(system)[source]¶ Appends atoms and forcefield from system to self. Returns a list of of the new created atoms in self. Systems must have identical nonbonded_info.vdw_funct. Overwrites self.global_cell with system.global_cell only when self.global_cell is all zeros.
-
asCapsule
()[source]¶ Return a capsule wrapper of the internal SystemPtr.
The capsule holds a bare pointer and therefore must not outlive self.
-
property
atom_props
¶ return the list of custom atom properties.
-
property
atoms
¶ return list of all atoms in the system
-
atomsGroupedBy
(prop)[source]¶ Return dictionary mapping representative values of the given atom property to lists of atoms having that property. If the property does not exist in this system, returns an empty dictionary.
-
atomsel
(sel)[source]¶ Create and return an atom selection object (Atomsel). :param sel: str atom selection, or list of GIDs (possibly empty). :type sel: object
Note
Even if ids are provided, the ids of the selection are in sorted order.
-
property
auxtable_names
¶ names of the auxiliary tables
-
property
auxtables
¶ all the auxiliary tables
-
property
bond_props
¶ return the list of custom bond properties.
-
property
bonds
¶ return list of all bonds in the system
-
property
cell
¶ The GlobalCell for this System
-
property
center
¶ return geometric center of positions of all atoms
-
property
chains
¶ return list of all chains in the system
-
clone
(sel=None, share_params=False, use_index=False, forbid_broken_bonds=False)[source]¶ Clone the System, returning a new System. If selection is provided, it should be an atom selection string, a list of ids, or a list of Atoms.
If share_params is True, then ParamTables will be shared between the old and new systems. By default, copies of the ParamTables are made, but ParamTables shared _within_ the old system will also be shared in the new system.
If forbid_broken_bonds is True, an exception will be thrown if the selected atoms do not include all atoms connected by bonds.
-
property
cts
¶ return list of all cts in the system
-
findContactIds
(cutoff, ids=None, other=None, pos=None, ignore_excluded=False)[source]¶ Find atoms not bonded to each other which are within cutoff of each other. If ids is not None, consider only atoms with the given ids. If other is not None, consider only atom pairs such that one is in ids and the other is in other. If pos is not None, use pos as positions, which should be natoms x 3 regardless of the size of ids or other. pos may be supplied only when there are no deleted atoms in the structure. If ignore_excluded=True, exclusions from the exclusion table are used. If ignore_excluded=False, bonds in System.bonds are still excluded.
Returns a list of (id 1, id 2, distance) tuples for each contact found.
-
guessBonds
(replace=True, reanalyze=True, periodic=False)[source]¶ Guess bond connectivity based on an atomic-number based atom radius.
Replaces any existing bonds, unless replace=False is specified.
Reanalyzes fragids and atom types unless reanalyze=False is specified. In that case, you MUST call updateFragids() manually before making any use of the fragment assignment (fragids will be out of date).
-
hash
(sorted=True)[source]¶ hash of contents of this system.
The hash is insensitive to provenance and System.name.
-
property
name
¶ The name of the System, taken from the input file name
-
property
natoms
¶ number of atoms
-
property
nbonds
¶ number of bonds
-
property
nchains
¶ number of chains
-
property
ncts
¶ number of Cts
-
property
nonbonded_info
¶ NonbondedInfo for this System
-
property
nresidues
¶ number of residues
-
property
positions
¶ Nx3 list of lists of positions of all atoms
-
property
provenance
¶ return a list of Provenance entries for this system
-
property
residues
¶ return list of all residues in the system
-
save
(path, structure_only=False)[source]¶ Write self to path
- Parameters
path (str) – file path
structure_only (bool) – write only atom information, not forcefield
- Returns
self
-
selectArr
(seltext)[source]¶ Return the ids of the Atoms satisfying the given VMD atom selection as a numpy array of type uint32.
-
selectChain
(name=None, segid=None)[source]¶ Returns a single Chain with the matching name and/or segid, or raises an exception if no single such chain is present.
-
selectCt
(name=None)[source]¶ Return a single Ct with the matching name, or raises an exception if no single such Ct is present
-
selectIds
(seltext, pos=None, box=None)[source]¶ Return the ids of the Atoms satisfying the given VMD atom selection. This can be considerably faster than calling select().
if pos is supplied, it should be an Nx3 numpy array of positions, where N=self.natoms.
If box is supplied, it should be a 3x3 numpy array of cell vectors, like System.cell.
-
sorted
()[source]¶ Return a clone of the system with atoms reordered based on their order of appearance in a depth-first traversal of the structure hierarchy.
-
property
table_names
¶ names of the tables in the System
-
property
tables
¶ all the tables in the System
-
property
topology
¶ list of bonded atoms for each atom in the System
-
class
msys.
SystemImporter
(system)[source]¶ Maps atoms to residues, chains and cts
-
__init__
(system)[source]¶ construct from system to be constructed
- Parameters
system (System) – msys System
-
__weakref__
¶ list of weak references to the object (if defined)
-
addAtom
(chain, segid, resnum, resname, aname, insertion='', ct=0)[source]¶ Add atom to system
- Returns
newly added Atom
-
initialize
(atoms)[source]¶ Process existing atoms in the system
- Parameters
atoms (list[msys.Atom]) –
Note
can be called multiple times; each time clears the internal tables so that subsequent atoms do not share residues, chains, or cts with previously added atoms.
-
property
system
¶ input system
-
-
class
msys.
Term
(_ptr, _id)[source]¶ A Term is a handle for an entry in a TermTable.
The properties of a Term can be read and updated using a dictionary like interface. Both “term properties” and properties from the ParamTable are accessed through the same interface. To add or remove properties, use the provided methods in the TermTable or ParamTable instance. If a Term’s
param
is shared by another Term in any other TermTable, Msys will take care of providing the Term with its own Param containing a copy of the original properties before applying the changes. However, if you a modify a Param through its dictionary interface, you will affect all Terms that happen to share that Param:# fetch the stretch_harm table table = mol.table('stretch_harm') # update the properties of just the first Term table.term(0)['fc'] = 320 # update the properties of all terms that use this param! table.term(0).param['fc'] = 320
-
property
atoms
¶ list of Atoms for this Term
-
property
id
¶ id of this term in its TermTable
-
property
param
¶ The Param corresponding to this Term’s parameters
-
property
system
¶ parent System of parent TermTable
-
property
table
¶ parent TermTable
-
property
-
class
msys.
TermTable
(_ptr)[source]¶ Each TermTable is intended to describe a specific type of interaction, e.g. stretch, angle, Lennard-Jones, constraint_hoh, etc. A TermTable has an arity (given by the natoms property) which specifies how many atoms are involved in each interaction: one for nonbonded terms, two for stretch terms, etc. Each interaction instance is described by a Term. Each Term references the appropriate number of atoms, and exactly one Param, which lives in a ParamTable owned (or possible shared) by the TermTable.
The functional form described by a TermTable is not part of msys; all msys does is represent the forcefield parameters in a generic way.
-
__init__
(_ptr)[source]¶ Construct from TermTablePtr. Do not invoke directly; use System.addTable or System.table instead
-
addTerm
(atoms, param=None)[source]¶ Add a Term to the table, with given initial param. The atoms list must have natoms elements, and each Atom must belong to the same System as the TermTable. If param is not None, it must belong to the ParamTable held by the TermTable.
-
property
category
¶ A string describing what kind of TermTable this is. Possibilities are: bond, constraint, virtual, polar, nonbonded, and exclusion.
-
coalesce
()[source]¶ Reassign param for each Term in this Table to a member of the distinct set of Params used by those Terms.
-
count_overrides
()[source]¶ return the number of pairwise nonbonded interactions which are affected by the overrides in the given term table.
-
findExact
(atoms)[source]¶ return the terms that contain precisely the given atoms in the given order.
-
property
name
¶ name of this table
-
property
natoms
¶ number of atoms in each term
-
property
noverrides
¶ number of parameter overrides
-
property
nterms
¶ number of terms
-
property
override_params
¶ parameter table containing override values
-
overrides
()[source]¶ return a mapping from pairs of Params in self.params to a Param in self.override_params.
-
property
params
¶ The ParamTable for terms in this table.
-
property
props
¶ Table properties
-
replaceWithSortedTerms
()[source]¶ Replace table in self.system with a version of self that has terms sorted by atom ids.
-
setOverride
(pi, pj, op)[source]¶ override the interaction between params pi and pj with the interaction given by op. pi and pj must be Params from self.params; op must be a param from self.override_params, or None to remove the override.
-
property
system
¶ The System whose atoms are referenced by this table.
-
property
term_props
¶ names of the custom properties
-
property
terms
¶ returns a list of all the Terms in the table
-
The Atomsel class¶
-
class
msys.atomsel.
Atomsel
(ptr, seltext)[source]¶ Supports alignment of molecular structures
-
alignCoordinates
(other)[source]¶ If other is an Atomsel instance, align the coordinates of other’s System with self. If other is a numpy array, align the array with self, using corresponding indices.
In either case, return the aligned RMSD.
-
currentRMSD
(other)[source]¶ compute RMS distance to other object, which may be Atomsel or an array of positions. In either it must be the case that len(other) equals len(self) or len(self.system)
-
property
ids
¶ ids of selected atoms in the parent system
-
raw_alignment
(other)[source]¶ Compute alignment to other object. Compute and return aligned rmsd, and rotational and translational transformations.
-
property
system
¶ parent system
-
Dealing with duplicate parameters¶
After performing various modifications to a TermTable, you may find that the associated ParamTable contains many entries whose values are all identical. The redundant parameters can be removed by first “coalescing” the parameter assignments of each Term to a set of distinct Params, then cloning the System. When a System is cloned, only the Params which are referenced by at least one Term in the TermTable are copied to the new System:
import msys
mol=msys.CreateSystem()
a1=mol.addAtom()
a2=mol.addAtom()
a3=mol.addAtom()
table = mol.addTableFromSchema('stretch_harm')
p1=table.params.addParam()
p1['fc']=320
p1['r0']=1.0
t1=table.addTerm([a1,a2], p1)
t2=table.addTerm([a1,a3], p1)
# At this point we have two terms and one param. Suppose we ignore the
# fact that t1 and t2 share a Param, and we just update their properties
# to the same value:
t1['r0']=1.2
t2['r0']=1.2
# Now we have two Params, because when we updated t1, we created a second
# Param that was unshared by t2. When we updated t2, p1 was unshared, so
# no duplicate was made.
assert table.params.nparams==2
# But we could get by with only a single Param. Let's do that:
mol.coalesceTables()
# At this point t1 and t2 are sharing a Param, and the other one is unused:
assert t1.param==t2.param
assert table.params.nparams==2
assert table.nterms==2
# When we clone, the unused params are not copied to the new system.
mol2=mol.clone()
assert mol2.table('stretch_harm').params.nparams==1
-
class
msys.
TermTable
(_ptr)[source]¶ Each TermTable is intended to describe a specific type of interaction, e.g. stretch, angle, Lennard-Jones, constraint_hoh, etc. A TermTable has an arity (given by the natoms property) which specifies how many atoms are involved in each interaction: one for nonbonded terms, two for stretch terms, etc. Each interaction instance is described by a Term. Each Term references the appropriate number of atoms, and exactly one Param, which lives in a ParamTable owned (or possible shared) by the TermTable.
The functional form described by a TermTable is not part of msys; all msys does is represent the forcefield parameters in a generic way.
-
addTerm
(atoms, param=None)[source]¶ Add a Term to the table, with given initial param. The atoms list must have natoms elements, and each Atom must belong to the same System as the TermTable. If param is not None, it must belong to the ParamTable held by the TermTable.
-
property
category
¶ A string describing what kind of TermTable this is. Possibilities are: bond, constraint, virtual, polar, nonbonded, and exclusion.
-
coalesce
()[source]¶ Reassign param for each Term in this Table to a member of the distinct set of Params used by those Terms.
-
count_overrides
()[source]¶ return the number of pairwise nonbonded interactions which are affected by the overrides in the given term table.
-
findExact
(atoms)[source]¶ return the terms that contain precisely the given atoms in the given order.
-
property
name
¶ name of this table
-
property
natoms
¶ number of atoms in each term
-
property
noverrides
¶ number of parameter overrides
-
property
nterms
¶ number of terms
-
property
override_params
¶ parameter table containing override values
-
overrides
()[source]¶ return a mapping from pairs of Params in self.params to a Param in self.override_params.
-
property
params
¶ The ParamTable for terms in this table.
-
property
props
¶ Table properties
-
replaceWithSortedTerms
()[source]¶ Replace table in self.system with a version of self that has terms sorted by atom ids.
-
setOverride
(pi, pj, op)[source]¶ override the interaction between params pi and pj with the interaction given by op. pi and pj must be Params from self.params; op must be a param from self.override_params, or None to remove the override.
-
property
system
¶ The System whose atoms are referenced by this table.
-
property
term_props
¶ names of the custom properties
-
property
terms
¶ returns a list of all the Terms in the table
-
Sharing ParamTables¶
Forcefield developers will (we hope!) appreciate the ability for Msys to parameterize multiple TermTables from potentially different Systems using a single ParamTable instance. Normally, when a System is loaded from an input file, or a TermTable is created using the scripting interface, each TermTable refer to a ParamTable of its very own, and no other TermTable can or will reference it. However, at the time that a TermTable is created, a ParamTable can be provided which will be used to hold the Param entries for the Terms in the TermTable:
# create two independent systems
m1=msys.CreateSystem()
m2=msys.CreateSystem()
# add some atoms
m1.addAtom()
m2.addAtom()
m2.addAtom()
# create a free-standing ParamTable and add some Params
params=msys.CreateParamTable()
p1=params.addParam()
p2=params.addParam()
# create a table in system 1 which uses the free ParamTable
table1=m1.addTable("table", 1, params)
# no other TermTable is using the ParamTable
assert not params.shared
# create a table in system 2 which also uses the free ParamTable
table2=m2.addTable("table", 1, params)
# now the ParamTable is shared
assert params.shared
assert table1.params == table2.params
# Add some terms to each table
t1=table1.addTerm(m1.atoms, p2)
t2=table2.addTerm(m2.atoms[1:], p2)
assert t1.param == t2.param
assert t2.param == p2
# modifications to the the original table and its params are propagated
# to each table
params.addProp("fc", float)
p1['fc']=32
p2['fc']=42
assert t1['fc']==42
assert t2['fc']==42
# p1 is shared by multiple TermTables, but within a TermTable, p1 is not
# shared. Modifications to t1['fc'] will affect t2!
t1['fc'] = 52
assert t2['fc'] == 52
Pfx¶
A high level interface for wrapping, centering, and alignment.
This module provides simple, yet performant methods for manipulating trajectories of systems with connected atoms and periodic boundary conditions.
-
msys.pfx.
aligned_rmsd
(X, Y, weight=None) → mat, rmsd¶ Compute the matrix aligning Y onto X, optionally with weights. Return the matrix and the rmsd.
-
msys.pfx.
inverse_3x3
(A) → Ainv¶
-
msys.pfx.
svd_3x3
(A) → U, w, V¶ svd_3x3 computes the singular value decomposition of the 3x3 matrix A. The result is always calculated and returned in double precision.
What pfx does¶
Pfx can be configured to perform a number of tasks related to postprocessing of trajectories of molecular systems. There are four main issues which Pfx is designed to deal with:
Fixing bonds. If two atoms with a bond between them are found in different periodic images, one of them must be shifted by some integer linear combination of the unit cell vectors so that the distance between them is no greater than half a unit cell vector along each cell vector direction. When there are multiple atoms bonded together, the bond fixing operation must be applied to the bonds composing this connected component in topologically sorted order.
Gluing components. Some molecular systems, such as multimeric ion channels, contain components which are not explicitly bonded to each other, but which do stay together during the simulation and should therefore be kept together during postprocessing. For each set of glued components, Pfx finds the transformations which minimize the square distance between the centers of each component.
Centering and alignment. Pfx can either center a selected set of atoms on the origin, or align a selected set to a reference structure.
Wrapping components. Any of the preceeding operations could place the center of a connected set of atoms outside the unit cell centered at the origin. Pfx shifts each connected component to bring it as close to the origin as possible, which maintaining any glued components. Importantly, if an alignment has been performed in the previous step, then the rotational part of the alignment transformation must be applied to the unit cell before performing the wrapping. Another subtlety is that when alignemnt has been performed, the wrapping should be performed about the center of the reference selection, not necessarily the origin. Otherwise, if the reference structure is far from the origin, wrapping could undo the alignment.
The main work of pfx is does in the apply method. The arguments to apply are a coordinate set and, optionally, a periodic cell and/or a set of velocities. Here’s what happens when you call apply, assuming that both the periodic cell and the velocities have been provided:
Fix bonds.
Glue components.
Translate the entire system to bring the centered or aligned atoms to the origin.
Compute a rotational transformation which aligns the system to the centered reference structure.
Apply the rotation to the input positions, unit cel, and velocities.
Wrap connected and glued components.
Shift the entire system to the center of the reference structure.
Specifying topology¶
A Pfx instance is constructed from a bond topology. The bond topology indicates both how many atoms are in the molecular system as well as which atoms are bonded together. Pfx analyzes this topology to find the connected components comprising the system. If the fixbonds argument is True, then Pfx also computes and caches a topologically sorted list of bonds from the topology, so that the bond fixing step can be performed efficiently.
Specifying glue¶
The glue method of Pfx lets you specify atoms which should be kept together even if there is no explicit bond between them. Suppose the ids of all the protein atoms in a multimeric protein are passed as the argument to glue. Pfx first finds the set of connected components which overlap with the selection. In the glue components step, the centers of these components will be brought together. Moreover, in the wrap components step, all the protein atoms will be treated as a single component for the purpose of wrapping.
Suppose now that only one residue from each monomer of a multicomponent protein is included in a glue selection. The same set of connected components will be kept together as before, when the entire protein was glued; however, the centers of the connected components will be computed from just the glued residue in each monomer, rather than from all atoms of each monomer. The wrap components step will be unchanged.
The glue method can be called multiple times on the same Pfx instance. It is perfectly valid for glue selections in different invocations to overlap.
Performing both centering and alignment¶
When viewing trajectories, chemists often want to specify both “center” and “fit” selections. But what does this mean? If you center on, say, atoms 1-10, and align atoms 10-20, one operation will undo the other. The only sensible approach seems to be to apply the “center” specification to whatever is being used for the reference structure, and then use the “fit” selection to align non-reference frames to that selection.
What about periodicfix?¶
The algorithms in this module are essentially the same as those in periodicfix. So why a new module? Here are some reasons:
Periodicfix isn’t consistent about its use of float and double, and does a lot of interconversion. This makes it slow.
Periodicfix doesn’t have both single and double precision versions available from Python. This one does.
Periodicfix makes you specify weights to align a subset of atoms. Pfx doesn’t use weights; or, if you like, the weights must be zero or one. In practice it’s been found that that’s all anyone needs. Having to specify weights is cumbersome. If someone really wants to have weight support in pfx we can add it some day.
Periodicfix has separate topology and fragment wrapper types, which make the Python interface more cumbersome. Pfx has just one type.
Periodicfix has accreted additional functionality which has nothing to do with periodic images or alignment, including contact finding and a hydrogen bond energy function.
The svd in periodicfix is greatly inferior to the one here. Yes, it would be easy replace the one in periodicfix with this one.
Molfile¶
Structure and coordinate file manipulation library.
Reading a structure file:
reader = molfile.mae.read('/path/to/foo.mae')
Iterating through the frames in a file:
for frame in molfile.dtr.read('/path/to/foo.dtr').frames():
function( frame.pos, frame.vel, frame.time, frame.box )
Random access to frames (only dtr files support this currently):
f27 = molfile.dtr.read('/path/to/foo.dtr').frame(27) # 0-based index
Write a trajectory to a frameset (dtr):
f = msys.molfile.Frame(natoms)
w = msys.molfile.dtr.write('output.dtr', natoms=natoms)
for i, xyz in enumerate(xyzs):
f.pos[:] = xyz
f.time = i
w.frame(f)
w.close()
Convert an mae file to a pdb file:
input=molfile.mae.read('foo.mae')
output=molfile.pdb.write('foo.pdb', atoms=input.atoms)
output.frame(input.frames().next())
output.close()
Write every 10th frame in a dtr to a trr:
input=molfile.dtr.read('big.dtr')
output=molfile.trr.write('out.trr, natoms=input.natoms)
for i in range(0,input.nframes, 10):
output.frame( input.frame(i) )
output.close()
Write a frame with a specified set of gids:
f = molfile.Frame(natoms, with_gids=True
f.gid[:] = my_gids
f.pos[:] = my_positions
w.frame(f)
Read the raw fields from a frameset (dtr):
dtr = molfile.DtrReader('input.dtr') # also works for stk
for i in range(dtr.nframes):
f = dtr.frame(i)
keyvals = dict()
frame = dtr.frame(i, keyvals=keyvals)
## use data in keyvals
Write raw fields to a frameset (dtr):
dtr = molfile.DtrWriter('output.dtr', natoms=natoms)
keyvals = dict( s = "a string",
f = positions.flatten(), # must be 1d arrays
i = numpy.array([1,2,3]),
)
dtr.append( time = my_time, keyvals = keyvals )
-
class
msys.molfile.
Plugin
¶ Interface to readers and writers
-
property
can_read
¶
-
property
can_write
¶
-
property
filename_extensions
¶
-
property
name
¶
-
property
prettyname
¶
-
read
((Plugin)arg1, (str)path[, (bool)double_precision=False]) → Reader :¶ Open a file for reading
-
property
version
¶
-
write
((Plugin)arg1, (str)path[, (object)atoms=None[, (object)natoms=None]]) → Writer :¶ write(path,atoms=None,natoms=None)
-
property
-
class
msys.molfile.
DtrReader
¶ -
fileinfo
((DtrReader)arg1, (int)arg2) → tuple :¶ fileinfo(index) -> path, time, offset, framesize, first, last, filesize, dtrpath, dtrsize file contains frames [first, last)
-
frame
((DtrReader)arg1, (int)index[, (object)bytes=None[, (object)keyvals=None]]) → object :¶ frame(index, bytes=None, keyvals=None) -> Frame Read bytes from disk if bytes are not provided If keyvals is not None, it should be a dict, and raw data from the frame will be provided.
-
frameset_is_compact
((DtrReader)arg1, (int)arg2) → bool¶
-
frameset_path
((DtrReader)arg1, (int)arg2) → str¶
-
frameset_size
((DtrReader)arg1, (int)arg2) → int¶
-
static
fromTimekeys
((str)arg1, (list)arg2, (list)arg3) → DtrReader¶
-
index_ge
((DtrReader)arg1, (float)arg2) → int¶
-
index_gt
((DtrReader)arg1, (float)arg2) → int¶
-
index_le
((DtrReader)arg1, (float)arg2) → int¶
-
index_lt
((DtrReader)arg1, (float)arg2) → int¶
-
index_near
((DtrReader)arg1, (float)arg2) → int¶
-
keyvals
((DtrReader)arg1, (int)arg2) → dict :¶ keyvals(index) -> dict() Read raw fields from frame.
-
property
metadata
¶
-
property
natoms
¶
-
property
nframes
¶
-
property
nframesets
¶
-
property
path
¶
-
reload
((DtrReader)arg1) → int :¶ reload() -> number of timekeys reloaded – reload frames in the dtr/stk
-
times
((DtrReader)arg1) → object¶
-
total_bytes
((DtrReader)arg1) → int¶
-
-
class
msys.molfile.
Frame
¶ -
property
box
¶
-
property
dpos
¶
-
property
dvel
¶
-
property
extended_energy
¶
-
property
fpos
¶
-
property
fvel
¶
-
property
kinetic_energy
¶
-
moveby
((Frame)arg1, (float)x, (float)y, (float)z) → None¶
-
property
pos
¶
-
property
position
¶
-
property
potential_energy
¶
-
property
pressure
¶
-
property
pressure_tensor
¶
-
select
((Frame)arg1, (object)indices) → Frame¶
-
property
temperature
¶
-
property
time
¶
-
property
total_energy
¶
-
property
vel
¶
-
property
velocity
¶
-
property
virial_tensor
¶
-
property
-
class
msys.molfile.
Atom
¶ -
addbond
()¶ addbond(atom) – add atom to self.bonds and self to atom.bonds
-
altloc
¶ segment name
-
anum
¶ atomic number
-
bfactor
¶ temperature factor
-
bonds
¶ bonded atoms
-
chain
¶ chain name
-
charge
¶
-
delbond
()¶ delbond(atom) – remove atom from self.bonds and self from atom.bonds
-
getbondorder
()¶ getbondorder(atom) – bond order for bond with given atom.
-
insertion
¶ segment name
-
mass
¶ mass in amu
-
name
¶ atom name
-
occupancy
¶
-
radius
¶ vdw radius
-
resid
¶ residue id
-
resname
¶ residue name
-
segid
¶ segment name
-
setbondorder
()¶ setbondorder(atom,val) – set bond order for bond with given atom.
-
type
¶ atom type
-
-
class
msys.molfile.
SeqFile
[source]¶ Read csv-like files with column names in the first row
-
filename_extensions
= 'seq'¶
-
name
= 'seq'¶
-
-
class
msys.molfile.
Grid
(data, name='', axis=None, origin=None)[source]¶ -
property
axis
¶
-
property
data
¶
-
property
name
¶
-
property
origin
¶
-
property
Reader¶
-
class
msys.molfile.
Reader
¶ Structure or trajectory open for reading
A Reader is a handle to an open file. Use the atoms member to fetch the atomic structure from the file, assuming it exists. To access frames, there are two methods.
-
frames
()¶ returns a FrameIter object for iteration over frames. FrameIter has two methods: the usual next() method which returns a Frame, and skip(n=1), which advances the iterator by n frames without (necessarily) reading anything. FrameIter is a very poor iterator: once a frame has been read or skipped, it can’t be loaded again; you have use a brand new Reader.
-
frame
(n)¶ returns the nth frame (0-based index). Currently only the dtr plugin supports this method.
-
grid
(n)¶ return the nth grid. For dx and ccp4 files.
-
at_time_ge
((Reader)arg1, (float)time) → Frame¶
-
at_time_gt
((Reader)arg1, (float)time) → Frame¶
-
at_time_le
((Reader)arg1, (float)time) → Frame¶
-
at_time_lt
((Reader)arg1, (float)time) → Frame¶
-
at_time_near
((Reader)arg1, (float)time) → Frame¶
-
property
atoms
¶ list of Atoms
-
property
bondorders
¶ list of bond orders
-
frame
((Reader)arg1, (int)arg2) → Frame¶
-
grid_data
((Reader)arg1, (int)arg2, (object)arg3) → None¶
-
grid_meta
((Reader)arg1, (int)arg2) → object¶
-
property
has_velocities
¶ reads velocities
-
property
natoms
¶ number of atoms
-
next
((Reader)arg1) → Frame :¶ Return the next frame
-
property
nframes
¶ number of frames
-
property
ngrids
¶ number of grids
-
reopen
((Reader)arg1) → Reader :¶ reopen file for reading
-
skip
((Reader)arg1) → None :¶ Skip the next frame
-
property
times
¶ all times for frames in trajectory
-
property
topology
¶ bond adjacency graph
-
Writer¶
-
class
msys.molfile.
Writer
¶ Structure or trajectory open for writing
Writers are initialized with a path and either an array of Atoms or an atom count. If the Writer supports structure writing, Atoms must be provided; if the Writer only writes frames, either one will do.
-
frame
(f)¶ If the writer supports frame writing, appends frame f to the end of the file.
-
grid
(g)¶ If the writer supports grid writing, writes Grid g to the file, where g is an instance of molfile.Grid, either returned from reader.grid(n) or created from scratch.
-
close
()¶ Invoked when the Writer goes out of scope, but it’s not a bad idea to invoke it explicitly.
-
close
((Writer)arg1) → None¶
-
frame
((Writer)arg1, (Frame)arg2) → Writer¶
-
sync
((Writer)arg1) → None¶
-
truncate
((Writer)arg1, (float)arg2) → bool¶
-
AnnotatedSystem¶
-
class
msys.
AnnotatedSystem
(sys, allow_bad_charges=False)[source]¶ System that has been annotated with additional chemical information
The AnnotatedSystem class provides chemical annotation useful primarily for evaluating smarts patterns. The system is expected to already have have chemical reasonable bond orders and formal charges, and to have no missing atoms (e.g. hydrogens). If these criteria cannot be met, set allow_bad_charges=True in the constructor to bypass these checks; in that case the AnnotatedSystem can still be used to evaluate smarts patterns, but patterns making use of the electronic state of the system (e.g. aromaticity, hybridization, etc.) will not be correct (the system will appear to be entirely aliphatic). You may also use the AssignBondOrderAndFormalCharge function to assign reasonable bond orders and formal charges, assuming there are no missing atoms.
The AnnotatedSystem defines a model for aromaticity. First, the SSSR (smallest set of smallest rings) is determined. Next, rings which share bonds are detected and grouped into ring systems. Rings are initially marked as nonaromatic. If the ring system taken as a whole is deemed to be aromatic, then all rings within it are aromatic as well; otherwise, individual rings are checked for aromaticity. Rings are checked in this fashion until no new rings are found to be aromatic.
A ring system is deemed to be aromatic if it satisfies Huckel’s 4N+2 rule for the number of electrons in the ring(s). An internal double bond (i.e. a bond between two atoms in the ring) adds 2 to the electron count. An external double bond (a bond between a ring atom and an atom not in that ring) adds 1 to the electron count. An external double bond between a carbon and a nonaromatic carbon makes the ring unconditionally nonaromtic. An atom with a lone pair and no double bonds adds 2 to the electron count.
-
property
errors
¶ List of errors found during system analysis if allow_bad_charges=True
-
property
SmartsPattern¶
-
class
msys.
SmartsPattern
(pattern)[source]¶ A class representing a compiled SMARTS pattern
The Msys smarts implementation is similar to that of Daylight smarts <http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html>, with support for arbitrarily nested recursive smarts. A few features are not currently supported; warnings will be generated when these constructs are used in a smarts pattern.
Directional bonds; e.g.
\
and /; these are treated as single bonds (i.e. as a - character).Chiral specification (
@
,@@
, etc); ignored.Implicit hydrogen (
h
): treated as explicitH
.Explicit degree (
D
): treated as bond countX
.Isotopes: (
[12C]
): ignored.Atom class (
[C:6]
): ignored.
On the other hand, Msys does support hybridization using the
^
token, as in OpenBabel:[c^2] select sp2 aromatic carbon
-
findMatches
(annotated_system, atoms=None)[source]¶ Return list of lists representing ids of matches of this pattern in this system, optionally requiring that the first atom match belongs to the given set of atoms. An AnnotatedSystem must be used here, which can be constructed from a System after calling AssignBondOrderAndFormalCharge.
-
match
(annotated_system)[source]¶ Return True if a match is found anywhere; False otherwise.
This is much faster than checking for an empty result from findMatches.
-
property
natoms
¶ Number of atoms in the compiled smarts pattern
-
property
pattern
¶ The pattern used to initialize the object
-
property
warnings
¶ Warnings, if any, emitted during compilation