Python Scripting

Most of the functionality in msys is exposed in its Python interface.

Overview

This section introduces the Python interface and explains how to use it effectively. We begin by introducing some concepts that pervade the Python interface, then move to some examples.

Msys ids

In Msys, instances of the Atom, Bond, Residue, and Chain classes are all Handles, in the sense that they refer to a piece of data held by the parent System. All Msys handles have an immutable id property that uniquely identifies them within their parent System. Objects that hold references to other objects do so through the id of that object. Two handles of the same type will compare equal to each other if and only if they belong the same System and possess the same id.

When you load a system from a file, or create one from scratch, these ids will be numbered consecutively, starting at zero. Deleting Atoms, Bonds, etc. from the System can introduce gaps in the set of ids, but, once created, the id of an object never changes.

When Msys writes a DMS file, the primary keys of the particles will be contiguous starting at 0, and will appear in the order in which the particles appear in the System, even if the ids of the atoms in the System are noncontiguous due to deletions. When Msys loads a DMS file, if the primary keys happen to be noncontiguous, Msys will still create a System with the usual contiguous ids.

Msys properties

Many objects in Msys (in particular, Atoms, Bonds, Terms, and Params) can have typed attributes given to all members of the set to which they belong. In Msys, these attributes are referred to as properties, or props for short, and have a type of either int, float, or str (string). The available property names and their types can be queried in the appropriate parent object, using the props, atom_props, etc. properties of the parent. The value of the property for a given element can be read and modified using a dictionary-like interface on the element itself:

mol = msys.LoadDMS('input.dms')
# find all distinct values of the 'grp_energy' atom property, if it exists
grp_energy_vals = set()
if 'grp_energy' in mol.atom_props:
  for atm in mol.atoms:
    grp_energy_vals.add( atm['grp_energy'] )

# add a new property 'foo' of type 'float'
mol.addAtomProp('foo', float)
# Set the value of foo to the z coordinate of the atom
for a in mol.atoms: a['foo'] = a.pos[2]

When you add a property to a set of elements, the initial value will be 0 for int and float types, and the empty string for str types. If a property with the same name and type already exists, no action is taken. An exception is thrown if you try to add a property with the same name but different type from an existing property.

Getting started

Once you have your Python environment by loading the appropriate modules, fire up Python, import the msys module, and load a dms or mae file:

import msys

# Load the entire contents of a DMS file
dms=msys.LoadDMS('system.dms')

# Import an MAE file, performing conversions on its forcefield data
mae=msys.LoadMAE('system.mae')

You can also create a new System from scratch:

# mol = msys.CreateSystem()

A System resides entirely in memory; changes to the System will not persist until/unless you write it back out to a file:

# Save the system as a DMS file
msys.SaveDMS(dms, 'output.dms')

# Export to MAE file
msys.SaveMAE(dms, 'output.mae')

Msys also lets you append chemical systems to an existing file, for certain file formats. The supported Save methods will have an ‘append’ option in their function signatures.

The full set of Atoms, Bonds, Residues, Chains, and TermTables are available by fetching them from the system. You can also fetch the bonds involving a particular atom, the atoms in a residue, or the bonds in a chain in a similar way:

# get the number of atoms, and the total charge
atoms = dms.atoms
natoms = len(atoms)
total_charge = sum(a.charge for a in atoms)

# find the atoms participating in double bonds
for chn in mol.chains:
  for res in chn.residues:
    for atm in res.atoms:
      for bnd in atm.bonds:
        if bnd.order == 2:
          print "atom %d in chain %s residue %s:%d has a double bond" % (
            atm.id, chn.name, res.name, res.num)

# iterate over tables, print atoms per term and number of terms
for t in mol.tables:
  print "table %s: %d atoms, %d terms" % (t.name, t.natoms, t.nterms)

# fetch the stretch_harm table.  Throws an exception if no such table
stretch = mol.table('stretch_harm')

Atom selections let you fetch a list of atoms using the VMD atom selection language. The select method returns a list of Atoms, which is just a subset of the list that would be returned by the atoms property:

# fetch the backbone atoms.  Note that bb is just a Python list
bb = mol.select('backbone')

Once you have the atoms, if you actually want to work with the residues or chains, it’s easy to do:

# get the set of distinct residues in the backbone
bb_residues = set(a.residue for a in bb)

Note that the atoms returned by select refer back to the original system. Msys also provides the means to create a new System independent of the original, using either the CloneSystem function or the clone method of System. When you clone a subset of a System, the Terms in the forcefield whose atoms are completely contained in the selection will be copied to the new System:

# get the list of protein atoms
pro_atoms = mol.select('protein')

# construct a new system containing just the protein
protein = msys.CloneSystem(pro_atoms)

# Atoms in the cloned system have the same attributes as the originals,
# but modifications to one do not affect the other
assert pro_atoms[0].charge == protein.atoms[0].charge
pro_atoms[0].charge += 3
assert pro_atoms[0].charge != protein.atoms[0].charge

The clone method of System is a more concise way of selecting a set of atoms, then immediately creating a new System from it:

# create a new System with all the hydrogens removed
hless = mol.clone('not hydrogen')

# create a copy of the original
dup = mol.clone()

You can append the structure and associated forcefield from one System onto another using System’s append method:

# duplicate the protein by appending to itself
protein.append(protein)

# load a water system and append it to the protein system.  Just as for
# CloneSystem, after appending water to protein, modifications to water
# will not affect any atoms in protein.
water = msy.LoadDMS('water.dms')
protein.append(water)

Terms in a system’s forcefield can be accessed and modified by going through the corresponding TermTable:

stretch = protein.table('stretch_harm')
terms = stretch.terms
params = stretch.params
props = params.props # ['fc', 'r0']
print "%d stretch terms, %d stretch params" % (terms.nterms, params.nparams)

You can change the properties of a selected Term using a dictionary-like interface:

# Change the force constant of the first stretch term to 42
stretch.terms[0]['fc] = 42

Adding new forcefield terms

Msys provides an interface for adding a TermTable corresponding to a “standard” forcefield term and configuring that table with its category and its the expected set of properties:

# Get the available set of TermTable schemas:
schemas = msys.TableSchemas()

# For bonded, constraint, virtual, and polar terms, as well as
the exclusion table:
table = mol.addTableFromSchema('posre_harm')  # position restraints

# Get the available set of nonbonded schemas
nb_schemas = msys.NonbondedSchemas()

# For a nonbonded table:
nb = mol.addNonbondedFromSchema('vdw_12_6')

The addNonbondedFromSchema also takes care of configuring the nonbonded_info properties of the System; see the section on nonbonded parameters for more details.

If you have a new table type that hasn’t made it into Msys’ canonical set, you can simply use addTable and configure the table yourself:

table = mol.addTable('funky_harm', 2)
table.params.addProp('fk', float)
table.params.addProp('r0', float)

If a table with a given name already exists in a System, addTable and addTableFromSchema will just return the existing table.

Files with multiple components

To examine every structure in a multi-component file without having to load them all into memory at once, use LoadMany. Unlike the Load function, which always returns one System, LoadMany is a generator which iterates over molecular structures in the input file:

for mol in msys.LoadMany('input.mol2'):
   print mol.name

Not every file format supports LoadMany; in cases where it doesn’t, LoadMany will stop after a single iteration, yielding just one System.

If you use LoadMany to load a file, each System will have only one Ct. However, if you use Load to import an MAE or DMS file, and the file contains multiple components, the new System will contain Ct elements corresponding to those components:

mol = msys.Load('small_vancomycin_complex.mae')
for ct in mol.cts:
   print ct.name

# prints:
# vancomycin_diala_complex
# SPC water box

The ct information wil be preserved when saving the System back to an MAE or DMS file.

You can create a multi-ct system from existing Systems using the append method:

pro = msys.Load('protein.dms')
pro.ct(0).name = 'protein'
wat = msys.Load('water.dms')
wat.ct(0).name = 'water'
pro.append(wat)
assert pro.ncts == 2     # assuming there was 1 ct in protein.dms and wat.dms
assert pro.ct(1).name == 'water'
msys.Save(pro, 'combined.dms')

The msys module

This is the high-level Python interface for msys, intended for use by chemists.

class msys.AnnotatedSystem(sys, allow_bad_charges=False)[source]

System that has been annotated with additional chemical information

The AnnotatedSystem class provides chemical annotation useful primarily for evaluating smarts patterns. The system is expected to already have have chemical reasonable bond orders and formal charges, and to have no missing atoms (e.g. hydrogens). If these criteria cannot be met, set allow_bad_charges=True in the constructor to bypass these checks; in that case the AnnotatedSystem can still be used to evaluate smarts patterns, but patterns making use of the electronic state of the system (e.g. aromaticity, hybridization, etc.) will not be correct (the system will appear to be entirely aliphatic). You may also use the AssignBondOrderAndFormalCharge function to assign reasonable bond orders and formal charges, assuming there are no missing atoms.

The AnnotatedSystem defines a model for aromaticity. First, the SSSR (smallest set of smallest rings) is determined. Next, rings which share bonds are detected and grouped into ring systems. Rings are initially marked as nonaromatic. If the ring system taken as a whole is deemed to be aromatic, then all rings within it are aromatic as well; otherwise, individual rings are checked for aromaticity. Rings are checked in this fashion until no new rings are found to be aromatic.

A ring system is deemed to be aromatic if it satisfies Huckel’s 4N+2 rule for the number of electrons in the ring(s). An internal double bond (i.e. a bond between two atoms in the ring) adds 2 to the electron count. An external double bond (a bond between a ring atom and an atom not in that ring) adds 1 to the electron count. An external double bond between a carbon and a nonaromatic carbon makes the ring unconditionally nonaromtic. An atom with a lone pair and no double bonds adds 2 to the electron count.

__init__(sys, allow_bad_charges=False)[source]

Construct from System. AnnotatedSystem is not updated if System is subsequently modified.

__repr__()[source]

Return repr(self).

__weakref__

list of weak references to the object (if defined)

aromatic(atom_or_bond)[source]

Is atom or bond aromatic

degree(atom)[source]

Number of (non-pseudo) bonds

property errors

List of errors found during system analysis if allow_bad_charges=True

hcount(atom)[source]

Number of bonded hydrogens

hybridization(atom)[source]

Atom hybridization – 1=sp, 2=sp2, 3=sp3, 4=sp3d, etc.

Equal to 0 for hydrogen and atoms with no bonds, otherwise max(1, a.degree() + (a.lone_electrons+1)/2 - 1).

loneelectrons(atom)[source]

Number of lone electrons

ringbondcount(atom)[source]

Number of ring bonds

valence(atom)[source]

Sum of bond orders of all (non-pseudo) bonds

msys.ApplyDihedralGeometry(a, b, c, r, theta, phi)[source]

Return the position of atom d with cd length r, bcd angle theta, and abcd dihedral phi, all in radians.

msys.AssignBondOrderAndFormalCharge(system_or_atoms, total_charge=None, compute_resonant_charges=False, *, timeout=60.0)[source]

Assign bond orders and formal charges to a molecular system.

Determines bond orders and formal charges by preferring neutral charges and placing negative charges with more electronegative atoms, under octet constraints and the total system charge constraint. Assigns the bond orders and formal charges to the system. Can assign to a subset of atoms of the system, provided these atoms form complete connected fragments.

Parameters
  • system_or_atoms – either a System or a list of Atoms

  • total_charge – if not None, integral total charge

  • compute_resonant_charges (bool) – compute and store resonant charge in atom property ‘resonant_charge’ and resonant bond order in bond property ‘resonant_order’.

  • timeout (float) – maximum time allowed, in seconds. Note: calling this function on a chemically incomplete system, i.e. just protein backbone, cause msys to hit the timeout.

class msys.Atom(_ptr, _id)[source]

Represents an atom (or pseudoparticle) in a chemical system

__contains__(key)[source]

does atom property key exist?

__getitem__(key)[source]

get atom property key

__lt__(that)[source]

Return self<value.

__repr__()[source]

Return repr(self).

__setitem__(key, val)[source]

set atom property key to val

addBond(other)[source]

create and return a Bond from self to other

property aromatic
property atomic_number
property bonded_atoms

Atoms bonded to this atom

property bonds

Bonds connected to this atom

property charge
findBond(other)[source]

Find the bond between self and Atom other; None if not found

property formal_charge
property fragid
property mass
property name
property nbonds

number of bonds to this atom

property nhydrogens

number of bonded hydrogens

property pos

position

remove()[source]

remove this Atom from the System

property valence

sum of bond orders

property vel

velocity

property vx
property vy
property vz
property x
property y
property z
class msys.Bond(_ptr, _id)[source]

Represents a bond in a System

__contains__(key)[source]

does custom Bond property exist?

__getitem__(key)[source]

get custom Bond property

__lt__(that)[source]

Return self<value.

__repr__()[source]

Return repr(self).

__setitem__(key, val)[source]

set custom Bond property

property atoms

Atoms in this Bond

property first

first Atom in the bond (the one with lower id)

property order

bond order (int)

other(atom)[source]

atom in bond not the same as given atom

remove()[source]

remove this Bond from the System

property second

second Atom in the bond (the one with higher id)

exception msys.BrokenBondsError[source]
__weakref__

list of weak references to the object (if defined)

msys.CalcAngle(a, b, c)[source]

Angle in radians of atoms or positions a-b-c.

msys.CalcDihedral(a, b, c, d)[source]

Dihedral angle in radians of atoms or positions a-b-c-d

msys.CalcDistance(a, b)[source]

Distance between atoms or positions a and b

msys.CalcPlanarity(pos_or_atoms)[source]

Planarity of positions or atoms

class msys.Chain(_ptr, _id)[source]

Represents a chain (of Residues) in a System

__repr__()[source]

Return repr(self).

addResidue()[source]

append a new Residue to this Chain and return it

property ct

Return the Ct for this chain

property name
property nresidues

number of residues in this chain

remove()[source]

remove this Chain from the System

property residues

list of Residues in this Chain

property segid
selectResidue(resid=None, name=None, insertion=None)[source]

Returns a single Residue with the given resid, name, and/or insertion code. If no such residue is found, returns None. If multiple such residues are found within this chain, raises an exception.

msys.CloneSystem(atoms)[source]

Call System.clone(atoms) using the System from the first atom.

DEPRECATED. Use System.clone directly instead.

msys.ComputeTopologicalIds(system)[source]

Compute and return the topological ids for the atoms or system

msys.ConvertFromOEChem(oe_mol, force=False)[source]

Construct a System from the given OEChem OEMol

Parameters
  • oe_mol (oechem.OEMol) – the OEChem molecule to convert

  • force (bool) – whether sanity checks should be performed before conversion

Returns

System

Return type

mol (System)

msys.ConvertFromRdkit(rdmol)[source]

Construct an msys System from an RDMol :param mol: system :type mol: rdkit.ROMol

Returns

System

Notes

All atoms will be assigned to the same Residue. Only the first conformer will be used, if any. There must not be any implicit hydrogens. Bonds will be kekulized since msys doesn’t maintain aromaticity. Chiral tags will not be maintained.

msys.ConvertToOEChem(mol_or_atoms)[source]

Construct an OEChem OEMol from the given System

Parameters

mol – System or [Atoms]

Returns

oechem.OEMol

Notes

If [Atoms] are give, only bonds involving the specified atoms will be passed to to the OEMol; this is the same behavior as System.clone(atoms).

msys.ConvertToRdkit(mol, sanitize=True)[source]

Construct an RDKit ROMol from the given System

Parameters
  • mol (System) – System

  • sanitize (bool) – whether to sanitize the molecule

Returns

rdkit.ROMol

Notes: alchemical systems may require sanitize=False

msys.CreateParamTable()[source]

Create a new, empty ParamTable

msys.CreateSystem()[source]

Create a new, empty System

class msys.Ct(_ptr, _id)[source]

Represents a list of Chains in a System

The Ct class exists mainly to provide a separate namespace for chains. If you merge two systems each of which has a chain A, you probably want the chains to remain separate. Cts accomplish this.

The Ct class also provides a key-value namespace for assigning arbitrary properties to Systems.

__delitem__(key)[source]

remove property key

__getitem__(key)[source]

get ct property key

__setitem__(key, val)[source]

set ct property key to val

addChain()[source]

append a new Chain to this Ct and return it

append(system)[source]

Appends atoms and forcefield from system to self. Returns a list of of the new created atoms in self. Systems must have identical nonbonded_info.vdw_funct. Does not overwrite the global cell information in self.

property atoms

list of Atoms in this Ct

property bonds

list of Bonds in this Ct

property chains

list of Chains in this Ct

get(key, d=None)[source]

get ct property key, else d, which defaults to None

keys()[source]

available Ct properties

property name

Name of Ct

property natoms

number of Atoms in the Ct

property nchains

number of Chains in this Ct

remove()[source]

remove this Ct from the System

msys.FindDistinctFragments(system, key='graph')[source]

Find connected sets of atoms with identical topology.

Parameters
  • system – System chemical system

  • key – str one of ‘graph’, ‘inchi’, ‘oechem_smiles’, or list of strings, one per fragment

Returns

mapping from representative fragment id to ids of fragments

having identical topology.

Return type

dict[int -> [int]]

Notes

Fragments are distinguished according to value given by ‘key’; if ‘graph’, topologically identical fragments will be considered identical even if they have different stereochemistry. Choose one of the other options to include stereochemistry in the fragment disambiguation.

msys.FormatDMS(system)[source]

Return the DMS form of the system as bytes

msys.FormatJson(system)[source]

Json formatted system (EXPERIMENTAL)

msys.FormatSDF(mol)[source]

Return System in sdf format

msys.FromSmilesString(smiles, forbid_stereo=True)[source]

Construct a System from a smiles string.

Parameters
  • smiles (str) – the smiles string

  • forbid_stereo (bool) – if True, raise exception if smiles has stereo

EXPERIMENTAL. In particular, stereo information in the smiles string is ignored. Set forbid_stereo=False to permit stereo specifications to be silently ignored. This flag may be removed at a later date once stereo support has been added.

msys.GetBondsAnglesDihedrals(system)[source]

Return bonds, angles and dihedrals deduced from bond topology Returns: dict

msys.GetRingSystems(atoms)[source]

Get ring systems for the given atoms

msys.GetSSSR(atoms, all_relevant=False)[source]

Get smallest set of smallest rings (SSSR) for a system fragment.

The SSSR is in general not unique; the SSSR of a tetrahedron is any three of its four triangular faces. The set of rings that is the union of all SSSR’s (all relevant rings) may be obtained by setting all_relevant to True.

Arguments: atoms – [msys.Atom, …, msys.Atom] from a single system all_relevant – bool Returns: [[msys.Atom, …, msys.Atom], …, [msys.Atom, …, msys.Atom]]

class msys.Graph(system_or_atoms, colors=None)[source]

Represents the chemical topology of a System

Used mainly to implement graph isomorphism; see the match() method

__init__(system_or_atoms, colors=None)[source]

Construct Graph :param system_or_atoms: System or [Atoms] :param colors: None, [Int] or Callable

If colors is provided, it is used to specify the vertex color for each atom in the graph. By default, atomic number is used. A color may be provided for each atom, or a callable accepting an Atom as argument. Atoms with color zero are ignored for purposes of graph matching, and no mapping will be returned for them.

__weakref__

list of weak references to the object (if defined)

atoms()[source]

ordered atoms in graph

hash()[source]

string hash of atoms and bonds in graph

static hash_atoms(atoms)[source]

string hash of specified atoms

Parameters

atoms ([msys.Atom]) – list of atoms

match(graph)[source]

Find a graph isomorphism between self and the given Graph. If no isomorphism could be found, return None; otherwise return mapping from atoms in this graph to atoms in that graph.

matchAll(graph, substructure=False)[source]

Find all graph isomorphisms between self and the given Graph. If no isomorphism could be found, return empty list; otherwise return list of dicts mapping atoms in this graph to atoms in that graph. If substructure is True, return isomorphisms between self and any subgraph of the given Graph.

size()[source]

number of atoms in graph

msys.GuessHydrogenPositions(atoms)[source]

Experimental

class msys.HydrogenBondFinder(system, donors, acceptors, cutoff=3.5)[source]

Find candidate hydrogen bonds.

More hbonds will be found than are “realistic”; further filtering may be performed using the energy attribute of the returned hbonds. A reasonable filter seems to be around -1.0 (more negative is stronger); i.e. energies greater than that are more likely than not to be spurious.

The HydrogenBond class can also be used directly to compute hydrogen bond geometry and energies by supplying donor, acceptor and hydrogen positions.

__init__(system, donors, acceptors, cutoff=3.5)[source]
Parameters
  • system (System) – msys system

  • donors – selection string, list of ids, or list of Atoms

  • acceptors – selection string, list of ids, or list of Atoms

  • cutoff (float) – distance cutoff for donor and acceptor

Note

If Atoms are provided, they must be members of system.

__weakref__

list of weak references to the object (if defined)

find(pos=None)[source]

Find hydrogen bonds for the given positions, defaulting to the current positions of the input system.

class msys.InChI(system, DoNotAddH=True, SNon=False, FixedH=True)[source]

InChI holds an the result of an inchi invocation for a structure

__init__(system, DoNotAddH=True, SNon=False, FixedH=True)[source]

Initialize self. See help(type(self)) for accurate signature.

__str__()[source]

Return str(self).

__weakref__

list of weak references to the object (if defined)

property auxinfo

Auxiliary info

property key

inchi key for this object’s string.

property message

Message returned by inchi during calculation

property ok

Was an inchi computed successfully?

property string

Computed inchi string

class msys.IndexedFileLoader(path, idx_path=None)[source]

Supports random access to multi-structure files

__getitem__(index)[source]

Get structure at given index

Parameters

index (int) – 0-based index

Returns

msys System

Return type

mol (System)

__init__(path, idx_path=None)[source]

Open an indexed file loader, creating an index file if needed. :param path: file path. File type is inferred from the extension. :type path: str :param idx_path: index file path. Defaults to $path.idx. :type idx_path: str

Note

You need write permission to the location of the index file.

__len__()[source]

number of entries

__weakref__

list of weak references to the object (if defined)

property path

path to source file

msys.LineIntersectsTriangle(r, s, a, b, c)[source]

True if line segment rs intersects triangle abc

msys.Load(path, structure_only=False, without_tables=None)[source]

Infer the file type of path and load the file.

Parameters
  • path – if str, a file path or PDB accession code. if int, an Anton jobid (require ‘yas’ garden module)

  • structure_only (bool) – Omit force tables and pseudo atoms

  • without_tables (bool) – Omit force tables.

Returns

new System

msys.LoadDMS(path=None, structure_only=False, buffer=None)[source]

Load the DMS file at the given path and return a System containing it. If structure_only is True, only Atoms, Bonds, Residues and Chains will be loaded, along with the GlobalCell, and no pseudos (atoms with atomic number less than one) will be loaded.

If the buffer argument is provided, it is expected to hold the contents of a DMS file, and the path argument will be ignored.

msys.LoadMAE(path=None, ignore_unrecognized=False, buffer=None, structure_only=False)[source]

load the MAE file at the given path and return a System containing it. Forcefield tables will be created that attempt to match as closely as possible the force terms in the MAE file; numerical differences are bound to exist. If ignore_unrecognized is True, ignore unrecognized force tables.

If the buffer argument is provided, it is expected to hold the contents of an MAE file, and the path argument will be ignored.

If the contents of the file specified by path, or the contents of buffer, are recognized as being gzip-compressed, they will be decompressed on the fly.

If structure_only is True, no forcefield components will be loaded.

msys.LoadMany(path, structure_only=False, error_writer=<_io.TextIOWrapper name='<stderr>' mode='w' encoding='UTF-8'>)[source]

Iterate over structures in a file, if the file type supports iteration.

for mol in LoadMany(‘input.mol2’): …

If there was an error reading a structure, LoadMany returns None for that iteration. If error_writer is not None, it’s write() method is invoked with the contents of the exception message as argument. error_writer defaults to sys.stderr.

msys.LoadMol2(path, multiple=False)[source]

Load a mol2 file at the given path. If multiple is True, return a list of Systems, one for each MOLECULE record. Otherwise, return just one System corresponding to the first MOLECULE record.

msys.LoadPDB(path)[source]

Load a PDB file at the given path and return a System. No bonds will be created, even if CONECT records are parent.

msys.LoadPrmTop(path, structure_only=False)[source]

Load an Amber7 prmtop file at the given path and return a System. Coordinates and global cell information are not present in the file.

msys.LoadXYZ(path)[source]

Load an xyz file at the given path. Guesses bonds based on guessed atomic numbers based on atom name.

msys.MatchFragments(mol1, mol2, key='graph')[source]

construct an atom to atom mapping for all fragments from mol1 to mol2

Parameters
  • mol1 – System

  • mol2 – System

  • key – see FindDistinctFragments

Returns

dict[Atom -> Atom] or None

msys.NonbondedSchemas()[source]

available nonbonded schemas for System.addNonbondedFromSchema

class msys.Param(_ptr, _id)[source]

A Param instance is a reference to a row in a ParamTable. Use the dict-style interface to get and set values in the row. Msys will take care of converting input values to the type of the corresponding column, and raise an exception if the conversion cannot be performed.

__getitem__(prop)[source]

get the value of prop

__repr__()[source]

Return repr(self).

__setitem__(prop, val)[source]

update the value of prop with val

duplicate()[source]

create a new entry in the parent parameter table with the same values as this one, returning it.

property id

id in parent table

keys()[source]

sorted list of available properties

property system

parent System

property table

parent ParamTable

class msys.ParamTable(_ptr)[source]

The ParamTable class is a 2d table, whose rows are indexed by id and whose columns are properties; see the discussion of properties in the Overview. A ParamTable is used by TermTables to hold the shared parameters for its Terms.

__eq__(x)[source]

Return self==value.

__hash__()[source]

Return hash(self).

__init__(_ptr)[source]

Initialize self. See help(type(self)) for accurate signature.

__ne__(x)[source]

Return self!=value.

addParam(**kwds)[source]

add and return a new Param().

If keyword arguments are supplied, they will be assigned to the newly created Param before returning it.

addProp(name, type)[source]

add a new property of the given type, which must be int, float, or str.

delProp(name)[source]

removes the property with the given name.

find(name, value)[source]

return the Params with the given value for name

property nparams

number of Params

property nprops

number of properties

param(id)[source]

fetch the Param with the given id

property params

list of all Params in table

propType(name)[source]

type of the property with the given name

property props

names of the properties

msys.ParseSDF(text)[source]

Iterate over blocks in sdf format text. Accepts normal and gzipped text.

msys.ReadCrdCoordinates(mol, path)[source]

Read coordinates from the given Amber crd file into the given System.

msys.ReadPDBCoordinates(mol, path)[source]

Read coordinates and box from the given pdb file into the given System.

class msys.Residue(_ptr, _id)[source]

Represents a residue (group of Atoms) in a System

__repr__()[source]

Return repr(self).

addAtom()[source]

append a new Atom to this Residue and return it

property atoms

list of Atoms in this Residue

property center

return geometric center of positions of atoms in residue

property chain

parent chain

property insertion

insertion code

property name

residue name

property natoms

number of atoms in this residue

remove()[source]

remove this Residue from the System

property resid

the PDB residue identifier

selectAtom(name=None)[source]

Returns a single Atom from this residue with the given name, or None if no such atom is present. If multiple atoms in the residue have that name, raise an exception.

msys.Save(mol, path, append=False, structure_only=False)[source]

Save the given system to path, using a file format guessed from the path name. Not all formats support both append and structure_only options; see the corresponding SaveXXX functions.

msys.SaveDMS(system, path, structure_only=False, unbuffered=False)[source]

Export the System to a DMS file at the given path.

msys.SaveMAE(system, path, with_forcefield=True, append=False)[source]

Export the System (or list of systems) to an MAE file at the given path.

msys.SaveMol2(system, path, selection='none', append=False, moe=True)[source]

Export the System to a mol2 file at the given path. You should probably call AssignBondOrderAndFormalCharge() before exporting the system.

Parameters
  • system (System) – msys system

  • path (str) – file name to save to

  • selection (str) – msys selection string to restrict to

  • append (bool) – if True, don’t clobber path, just append to it

  • moe (bool) – output guadinium groups with aromatic bonds so that MOE will read correctly

msys.SavePDB(system, path, append=False, reorder=False)[source]

Export the System to a PDB file at the given path.

msys.SerializeMAE(system, with_forcefield=True)[source]

Return the MAE form of the System as bytes.

class msys.SmartsPattern(pattern)[source]

A class representing a compiled SMARTS pattern

The Msys smarts implementation is similar to that of Daylight smarts <http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html>, with support for arbitrarily nested recursive smarts. A few features are not currently supported; warnings will be generated when these constructs are used in a smarts pattern.

  • Directional bonds; e.g. \ and /; these are treated as single bonds (i.e. as a - character).

  • Chiral specification (@, @@, etc); ignored.

  • Implicit hydrogen (h): treated as explicit H.

  • Explicit degree (D): treated as bond count X.

  • Isotopes: ([12C]): ignored.

  • Atom class ([C:6]): ignored.

On the other hand, Msys does support hybridization using the ^ token, as in OpenBabel:

[c^2]       select sp2 aromatic carbon
__init__(pattern)[source]

Initialize with SMARTS pattern

__repr__()[source]

Return repr(self).

__weakref__

list of weak references to the object (if defined)

findMatches(annotated_system, atoms=None)[source]

Return list of lists representing ids of matches of this pattern in this system, optionally requiring that the first atom match belongs to the given set of atoms. An AnnotatedSystem must be used here, which can be constructed from a System after calling AssignBondOrderAndFormalCharge.

match(annotated_system)[source]

Return True if a match is found anywhere; False otherwise.

This is much faster than checking for an empty result from findMatches.

property natoms

Number of atoms in the compiled smarts pattern

property pattern

The pattern used to initialize the object

property warnings

Warnings, if any, emitted during compilation

class msys.SpatialHash(pos, ids=None, box=None)[source]

SpatialHash provides an interface for efficient spatial queries on particle positions.

__init__(pos, ids=None, box=None)[source]

Construct from particle positions. If ids are provided, they should be a numpy array of type uint32 and specify which rows of the Nx3 pos array are to be hashed. If box is provided, it must be a 3x3 array of doubles, and the search will be performed using periodic boundary conditions.

__weakref__

list of weak references to the object (if defined)

findContacts(radius, pos, ids=None, reuse_voxels=False)[source]

Find pairs of particles within radius of each other.

Parameters
  • radius (float) – search radius

  • pos (array[float]) – positions

  • ids (array[uint]) – particle indices

  • reuse_voxels (bool) – assume voxelize(R>=radius) has already been called

  • ignore_excluded (bool) – exclude atom pairs in the exclusion table.

Returns

Mx1 arrays of ids and distances.

Return type

i, j, dists (tuple)

The first array corresponds to ids in the call to findContacts; the second column to the ids passed to the SpatialHash constructor.

IMPORTANT: pairs with the same id in the constructor and the call the findContacts are excluded from the output set. Therefore, the positions passsed to findContacts should correspond to the same atom indices as the positions passed to the SpatialHash constructor.

findNearest(k, pos, ids=None)[source]

Find at most k particles from pos with the smallest minimum distance to some particle in the spatial hash. If ids is not provided, it defaults to arange(len(pos)).

findWithin(radius, pos, ids=None, reuse_voxels=False)[source]

Find particles from pos which are within the given radius of some particle in the spatial hash (i.e. provided in the SpatialHash constructor). By default, voxelization is performed at the same resolution as the query radius, but this can be overridden by calling voxelize() manually, then calling findWithin() with reuse_voxels=True. pos is expected to be an Nx3 array of floats. The ids parameter defaults to arange(len(pos)); supply an array of ids to limit the search to a subset of rows in pos.

voxelize(radius)[source]

Perform voxelization such that findWithin queries with reuse_voxels=True at a radius equal to or less than the given radius can be performed accurately. For queries at radius less than the voxelization, it may be worthwhile to revoxelize at a smaller radius. Note that, by default, findWithin calls voxelize with the query radius as argument, so it is not strictly necessary ever to use this method.

class msys.System(_ptr)[source]

The System class holds all structure and forcefield data for a single chemical system. Create a new System using msys.CreateSystem(), or from a file using msys.LoadDMS or msys.LoadMAE.

A System organizes the information in a DMS file into several different groups:

  • Tables - TermTables are grouped and accessed by name

  • cell - the unit cell vectors for the System, in the form of a 3x3 NumPy array.

  • nonbonded_info - the NonbondedInfo object describing the type of nonbonded interactions.

  • provenance - a list of Provenance objects describing how the input file has been processed.

  • Auxiliary tables: Everything else in the DMS file that does not fit into one of the above categories finds its way into an auxiliary table. Notable denizens of this category include:

    • cmap tables

    • forcefield (annotation for parameters in the DMS file)

__eq__(x)[source]

Return self==value.

__getinitargs__()[source]

Pickle support (requires cPickle.HIGHEST_PROTOCOL)

__hash__()[source]

Return hash(self).

__init__(_ptr)[source]

Construct from SystemPtr. Do not invoke directly; use CreateSystem() instead.

__ne__(x)[source]

Return self!=value.

__repr__()[source]

Return repr(self).

addAtom()[source]

add and return a new Atom in its own residue

addAtomProp(name, type)[source]

add a custom atom property with the given name and type. type should be int, float, or str.

addAuxTable(name, table)[source]

add or replace extra table with the given name.

addBondProp(name, type)[source]

add a custom bond property with the given name and type. type should be int, float, or str.

addChain(ct=4294967295)[source]

add and return a new Chain. If no ct is given, the chain will be added to the first ct, creating one if necessary.

addCt()[source]

add and return a new Ct

addNonbondedFromSchema(funct, rule='')[source]

Add a nonbonded table to the system, and configure the nonbonded info according to funct and rule. funct must be the name of recognized nonbonded type. rule is not checked; at some point in the future we might start requiring that it be one of the valid combining rules for the specified funct. If nonbonded_info’s vdw_funct and vdw_rule are empty, they are overridden by the provided values; otherwise, the corresponding values must agree if funct and rule are not empty. A nonbonded table is returned.

addResidue()[source]

add and return a new Residue in its own chain

addTable(name, natoms, params=None)[source]

add a table with the given name and number of atoms. If a table with the same name already exists, it is returned, otherwise the newly created table is returned. If no ParamTable params is supplied, a new one is created.

addTableFromSchema(type, name=None)[source]

Add a table to the system if it not already present, returning it. If optional name field is provided, the table will be added with the given name; otherwise the name is taken from the table schema.

analyze()[source]

Assign atom and residue types. This needs to be called manually only if you create a system from scratch, using msys.CreateSystem(); in that case, analyze() should be called before performing any atom selections.

append(system)[source]

Appends atoms and forcefield from system to self. Returns a list of of the new created atoms in self. Systems must have identical nonbonded_info.vdw_funct. Overwrites self.global_cell with system.global_cell only when self.global_cell is all zeros.

asCapsule()[source]

Return a capsule wrapper of the internal SystemPtr.

The capsule holds a bare pointer and therefore must not outlive self.

atom(id)[source]

return the atom with the specified id

atomPropType(name)[source]

type of the given atom property

property atom_props

return the list of custom atom properties.

property atoms

return list of all atoms in the system

atomsGroupedBy(prop)[source]

Return dictionary mapping representative values of the given atom property to lists of atoms having that property. If the property does not exist in this system, returns an empty dictionary.

atomsel(sel)[source]

Create and return an atom selection object (Atomsel). :param sel: str atom selection, or list of GIDs (possibly empty). :type sel: object

Note

Even if ids are provided, the ids of the selection are in sorted order.

auxtable(name)[source]

auxiliary table with the given name

property auxtable_names

names of the auxiliary tables

property auxtables

all the auxiliary tables

bond(id)[source]

return the bond with the specified id

bondPropType(name)[source]

type of the given bond property

property bond_props

return the list of custom bond properties.

property bonds

return list of all bonds in the system

property cell

The GlobalCell for this System

property center

return geometric center of positions of all atoms

chain(id)[source]

return the chain with the specified id

property chains

return list of all chains in the system

clone(sel=None, share_params=False, use_index=False, forbid_broken_bonds=False)[source]

Clone the System, returning a new System. If selection is provided, it should be an atom selection string, a list of ids, or a list of Atoms.

If share_params is True, then ParamTables will be shared between the old and new systems. By default, copies of the ParamTables are made, but ParamTables shared _within_ the old system will also be shared in the new system.

If forbid_broken_bonds is True, an exception will be thrown if the selected atoms do not include all atoms connected by bonds.

coalesceTables()[source]

Invoke TermTable.coalesce on each table

ct(id)[source]

return the Ct with the specified id

property cts

return list of all cts in the system

delAtomProp(name)[source]

remove the given custom atom property

delAtoms(atoms)[source]

remove the given Atoms from the System

delAuxTable(name)[source]

remove auxiliary table with the given name.

delBondProp(name)[source]

remove the given custom bond property

delBonds(bonds)[source]

remove the given Bonds from the System

delChains(chains)[source]

remove the given Chains from the System

delResidues(residues)[source]

remove the given Residues from the System

findBond(a1, a2)[source]

return the bond between the specified atoms, or None if not found

findContactIds(cutoff, ids=None, other=None, pos=None, ignore_excluded=False)[source]

Find atoms not bonded to each other which are within cutoff of each other. If ids is not None, consider only atoms with the given ids. If other is not None, consider only atom pairs such that one is in ids and the other is in other. If pos is not None, use pos as positions, which should be natoms x 3 regardless of the size of ids or other. pos may be supplied only when there are no deleted atoms in the structure. If ignore_excluded=True, exclusions from the exclusion table are used. If ignore_excluded=False, bonds in System.bonds are still excluded.

Returns a list of (id 1, id 2, distance) tuples for each contact found.

classmethod fromCapsule(cap)[source]

Construct from a capsule wrapper of a SystemPtr.

getCell()[source]

return copy of unit cell as 3x3 numpy array

getPositions()[source]

get copy of positions as Nx3 array

getTable(name)[source]

Return the TermTable with the given name, or None if not present.

getVelocities()[source]

get copy of velocities as N3x array

guessBonds(replace=True, reanalyze=True, periodic=False)[source]

Guess bond connectivity based on an atomic-number based atom radius.

Replaces any existing bonds, unless replace=False is specified.

Reanalyzes fragids and atom types unless reanalyze=False is specified. In that case, you MUST call updateFragids() manually before making any use of the fragment assignment (fragids will be out of date).

hash(sorted=True)[source]

hash of contents of this system.

The hash is insensitive to provenance and System.name.

property name

The name of the System, taken from the input file name

property natoms

number of atoms

property nbonds

number of bonds

property nchains

number of chains

property ncts

number of Cts

property nonbonded_info

NonbondedInfo for this System

property nresidues

number of residues

property positions

Nx3 list of lists of positions of all atoms

property provenance

return a list of Provenance entries for this system

residue(id)[source]

return the residue with the specified id

property residues

return list of all residues in the system

save(path, structure_only=False)[source]

Write self to path

Parameters
  • path (str) – file path

  • structure_only (bool) – write only atom information, not forcefield

Returns

self

select(seltext)[source]

return a list of Atoms satisfying the given VMD atom selection.

selectArr(seltext)[source]

Return the ids of the Atoms satisfying the given VMD atom selection as a numpy array of type uint32.

selectChain(name=None, segid=None)[source]

Returns a single Chain with the matching name and/or segid, or raises an exception if no single such chain is present.

selectCt(name=None)[source]

Return a single Ct with the matching name, or raises an exception if no single such Ct is present

selectIds(seltext, pos=None, box=None)[source]

Return the ids of the Atoms satisfying the given VMD atom selection. This can be considerably faster than calling select().

if pos is supplied, it should be an Nx3 numpy array of positions, where N=self.natoms.

If box is supplied, it should be a 3x3 numpy array of cell vectors, like System.cell.

setCell(cell)[source]

set unit cell from from 3x3 array

setPositions(pos)[source]

set positions from Nx3 array

setVelocities(vel)[source]

set velocities from Nx3 array

sorted()[source]

Return a clone of the system with atoms reordered based on their order of appearance in a depth-first traversal of the structure hierarchy.

table(name)[source]

Get the TermTable with the given name, raising ValueError if not present.

property table_names

names of the tables in the System

property tables

all the tables in the System

property topology

list of bonded atoms for each atom in the System

translate(xyz)[source]

shift coordinates by given amount

updateFragids()[source]

Find connected sets of atoms, and assign each a 0-based id, stored in the fragment property of the atom. Return a list of fragments as a list of lists of atoms.

class msys.SystemImporter(system)[source]

Maps atoms to residues, chains and cts

__init__(system)[source]

construct from system to be constructed

Parameters

system (System) – msys System

__weakref__

list of weak references to the object (if defined)

addAtom(chain, segid, resnum, resname, aname, insertion='', ct=0)[source]

Add atom to system

Returns

newly added Atom

initialize(atoms)[source]

Process existing atoms in the system

Parameters

atoms (list[msys.Atom]) –

Note

can be called multiple times; each time clears the internal tables so that subsequent atoms do not share residues, chains, or cts with previously added atoms.

property system

input system

msys.TableSchemas()[source]

available schemas for System.addTableFromSchema

class msys.Term(_ptr, _id)[source]

A Term is a handle for an entry in a TermTable.

The properties of a Term can be read and updated using a dictionary like interface. Both “term properties” and properties from the ParamTable are accessed through the same interface. To add or remove properties, use the provided methods in the TermTable or ParamTable instance. If a Term’s param is shared by another Term in any other TermTable, Msys will take care of providing the Term with its own Param containing a copy of the original properties before applying the changes. However, if you a modify a Param through its dictionary interface, you will affect all Terms that happen to share that Param:

# fetch the stretch_harm table
table = mol.table('stretch_harm')
# update the properties of just the first Term
table.term(0)['fc'] = 320
# update the properties of all terms that use this param!
table.term(0).param['fc'] = 320
__getitem__(prop)[source]

get the value of property prop

__repr__()[source]

Return repr(self).

__setitem__(prop, val)[source]

set the value of property prop

property atoms

list of Atoms for this Term

property id

id of this term in its TermTable

keys()[source]

union of table.params.props and table.term_props

property param

The Param corresponding to this Term’s parameters

remove()[source]

remove the given Term from its TermTable

property system

parent System of parent TermTable

property table

parent TermTable

class msys.TermTable(_ptr)[source]

Each TermTable is intended to describe a specific type of interaction, e.g. stretch, angle, Lennard-Jones, constraint_hoh, etc. A TermTable has an arity (given by the natoms property) which specifies how many atoms are involved in each interaction: one for nonbonded terms, two for stretch terms, etc. Each interaction instance is described by a Term. Each Term references the appropriate number of atoms, and exactly one Param, which lives in a ParamTable owned (or possible shared) by the TermTable.

The functional form described by a TermTable is not part of msys; all msys does is represent the forcefield parameters in a generic way.

__eq__(x)[source]

Return self==value.

__hash__()[source]

Return hash(self).

__init__(_ptr)[source]

Construct from TermTablePtr. Do not invoke directly; use System.addTable or System.table instead

__ne__(x)[source]

Return self!=value.

__repr__()[source]

Return repr(self).

addTerm(atoms, param=None)[source]

Add a Term to the table, with given initial param. The atoms list must have natoms elements, and each Atom must belong to the same System as the TermTable. If param is not None, it must belong to the ParamTable held by the TermTable.

addTermProp(name, type)[source]

add a custom Term property of the given type

property category

A string describing what kind of TermTable this is. Possibilities are: bond, constraint, virtual, polar, nonbonded, and exclusion.

coalesce()[source]

Reassign param for each Term in this Table to a member of the distinct set of Params used by those Terms.

count_overrides()[source]

return the number of pairwise nonbonded interactions which are affected by the overrides in the given term table.

delTermProp(name)[source]

remove the custom Term property

delTermsWithAtom(atom)[source]

remove all terms whose atoms list contains the given Atom

findExact(atoms)[source]

return the terms that contain precisely the given atoms in the given order.

findWithAll(atoms)[source]

return the terms that contain all the given atoms in any order

findWithAny(atoms)[source]

return the terms that contain at least one of the given atoms

findWithOnly(atoms)[source]

return the terms that contain only the given atoms

getOverride(pi, pj)[source]

get override for given pair of params, or None if not present.

hasTerm(id)[source]

Does a Term with the given id exist in the table?

property name

name of this table

property natoms

number of atoms in each term

property noverrides

number of parameter overrides

property nterms

number of terms

property override_params

parameter table containing override values

overrides()[source]

return a mapping from pairs of Params in self.params to a Param in self.override_params.

property params

The ParamTable for terms in this table.

property props

Table properties

remove()[source]

Remove this table from its parent system

replaceWithSortedTerms()[source]

Replace table in self.system with a version of self that has terms sorted by atom ids.

setOverride(pi, pj, op)[source]

override the interaction between params pi and pj with the interaction given by op. pi and pj must be Params from self.params; op must be a param from self.override_params, or None to remove the override.

property system

The System whose atoms are referenced by this table.

term(id)[source]

returns the Term in the table with the given id

termPropType(name)[source]

type of the given Term property

property term_props

names of the custom properties

property terms

returns a list of all the Terms in the table

The Atomsel class

class msys.atomsel.Atomsel(ptr, seltext)[source]

Supports alignment of molecular structures

__init__(ptr, seltext)[source]

don’t use directly - use System.atomsel()

__len__()[source]

number of selected atoms

__repr__()[source]

Return repr(self).

__str__()[source]

Return str(self).

alignCoordinates(other)[source]

If other is an Atomsel instance, align the coordinates of other’s System with self. If other is a numpy array, align the array with self, using corresponding indices.

In either case, return the aligned RMSD.

alignedRMSD(other)[source]

Return the aligned rmsd to other.

currentRMSD(other)[source]

compute RMS distance to other object, which may be Atomsel or an array of positions. In either it must be the case that len(other) equals len(self) or len(self.system)

property ids

ids of selected atoms in the parent system

raw_alignment(other)[source]

Compute alignment to other object. Compute and return aligned rmsd, and rotational and translational transformations.

property system

parent system

Dealing with duplicate parameters

After performing various modifications to a TermTable, you may find that the associated ParamTable contains many entries whose values are all identical. The redundant parameters can be removed by first “coalescing” the parameter assignments of each Term to a set of distinct Params, then cloning the System. When a System is cloned, only the Params which are referenced by at least one Term in the TermTable are copied to the new System:

import msys
mol=msys.CreateSystem()
a1=mol.addAtom()
a2=mol.addAtom()
a3=mol.addAtom()
table = mol.addTableFromSchema('stretch_harm')
p1=table.params.addParam()
p1['fc']=320
p1['r0']=1.0
t1=table.addTerm([a1,a2], p1)
t2=table.addTerm([a1,a3], p1)

# At this point we have two terms and one param.  Suppose we ignore the
# fact that t1 and t2 share a Param, and we just update their properties
# to the same value:

t1['r0']=1.2
t2['r0']=1.2

# Now we have two Params, because when we updated t1, we created a second
# Param that was unshared by t2.  When we updated t2, p1 was unshared, so
# no duplicate was made.
assert table.params.nparams==2

# But we could get by with only a single Param.  Let's do that:
mol.coalesceTables()

# At this point t1 and t2 are sharing a Param, and the other one is unused:
assert t1.param==t2.param
assert table.params.nparams==2
assert table.nterms==2

# When we clone, the unused params are not copied to the new system.
mol2=mol.clone()
assert mol2.table('stretch_harm').params.nparams==1
class msys.TermTable(_ptr)[source]

Each TermTable is intended to describe a specific type of interaction, e.g. stretch, angle, Lennard-Jones, constraint_hoh, etc. A TermTable has an arity (given by the natoms property) which specifies how many atoms are involved in each interaction: one for nonbonded terms, two for stretch terms, etc. Each interaction instance is described by a Term. Each Term references the appropriate number of atoms, and exactly one Param, which lives in a ParamTable owned (or possible shared) by the TermTable.

The functional form described by a TermTable is not part of msys; all msys does is represent the forcefield parameters in a generic way.

addTerm(atoms, param=None)[source]

Add a Term to the table, with given initial param. The atoms list must have natoms elements, and each Atom must belong to the same System as the TermTable. If param is not None, it must belong to the ParamTable held by the TermTable.

addTermProp(name, type)[source]

add a custom Term property of the given type

property category

A string describing what kind of TermTable this is. Possibilities are: bond, constraint, virtual, polar, nonbonded, and exclusion.

coalesce()[source]

Reassign param for each Term in this Table to a member of the distinct set of Params used by those Terms.

count_overrides()[source]

return the number of pairwise nonbonded interactions which are affected by the overrides in the given term table.

delTermProp(name)[source]

remove the custom Term property

delTermsWithAtom(atom)[source]

remove all terms whose atoms list contains the given Atom

findExact(atoms)[source]

return the terms that contain precisely the given atoms in the given order.

findWithAll(atoms)[source]

return the terms that contain all the given atoms in any order

findWithAny(atoms)[source]

return the terms that contain at least one of the given atoms

findWithOnly(atoms)[source]

return the terms that contain only the given atoms

getOverride(pi, pj)[source]

get override for given pair of params, or None if not present.

hasTerm(id)[source]

Does a Term with the given id exist in the table?

property name

name of this table

property natoms

number of atoms in each term

property noverrides

number of parameter overrides

property nterms

number of terms

property override_params

parameter table containing override values

overrides()[source]

return a mapping from pairs of Params in self.params to a Param in self.override_params.

property params

The ParamTable for terms in this table.

property props

Table properties

remove()[source]

Remove this table from its parent system

replaceWithSortedTerms()[source]

Replace table in self.system with a version of self that has terms sorted by atom ids.

setOverride(pi, pj, op)[source]

override the interaction between params pi and pj with the interaction given by op. pi and pj must be Params from self.params; op must be a param from self.override_params, or None to remove the override.

property system

The System whose atoms are referenced by this table.

term(id)[source]

returns the Term in the table with the given id

termPropType(name)[source]

type of the given Term property

property term_props

names of the custom properties

property terms

returns a list of all the Terms in the table

Sharing ParamTables

Forcefield developers will (we hope!) appreciate the ability for Msys to parameterize multiple TermTables from potentially different Systems using a single ParamTable instance. Normally, when a System is loaded from an input file, or a TermTable is created using the scripting interface, each TermTable refer to a ParamTable of its very own, and no other TermTable can or will reference it. However, at the time that a TermTable is created, a ParamTable can be provided which will be used to hold the Param entries for the Terms in the TermTable:

# create two independent systems
m1=msys.CreateSystem()
m2=msys.CreateSystem()

# add some atoms
m1.addAtom()
m2.addAtom()
m2.addAtom()

# create a free-standing ParamTable and add some Params
params=msys.CreateParamTable()
p1=params.addParam()
p2=params.addParam()

# create a table in system 1 which uses the free ParamTable
table1=m1.addTable("table", 1, params)

# no other TermTable is using the ParamTable
assert not params.shared

# create a table in system 2 which also uses the free ParamTable
table2=m2.addTable("table", 1, params)

# now the ParamTable is shared
assert params.shared
assert table1.params == table2.params

# Add some terms to each table
t1=table1.addTerm(m1.atoms, p2)
t2=table2.addTerm(m2.atoms[1:], p2)

assert t1.param == t2.param
assert t2.param == p2

# modifications to the the original table and its params are propagated
# to each table
params.addProp("fc", float)
p1['fc']=32
p2['fc']=42
assert t1['fc']==42
assert t2['fc']==42

# p1 is shared by multiple TermTables, but within a TermTable, p1 is not
# shared.  Modifications to t1['fc'] will affect t2!
t1['fc'] = 52
assert t2['fc'] == 52

Pfx

A high level interface for wrapping, centering, and alignment.

This module provides simple, yet performant methods for manipulating trajectories of systems with connected atoms and periodic boundary conditions.

msys.pfx.aligned_rmsd(X, Y, weight=None) → mat, rmsd

Compute the matrix aligning Y onto X, optionally with weights. Return the matrix and the rmsd.

msys.pfx.inverse_3x3(A) → Ainv
msys.pfx.svd_3x3(A) → U, w, V

svd_3x3 computes the singular value decomposition of the 3x3 matrix A. The result is always calculated and returned in double precision.

What pfx does

Pfx can be configured to perform a number of tasks related to postprocessing of trajectories of molecular systems. There are four main issues which Pfx is designed to deal with:

  1. Fixing bonds. If two atoms with a bond between them are found in different periodic images, one of them must be shifted by some integer linear combination of the unit cell vectors so that the distance between them is no greater than half a unit cell vector along each cell vector direction. When there are multiple atoms bonded together, the bond fixing operation must be applied to the bonds composing this connected component in topologically sorted order.

  2. Gluing components. Some molecular systems, such as multimeric ion channels, contain components which are not explicitly bonded to each other, but which do stay together during the simulation and should therefore be kept together during postprocessing. For each set of glued components, Pfx finds the transformations which minimize the square distance between the centers of each component.

  3. Centering and alignment. Pfx can either center a selected set of atoms on the origin, or align a selected set to a reference structure.

  4. Wrapping components. Any of the preceeding operations could place the center of a connected set of atoms outside the unit cell centered at the origin. Pfx shifts each connected component to bring it as close to the origin as possible, which maintaining any glued components. Importantly, if an alignment has been performed in the previous step, then the rotational part of the alignment transformation must be applied to the unit cell before performing the wrapping. Another subtlety is that when alignemnt has been performed, the wrapping should be performed about the center of the reference selection, not necessarily the origin. Otherwise, if the reference structure is far from the origin, wrapping could undo the alignment.

The main work of pfx is does in the apply method. The arguments to apply are a coordinate set and, optionally, a periodic cell and/or a set of velocities. Here’s what happens when you call apply, assuming that both the periodic cell and the velocities have been provided:

  1. Fix bonds.

  2. Glue components.

  3. Translate the entire system to bring the centered or aligned atoms to the origin.

  4. Compute a rotational transformation which aligns the system to the centered reference structure.

  5. Apply the rotation to the input positions, unit cel, and velocities.

  6. Wrap connected and glued components.

  7. Shift the entire system to the center of the reference structure.

Specifying topology

A Pfx instance is constructed from a bond topology. The bond topology indicates both how many atoms are in the molecular system as well as which atoms are bonded together. Pfx analyzes this topology to find the connected components comprising the system. If the fixbonds argument is True, then Pfx also computes and caches a topologically sorted list of bonds from the topology, so that the bond fixing step can be performed efficiently.

Specifying glue

The glue method of Pfx lets you specify atoms which should be kept together even if there is no explicit bond between them. Suppose the ids of all the protein atoms in a multimeric protein are passed as the argument to glue. Pfx first finds the set of connected components which overlap with the selection. In the glue components step, the centers of these components will be brought together. Moreover, in the wrap components step, all the protein atoms will be treated as a single component for the purpose of wrapping.

Suppose now that only one residue from each monomer of a multicomponent protein is included in a glue selection. The same set of connected components will be kept together as before, when the entire protein was glued; however, the centers of the connected components will be computed from just the glued residue in each monomer, rather than from all atoms of each monomer. The wrap components step will be unchanged.

The glue method can be called multiple times on the same Pfx instance. It is perfectly valid for glue selections in different invocations to overlap.

Performing both centering and alignment

When viewing trajectories, chemists often want to specify both “center” and “fit” selections. But what does this mean? If you center on, say, atoms 1-10, and align atoms 10-20, one operation will undo the other. The only sensible approach seems to be to apply the “center” specification to whatever is being used for the reference structure, and then use the “fit” selection to align non-reference frames to that selection.

What about periodicfix?

The algorithms in this module are essentially the same as those in periodicfix. So why a new module? Here are some reasons:

  • Periodicfix isn’t consistent about its use of float and double, and does a lot of interconversion. This makes it slow.

  • Periodicfix doesn’t have both single and double precision versions available from Python. This one does.

  • Periodicfix makes you specify weights to align a subset of atoms. Pfx doesn’t use weights; or, if you like, the weights must be zero or one. In practice it’s been found that that’s all anyone needs. Having to specify weights is cumbersome. If someone really wants to have weight support in pfx we can add it some day.

  • Periodicfix has separate topology and fragment wrapper types, which make the Python interface more cumbersome. Pfx has just one type.

  • Periodicfix has accreted additional functionality which has nothing to do with periodic images or alignment, including contact finding and a hydrogen bond energy function.

  • The svd in periodicfix is greatly inferior to the one here. Yes, it would be easy replace the one in periodicfix with this one.

Molfile

Structure and coordinate file manipulation library.

Reading a structure file:

reader = molfile.mae.read('/path/to/foo.mae')

Iterating through the frames in a file:

for frame in molfile.dtr.read('/path/to/foo.dtr').frames():
    function( frame.pos, frame.vel, frame.time, frame.box )

Random access to frames (only dtr files support this currently):

f27 = molfile.dtr.read('/path/to/foo.dtr').frame(27) # 0-based index

Write a trajectory to a frameset (dtr):

f = msys.molfile.Frame(natoms)
w = msys.molfile.dtr.write('output.dtr', natoms=natoms)
for i, xyz in enumerate(xyzs):
    f.pos[:] = xyz
    f.time = i
    w.frame(f)
w.close()

Convert an mae file to a pdb file:

input=molfile.mae.read('foo.mae')
output=molfile.pdb.write('foo.pdb', atoms=input.atoms)
output.frame(input.frames().next())
output.close()

Write every 10th frame in a dtr to a trr:

input=molfile.dtr.read('big.dtr')
output=molfile.trr.write('out.trr, natoms=input.natoms)
for i in range(0,input.nframes, 10):
    output.frame( input.frame(i) )
output.close()

Write a frame with a specified set of gids:

f = molfile.Frame(natoms, with_gids=True
f.gid[:] = my_gids
f.pos[:] = my_positions
w.frame(f)

Read the raw fields from a frameset (dtr):

dtr = molfile.DtrReader('input.dtr')    # also works for stk
for i in range(dtr.nframes):
    f = dtr.frame(i)
    keyvals = dict()
    frame = dtr.frame(i, keyvals=keyvals)
    ## use data in keyvals

Write raw fields to a frameset (dtr):

dtr = molfile.DtrWriter('output.dtr', natoms=natoms)
keyvals = dict( s = "a string",
                f = positions.flatten(),    # must be 1d arrays
                i = numpy.array([1,2,3]),
                )
dtr.append( time = my_time, keyvals = keyvals )
class msys.molfile.Plugin

Interface to readers and writers

property can_read
property can_write
property filename_extensions
property name
property prettyname
read((Plugin)arg1, (str)path[, (bool)double_precision=False]) → Reader :

Open a file for reading

property version
write((Plugin)arg1, (str)path[, (object)atoms=None[, (object)natoms=None]]) → Writer :

write(path,atoms=None,natoms=None)

class msys.molfile.DtrReader
fileinfo((DtrReader)arg1, (int)arg2) → tuple :

fileinfo(index) -> path, time, offset, framesize, first, last, filesize, dtrpath, dtrsize file contains frames [first, last)

frame((DtrReader)arg1, (int)index[, (object)bytes=None[, (object)keyvals=None]]) → object :

frame(index, bytes=None, keyvals=None) -> Frame Read bytes from disk if bytes are not provided If keyvals is not None, it should be a dict, and raw data from the frame will be provided.

frameset_is_compact((DtrReader)arg1, (int)arg2) → bool
frameset_path((DtrReader)arg1, (int)arg2) → str
frameset_size((DtrReader)arg1, (int)arg2) → int
static fromTimekeys((str)arg1, (list)arg2, (list)arg3) → DtrReader
index_ge((DtrReader)arg1, (float)arg2) → int
index_gt((DtrReader)arg1, (float)arg2) → int
index_le((DtrReader)arg1, (float)arg2) → int
index_lt((DtrReader)arg1, (float)arg2) → int
index_near((DtrReader)arg1, (float)arg2) → int
keyvals((DtrReader)arg1, (int)arg2) → dict :

keyvals(index) -> dict() Read raw fields from frame.

property metadata
property natoms
property nframes
property nframesets
property path
reload((DtrReader)arg1) → int :

reload() -> number of timekeys reloaded – reload frames in the dtr/stk

times((DtrReader)arg1) → object
total_bytes((DtrReader)arg1) → int
class msys.molfile.Frame
property box
property dpos
property dvel
property extended_energy
property fpos
property fvel
property kinetic_energy
moveby((Frame)arg1, (float)x, (float)y, (float)z) → None
property pos
property position
property potential_energy
property pressure
property pressure_tensor
select((Frame)arg1, (object)indices) → Frame
property temperature
property time
property total_energy
property vel
property velocity
property virial_tensor
class msys.molfile.Atom
addbond()

addbond(atom) – add atom to self.bonds and self to atom.bonds

altloc

segment name

anum

atomic number

bfactor

temperature factor

bonds

bonded atoms

chain

chain name

charge
delbond()

delbond(atom) – remove atom from self.bonds and self from atom.bonds

getbondorder()

getbondorder(atom) – bond order for bond with given atom.

insertion

segment name

mass

mass in amu

name

atom name

occupancy
radius

vdw radius

resid

residue id

resname

residue name

segid

segment name

setbondorder()

setbondorder(atom,val) – set bond order for bond with given atom.

type

atom type

class msys.molfile.SeqFile[source]

Read csv-like files with column names in the first row

class Reader(path)[source]
at_time_near(time)[source]
frame(n)[source]
frames()[source]
get_prop(prop)[source]
property natoms
property nframes
filename_extensions = 'seq'
name = 'seq'
classmethod read(path)[source]

Open an eneseq file for reading

class msys.molfile.Grid(data, name='', axis=None, origin=None)[source]
property axis
property data
property name
property origin

Reader

class msys.molfile.Reader

Structure or trajectory open for reading

A Reader is a handle to an open file. Use the atoms member to fetch the atomic structure from the file, assuming it exists. To access frames, there are two methods.

frames()

returns a FrameIter object for iteration over frames. FrameIter has two methods: the usual next() method which returns a Frame, and skip(n=1), which advances the iterator by n frames without (necessarily) reading anything. FrameIter is a very poor iterator: once a frame has been read or skipped, it can’t be loaded again; you have use a brand new Reader.

frame(n)

returns the nth frame (0-based index). Currently only the dtr plugin supports this method.

grid(n)

return the nth grid. For dx and ccp4 files.

at_time_ge((Reader)arg1, (float)time) → Frame
at_time_gt((Reader)arg1, (float)time) → Frame
at_time_le((Reader)arg1, (float)time) → Frame
at_time_lt((Reader)arg1, (float)time) → Frame
at_time_near((Reader)arg1, (float)time) → Frame
property atoms

list of Atoms

property bondorders

list of bond orders

frame((Reader)arg1, (int)arg2) → Frame
grid_data((Reader)arg1, (int)arg2, (object)arg3) → None
grid_meta((Reader)arg1, (int)arg2) → object
property has_velocities

reads velocities

property natoms

number of atoms

next((Reader)arg1) → Frame :

Return the next frame

property nframes

number of frames

property ngrids

number of grids

reopen((Reader)arg1) → Reader :

reopen file for reading

skip((Reader)arg1) → None :

Skip the next frame

property times

all times for frames in trajectory

property topology

bond adjacency graph

Writer

class msys.molfile.Writer

Structure or trajectory open for writing

Writers are initialized with a path and either an array of Atoms or an atom count. If the Writer supports structure writing, Atoms must be provided; if the Writer only writes frames, either one will do.

frame(f)

If the writer supports frame writing, appends frame f to the end of the file.

grid(g)

If the writer supports grid writing, writes Grid g to the file, where g is an instance of molfile.Grid, either returned from reader.grid(n) or created from scratch.

close()

Invoked when the Writer goes out of scope, but it’s not a bad idea to invoke it explicitly.

close((Writer)arg1) → None
frame((Writer)arg1, (Frame)arg2) → Writer
sync((Writer)arg1) → None
truncate((Writer)arg1, (float)arg2) → bool

AnnotatedSystem

class msys.AnnotatedSystem(sys, allow_bad_charges=False)[source]

System that has been annotated with additional chemical information

The AnnotatedSystem class provides chemical annotation useful primarily for evaluating smarts patterns. The system is expected to already have have chemical reasonable bond orders and formal charges, and to have no missing atoms (e.g. hydrogens). If these criteria cannot be met, set allow_bad_charges=True in the constructor to bypass these checks; in that case the AnnotatedSystem can still be used to evaluate smarts patterns, but patterns making use of the electronic state of the system (e.g. aromaticity, hybridization, etc.) will not be correct (the system will appear to be entirely aliphatic). You may also use the AssignBondOrderAndFormalCharge function to assign reasonable bond orders and formal charges, assuming there are no missing atoms.

The AnnotatedSystem defines a model for aromaticity. First, the SSSR (smallest set of smallest rings) is determined. Next, rings which share bonds are detected and grouped into ring systems. Rings are initially marked as nonaromatic. If the ring system taken as a whole is deemed to be aromatic, then all rings within it are aromatic as well; otherwise, individual rings are checked for aromaticity. Rings are checked in this fashion until no new rings are found to be aromatic.

A ring system is deemed to be aromatic if it satisfies Huckel’s 4N+2 rule for the number of electrons in the ring(s). An internal double bond (i.e. a bond between two atoms in the ring) adds 2 to the electron count. An external double bond (a bond between a ring atom and an atom not in that ring) adds 1 to the electron count. An external double bond between a carbon and a nonaromatic carbon makes the ring unconditionally nonaromtic. An atom with a lone pair and no double bonds adds 2 to the electron count.

aromatic(atom_or_bond)[source]

Is atom or bond aromatic

degree(atom)[source]

Number of (non-pseudo) bonds

property errors

List of errors found during system analysis if allow_bad_charges=True

hcount(atom)[source]

Number of bonded hydrogens

hybridization(atom)[source]

Atom hybridization – 1=sp, 2=sp2, 3=sp3, 4=sp3d, etc.

Equal to 0 for hydrogen and atoms with no bonds, otherwise max(1, a.degree() + (a.lone_electrons+1)/2 - 1).

loneelectrons(atom)[source]

Number of lone electrons

ringbondcount(atom)[source]

Number of ring bonds

valence(atom)[source]

Sum of bond orders of all (non-pseudo) bonds

SmartsPattern

class msys.SmartsPattern(pattern)[source]

A class representing a compiled SMARTS pattern

The Msys smarts implementation is similar to that of Daylight smarts <http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html>, with support for arbitrarily nested recursive smarts. A few features are not currently supported; warnings will be generated when these constructs are used in a smarts pattern.

  • Directional bonds; e.g. \ and /; these are treated as single bonds (i.e. as a - character).

  • Chiral specification (@, @@, etc); ignored.

  • Implicit hydrogen (h): treated as explicit H.

  • Explicit degree (D): treated as bond count X.

  • Isotopes: ([12C]): ignored.

  • Atom class ([C:6]): ignored.

On the other hand, Msys does support hybridization using the ^ token, as in OpenBabel:

[c^2]       select sp2 aromatic carbon
findMatches(annotated_system, atoms=None)[source]

Return list of lists representing ids of matches of this pattern in this system, optionally requiring that the first atom match belongs to the given set of atoms. An AnnotatedSystem must be used here, which can be constructed from a System after calling AssignBondOrderAndFormalCharge.

match(annotated_system)[source]

Return True if a match is found anywhere; False otherwise.

This is much faster than checking for an empty result from findMatches.

property natoms

Number of atoms in the compiled smarts pattern

property pattern

The pattern used to initialize the object

property warnings

Warnings, if any, emitted during compilation