properties : Injection of properties in a tree’s node

class idpflex.properties.Asphericity(*args, **kwargs)[source]

Bases: idpflex.properties.ScalarProperty, idpflex.properties.AsphericityMixin

Implementation of a node property to store the asphericity from the gyration radius tensor

\(\frac{(L_1-L_2)^2+(L_1-L_3)^2+L_2-L_3)^2}{2(L_1+L_2+L_3)^2}\)

where \(L_i\) are the eigenvalues of the gyration tensor. Units are same as units of a_universe.

Reference: https://pubs.acs.org/doi/pdf/10.1021/ja206839u

Does not apply periodic boundary conditions

See ScalarProperty for initialization

property asphericity

Property to read and set the asphericity

default_name = 'asphericity'
class idpflex.properties.AsphericityMixin[source]

Bases: object

Mixin class providing a set of methods to calculate the asphericity from the gyration radius tensor

from_pdb(filename, selection=None)[source]

Calculate asphericity from a PDB file

\(\frac{(L_1-L_2)^2+(L_1-L_3)^2+L_2-L_3)^2}{2(L_1+L_2+L_3)^2}\)

where \(L_i\) are the eigenvalues of the gyration tensor. Units are same as units of a_universe.

Does not apply periodic boundary conditions

Parameters
  • filename (str) – path to the PDB file

  • selection (str) – Atomic selection. All atoms are considered if None is passed. See the selections page for atom selection syntax.

Returns

self – Instantiated Asphericity object

Return type

Asphericity

from_universe(a_universe, selection=None, index=0)[source]

Calculate asphericity from an MDAnalysis universe instance

\(\frac{(L_1-L_2)^2+(L_1-L_3)^2+L_2-L_3)^2}{2(L_1+L_2+L_3)^2}\)

where \(L_i\) are the eigenvalues of the gyration tensor. Units are same as units of a_universe.

Does not apply periodic boundary conditions

Parameters
  • a_universe (Universe) – Trajectory or single-conformation instance

  • selection (str) – Atomic selection. All atoms considered if None is passed. See the selections page for atom selection syntax.

Returns

self – Instantiated Asphericity object

Return type

Asphericity

class idpflex.properties.EndToEnd(*args, **kwargs)[source]

Bases: idpflex.properties.ScalarProperty, idpflex.properties.EndToEndMixin

Implementation of a node property to store the end-to-end distance

See ScalarProperty for initialization

default_name = 'end_to_end'
property end_to_end

Property to read and set the end-to-end distance

class idpflex.properties.EndToEndMixin[source]

Bases: object

Mixin class providing a set of methods to load and calculate the end-to-end distance for a protein

from_pdb(filename, selection='name CA')[source]

Calculate end-to-end distance from a PDB file

Does not apply periodic boundary conditions

Parameters
  • filename (str) – path to the PDB file

  • selection (str) – Atomic selection. The first and last atoms of the selection are considered for the calculation of the end-to-end distance. See the selections page for atom selection syntax.

Returns

self – Instantiated EndToEnd object

Return type

EndToEnd

from_universe(a_universe, selection='name CA', index=0)[source]

Calculate radius of gyration from an MDAnalysis Universe instance

Does not apply periodic boundary conditions

Parameters
  • a_universe (Universe) – Trajectory or single-conformation instance

  • selection (str) – Atomic selection. The first and last atoms of the selection are considered for the calculation of the end-to-end distance. See the selections page for atom selection syntax.

Returns

self – Instantiated EndToEnd object

Return type

EndToEnd

class idpflex.properties.ProfileProperty(name=None, qvalues=None, profile=None, errors=None)[source]

Bases: object

Implementation of a node property valid for SANS or X-Ray data.

Parameters
  • name (str) – Property name.

  • qvalues (ndarray) – Momentun transfer domain

  • profile (ndarray) – Intensity values

  • errors (ndarray) – Errors in the intensity values

default_name = 'profile'
property e

property errors : (ndarray) intensity errors

property feature_vector

Each qvalue is interpreted as an independent feature, and the related value in profile is a particular “measured” value of that feature.

Returns

Return type

numpy.ndarray

property feature_weights

Weights to be used when calculating the square of the euclidean distance between two feature vectors

Returns

Return type

numpy.ndarray

property name

property name : (str) name of the profile

property x

property qvalues : (ndarray) momentum transfer values

property y

property profile : (ndarray) profile intensities

class idpflex.properties.PropertyDict(properties=None)[source]

Bases: object

A container of properties mimicking some of the behavior of a standard python dictionary, plus methods representing features of the properties when taken as a group.

Parameters

properties (list) – A list of properties to include

feature_vector(names=None)[source]

Feature vector for the specified sequence of names.

The feature vector is a concatenation of the feature vectors for each of the properties and the concatenation follows the order of names.

If names is None, return all features in the property dict in the order of insertion.

Parameters

names (list) – List of property names

Returns

Return type

numpy.ndarray

feature_weights(names=None)[source]

Feature vector weights for the specified sequence of names.

The feature vector weights is a concatenation of the feature vectors weights for each of the properties and the concatenation follows the order of names.

If names is None, return all features in the property dict in the order of insertion.

Parameters

names (list) – List of property names

Returns

Return type

numpy.ndarray

get(name, default=None)[source]

Mimic get method of a dictionary

Parameters
  • name (str) – name of the property

  • default (object) – default value if name is not one of the properties stored

Returns

Return type

Property or default object

items()[source]

Mimic items method of a dictionary

Returns

Return type

dict_items of _properties

keys()[source]

Mimic keys method of a dictionary

Returns

Return type

dict_keys of _properties

values()[source]

Mimic values method of a dictionary

Returns

Return type

dict_values of _properties

class idpflex.properties.RadiusOfGyration(*args, **kwargs)[source]

Bases: idpflex.properties.ScalarProperty, idpflex.properties.RadiusOfGyrationMixin

Implementation of a node property to store the radius of gyration.

See ScalarProperty for initialization

default_name = 'rg'
property rg

Property to read and write the radius of gyration value

class idpflex.properties.RadiusOfGyrationMixin[source]

Bases: object

Mixin class providing a set of methods to load the Radius of Gyration data into a Scalar property

from_pdb(filename, selection=None)[source]

Calculate Rg from a PDB file

Parameters
  • filename (str) – path to the PDB file

  • selection (str) – Atomic selection for calculating Rg. All atoms considered if default None is passed. See the selections page for atom selection syntax.

Returns

self – Instantiated RadiusOfGyration property object

Return type

RadiusOfGyration

from_universe(a_universe, selection=None, index=0)[source]

Calculate radius of gyration from an MDAnalysis Universe instance

Parameters
  • a_universe (Universe) – Trajectory, or single-conformation instance.

  • selection (str) – Atomic selection. All atoms considered if None is passed. See the selections page for atom selection syntax.

Returns

self – Instantiated RadiusOfGyration object

Return type

RadiusOfGyration

class idpflex.properties.ResidueContactMap(name=None, selection=None, cmap=None, errors=None, cutoff=None)[source]

Bases: object

Contact map between residues of the conformation using different definitions of contact.

Parameters
  • name (str) – Name of the contact map

  • selection (AtomGroup) – Atomic selection for calculation of the contact map, which is then projected to a residue based map. See the selections page for atom selection syntax.

  • cmap (ndarray) – Contact map between residues of the atomic selection

  • errors (ndarray) – Underterminacies for every contact of cmap

  • cutoff (float) – Cut-off distance defining a contact between two atoms

default_name = 'cm'
property e

property errors : (ndarray) undeterminacies in the contact map

from_pdb(filename, cutoff, selection=None)[source]

Calculate residue contact map from a PDB file

Parameters
  • filename (str) – Path to the file in PDB format

  • cutoff (float) – Cut-off distance defining a contact between two atoms

  • selection (str) – Atomic selection for calculating interatomic contacts. All atoms are used if None is passed. See the selections page for atom selection syntax.

Returns

self – Instantiated ResidueContactMap object

Return type

ResidueContactMap

from_universe(a_universe, cutoff, selection=None, index=0)[source]

Calculate residue contact map from an MDAnalysis Universe instance

Parameters
  • a_universe (Universe) – Trajectory or single-conformation instance

  • cutoff (float) – Cut-off distance defining a contact between two atoms

  • selection (str) – Atomic selection for calculating interatomic contacts. All atoms are used if None is passed. See the selections page for atom selection syntax.

Returns

self – Instantiated ResidueContactMap object

Return type

ResidueContactMap

property name

property name : (str) name of the contact map

plot()[source]

Plot the residue contact map of the node

property x

property selection : (AtomGroup) atom selection

property y

property cmap : (ndarray) contact map between residues

class idpflex.properties.SaSa(*args, **kwargs)[source]

Bases: idpflex.properties.ScalarProperty, idpflex.properties.SaSaMixin

Implementation of a node property to calculate the Solvent Accessible Surface Area.

See ScalarProperty for initialization

default_name = 'sasa'
property sasa

Property to read and write the SASA value

class idpflex.properties.SaSaMixin[source]

Bases: object

Mixin class providing a set of methods to load and calculate the solvent accessible surface area

from_mdtraj(a_traj, probe_radius=1.4, **kwargs)[source]

Calculate solvent accessible surface for frames in a trajectory

SASA units are Angstroms squared

Parameters
  • a_traj (Trajectory) – mdtraj trajectory instance

  • probe_radius (float) – The radius of the probe, in Angstroms

  • kwargs (dict) – Optional arguments for the underlying mdtraj.shrake_rupley algorithm doing the actual SaSa calculation

Returns

self – Instantiated SaSa property object

Return type

SaSa

from_pdb(filename, selection=None, probe_radius=1.4, **kwargs)[source]

Calculate solvent accessible surface area (SASA) from a PDB file

If the PBD contains more than one structure, calculation is performed only for the first one.

SASA units are Angstroms squared

Parameters
  • filename (str) – Path to the PDB file

  • selection (str) – Atomic selection for calculating SASA. All atoms considered if default None is passed. See the

  • `selections page <https (//www.mdanalysis.org/docs/documentation_pages/selections.html>`_)

  • for atom selection syntax.

  • probe_radius (float) – The radius of the probe, in Angstroms

  • kwargs (dict) –

    Optional arguments for the underlying mdtraj.shrake_rupley

    algorithm doing the actual SaSa calculation

Returns

self – Instantiated SaSa property object

Return type

SaSa

from_universe(a_universe, selection=None, probe_radius=1.4, index=0, **kwargs)[source]

Calculate solvent accessible surface area (SASA) from an MDAnalysis universe instance.

This method is a thin wrapper around method from_pdb()

Parameters
  • a_universe (Universe) – Trajectory or single-conformation instance

  • selection (str) – Atomic selection for calculating SASA. All atoms considered if default None is passed. See the

  • `selections page <https (//www.mdanalysis.org/docs/documentation_pages/selections.html>`_)

  • for atom selection syntax.

  • probe_radius (float) – The radius of the probe, in Angstroms

  • kwargs (dict) – Optional arguments for underlying mdtraj.shrake_rupley doing the actual SASA calculation.

Returns

self – Instantiated SaSa property object

Return type

SaSa

class idpflex.properties.SansLoaderMixin[source]

Bases: object

Mixin class providing a set of methods to load SANS data into a profile property

from_ascii(file_name)[source]

Load profile from an ascii file.

Expected file format:
Rows have three items separated by a blank space:
- col1 momentum transfer
- col2 profile
- col3 errors of the profile
Parameters

file_name (str) – File path

Returns

self

Return type

SansProperty

from_cryson_fit(file_name)[source]

Load profile from a cryson *.fit file.

Parameters

file_name (str) – File path

Returns

self

Return type

SansProperty

from_cryson_int(file_name)[source]

Load profile from a cryson *.int file

Parameters

file_name (str) – File path

Returns

self

Return type

SansProperty

from_cryson_pdb(file_name, command='cryson', args='-lm 20 -sm 0.6 -ns 500 -un 1 -eh -dro 0.075', silent=True)[source]

Calculate profile with cryson from a PDB file

Parameters
  • file_name (str) – Path to PDB file

  • command (str) – Command to invoke cryson

  • args (str) – Arguments to pass to cryson

  • silent (bool) – Suppress cryson standard output and standard error

Returns

self

Return type

SansProperty

from_sassena(handle, profile_key='fqt', index=0)[source]

Load SANS profile from sassena output.

It is assumed that Q-values are stored under item qvalues and listed under the X column.

Parameters
  • handle (h5py.File) – h5py reading handle to HDF5 file

  • profile_key (str) – item key where profiles are stored in the HDF5 file

  • param index (int) – profile index, if data contains more than one profile

Returns

self

Return type

SansProperty

to_ascii(file_name)[source]

Save profile as a three-column ascii file.

Rows have three items separated by a blank space
- col1 momentum transfer
- col2 profile
- col3 errors of the profile
class idpflex.properties.SansProperty(*args, **kwargs)[source]

Bases: idpflex.properties.ProfileProperty, idpflex.properties.SansLoaderMixin

Implementation of a node property for SANS data

default_name = 'sans'
class idpflex.properties.SaxsLoaderMixin[source]

Bases: object

Mixin class providing a set of methods to load X-ray data into a profile property

from_ascii(file_name)[source]

Load profile from an ascii file.

Expected file format:
Rows have three items separated by a blank space:
- col1 momentum transfer
- col2 profile
- col3 errors of the profile
Parameters

file_name (str) – File path

Returns

self

Return type

SaxsProperty

from_crysol_fit(file_name)[source]

Load profile from a crysol *.fit file.

Parameters

file_name (str) – File path

Returns

self

Return type

SaxsProperty

from_crysol_int(file_name)[source]

Load profile from a crysol *.int file

Parameters

file_name (str) – File path

Returns

self

Return type

SaxsProperty

from_crysol_pdb(file_name, command='crysol', args='-lm 20 -sm 0.6 -ns 500 -un 1 -eh -dro 0.075', silent=True)[source]

Calculate profile with crysol from a PDB file

Parameters
  • file_name (str) – Path to PDB file

  • command (str) – Command to invoke crysol

  • args (str) – Arguments to pass to crysol

  • silent (bool) – Suppress crysol standard output and standard error

Returns

self

Return type

SaxsProperty

to_ascii(file_name)[source]

Save profile as a three-column ascii file.

Rows have three items separated by a blank space
- col1 momentum transfer
- col2 profile
- col3 errors of the profile
class idpflex.properties.SaxsProperty(*args, **kwargs)[source]

Bases: idpflex.properties.ProfileProperty, idpflex.properties.SaxsLoaderMixin

Implementation of a node property for SAXS data

default_name = 'saxs'
class idpflex.properties.ScalarProperty(name=None, x=0.0, y=0.0, e=0.0)[source]

Bases: object

Implementation of a node property for a number plus an error.

Instances have name, x, y, and e attributes, so they will follow the property node protocol.

Parameters
  • name (str) – Name associated to this type of property

  • x (float) – Domain of the property

  • y (float) – value of the property

  • e (float) – error of the property’s value

property feature_vector
property feature_weights
histogram(bins=10, errors=False, **kwargs)[source]

Histogram of values for the leaf nodes

Parameters
  • nbins (int) – number of histogram bins

  • errors (bool) – estimate error from histogram counts

  • kwargs (dict) – Additional arguments to underlying histogram()

Returns

  • ndarray – histogram bin edges

  • ndarray – histogram values

  • ndarray – Errors for histogram counts, if error=True. Otherwise None.

plot(kind='histogram', errors=False, **kwargs)[source]
Parameters
  • kind (str) – ‘histogram’: Gather Rg for the leafs under the node associated to this property, then make a histogram.

  • errors (bool) – Estimate error from histogram counts

  • kwargs (dict) – Additional arguments to underlying hist()

Returns

Axes object holding the plot

Return type

Axes

set_scalar(y)[source]
class idpflex.properties.SecondaryStructureProperty(name=None, aa=None, profile=None, errors=None)[source]

Bases: object

Node property for secondary structure determined by DSSP

Every residue is assigned a vector of length 8. Indexes corresponds to different secondary structure assignment:

Index__||__DSSP code__||__ Color__||__Structure__||
=======================================
__0__||__H__||__yellow__||__Alpha helix (4-12)
__1__||__B__||__pink__||__Isolated beta-bridge residue
__2__||__E__||__red__||__Strand
__3__||__G__||__orange__||__3-10 helix
__4__||__I___||__green__||__Pi helix
__5__||__T__||__magenta__||__Turn
__6__||__S__||__cyan__||__Bend
__7__||_____||__white__||__Unstructured (coil)

We follow here Bio.PDB.DSSP ordering

For a leaf node (single structure), the vector for any given residue will be all zeroes except a value of one for the corresponding assigned secondary structure. For all other nodes, the vector will correspond to a probability distribution among the different DSSP codes.

Parameters
  • name (str) – Property name

  • aa (str) – One-letter amino acid sequence encoded in a single string

  • profile (ndarray) – N x 8 matrix with N number of residues and 8 types of secondary structure

  • errors (ndarray) – N x 8 matrix denoting undeterminacies for each type of assigned secondary residue in every residue

classmethod code2profile(code)[source]

Generate a secondary structure profile vector for a particular DSSP code

Parameters

code (str) – one-letter code denoting secondary structure assignment

Returns

profile vector

Return type

ndarray

property collapsed

For every residue, collapse the secondary structure profile onto the component with the highest probability

Returns

List of indexes corresponding to collapsed secondary structure states

Return type

ndarray

colors = ('yellow', 'pink', 'red', 'orange', 'green', 'magenta', 'cyan', 'white')

associated colors to each element of secondary structure

default_name = 'ss'
disparity(other)[source]

Secondary Structure disparity of other profile to self, akin to \(\chi^2\)

\(\frac{1}{N(n-1)} \sum_{i=1}^{N}\sum_{j=1}^{n} (\frac{p_{ij}-q_ {ij}}{e})^2\)

with \(N\) number of residues and \(n\) number of DSSP codes. Errors \(e\) are those of self, and are set to one if they have not been initialized. We divide by \(n-1\) because it is implied a normalized distribution of secondary structure elements for each residue.

Parameters

other (SecondaryStructureProperty) – Secondary structure property to compare to

Returns

disparity measure

Return type

float

dssp_codes = 'HBEGITS '

list of single-letter codes for secondary structure. Last code is a blank space denoting no secondary structure (Unstructured)

property e

property errors : (ndarray) assignment undeterminacy

elements = {' ': 'Unstructured', 'B': 'Isolated beta-bridge', 'E': 'Strand', 'G': '3-10 helix', 'H': 'Alpha helix', 'I': 'Pi helix', 'S': 'Bend', 'T': 'Turn'}

Description of single-letter codes for secondary structure

property fractions

Output fraction of each element of secondary structure.

Fractions are computed summing over all residues.

Returns

Elements of the form {single-letter-code: fraction}

Return type

dict

from_dssp(file_name)[source]

Load secondary structure profile from a dssp file

Parameters

file_name (str) – File path

Returns

self

Return type

SecondaryStructureProperty

from_dssp_pdb(file_name, command='mkdssp', silent=True)[source]

Calculate secondary structure with DSSP

Parameters
  • file_name (str) – Path to PDB file

  • command (str) – Command to invoke dssp. You need to have DSSP installed in your machine

  • silent (bool) – Suppress DSSP standard output and error

Returns

self

Return type

SecondaryStructureProperty

from_dssp_sequence(codes)[source]

Load secondary structure profile from a single string of DSSP codes

Attributes aa and errors are not modified, only profile.

Parameters

codes (str) – Sequence of one-letter DSSP codes

Returns

self

Return type

SecondaryStructureProperty

n_codes = 8

number of distinctive elements of secondary structure

property name

property name : (str) name of the profile

plot(kind='percents')[source]

Plot the secondary structure of the node holding the property

Parameters

kind (str) – ‘percents’: bar chart with each bar denoting the percent of a particular secondary structure in all the protein; — ‘node’: gray plot of secondary structure element probabilities for each residue; — ‘leafs’: color plot of secondary structure for each leaf under the node. Leafs are sorted by increasing disparity to the secondary structure of the node.

property x

property aa : (str) amino-acid sequence

property y

property profile : (ndarray) secondary structure assignment

idpflex.properties.decorate_as_node_property(nxye)[source]

Decorator that endows a class with the node property protocol

For details, see register_as_node_property()

Parameters

nxye (list) – list of (name, description) pairs denoting the property components

idpflex.properties.propagator_size_weighted_sum(values, tree, *, weights=<function weights_by_size>)
Calculate a property of the node as the sum of its siblings’ property

values, weighted by the relative cluster sizes of the siblings.

Parameters
  • values (list) – List of property values (of same type), one item for each leaf node.

  • node_tree (Tree) – Tree of ClusterNodeX nodes

idpflex.properties.propagator_weighted_sum(values, tree, weights=<function <lambda>>)[source]

Calculate the property of a node as the sum of its two siblings’ property values. Propagation applies only to non-leaf nodes.

Parameters
  • values (list) – List of property values (of same type), one item for each leaf node.

  • tree (Tree) – Tree of ClusterNodeX nodes

  • weights (tuple) – Callable of two arguments (left-node and right-node) returning a tuple of left and right weights. Default callable returns (1.0, 1.0) always.

idpflex.properties.register_as_node_property(cls, nxye)[source]

Endows a class with the node property protocol.

The node property assumes the existence of these attributes
- name name of the property
- x property domain
- y property values
- e errors of the property values

This function will endow class cls with these attributes, implemented through the python property pattern. Names for the corresponding storage attributes must be supplied when registering the class.

Parameters
  • cls (class type) – The class type

  • nxye (tuple (len==4)) – nxye is a four element tuple. Its elements are in this order:

    (property name, ‘stores the name of the property’), (domain_storage_attribute_name, description of the domain), (values_storage_attribute_name, description of the values), (errors_storage_attribute_name, description of the errors)

    Example:

    ((‘name’, ‘stores the name of the property’), (‘qvalues’, ‘momentum transfer values’), (‘profile’, ‘profile intensities’), (‘errors’, ‘intensity errors’))

idpflex.properties.weights_by_size(left_node, right_node)[source]

Calculate the relative size of two nodes

Parameters
Returns

Weights representing the relative populations of two nodes

Return type

tuple