Decorators¶

class menten_gcn.decorators.BareBonesDecorator[source]¶

This decorator is included in all DataMakers by default. Its goal is to be the starting point upon which everything else is built. It labels focus nodes and labels edges for residues that are polymer bonded to one another.

1 Node Feature
1 Edge Feature

calc_edge_features(wrapped_pose, resid1, resid2, dict_cache=None)[source]¶

This does all of the business logic of calculating the values to be added for each edge.

This function will never be called in the reverse order (with resid1 and resid2 swapped). Instead, we just create both edges at once.

Parameters

wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid1 (int) – The first residue ID we are currently generating data for
resid1 (int) – The second residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this

Returns

features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid1 -> resid2.
inv_features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid2 -> resid1.

calc_node_features(wrapped_pose, resid, dict_cache=None)[source]¶

This does all of the business logic of calculating the values to be added for each node.

Parameters

wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid (int) – The residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this

Returns

features (list) – The length of this list will be the same value as self.n_node_features(). These are the values to represent this decorator’s contribution to X for this resid.

describe_edge_features()[source]¶

Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.

Returns: features (list) – The length of this list will be the same value as self.n_edge_features(). These are descriptions of the values to represent this decorator’s contribution to E for any arbitrary resid pair.

describe_node_features()[source]¶

Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.

Returns: features (list) – The length of this list will be the same value as self.n_node_features(). These are descriptions of the values to represent this decorator’s contribution to X for any arbitrary resid.

get_version_name()[source]¶: Get a unique, versioned name of this decorator for maximal reproducability

n_edge_features()[source]¶: How many features will this decorator add to edge tensors (E)?

n_node_features()[source]¶: How many features will this decorator add to node tensors (X)?

Geometry¶

class menten_gcn.decorators.CACA_dist(use_nm: bool = False)[source]¶

Measures distance between the two C-Alpha atoms of each residue

0 Node Features
1 Edge Feature

Parameters: use_nm (bool) – If true (default), measure distance in Angstroms. Otherwise use nanometers.

calc_edge_features(wrapped_pose, resid1: int, resid2: int, dict_cache=None)[source]¶

This does all of the business logic of calculating the values to be added for each edge.

This function will never be called in the reverse order (with resid1 and resid2 swapped). Instead, we just create both edges at once.

Parameters

wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid1 (int) – The first residue ID we are currently generating data for
resid1 (int) – The second residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this

Returns

features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid1 -> resid2.
inv_features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid2 -> resid1.

describe_edge_features()[source]¶

Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.

Returns: features (list) – The length of this list will be the same value as self.n_edge_features(). These are descriptions of the values to represent this decorator’s contribution to E for any arbitrary resid pair.

get_version_name()[source]¶: Get a unique, versioned name of this decorator for maximal reproducability

n_edge_features()[source]¶: How many features will this decorator add to edge tensors (E)?

class menten_gcn.decorators.CBCB_dist(use_nm: bool = False)[source]¶

Measures distance between the two C-Beta atoms of each residue. Note: We will calculate the “ideal ALA” CB location even if this residue has a CB atom. This may sound silly but it is intended to prevents noise from different native amino acid types.

0 Node Features
1 Edge Feature

Parameters: use_nm (bool) – If true (default), measure distance in Angstroms. Otherwise use nanometers.

calc_edge_features(wrapped_pose, resid1: int, resid2: int, dict_cache=None)[source]¶

This does all of the business logic of calculating the values to be added for each edge.

This function will never be called in the reverse order (with resid1 and resid2 swapped). Instead, we just create both edges at once.

Parameters

wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid1 (int) – The first residue ID we are currently generating data for
resid1 (int) – The second residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this

Returns

features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid1 -> resid2.
inv_features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid2 -> resid1.

describe_edge_features()[source]¶

Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.

Returns: features (list) – The length of this list will be the same value as self.n_edge_features(). These are descriptions of the values to represent this decorator’s contribution to E for any arbitrary resid pair.

get_version_name()[source]¶: Get a unique, versioned name of this decorator for maximal reproducability

n_edge_features()[source]¶: How many features will this decorator add to edge tensors (E)?

class menten_gcn.decorators.PhiPsiRadians(sincos: bool = False)[source]¶

Returns the phi and psi values of each residue position.

2-4 Node Features
0 Edge Features

Parameters: sincos (bool) – Return the sine and cosine of phi and psi instead of just the raw values.

calc_node_features(wrapped_pose, resid, dict_cache=None)[source]¶

This does all of the business logic of calculating the values to be added for each node.

Parameters

wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid (int) – The residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this

Returns

features (list) – The length of this list will be the same value as self.n_node_features(). These are the values to represent this decorator’s contribution to X for this resid.

describe_node_features()[source]¶

Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.

Returns: features (list) – The length of this list will be the same value as self.n_node_features(). These are descriptions of the values to represent this decorator’s contribution to X for any arbitrary resid.

get_version_name()[source]¶: Get a unique, versioned name of this decorator for maximal reproducability

n_node_features()[source]¶: How many features will this decorator add to node tensors (X)?

class menten_gcn.decorators.ChiAngleDecorator(chi1: bool = True, chi2: bool = True, chi3: bool = True, chi4: bool = True, sincos: bool = True)[source]¶

Returns the chi values of each residue position. Ranges from -pi to pi or -1 to 1 if sincos=True.

WARNING: This can behave inconsistantly for proton chis accross modeling frameworks. Rosetta adds hydrogens when they are absent from the input file but MDtraj does not. This results in Rosetta calculating a chi value in some cases that MDtraj skips!

0-8 Node Features
0 Edge Features

Parameters

chi1 (bool) – Include chi1’s value
chi2 (bool) – Include chi2’s value
chi3 (bool) – Include chi3’s value
chi4 (bool) – Include chi4’s value
sincos (bool) – Return the sine and cosine of chi instead of just the raw values

calc_node_features(wrapped_pose, resid, dict_cache=None)[source]¶

This does all of the business logic of calculating the values to be added for each node.

Parameters

wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid (int) – The residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this

Returns

features (list) – The length of this list will be the same value as self.n_node_features(). These are the values to represent this decorator’s contribution to X for this resid.

describe_node_features()[source]¶

Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.

Returns: features (list) – The length of this list will be the same value as self.n_node_features(). These are descriptions of the values to represent this decorator’s contribution to X for any arbitrary resid.

get_version_name()[source]¶: Get a unique, versioned name of this decorator for maximal reproducability

n_edge_features()[source]¶: How many features will this decorator add to edge tensors (E)?

n_node_features()[source]¶: How many features will this decorator add to node tensors (X)?

class menten_gcn.decorators.trRosettaEdges(sincos: bool = False, use_nm: bool = False)[source]¶

Use the residue pair geometries used in this paper: https://www.pnas.org/content/117/3/1496/tab-figures-data

0 Node Features
4-7 Edge Features

Parameters

sincos (bool) – Return the sine and cosine of phi and psi instead of just the raw values.
use_nm (bool) – If true, measure distance in Angstroms. Otherwise use nanometers.

Note: This default value does not match the default of other decorators. This is for the sake of matching the trRosetta paper.

https://www.pnas.org/content/pnas/117/3/1496/F1.large.jpg

class menten_gcn.decorators.SimpleBBGeometry(use_nm=False)[source]¶

Meta-decorator that combines PhiPsiRadians(sincos=False) and CBCB_dist

2 Node Features
1 Edge Feature

Parameters: use_nm (bool) – If true, measure distance in Angstroms. Otherwise use nanometers.

class menten_gcn.decorators.StandardBBGeometry(use_nm=False)[source]¶

Meta-decorator that combines PhiPsiRadians(sincos=True) and trRosettaEdges(sincos=False)

4 Node Features
4 Edge Features

Parameters: use_nm (bool) – If true, measure distance in Angstroms. Otherwise use nanometers.

class menten_gcn.decorators.AdvancedBBGeometry(use_nm=False)[source]¶

Meta-decorator that combines PhiPsiRadians(sincos=True), CACA_dist, and trRosettaEdges(sincos=True)

4 Node Features
8 Edge Features

Parameters: use_nm (bool) – If true, measure all distances in Angstroms. Otherwise use nanometers.

Sequence¶

class menten_gcn.decorators.Sequence[source]¶

One-hot encode the canonical amino acid identity on each node.

20 Node Features
0 Edge Features

calc_node_features(wrapped_pose, resid, dict_cache=None)[source]¶

This does all of the business logic of calculating the values to be added for each node.

Parameters

wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid (int) – The residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this

Returns

features (list) – The length of this list will be the same value as self.n_node_features(). These are the values to represent this decorator’s contribution to X for this resid.

describe_node_features()[source]¶

Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.

Returns: features (list) – The length of this list will be the same value as self.n_node_features(). These are descriptions of the values to represent this decorator’s contribution to X for any arbitrary resid.

get_version_name()[source]¶: Get a unique, versioned name of this decorator for maximal reproducability

n_edge_features()[source]¶: How many features will this decorator add to edge tensors (E)?

n_node_features()[source]¶: How many features will this decorator add to node tensors (X)?

class menten_gcn.decorators.DesignableSequence[source]¶

One-hot encode the canonical amino acid identity on each node, with a 21st value for residues that are not yet assigned an amino acid identity.

Note: requires you to call WrappedPose.set_designable_resids first

21 Node Features
0 Edge Features

calc_node_features(wrapped_pose, resid, dict_cache=None)[source]¶

This does all of the business logic of calculating the values to be added for each node.

Parameters

wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid (int) – The residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this

Returns

features (list) – The length of this list will be the same value as self.n_node_features(). These are the values to represent this decorator’s contribution to X for this resid.

describe_node_features()[source]¶

Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.

Returns: features (list) – The length of this list will be the same value as self.n_node_features(). These are descriptions of the values to represent this decorator’s contribution to X for any arbitrary resid.

get_version_name()[source]¶: Get a unique, versioned name of this decorator for maximal reproducability

n_edge_features()[source]¶: How many features will this decorator add to edge tensors (E)?

n_node_features()[source]¶: How many features will this decorator add to node tensors (X)?

class menten_gcn.decorators.SequenceSeparation(ln: bool = True)[source]¶

The sequence distance between the two residues (i.e., number of residues between these two residues in sequence space, plus one). -1.0 if the two residues belong to different chains.

0 Node Features
1 Edge Feature

Parameters: ln (bool) – Report the natural log of the distance instead of the raw count. Does not apply to -1 values

calc_edge_features(wrapped_pose, resid1, resid2, dict_cache=None)[source]¶

This does all of the business logic of calculating the values to be added for each edge.

This function will never be called in the reverse order (with resid1 and resid2 swapped). Instead, we just create both edges at once.

Parameters

wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid1 (int) – The first residue ID we are currently generating data for
resid1 (int) – The second residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this

Returns

features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid1 -> resid2.
inv_features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid2 -> resid1.

describe_edge_features()[source]¶

Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.

Returns: features (list) – The length of this list will be the same value as self.n_edge_features(). These are descriptions of the values to represent this decorator’s contribution to E for any arbitrary resid pair.

get_version_name()[source]¶: Get a unique, versioned name of this decorator for maximal reproducability

n_edge_features()[source]¶: How many features will this decorator add to edge tensors (E)?

n_node_features()[source]¶: How many features will this decorator add to node tensors (X)?

class menten_gcn.decorators.SameChain[source]¶

1 if the two residues are part of the same protein chain. Otherwise 0.

0 Node Features
1 Edge Feature

calc_edge_features(wrapped_pose, resid1, resid2, dict_cache=None)[source]¶

This does all of the business logic of calculating the values to be added for each edge.

This function will never be called in the reverse order (with resid1 and resid2 swapped). Instead, we just create both edges at once.

Parameters

wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid1 (int) – The first residue ID we are currently generating data for
resid1 (int) – The second residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this

Returns

features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid1 -> resid2.
inv_features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid2 -> resid1.

describe_edge_features()[source]¶

Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.

Returns: features (list) – The length of this list will be the same value as self.n_edge_features(). These are descriptions of the values to represent this decorator’s contribution to E for any arbitrary resid pair.

get_version_name()[source]¶: Get a unique, versioned name of this decorator for maximal reproducability

n_edge_features()[source]¶: How many features will this decorator add to edge tensors (E)?

n_node_features()[source]¶: How many features will this decorator add to node tensors (X)?

Rosetta¶

class menten_gcn.decorators.RosettaResidueSelectorDecorator(selector, description: str)[source]¶

Takes a user-provided residue selctor and labels each residue with a 1 or 0 accordingly.

1 Node Feature
0 Edge Features

Parameters

selector (ResidueSelector) – This residue selector will be applied to the Rosetta pose
description (str) – This is the string that will label this feature in the final summary. Not technically required but highly recommended

Example:

import menten_gcn as mg
import menten_gcn.decorators as decs
import pyrosetta

pyrosetta.init()

buried = pyrosetta.rosetta.core.select.residue_selector.LayerSelector()
buried.set_layers( True, False, False )
buried_dec = decs.RosettaResidueSelectorDecorator( selector=buried, description='<Layer select_core="true" />' )

data_maker = mg.DataMaker( decorators=[ buried_dec ], edge_distance_cutoff_A=10.0, max_residues=30 )
data_maker.summary()

Gives:

Summary:

Node Features:
: 1 if the node is a focus residue, 0 otherwise
: 1.0 if the residue is selected by the residue selector, 0.0 otherwise. User defined definition of the residue selector and how to reproduce it: <Layer select_core="true" />

Edge Features:
: 1.0 if the two residues are polymer-bonded, 0.0 otherwise

Note that the additional features are due to the BareBonesDecorator, which is included by default

class menten_gcn.decorators.RosettaResidueSelectorFromXML(xml_str: str, res_sele_name: str)[source]¶

Takes a user-provided residue selctor via XML and labels each residue with a 1 or 0 accordingly.

1 Node Feature
0 Edge Features

Parameters

xml_str (str) – XML snippet that defines the selector
res_sele_name (str) – The name of the selector within the snippet

Example:

import menten_gcn as mg
import menten_gcn.decorators as decs
import pyrosetta

pyrosetta.init()
xml = '''
<RESIDUE_SELECTORS>
<Layer name="surface" select_surface="true" />
</RESIDUE_SELECTORS>
'''
surface_dec = decs.RosettaResidueSelectorFromXML( xml, "surface" )

max_res=30
data_maker = mg.DataMaker( decorators=[ surface_dec ], edge_distance_cutoff_A=10.0, max_residues=max_res )
data_maker.summary()

Gives:

Summary:

2 Node Features:
1 : 1 if the node is a focus residue, 0 otherwise
2 : 1.0 if the residue is selected by the residue selector, 0.0 otherwise. User defined definition of the residue selector and how to reproduce it: Took the residue selector named surface from this XML:
<RESIDUE_SELECTORS>
<Layer name="surface" select_surface="true" />
</RESIDUE_SELECTORS>


1 Edge Features:
1 : 1.0 if the two residues are polymer-bonded, 0.0 otherwise

Note that the additional features are due to the BareBonesDecorator, which is included by default

class menten_gcn.decorators.RosettaJumpDecorator(use_nm: bool = False, rottype: str = 'euler')[source]¶

Measures the translational and rotational relationships between all residue pairs. This uses internal coordinate frames so it is agnostic to the global coordinate system. You can move/rotate your protein around and these will stay the same.

0 Node Features
6-12 Edge Features

Parameters

use_nm (bool) – If true (default), measure distance in Angstroms. Otherwise use nanometers.
rottype (str) – How do you want to represent the rotational degrees of freedom? Options are “euler” (default), “euler_sincos”, “matrix”, “quat”, “rotvec”, and “rotvec_sincos”.

calc_edge_features(wrapped_pose, resid1, resid2, dict_cache=None)[source]¶

This does all of the business logic of calculating the values to be added for each edge.

This function will never be called in the reverse order (with resid1 and resid2 swapped). Instead, we just create both edges at once.

Parameters

wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid1 (int) – The first residue ID we are currently generating data for
resid1 (int) – The second residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this

Returns

features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid1 -> resid2.
inv_features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid2 -> resid1.

describe_edge_features()[source]¶

Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.

Returns: features (list) – The length of this list will be the same value as self.n_edge_features(). These are descriptions of the values to represent this decorator’s contribution to E for any arbitrary resid pair.

get_version_name()[source]¶: Get a unique, versioned name of this decorator for maximal reproducability

n_edge_features()[source]¶: How many features will this decorator add to edge tensors (E)?

n_node_features()[source]¶: How many features will this decorator add to node tensors (X)?

class menten_gcn.decorators.RosettaHBondDecorator(sfxn=None, bb_only: bool = False)[source]¶

Takes a user-provided residue selctor via XML and labels each residue with a 1 or 0 accordingly.

0 Node Features
1-5 Edge Features (depending on bb_only)

Parameters

sfxn (ScoreFunction) – Score function used to calculate hbonds. We will use Rosetta’s default if this is None
bb_only (bool) – Only consider backbone-backbone hbonds. Reduces the number of features from 5 down to 1

class menten_gcn.decorators.Rosetta_Ref2015_OneBodyEneriges(individual: bool = False, score_types=None)[source]¶

Label each node with its Rosetta one-body energy

1 - 20-ish Node Features
0 Edge Features

Parameters

individual (bool) – If true, list the score for each term individually. Otherwise sum them all into one value.
score_types (list of ScoreTypes) – Only use these score types. None (default) includes all default types. Note - this only applies if individual == True

get_version_name()[source]¶: Get a unique, versioned name of this decorator for maximal reproducability

class menten_gcn.decorators.Rosetta_Ref2015_TwoBodyEneriges(individual: bool = False, score_types=None)[source]¶

Label each edge with its Rosetta two-body energy

0 Node Features
1 - 20-ish Edge Features

Parameters

individual (bool) – If true, list the score for each term individually. Otherwise sum them all into one value.
score_types (list of ScoreTypes) – Only use these score types. None (default) includes all default types. Note - this only applies if individual == True

get_version_name()[source]¶: Get a unique, versioned name of this decorator for maximal reproducability

class menten_gcn.decorators.Ref2015Decorator(individual: bool = False, score_types=None)[source]¶

Meta-decorator that combines Rosetta_Ref2015_OneBodyEneriges and Rosetta_Ref2015_TwoBodyEneriges

1 - 20-ish Node Features
1 - 20-ish Edge Features

Parameters

individual (bool) – If true, list the score for each term individually. Otherwise sum them all into one value.
score_types (list of ScoreTypes) – Only use these score types. None (default) includes all default types. Note - this only applies if individual == True

get_version_name()[source]¶: Get a unique, versioned name of this decorator for maximal reproducability