Decorators¶
-
class
menten_gcn.decorators.
BareBonesDecorator
[source]¶ This decorator is included in all DataMakers by default. Its goal is to be the starting point upon which everything else is built. It labels focus nodes and labels edges for residues that are polymer bonded to one another.
1 Node Feature
1 Edge Feature
-
calc_edge_features
(wrapped_pose, resid1, resid2, dict_cache=None)[source]¶ This does all of the business logic of calculating the values to be added for each edge.
This function will never be called in the reverse order (with resid1 and resid2 swapped). Instead, we just create both edges at once.
- Parameters
wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid1 (int) – The first residue ID we are currently generating data for
resid1 (int) – The second residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this
- Returns
features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid1 -> resid2.
inv_features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid2 -> resid1.
-
calc_node_features
(wrapped_pose, resid, dict_cache=None)[source]¶ This does all of the business logic of calculating the values to be added for each node.
- Parameters
wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid (int) – The residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this
- Returns
features (list) – The length of this list will be the same value as self.n_node_features(). These are the values to represent this decorator’s contribution to X for this resid.
-
describe_edge_features
()[source]¶ Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.
- Returns
features (list) – The length of this list will be the same value as self.n_edge_features(). These are descriptions of the values to represent this decorator’s contribution to E for any arbitrary resid pair.
-
describe_node_features
()[source]¶ Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.
- Returns
features (list) – The length of this list will be the same value as self.n_node_features(). These are descriptions of the values to represent this decorator’s contribution to X for any arbitrary resid.
Geometry¶
-
class
menten_gcn.decorators.
CACA_dist
(use_nm: bool = False)[source]¶ Measures distance between the two C-Alpha atoms of each residue
0 Node Features
1 Edge Feature
- Parameters
use_nm (bool) – If true (default), measure distance in Angstroms. Otherwise use nanometers.
-
calc_edge_features
(wrapped_pose, resid1: int, resid2: int, dict_cache=None)[source]¶ This does all of the business logic of calculating the values to be added for each edge.
This function will never be called in the reverse order (with resid1 and resid2 swapped). Instead, we just create both edges at once.
- Parameters
wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid1 (int) – The first residue ID we are currently generating data for
resid1 (int) – The second residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this
- Returns
features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid1 -> resid2.
inv_features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid2 -> resid1.
-
describe_edge_features
()[source]¶ Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.
- Returns
features (list) – The length of this list will be the same value as self.n_edge_features(). These are descriptions of the values to represent this decorator’s contribution to E for any arbitrary resid pair.
-
class
menten_gcn.decorators.
CBCB_dist
(use_nm: bool = False)[source]¶ Measures distance between the two C-Beta atoms of each residue. Note: We will calculate the “ideal ALA” CB location even if this residue has a CB atom. This may sound silly but it is intended to prevents noise from different native amino acid types.
0 Node Features
1 Edge Feature
- Parameters
use_nm (bool) – If true (default), measure distance in Angstroms. Otherwise use nanometers.
-
calc_edge_features
(wrapped_pose, resid1: int, resid2: int, dict_cache=None)[source]¶ This does all of the business logic of calculating the values to be added for each edge.
This function will never be called in the reverse order (with resid1 and resid2 swapped). Instead, we just create both edges at once.
- Parameters
wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid1 (int) – The first residue ID we are currently generating data for
resid1 (int) – The second residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this
- Returns
features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid1 -> resid2.
inv_features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid2 -> resid1.
-
describe_edge_features
()[source]¶ Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.
- Returns
features (list) – The length of this list will be the same value as self.n_edge_features(). These are descriptions of the values to represent this decorator’s contribution to E for any arbitrary resid pair.
-
class
menten_gcn.decorators.
PhiPsiRadians
(sincos: bool = False)[source]¶ Returns the phi and psi values of each residue position.
2-4 Node Features
0 Edge Features
- Parameters
sincos (bool) – Return the sine and cosine of phi and psi instead of just the raw values.
-
calc_node_features
(wrapped_pose, resid, dict_cache=None)[source]¶ This does all of the business logic of calculating the values to be added for each node.
- Parameters
wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid (int) – The residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this
- Returns
features (list) – The length of this list will be the same value as self.n_node_features(). These are the values to represent this decorator’s contribution to X for this resid.
-
describe_node_features
()[source]¶ Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.
- Returns
features (list) – The length of this list will be the same value as self.n_node_features(). These are descriptions of the values to represent this decorator’s contribution to X for any arbitrary resid.
-
class
menten_gcn.decorators.
ChiAngleDecorator
(chi1: bool = True, chi2: bool = True, chi3: bool = True, chi4: bool = True, sincos: bool = True)[source]¶ Returns the chi values of each residue position. Ranges from -pi to pi or -1 to 1 if sincos=True.
WARNING: This can behave inconsistantly for proton chis accross modeling frameworks. Rosetta adds hydrogens when they are absent from the input file but MDtraj does not. This results in Rosetta calculating a chi value in some cases that MDtraj skips!
0-8 Node Features
0 Edge Features
- Parameters
chi1 (bool) – Include chi1’s value
chi2 (bool) – Include chi2’s value
chi3 (bool) – Include chi3’s value
chi4 (bool) – Include chi4’s value
sincos (bool) – Return the sine and cosine of chi instead of just the raw values
-
calc_node_features
(wrapped_pose, resid, dict_cache=None)[source]¶ This does all of the business logic of calculating the values to be added for each node.
- Parameters
wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid (int) – The residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this
- Returns
features (list) – The length of this list will be the same value as self.n_node_features(). These are the values to represent this decorator’s contribution to X for this resid.
-
describe_node_features
()[source]¶ Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.
- Returns
features (list) – The length of this list will be the same value as self.n_node_features(). These are descriptions of the values to represent this decorator’s contribution to X for any arbitrary resid.
-
class
menten_gcn.decorators.
trRosettaEdges
(sincos: bool = False, use_nm: bool = False)[source]¶ Use the residue pair geometries used in this paper: https://www.pnas.org/content/117/3/1496/tab-figures-data
0 Node Features
4-7 Edge Features
- Parameters
sincos (bool) – Return the sine and cosine of phi and psi instead of just the raw values.
use_nm (bool) – If true, measure distance in Angstroms. Otherwise use nanometers.
Note: This default value does not match the default of other decorators. This is for the sake of matching the trRosetta paper.
-
class
menten_gcn.decorators.
SimpleBBGeometry
(use_nm=False)[source]¶ Meta-decorator that combines PhiPsiRadians(sincos=False) and CBCB_dist
2 Node Features
1 Edge Feature
- Parameters
use_nm (bool) – If true, measure distance in Angstroms. Otherwise use nanometers.
Sequence¶
-
class
menten_gcn.decorators.
Sequence
[source]¶ One-hot encode the canonical amino acid identity on each node.
20 Node Features
0 Edge Features
-
calc_node_features
(wrapped_pose, resid, dict_cache=None)[source]¶ This does all of the business logic of calculating the values to be added for each node.
- Parameters
wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid (int) – The residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this
- Returns
features (list) – The length of this list will be the same value as self.n_node_features(). These are the values to represent this decorator’s contribution to X for this resid.
-
describe_node_features
()[source]¶ Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.
- Returns
features (list) – The length of this list will be the same value as self.n_node_features(). These are descriptions of the values to represent this decorator’s contribution to X for any arbitrary resid.
-
class
menten_gcn.decorators.
DesignableSequence
[source]¶ One-hot encode the canonical amino acid identity on each node, with a 21st value for residues that are not yet assigned an amino acid identity.
Note: requires you to call WrappedPose.set_designable_resids first
21 Node Features
0 Edge Features
-
calc_node_features
(wrapped_pose, resid, dict_cache=None)[source]¶ This does all of the business logic of calculating the values to be added for each node.
- Parameters
wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid (int) – The residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this
- Returns
features (list) – The length of this list will be the same value as self.n_node_features(). These are the values to represent this decorator’s contribution to X for this resid.
-
describe_node_features
()[source]¶ Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.
- Returns
features (list) – The length of this list will be the same value as self.n_node_features(). These are descriptions of the values to represent this decorator’s contribution to X for any arbitrary resid.
-
class
menten_gcn.decorators.
SequenceSeparation
(ln: bool = True)[source]¶ The sequence distance between the two residues (i.e., number of residues between these two residues in sequence space, plus one). -1.0 if the two residues belong to different chains.
0 Node Features
1 Edge Feature
- Parameters
ln (bool) – Report the natural log of the distance instead of the raw count. Does not apply to -1 values
-
calc_edge_features
(wrapped_pose, resid1, resid2, dict_cache=None)[source]¶ This does all of the business logic of calculating the values to be added for each edge.
This function will never be called in the reverse order (with resid1 and resid2 swapped). Instead, we just create both edges at once.
- Parameters
wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid1 (int) – The first residue ID we are currently generating data for
resid1 (int) – The second residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this
- Returns
features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid1 -> resid2.
inv_features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid2 -> resid1.
-
describe_edge_features
()[source]¶ Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.
- Returns
features (list) – The length of this list will be the same value as self.n_edge_features(). These are descriptions of the values to represent this decorator’s contribution to E for any arbitrary resid pair.
-
class
menten_gcn.decorators.
SameChain
[source]¶ 1 if the two residues are part of the same protein chain. Otherwise 0.
0 Node Features
1 Edge Feature
-
calc_edge_features
(wrapped_pose, resid1, resid2, dict_cache=None)[source]¶ This does all of the business logic of calculating the values to be added for each edge.
This function will never be called in the reverse order (with resid1 and resid2 swapped). Instead, we just create both edges at once.
- Parameters
wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid1 (int) – The first residue ID we are currently generating data for
resid1 (int) – The second residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this
- Returns
features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid1 -> resid2.
inv_features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid2 -> resid1.
-
describe_edge_features
()[source]¶ Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.
- Returns
features (list) – The length of this list will be the same value as self.n_edge_features(). These are descriptions of the values to represent this decorator’s contribution to E for any arbitrary resid pair.
Rosetta¶
-
class
menten_gcn.decorators.
RosettaResidueSelectorDecorator
(selector, description: str)[source]¶ Takes a user-provided residue selctor and labels each residue with a 1 or 0 accordingly.
1 Node Feature
0 Edge Features
- Parameters
selector (ResidueSelector) – This residue selector will be applied to the Rosetta pose
description (str) – This is the string that will label this feature in the final summary. Not technically required but highly recommended
Example:
import menten_gcn as mg import menten_gcn.decorators as decs import pyrosetta pyrosetta.init() buried = pyrosetta.rosetta.core.select.residue_selector.LayerSelector() buried.set_layers( True, False, False ) buried_dec = decs.RosettaResidueSelectorDecorator( selector=buried, description='<Layer select_core="true" />' ) data_maker = mg.DataMaker( decorators=[ buried_dec ], edge_distance_cutoff_A=10.0, max_residues=30 ) data_maker.summary()
Gives:
Summary: 2 Node Features: 1 : 1 if the node is a focus residue, 0 otherwise 2 : 1.0 if the residue is selected by the residue selector, 0.0 otherwise. User defined definition of the residue selector and how to reproduce it: <Layer select_core="true" /> 1 Edge Features: 1 : 1.0 if the two residues are polymer-bonded, 0.0 otherwise
Note that the additional features are due to the BareBonesDecorator, which is included by default
-
class
menten_gcn.decorators.
RosettaResidueSelectorFromXML
(xml_str: str, res_sele_name: str)[source]¶ Takes a user-provided residue selctor via XML and labels each residue with a 1 or 0 accordingly.
1 Node Feature
0 Edge Features
- Parameters
xml_str (str) – XML snippet that defines the selector
res_sele_name (str) – The name of the selector within the snippet
Example:
import menten_gcn as mg import menten_gcn.decorators as decs import pyrosetta pyrosetta.init() xml = ''' <RESIDUE_SELECTORS> <Layer name="surface" select_surface="true" /> </RESIDUE_SELECTORS> ''' surface_dec = decs.RosettaResidueSelectorFromXML( xml, "surface" ) max_res=30 data_maker = mg.DataMaker( decorators=[ surface_dec ], edge_distance_cutoff_A=10.0, max_residues=max_res ) data_maker.summary()
Gives:
Summary: 2 Node Features: 1 : 1 if the node is a focus residue, 0 otherwise 2 : 1.0 if the residue is selected by the residue selector, 0.0 otherwise. User defined definition of the residue selector and how to reproduce it: Took the residue selector named surface from this XML: <RESIDUE_SELECTORS> <Layer name="surface" select_surface="true" /> </RESIDUE_SELECTORS> 1 Edge Features: 1 : 1.0 if the two residues are polymer-bonded, 0.0 otherwise
Note that the additional features are due to the BareBonesDecorator, which is included by default
-
class
menten_gcn.decorators.
RosettaJumpDecorator
(use_nm: bool = False, rottype: str = 'euler')[source]¶ Measures the translational and rotational relationships between all residue pairs. This uses internal coordinate frames so it is agnostic to the global coordinate system. You can move/rotate your protein around and these will stay the same.
0 Node Features
6-12 Edge Features
- Parameters
use_nm (bool) – If true (default), measure distance in Angstroms. Otherwise use nanometers.
rottype (str) – How do you want to represent the rotational degrees of freedom? Options are “euler” (default), “euler_sincos”, “matrix”, “quat”, “rotvec”, and “rotvec_sincos”.
-
calc_edge_features
(wrapped_pose, resid1, resid2, dict_cache=None)[source]¶ This does all of the business logic of calculating the values to be added for each edge.
This function will never be called in the reverse order (with resid1 and resid2 swapped). Instead, we just create both edges at once.
- Parameters
wrapped_pose (WrappedPose) – The pose we are currently generating data for
resid1 (int) – The first residue ID we are currently generating data for
resid1 (int) – The second residue ID we are currently generating data for
dict_cache (dict) – The same cache that was populated in “cache_data”. The user might not have created a cache so don’t assume this is not None. See the RosettaHBondDecorator for an example of how to use this
- Returns
features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid1 -> resid2.
inv_features (list) – The length of this list will be the same value as self.n_edge_features(). These are the values to represent this decorator’s contribution to E for the edge going from resid2 -> resid1.
-
describe_edge_features
()[source]¶ Returns descriptions of how each value is computed. Our goal is for these descriptions to be relatively concise but also have enough detail to fully reproduce these calculations.
- Returns
features (list) – The length of this list will be the same value as self.n_edge_features(). These are descriptions of the values to represent this decorator’s contribution to E for any arbitrary resid pair.
-
class
menten_gcn.decorators.
RosettaHBondDecorator
(sfxn=None, bb_only: bool = False)[source]¶ Takes a user-provided residue selctor via XML and labels each residue with a 1 or 0 accordingly.
0 Node Features
1-5 Edge Features (depending on bb_only)
- Parameters
sfxn (ScoreFunction) – Score function used to calculate hbonds. We will use Rosetta’s default if this is None
bb_only (bool) – Only consider backbone-backbone hbonds. Reduces the number of features from 5 down to 1
-
class
menten_gcn.decorators.
Rosetta_Ref2015_OneBodyEneriges
(individual: bool = False, score_types=None)[source]¶ Label each node with its Rosetta one-body energy
1 - 20-ish Node Features
0 Edge Features
- Parameters
individual (bool) – If true, list the score for each term individually. Otherwise sum them all into one value.
score_types (list of ScoreTypes) – Only use these score types. None (default) includes all default types. Note - this only applies if individual == True
-
class
menten_gcn.decorators.
Rosetta_Ref2015_TwoBodyEneriges
(individual: bool = False, score_types=None)[source]¶ Label each edge with its Rosetta two-body energy
0 Node Features
1 - 20-ish Edge Features
- Parameters
individual (bool) – If true, list the score for each term individually. Otherwise sum them all into one value.
score_types (list of ScoreTypes) – Only use these score types. None (default) includes all default types. Note - this only applies if individual == True
-
class
menten_gcn.decorators.
Ref2015Decorator
(individual: bool = False, score_types=None)[source]¶ Meta-decorator that combines Rosetta_Ref2015_OneBodyEneriges and Rosetta_Ref2015_TwoBodyEneriges
1 - 20-ish Node Features
1 - 20-ish Edge Features
- Parameters
individual (bool) – If true, list the score for each term individually. Otherwise sum them all into one value.
score_types (list of ScoreTypes) – Only use these score types. None (default) includes all default types. Note - this only applies if individual == True