You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@adder asked for user-defined locations of the Modification:
You also have to be able to specify where the variable modificiation is positioned. Eg. if you have a sequence with 2 serines. You can have a Phospo group on the first but not on the second serine.
unimod.org doesn't support specific locations. Instead they have a position argument that could be: "Anywhere", "Any N-term", "Any C-term", "N-term", "C-term", "Protein N-term", "Protein C-term".
library("unimod")
umodName("Acetyl") # or umodId(1)# - General:# Modification version : 0.0.1# Accession number/id : 1# PSI-MS/Interim Name : Acetyl# Description : Acetylation# Composition : H(2) C(2) O(1)# Delta Average Mass : 42.0367# Delta Monoisotopic Mass: 42.010565# Approved : TRUE # - Specificity:# site position classification hidden group# 1 K Anywhere Multiple FALSE 1# 2 N-term Any N-term Multiple FALSE 2# 3 C Anywhere Post-translational TRUE 3# 4 S Anywhere Post-translational TRUE 4# 5 N-term Protein N-term Post-translational FALSE 5# 6 T Anywhere Post-translational TRUE 6# 7 Y Anywhere Chemical derivative TRUE 7# 8 H Anywhere Chemical derivative TRUE 8# - References: use 'references(object)'
I am currently not sure how to fulfill @adder's feature request best. The specificity slot is a data.frame that has a column position of type character. We could add another column, e.g. index as numeric or the user has to supply the position number as character (or we cast it).
I am planning to add a seq2mass(sequence, modificitions) or calculateMass(sequence, modifications) or just mass(sequence, modifications) function that would do the following:
Split the amino acid sequence into single letter character.
For convenience we could keep the neutralLoss argument (otherwise the sequence would be really destroyed by many _, * for possible neutral loss positions).
While I think that would a great interface for the user if he only wants to calculate fragments for just a single protein sequence but it would be nearly impossible to do this for batch processing of mzML + mzID files.
In summary it seems not suitable for our intention.
This issue is a follow-up to lgatto/MSnbase#167.
@adder asked for user-defined locations of the Modification:
unimod.org doesn't support specific locations. Instead they have a position argument that could be:
"Anywhere", "Any N-term", "Any C-term", "N-term", "C-term", "Protein N-term", "Protein C-term"
.I am currently not sure how to fulfill @adder's feature request best. The
specificity
slot is adata.frame
that has a columnposition
of typecharacter
. We could add another column, e.g.index
asnumeric
or the user has to supply the position number as character (or we cast it).I am planning to add a
seq2mass(sequence, modificitions)
orcalculateMass(sequence, modifications)
or justmass(sequence, modifications)
function that would do the following:character
.aminoacid
data.frame
(see How to store chemical elements and aminoacids? #1).specificity
slot (site
andposition
column) if the mass has to be modified.This function should replace the mass calculation in
MSnbase::calculateFragments
.Any suggestions regarding the user-defined modification positions or something else?
The text was updated successfully, but these errors were encountered: