-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read PASEF DDA MS2 precursor information #20
base: main
Are you sure you want to change the base?
Conversation
Create accessor methods and implement MS2 fun into SpectraData. Create unit tests for methods and check that all works
Ensure that precursorCharge has the proper format, otherwise it would fail as validObject
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Roger for the PR! I have some comments and change requests. - and sorry for the delay...
R/MsBackendTimsTof-functions.R
Outdated
} | ||
output <- matrix(NA_real_, | ||
nrow = nrow(indices), | ||
ncol = length(.TIMSTOF_MS2_COLUMNS)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe also add the column names with
ncol = length(.TIMSTOF_MS2_COLUMNS),
dimnames = list(NULL, .TIMSTOF_MS2_COLUMNS))
R/MsBackendTimsTof-functions.R
Outdated
row <- tbl[[1]][i, ] | ||
prec <- tbl[[2]][row$Precursor,] | ||
target_rows <- which(indices[,1] == row$Frame & | ||
MsCoreUtils::between(indices[,2], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The MsCoreUtils::
should not be needed here, since you are importing between
from MsCoreUtils
.
R/MsBackendTimsTof-functions.R
Outdated
prec$Charge, ## precursorCharge | ||
prec$Intensity, ## precursorIntensity | ||
row$CollisionEnergy, ## collisionEnergy | ||
row$IsolationMz - row$IsolationWidth, ## isolationWindowLowerMz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I'm wondering how the isolation width is defined - just to be sure that it should not be isolationMz - isolationWidth/2 ... can you please check?
Please give me a ping once you have pushed changes |
Refactored all of .do_calculate_core_ms2_information because old implementation was very slow. Now we use fast dplyr table joins to extract all relevant infomation from the tables. There's still room for future improvement, probably save the tables in the backendobject when initializing and also filtering them when subsetting the backend. In this way, you could keep the table sizes as small as needed and not have to read the long tables from each file if the backend is only storing a couple spectra. Added depencencies on dplyr (mandatory for fast table joins) and utils (to automatically fill the `version` slot when creating a new backend object) Temptative bump to 0.2.0 due to the addition of methods: precursorMz, precursorCharge, etc.
Hi there @jorainer ! It's been more than a year since I last updated on this PR, but better late than never 😅 So I've refactored it in a faster (<3s compared to 30s for a regular LC-IM-MS benchmark file I use) and more elegant way (only returning the desired subset of MS/MS columns the user needs). The only downside is that it uses I've run all unit tests again and everything seems to work nicely. Related to #18, @chufz , feel free to check it out too and tell me all what you think, Roger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great @RogerGinBer ! looks all good - with the exception of the version for the object ;)
R/MsBackendTimsTof.R
Outdated
@@ -137,7 +139,8 @@ setClass("MsBackendTimsTof", | |||
"file"))), | |||
fileNames = integer(), | |||
readonly = TRUE, | |||
version = "0.1")) | |||
version = as.character( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
honestly, I'm not totally convinced by this. I know, it would allow us to keep track of the version of the package with which a class was created - but it would not reflect the version of the object. The version of the object should just change if its definition changes, i.e. if slots were added or removed. I would suggest to keep the old version, but am open to discuss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I see what you mean: I thought version
was meant to be related to the package versioning, not the object structure itself. I see this slot is inherited from the virtual class MsBackend, so it makes sense to keep its definition consistent
So yes, let's keep the old version then 👍
Hi Roger, thanks a lot for this implementation, I can try it out with my data and compare the MS2 spectra to data received in python. Is the current version on your main branch ready for testing? |
Hi @chufz I'd say yes: I designed and ran some tests using some PASEF MS/MS data we acquired a while back and it seems to extract the precursor information correctly. That being said, I haven't had the chance to cross-validate it against another source of data, so please go ahead! I'll now revert the changes I did the object |
@chufz , you should be able to install the version with the changes using @RogerGinBer , seems there are (now?) conflicts with the main branch - can you please try to solve them? otherwise I could also give a hand and provide a PR to your branch? just let me know how you prefer it. |
Related to issue #18, I've implemented a function to retrieve PASEF DDA MS2 precursor variables (specifically:
precursorMz
,precursorCharge
,precursorIntensity
,collisionEnergy
,isolationWindowLowerMz
,isolationWindowTargetMz
,and
isolationWindowUpperMz
) from the TDF tables usingopentimsr
.This implementation generates a matrix with all this information for the desired input scans, paralleling for all different
fileNames
in the backend. I prefered to create just a single, common function (.do_calculate_core_ms2_information
) that extracts all variables at once (they are very related and close to each other in the TDF tables) rather than many similar functions for each individual variable: this will make it easy to cache them all at once if we decide to implement cache (#19).I created methods for all the accessors, as well as unit tests, etc. AFAIK, we are passing all tests 👍
Since this update is pretty significant, I'd suggest we bump the package version to 0.2.0