-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Could CytoNorm impede discovery of novel clusters? #10
Comments
@vivek-verma202 it doesn't have to, but it's a bit complicated. In our implementation in Spectre we generate clusters using only stable markers to get the major population groups (e.g. Ly6G, CD19, CD4, CD3 etc in mouse). Essentially when thinking about cluster-specific batch effects, this probably happens at the level of fundamental biological groups. E.g. most T cells would likely have a similar batch effects, whereas the batch effects on T cells might be very different to those on eosinophils. So we are only trying to cluster eosinophils, neutrophils, monocytes, T cells, NK cells, and B cells for alignment. If the alignment is done using only these large population groups, then when the actual analysis gets done afterwards, you can cluster on all markers and still find novel clusters. Now the specific issue you raised is a good point -- if the healthy controls are being used as the reference samples then the alignment of markers might get messed up, as some disease samples might have high levels of some markers that aren't present in the healthy controls. One of the requirements of CytoNorm specified in the document is that the reference control needs to span the full range of the data. The implication is that, in an ideal world, you could use one of the 'disease' sample etc so that all the activation markers etc will be present. However, in practice this is difficult to do. The way we have been getting around this is, as above, use just stable markers to create clusters for alignment, and then we keep both the raw and aligned data in our dataset. Then in our analysis proper, we cluster on all the markers where we know the distribution is fairly similar between healthy and diseased (CD11b etc), and then look for novel patterns/bifurcations in each cluster that are generated by the raw data for activation/novel markers (CD80/CD86 etc). There are essentially two ways of using clustering: one is to cluster on everything and find new clusters 'appearing' in experimental groups, or cluster on stable markers and then ask how each of those stable clusters have changed between experimental groups -- the approach described above is the later. |
@tomashhurst , thank you for your response!
|
Hello, I have been having similar questions, so thanks @tomashhurst for the input. However, as far as I understand, having only used some markers for the training (the "stable" ones), then only those markers will be normalised and the rest will not even appear on the normalised fcs samples. So in your analyses, do you just append the rest of the markers (the non normalised ones) on the normalised fcs files? Thanks again for your time. |
Hi Emma and Vivek,
It is possible to define a different set of markers for the clustering as
for the normalization, by adding colsToUse to the FlowSOM.params. As such,
you can only use the "stable" ones for clustering, but still normalise all
markers.
I hope this helps,
Kind regards,
Sofie
…On Tue, 14 Jul 2020 at 12:52, Emma ***@***.***> wrote:
Hello,
I have been having similar questions, so thanks @tomashhurst
<https://github.com/tomashhurst> for the input. However, as far as I
understand, having only used some markers for the training (the "stable"
ones), then only those markers will be normalised and the rest will not
even appear on the normalised fcs samples. So in your analyses, do you just
append the rest of the markers (the non normalised ones) on the normalised
fcs files?
Also if you are suggesting to only use some markers for the normalisation,
does that mean that you tried using all of them but the normalisation
didn't work as well?
Thanks again for your time.
Best,
Emma
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#10 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOS722MBBFUEHJOZ2GKWYTR3Q2G5ANCNFSM4NYUHUVA>
.
|
Hi Sofie, Thank you so much for the quick reply. That's very helpful! Should we also use a different transformList for each step as well then (with the "stable" channels for the prepareFlowSOM and all the channels for CytoNorm.train)? Best, |
You can keep the same transformList, the whole flowFrame will be
transformed but only those channels of interest will actually be used in
the FlowSOM computations. It does not matter that some columns get
transformed while not being used in the computation.
…On Tue, 14 Jul 2020 at 13:32, Emma ***@***.***> wrote:
Hi Sofie,
Thank you so much for the quick reply. That's very helpful! Should we also
use a different transformList for each step as well then (with the "stable"
channels for the prepareFlowSOM and all the channels for CytoNorm.train)?
Best,
Emma
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#10 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOS725TH3DQUBUVXJVYVFDR3Q65TANCNFSM4NYUHUVA>
.
|
Thanks! |
Exactly. We work on the arcsinh transformed data to compute the clusters
and interpolate the normalisation values, and afterwards reverse the
transformation to have "raw" values in the fcs files again (as this is in
general assumed to be raw data, e.g. by other software). You can of course
also specify another transformation list (e.g. if you would be working on
flow cytometry data instead of mass) or work with pre-transformed files
and set the transformationList to NULL.
…On Tue, 14 Jul 2020 at 13:57, Emma ***@***.***> wrote:
Thanks!
I know I'm going a bit off topic here, but if you could please elaborate
on what the transformList is actually doing as well I would be grateful.
It's not completely clear to me from the documentation, is it an arcsinh
transformation before you do the normalisation and then you return the
values to their original (non transformed) range before you export to fcs?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#10 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOS72254W6L46JAU4KC5SLR3RB3VANCNFSM4NYUHUVA>
.
|
Hi!
If the training set is from healthy controls and the hypothesis is to discover novel clusters (using Diffcyt) that occur only in cases but not in controls, could CytoNorm pre-processing wash-off the signal?
Thanks,
Vivek
The text was updated successfully, but these errors were encountered: