Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Created tutorial for MultiGSEA #5567

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Conversation

tStehling
Copy link

Added a tutorial for MultiGSEA tool.

@tStehling tStehling requested a review from a team as a code owner November 27, 2024 12:40
@tStehling tStehling closed this Nov 27, 2024
@tStehling tStehling reopened this Nov 27, 2024
@anuprulez
Copy link
Member

I am unsure if the tutorial should be part of the statistics category. Usually, machine learning tutorials become part of this category in GTN.

@bernt-matthias
Copy link
Contributor

We are fine to move it to any other category, but none seems to fit yet. Me may create a multiomics category?

@shiltemann
Copy link
Member

@bernt-matthias the transcriptomics topic has a "multi-omics" subsection, could we add it there for now?

@shiltemann
Copy link
Member

shiltemann commented Dec 4, 2024

@bernt-matthias we can then make "synthetic topic" by adding a "multi-omics" tag to all tutorials analyzing multi-omics data, and define the topic similar to the plants topic: https://github.com/galaxyproject/training-material/blob/main/metadata/plants.yaml

Then the topic will be shown on the main page, since people interested in this tutorial may not naturally go to the proteomics topic, but tutorials themselves can live in multiple topics (as they do now) What do you think?

@bernt-matthias
Copy link
Contributor

This sounds good to me. @tStehling could you move it?

@shiltemann do you have some links for @tStehling on how to assign tags / assign the tutorial to the sub-topic?

Copy link
Member

@shiltemann shiltemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for your contribution @tStehling! I've left some comments below, but please let me know if anything is unclear, or if you would like some help doing it :)

@shiltemann
Copy link
Member

shiltemann commented Dec 4, 2024

@shiltemann do you have some links for @tStehling on how to assign tags / assign the tutorial to the sub-topic?

absolutely, @tStehling :

In the proteomics topics, the subsection id is multi-omics so you can add a tutorial to this section by adding the following to the metadata at the top of your tutorial file:

subtopic: multi-omics

And then simply add a tag of the same name. I will use this to create the "multiomics" synthetic topic later. To add a tag, add the following to the metadata of your tutorial

tags: 
  - multi-omics

(and feel free to add more tags in this list as you see fit)

tags are shown as follows under the tutorial name , and can help users identify interesting tutorials

image

@tStehling
Copy link
Author

@shiltemann thank you for the effort.

Copy link
Member

@bgruening bgruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good @tStehling! Just a few mini comments, otherwise good to go from my side.


tag_based: true

gitter: galaxy-multi-omics:matrix.org
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is different from the one above, and I think both do not exist ...
I'm also not sure if we should really create a separate room for it or if we should reuse one of the other rooms

CONTRIBUTORS.yaml Show resolved Hide resolved
Copy link
Member

@shiltemann shiltemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tStehling! I pushed some formatting changes and left a couple small comments below

metadata/multi-omics.yaml Outdated Show resolved Hide resolved
metadata/lang/multiomics.yml Outdated Show resolved Hide resolved
> 3. You can also choose the Gene ID format for every data set. In this tutorial we will use the preset "SYMBOL" for transcriptomics and proteomics. For metabolomics we use HMDB.
> 4. Select in **Supported organisms** the organism of which the data is about. In our case we select `Homo sapiens (Human)`.
> 5. **Pathway databases**: Databases often contain their own format in which pathway definitions are provided. So you can select a relevant database. For the tutorial we choose `KEGG`
> 6. **Combine p-values method**: Choose a method (here `Stouffer` for balanced weighting). To more comprehensively measure a pathway response, multiGSEA provides different approaches to compute an aggregated p value over multiple omics layers. Because no single approach for aggregating p values performs best under all circumstances, Loughin proposed basic recommendations on which method to use depending on structure and expectation of the problem. If small p values should be emphasized, Fisher’s method should be chosen. In cases where p values should be treated equally, Stouffer’s method is preferable. If large p values should be emphasized, the user should select Edgington’s method. Figure 2 indicates the difference between those three methods.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reformatted the hands-on boxes a bit. The hands-on boxes should be very concise, just telling the user how to configure the tool. I have moved all your (very useful!) explanations to a tip box inside the hands-on box, but you could also just put them in normal text before or after the hands-on box, as you prefer

> 8. Click on `Run Tool`
>
{: .hands_on}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be useful to discuss the output of the tool here? again both on technical level (what is the format, what do the contents mean?) and biological (what can we learn from the output)

topics/proteomics/tutorials/multiGSEA-tutorial/tutorial.md Outdated Show resolved Hide resolved
@bernt-matthias
Copy link
Contributor

Thanks for the comments. I will discuss with @tStehling tomorrow.

topics/statistics/tutorials/multiGSEA-tutorial/tutorial.md Outdated Show resolved Hide resolved

# Preparing the Data

To perform pathway enrichment with MultiGSEA, you'll need omics datasets in the file type TSV . Each individual data set contains four columns representing the feature (denoted as Symbol), the log2 fold change (logFC), the p-value (pValue), and the adjusted p-values (adj.pValue). We'll use example data provided on Zenodo.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe Sebastian can tell us a few xrefs which methods can give the needed values for Transcriptomics, Metabolomics, and Proteomics.

>
> > <tip-title>About the parameters</tip-title>
> > - **Pathway databases**: `KEGG`Databases often contain their own format in which pathway definitions are provided. So you can select a relevant > > database. For the tutorial we choose `KEGG`
> > - **Combine p-values method**: Choose a method (here `Stouffer` for balanced weighting). To more comprehensively measure a pathway response, multiGSEA provides different approaches to compute an aggregated p value over multiple omics layers. Because no single approach for aggregating p values performs best under all circumstances, Loughin proposed basic recommendations on which method to use depending on structure and expectation of the problem. If small p values should be emphasized, Fisher’s method should be chosen. In cases where p values should be treated equally, Stouffer’s method is preferable. If large p values should be emphasized, the user should select Edgington’s method. Figure 2 indicates the difference between those three methods.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a reference for Loughin?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants