New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Created tutorial for MultiGSEA #5567

Open

tStehling wants to merge 12 commits into galaxyproject:main from tStehling:main

+189 −3

tStehling commented Nov 27, 2024

Added a tutorial for MultiGSEA tool.


          Created tutorial for multiGSEA

20b6888

tStehling requested a review from a team as a code owner

November 27, 2024 12:40

github-actions bot added the statistics label

tStehling closed this

tStehling reopened this

Member

anuprulez commented Nov 28, 2024

I am unsure if the tutorial should be part of the statistics category. Usually, machine learning tutorials become part of this category in GTN.

Contributor

bernt-matthias commented Nov 28, 2024

We are fine to move it to any other category, but none seems to fit yet. Me may create a multiomics category?

Member

shiltemann commented Dec 4, 2024

@bernt-matthias the transcriptomics topic has a "multi-omics" subsection, could we add it there for now?

Member

shiltemann commented Dec 4, 2024 •

edited

Loading

@bernt-matthias we can then make "synthetic topic" by adding a "multi-omics" tag to all tutorials analyzing multi-omics data, and define the topic similar to the plants topic: https://github.com/galaxyproject/training-material/blob/main/metadata/plants.yaml

Then the topic will be shown on the main page, since people interested in this tutorial may not naturally go to the proteomics topic, but tutorials themselves can live in multiple topics (as they do now) What do you think?

Contributor

bernt-matthias commented Dec 4, 2024

This sounds good to me. @tStehling could you move it?

@shiltemann do you have some links for @tStehling on how to assign tags / assign the tutorial to the sub-topic?

shiltemann reviewed

View reviewed changes

Member

shiltemann left a comment

Thanks a lot for your contribution @tStehling! I've left some comments below, but please let me know if anything is unclear, or if you would like some help doing it :)

topics/statistics/tutorials/multiGSEA-tutorial/tutorial.md Outdated Show resolved Hide resolved

topics/statistics/tutorials/multiGSEA-tutorial/tutorial.md Outdated Show resolved Hide resolved

topics/statistics/tutorials/multiGSEA-tutorial/tutorial.md Outdated Show resolved Hide resolved

topics/statistics/tutorials/multiGSEA-tutorial/tutorial.md Outdated Show resolved Hide resolved

topics/statistics/tutorials/multiGSEA-tutorial/tutorial.md Outdated Show resolved Hide resolved

topics/statistics/tutorials/multiGSEA-tutorial/tutorial.md Outdated Show resolved Hide resolved

Member

shiltemann commented Dec 4, 2024 •

edited

Loading

@shiltemann do you have some links for @tStehling on how to assign tags / assign the tutorial to the sub-topic?

absolutely, @tStehling :

In the proteomics topics, the subsection id is multi-omics so you can add a tutorial to this section by adding the following to the metadata at the top of your tutorial file:

subtopic: multi-omics

And then simply add a tag of the same name. I will use this to create the "multiomics" synthetic topic later. To add a tag, add the following to the metadata of your tutorial

tags: 
  - multi-omics

(and feel free to add more tags in this list as you see fit)

tags are shown as follows under the tutorial name , and can help users identify interesting tutorials

Author

tStehling commented Dec 7, 2024

@shiltemann thank you for the effort.


          Switched topic to proteomics->multiomics, smaller changes according t…

3b17448

…o comments

tStehling requested review from a team, bebatut and hexylena as code owners

December 7, 2024 13:55

github-actions bot added template-and-tools proteomics labels

bgruening and others added 3 commits

December 7, 2024 23:14


          Merge branch 'main' into main


          Updated multiGSEA tutorial

fe20d8f


          Merge branch 'main' of github.com:tStehling/training-material-fork

465f922

bgruening reviewed

View reviewed changes

Member

bgruening left a comment

Very good @tStehling! Just a few mini comments, otherwise good to go from my side.

metadata/multi-omics.yaml Outdated


		tag_based: true

		gitter: galaxy-multi-omics:matrix.org

Member

bgruening Jan 14, 2025

This is different from the one above, and I think both do not exist ...
I'm also not sure if we should really create a separate room for it or if we should reuse one of the other rooms

CONTRIBUTORS.yaml Show resolved Hide resolved

shiltemann added 3 commits

January 14, 2025 16:17


          remove obsolete files

3376cec


          fix broken boxes and tweak formatting of zenodo links

9554a79


          update formatting to GTN best practices

3eec113

shiltemann reviewed

View reviewed changes

Member

shiltemann left a comment

Thanks @tStehling! I pushed some formatting changes and left a couple small comments below

metadata/multi-omics.yaml Outdated Show resolved Hide resolved

metadata/lang/multiomics.yml Outdated Show resolved Hide resolved

topics/proteomics/tutorials/multiGSEA-tutorial/tutorial.md Outdated

+              > 3. You can also choose the Gene ID format for every data set. In this tutorial we will use the preset "SYMBOL" for transcriptomics and proteomics. For metabolomics we use HMDB.
+              > 4. Select in **Supported organisms** the organism of which the data is about. In our case we select `Homo sapiens (Human)`.
+              > 5. **Pathway databases**: Databases often contain their own format in which pathway definitions are provided. So you can select a relevant database. For the tutorial we choose `KEGG`
+              > 6. **Combine p-values method**: Choose a method (here `Stouffer` for balanced weighting). To more comprehensively measure a pathway response, multiGSEA provides different approaches to compute an aggregated p value over multiple omics layers. Because no single approach for aggregating p values performs best under all circumstances, Loughin proposed basic recommendations on which method to use depending on structure and expectation of the problem. If small p values should be emphasized, Fisher’s method should be chosen. In cases where p values should be treated equally, Stouffer’s method is preferable. If large p values should be emphasized, the user should select Edgington’s method. Figure 2 indicates the difference between those three methods.

Member

shiltemann Jan 14, 2025

I reformatted the hands-on boxes a bit. The hands-on boxes should be very concise, just telling the user how to configure the tool. I have moved all your (very useful!) explanations to a tip box inside the hands-on box, but you could also just put them in normal text before or after the hands-on box, as you prefer

topics/proteomics/tutorials/multiGSEA-tutorial/tutorial.md Show resolved Hide resolved

topics/proteomics/tutorials/multiGSEA-tutorial/tutorial.md

+              > 8. Click on `Run Tool`
+              >
+              {: .hands_on}

Member

shiltemann Jan 14, 2025

would it be useful to discuss the output of the tool here? again both on technical level (what is the format, what do the contents mean?) and biological (what can we learn from the output)

topics/proteomics/tutorials/multiGSEA-tutorial/tutorial.md Outdated Show resolved Hide resolved


          Merge branch 'main' into main

4acf5bb

shiltemann reviewed

View reviewed changes

topics/proteomics/tutorials/multiGSEA-tutorial/tutorial.md Outdated Show resolved Hide resolved

Contributor

bernt-matthias commented Jan 14, 2025

Thanks for the comments. I will discuss with @tStehling tomorrow.

shiltemann and others added 3 commits

January 14, 2025 17:20


          fix linting error


          Merge branch 'galaxyproject:main' into main

a9507e9


          updated authorship and title

ab9b9d4

bernt-matthias reviewed

View reviewed changes

topics/statistics/tutorials/multiGSEA-tutorial/tutorial.md Outdated Show resolved Hide resolved

topics/proteomics/tutorials/multiGSEA-tutorial/tutorial.md


		# Preparing the Data

		To perform pathway enrichment with MultiGSEA, you'll need omics datasets in the file type TSV . Each individual data set contains four columns representing the feature (denoted as Symbol), the log2 fold change (logFC), the p-value (pValue), and the adjusted p-values (adj.pValue). We'll use example data provided on Zenodo.

Contributor

bernt-matthias Jan 15, 2025

Maybe Sebastian can tell us a few xrefs which methods can give the needed values for Transcriptomics, Metabolomics, and Proteomics.

topics/proteomics/tutorials/multiGSEA-tutorial/tutorial.md

+              >
+              >    > <tip-title>About the parameters</tip-title>
+              >    > - **Pathway databases**: `KEGG`Databases often contain their own format in which pathway definitions are provided. So you can select a relevant >    >   database. For the tutorial we choose `KEGG`
+              >    > - **Combine p-values method**: Choose a method (here `Stouffer` for balanced weighting). To more comprehensively measure a pathway response, multiGSEA provides different approaches to compute an aggregated p value over multiple omics layers. Because no single approach for aggregating p values performs best under all circumstances, Loughin proposed basic recommendations on which method to use depending on structure and expectation of the problem. If small p values should be emphasized, Fisher’s method should be chosen. In cases where p values should be treated equally, Stouffer’s method is preferable. If large p values should be emphasized, the user should select Edgington’s method. Figure 2 indicates the difference between those three methods.

Contributor

bernt-matthias Jan 15, 2025

We need a reference for Loughin?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

bernt-matthias bernt-matthias left review comments

shiltemann shiltemann left review comments

bgruening bgruening left review comments

bebatut Awaiting requested review from bebatut

hexylena Awaiting requested review from hexylena

At least 1 approving review is required to merge this pull request.

Labels

proteomics statistics template-and-tools