-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
256 lines (189 loc) · 9.05 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
---
output: rmarkdown::github_document
---
<!-- README.md is generated from README.Rmd. Please edit this file -->
```{r, include=FALSE, echo=FALSE, message=FALSE, warning=FALSE}
knitr::opts_chunk$set(fig.path="./man/figures/", message=FALSE, collapse=FALSE, comment="", warning=FALSE)
# Load CaDrA and CaDrA.shiny
library(devtools)
devtools::load_all("~/Github/CaDrA", recompile=TRUE)
devtools::load_all("~/Github/CaDrA.shiny", recompile=TRUE)
# Load others R packages
library(GSVA)
library(hypeR)
```
<p style="font-size: 40pt; front-weight: bold; padding-bottom: 0px; margin-bottom: 0;">CaDrA.shiny</p>
<hr style="padding-top: 0; margin-top: 0;">
![build](https://github.com/montilab/cadra/workflows/rcmdcheck/badge.svg)
![Docker pulls](https://img.shields.io/docker/pulls/montilab/cadra.shiny)
![Docker image size](https://img.shields.io/docker/image-size/montilab/cadra.shiny) ![Gitter](https://img.shields.io/gitter/room/montilab/cadra.shiny)
![GitHub issues](https://img.shields.io/github/issues/montilab/cadra.shiny)
![GitHub last commit](https://img.shields.io/github/last-commit/montilab/cadra.shiny)
An R Shiny Dashboard for Interacting with **[CaDrA](https://montilab.github.io/CaDrA/)** Package
### **CaDrA Package: [https://montilab.github.io/CaDrA/](https://montilab.github.io/CaDrA/)**
### **Web Portal: [https://cadra.bu.edu/](https://cadra.bu.edu/)**
## Overview
**CaDrA.shiny** is an interactive R Shiny dashboard developed to allow users to directly interact with **CaDrA** package. **CaDrA** is an R package that supports a heuristic search framework aimed at identifying candidate drivers of a molecular phenotype of interest (visit [our Github repo](https://github.com/montilab/CaDrA) for more details).
The CaDrA's Shiny dashboard has two distinctive features:
1. Run **CaDrA** search to identify candidate drivers of a molecular phenotype of interest.
2. Run **GSVA** to estimate aggregate enrichment scores by projecting a (gene) expression dataset onto a given gene set or signature, usually representing a molecular phenotype. Afterward, one can apply **CaDrA** search to look for genetic drivers based on a given binary multi-omics dataset and its derived enrichment score of a signature of interest.
Data visualizations include:
- Meta-feature plot.
- Kolmogorov Smirnov (KS) enrichment plot.
- Top N candidates overlapping heatmap.
- Permutation plot.
**CaDrA.shiny** is currently containerized using **Docker** and can be deployed on any compatible cloud-based services.
#### **Docker image: [montilab/cadra-shiny](https://hub.docker.com/r/montilab/cadra.shiny)**
#### Useful Guides
- <a href="articles/docker.html" target="_blank">Launch CaDrA's Shiny Dashboard with Docker</a>
- <a href="articles/docker-compose.html" target="_blank">Launch CaDrA's Shiny Dashboard with Compose</a>
### (1) Installation
```r
# Install BiocManager
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
# Install CaDrA
BiocManager::install("CaDrA")
# Install GSVA
BiocManager::install("GSVA")
# Install hypeR
library(devtools)
devtools::install_github("montilab/hypeR")
# Install CaDrA.shiny
devtools::install_github("montilab/CaDrA.shiny")
```
### (2) Load packages
```r
library(CaDrA.shiny)
library(CaDrA)
library(GSVA)
library(hypeR)
```
### (3) Run CaDrA with dataset downloaded from CaDrA Portal
Here, we show how to run CaDrA on a dataset downloaded from the CaDrA Portal, using input_score's derived by applying GSVA to the downloaded gene expression dataset and the [Hallmarks](https://www.gsea-msigdb.org/gsea/msigdb/human/genesets.jsp?collection=H) genesets. We will run a CaDrA search to look for genetic drivers of the _"Epithelial Mesenchymal Transition"_ (EMT)-derived score.
#### (i) Retrieve a list of descriptors of pre-processed feature sets available on the portal
```{r}
## Get a list of descriptors of feature sets available on CaDrA Portal
fs_list <- CaDrA.shiny::get_feature_set(order_by="asc")
```
```{r}
## Show the description of the first few feature sets
knitr::kable(head(fs_list), row.names = FALSE)
```
#### (ii) Retrieve datasets from the portal
```{r}
## Retrieve the dataset (both genetic feature set and gene expression)
datasets <- CaDrA.shiny::pull_datasets(
feature_set = "TCGA_HNSC_2016_01_28_GISTIC_MUT_SIG",
include_gene_expression = TRUE
)
datasets
```
#### (iii) Run GSVA on the downloaded expression dataset
```{r}
## download MSigDB’s Hallmark genesets
hallmarks <- hypeR::msigdb_gsets("Homo sapiens", "H", clean=TRUE)$genesets # returns 50 genesets
# Compute the gsva scores of the 50 hallmark genesets
input_score_matrix <- GSVA::gsva(
expr = SummarizedExperiment::assay(datasets$gene_expression),
gset.idx.list = hallmarks,
method = "gsva",
mx.diff = TRUE,
verbose = FALSE
)
```
```{r}
## Show few entries of the returned hallmark-by-sample matrix
knitr::kable(input_score_matrix[1:5, 1:5])
```
#### (iv) Run candidate search with input scores obtained in (iii)
```{r}
## Samples to keep based on the overlap between the two inputs
sample_overlap <- intersect(colnames(input_score_matrix), colnames(datasets$feature_set))
input_score <- input_score_matrix["Epithelial Mesenchymal Transition", sample_overlap]
FS <- datasets$feature_set[, sample_overlap, drop = FALSE]
## Pre-filter FS based on occurrence frequency
FS_filtered <- CaDrA::prefilter_data(
FS = FS,
max_cutoff = 0.6, # max event frequency (60%)
min_cutoff = round(5/ncol(FS), 2) # make sure min event frequency has at least 5 samples
)
## Run candidate search
topn_result <- CaDrA::candidate_search(
FS = FS_filtered,
input_score = input_score,
method = "ks_pval", # Use Kolmogorov-Smirnov scoring function
method_alternative = "less", # Use one-sided hypothesis testing
weights = NULL, # If weights are provided, perform a weighted-KS (gsea-like) test
search_method = "both", # Apply both forward and backward search
top_N = 1, # Perform only one search (starting from top scoring feature)
max_size = 7, # Maximum number of features to include in the returned meta-feature
do_plot = FALSE, # Plot after finding the best features
best_score_only = FALSE # Return all results from the search
)
```
### (v) Visualize Best Results
```{r}
## Fetch the meta-feature yielding the best score over N searches
## .. (in this example, only N=1 search was performed)
topn_best_meta <- CaDrA::topn_best(topn_result)
## Visualize the best results with the meta-feature plot
CaDrA::meta_plot(topn_best_list = topn_best_meta, input_score_label = NULL)
```
### (vi) Compute permutation-based p-value
```{r}
## Permutation seed (for reproducible results)
set.seed(123)
## Run CaDrA Search 100 times on permuted input scores to estimate the null distribution
perm_res <- CaDrA::CaDrA(
FS = FS_filtered,
input_score = input_score,
method = "ks_pval",
method_alternative = "less",
top_N = 1,
max_size = 7,
search_method = "both",
n_perm = 100,
perm_alternative = "one.sided",
ncores = 2,
cache = FALSE
)
## Visualize permutation results
CaDrA::permutation_plot(perm_res = perm_res)
```
### (4) Launch CaDrA's Shiny dashboard with your pre-proccessed dataset
Here, we show how to launch a local instance of the CaDrA Portal, which will be populated with a user-selected set of datasets. In the example, a single dataset (ACC) will be uploaded to the portal.
#### (i) Pull pre-processed feature sets using our REST API
```{r}
# Download feature sets from CaDrA portal and return a datalist with appropriate paths to dataset
mydatafile <- CaDrA.shiny::download_feature_sets(
#feature_set = fs_list$feature_set_name, # this would download all TCGA datasets
feature_set = "TCGA_ACC_2016_01_28_GISTIC_MUT_SIG",
include_input_score = TRUE,
include_gene_expression = TRUE,
out_dir = file.path(Sys.getenv("HOME"), "Github") # specify your folder of choice here
)
```
```{r}
# Look at the top 6 rows
knitr::kable(head(mydatafile), row.names = FALSE)
```
#### (iii) Launch Shiny dashboard with downloaded dataset
```r
## Launch CaDrA's Shiny app with your downloaded datalist retrieved from (ii)
app <- CaDrA.shiny::CaDrA_App(id="myapp", datalist=mydatafile)
## Launch app on localhost with port 3838
shiny::runApp(app, host='0.0.0.0', port=3838)
```
# A Glimpse of CaDrA's Shiny Dashboard
![](man/figures/cadra-shiny-landing-page.png)
There are five tabs on CaDrA's Shiny dashboard. Explore each tab and see what they do.
![](man/figures/tabs.png)
- <a href="https://montilab.github.io/CaDrA.shiny/articles/run-cadra-tab.html" target="_blank">Run CaDrA</a>
- <a href="https://montilab.github.io/CaDrA.shiny/articles/run-gsva-tab.html" target="_blank">Run GSVA</a>
- Download
- Publication
- Contract Us
# Getting Help
To get help with **CaDrA**, visit our [Github CaDrA dicussion](https://github.com/montilab/CaDrA/discussions) or [Github CaDrA issues](https://github.com/montilab/CaDrA/issues).
To get help with **CaDrA.shiny**, visit our [Github CaDrA.shiny dicussion](https://github.com/montilab/CaDrA.shiny/discussions) or [Github CaDrA.shiny issues](https://github.com/montilab/CaDrA.shiny/issues).