MultiAssayExperiment.TCGA
is the pipeline package for building and uploading
MultiAssayExperiment datasets from the GDAC Firehose pipeline as obtained
from RTCGAToolbox
.
There are several steps to rebuild datasets for 33 cancer types.
Generally, users should use the packaged product of the pipeline:
For those looking to rebuild the pipeline, there are several steps that need to be followed:
- Create all data directories required (
dataDirectories
) - Obtain all clinical and assay data from RTCGAToolbox (
saveRTCGAdata
) - Introduce additional clinical variables to all clinical datasets
- Download and integrate subtype curation data from Dropbox
- Generate and serialize data maps, providing relationships between samples and patients
- Update metadata and upload to
ExperimentHub
(buildMultiAssayExperiments
)
These functions can be found in the data-raw
, inst/scripts
, and R
folders.
NOTE. Include AWS CLI
authentication credentials in the ~/.Renviron
file.
It should include three key:value pairs,
AWS_SESSION_TOKEN
, AWS_SECRET_ACCESS_KEY
, and AWS_ACCESS_KEY_ID