This repository has been archived by the owner on Apr 8, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
5329472
commit 354d198
Showing
31 changed files
with
1,709 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
--- | ||
id: overview | ||
title: Course Overview | ||
description: Course Overview | ||
sidebar_label: 'Overview' | ||
--- | ||
|
||
# Overview | ||
|
||
| WEEK | TUESDAY | THURSDAY | | ||
| ---: | ---------------------------------------------------------------- | ---------------------------------------- | | ||
| 1 | Introduction/[Setup Environment](./week_01/environment_setup.md) | [Intro to Unix](./week_01/intro_unix.md) | | ||
| 2 | [Reproducible Computing](./week_02/intro.md) | Group Project 1 Introduction Lab | | ||
| 3 | RNA-seq by Example | RNA-seq by Example | | ||
| 4 | RNA-seq by Example | RNA-seq by Example | | ||
| 5 | The Grouchy Grinch | The Grouchy Grinch | | ||
| 6 | RNA-seq Presentations / ChIP-Seq Intro | ChIP-Seq | | ||
| 7 | Nextflow Scripting | Nextflow Scripting | | ||
| 8 | ChIP-seq Pipeline | ChIP-seq Pipeline | | ||
| 8 | ChIP-seq Pipeline | ChIP-seq Pipeline | | ||
| 9 | Project 2 Demo day / Intro to module 3 project | Variant Calling | | ||
| 10 | Intro to Variant Calling | Variant Calling Continued/Xena Browser | | ||
| 11 | Project Work Day | Group Demo Day/Concluding Remarks | | ||
|
||
Issues with Biostars? [Create an issue!](https://github.com/biostars/biostar-handbook/issues/new) | ||
|
||
# Course Alumni | ||
|
||
| Alumni | Semester | GitHub | ag-intro Repo | | ||
| ------------------ | -------- | ------------- | --------------------------------------------------------------------------- | | ||
| Stephanie Yamauchi | 21U | syamauchi2000 | [syamauchi2000/ag-intro](https://github.com/syamauchi2000/ag-intro) | | ||
| Hiba Fatima | 21U | hxf190002 | [hxf190002/ag-intro](https://github.com/hxf190002/ag-intro) | | ||
| Mufeed Kamal | 21U | Mufeedmk4 | [Mufeedmk4/ag-intro](https://github.com/Mufeedmk4/ag-intro) | | ||
| Saleh Karim | 21U | Salehkarim21 | [Salehkarim21/6-1-2021-Repo](https://github.com/Salehkarim21/6-1-2021-Repo) | | ||
| Muneer Yaqub | 21U | muneeryaqub | [muneeryaqub/ag-intro](https://github.com/muneeryaqub/ag-intro) | | ||
| Luke Ballew | 21U | lxb190012 | [lxb190013/ag-intro](https://github.com/lxb190013/ag-intro) | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"label": "ChIP-Seq", | ||
"position": 4 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
--- | ||
id: biostars | ||
title: Biostars ChIP-Seq | ||
description: 'Notes and issues we ran into' | ||
sidebar_label: 'Biostars' | ||
sidebar_position: 1 | ||
--- | ||
|
||
Replace | ||
|
||
```sh | ||
# Create a namespace for the tool | ||
conda create --name macs python=2.7 | ||
|
||
# Activate the new environment. | ||
source activate macs | ||
|
||
# Install the tools. | ||
conda install numpy | ||
conda install macs2 | ||
``` | ||
|
||
with | ||
|
||
```sh | ||
conda create -n macs bioconda::macs2=2.2.7.1 | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,111 @@ | ||
--- | ||
id: nextflow | ||
title: Nextflow | ||
description: 'Data-driven computational pipelines' | ||
sidebar_label: 'Nextflow' | ||
sidebar_position: 2 | ||
--- | ||
|
||
## Workflow managers | ||
|
||
The `Makefile` has been getting a little scary. It's great for one off commands | ||
for a project, but not so much for full blown data pipelines. There are plenty | ||
of more modern alternatives. | ||
|
||
- [CWL](https://www.commonwl.org/user_guide/index.html) | ||
- [WDL](https://github.com/openwdl/wdl) | ||
- [Snakemake](https://snakemake.readthedocs.io/en/stable/) | ||
- [Nextflow](https://www.nextflow.io/) | ||
|
||
## What is Nextflow? | ||
|
||
Nextflow is an incredibly powerful and flexible workflow language. It's mainly | ||
used for bioinformatics analysis. | ||
|
||
```groovy title="main.nf" | ||
/* | ||
* Default pipeline parameters. They can be overriden on the command line eg. | ||
* given `params.foo` specify on the run command line `--foo some_value`. | ||
*/ | ||
params.reads = "$baseDir/data/ggal/*_{1,2}.fq" | ||
params.transcriptome = "$baseDir/data/ggal/ggal_1_48850000_49020000.Ggal71.500bpflank.fa" | ||
params.outdir = "results" | ||
params.multiqc = "$baseDir/multiqc" | ||
log.info """\ | ||
R N A S E Q - N F P I P E L I N E | ||
=================================== | ||
transcriptome: ${params.transcriptome} | ||
reads : ${params.reads} | ||
outdir : ${params.outdir} | ||
""" | ||
// import modules | ||
include { RNASEQ } from './modules/rnaseq' | ||
include { MULTIQC } from './modules/multiqc' | ||
/* | ||
* main script flow | ||
*/ | ||
workflow { | ||
read_pairs_ch = channel.fromFilePairs( params.reads, checkIfExists: true ) | ||
RNASEQ( params.transcriptome, read_pairs_ch ) | ||
MULTIQC( RNASEQ.out, params.multiqc ) | ||
} | ||
/* | ||
* completion handler | ||
*/ | ||
workflow.onComplete { | ||
log.info ( workflow.success ? "\nDone! Open the following report in your browser --> $params.outdir/multiqc_report.html\n" : "Oops .. something went wrong" ) | ||
} | ||
``` | ||
|
||
The thing that sets Nextflow apart is that it _pushes_ the data through the | ||
pipeline, rather than _pulling_ it through like make. | ||
|
||
## Subworkflows | ||
|
||
```groovy title="./modules/rnaseq.nf" | ||
params.outdir = 'results' | ||
include { INDEX } from './index' | ||
include { QUANT } from './quant' | ||
include { FASTQC } from './fastqc' | ||
workflow RNASEQ { | ||
take: | ||
transcriptome | ||
read_pairs_ch | ||
main: | ||
INDEX(transcriptome) | ||
FASTQC(read_pairs_ch) | ||
QUANT(INDEX.out, read_pairs_ch) | ||
emit: | ||
QUANT.out | concat(FASTQC.out) | collect | ||
} | ||
``` | ||
|
||
## Modules | ||
|
||
```groovy title="./modules/index.nf" | ||
process INDEX { | ||
tag "$transcriptome.simpleName" | ||
input: | ||
path transcriptome | ||
output: | ||
path 'index' | ||
script: | ||
""" | ||
salmon index --threads $task.cpus -t $transcriptome -i index | ||
""" | ||
} | ||
``` | ||
|
||
[The full nextflow/rnaseq-nf example repo](https://github.com/nextflow-io/rnaseq-nf) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,144 @@ | ||
--- | ||
id: nf-core | ||
title: nf-core | ||
description: 'A community effort to collect a curated set of analysis pipelines built using Nextflow.' | ||
sidebar_label: 'nf-core' | ||
sidebar_position: 3 | ||
--- | ||
|
||
## nf-core Intro | ||
|
||
<iframe width="560" height="315" src="https://www.youtube.com/embed/gUM9acK25tQ" | ||
title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; | ||
clipboard-write; encrypted-media; gyroscope; picture-in-picture" | ||
allowfullscreen></iframe> | ||
|
||
> A community effort to collect a curated set of analysis pipelines built using | ||
> Nextflow. | ||
We have the genomics core, imaging core, etc. core facilities, and nf-core! | ||
|
||
Enough talk, let's run it! | ||
|
||
### Testing a pipeline | ||
|
||
[nf-core installation docs](https://nf-co.re/usage/installation) | ||
|
||
1. Move into your chipseq repo | ||
2. Install Nextflow | ||
|
||
```bash | ||
curl -fsSL get.nextflow.io | bash | ||
mv nextflow ~/bin | ||
``` | ||
|
||
3. Activate singularity | ||
|
||
```bash | ||
ml load singularity | ||
``` | ||
|
||
4. Run | ||
|
||
```bash | ||
nextflow run nf-core/chipseq -profile test,utd_sysbio -r dev --outdir test-run | ||
``` | ||
|
||
5. Update your `.gitignore` | ||
|
||
```gitignore | ||
.nextflow* | ||
work/ | ||
data/ | ||
results/ | ||
``` | ||
|
||
## Running the nf-core pipeline | ||
|
||
[Let's refer to the usage section of the pipeline's docs](https://nf-co.re/chipseq/dev/usage) | ||
|
||
### Using the nf-core launcher | ||
|
||
1. [Open up the nf-core launch utility](https://nf-co.re/launch?) | ||
2. Select the `chipseq` pipeline, `dev` for the version and click Launch | ||
3. Fill out the following command-line flags: | ||
|
||
- profile: `utd_sysbio` | ||
- input: `samplesheet.csv` | ||
- email: `<netid>@utdallas.edu` | ||
- read_length: 50 | ||
- genome: `hg19` | ||
|
||
4. Create a file with the `nf-params.json` file it generates. | ||
|
||
```json title="nf-params.json" | ||
{ | ||
"input": "samplesheet.csv", | ||
"read_length": 50, | ||
"outdir": "ming-results", | ||
"email": "<netid>@utdallas.edu", | ||
"genome": "hg19" | ||
} | ||
``` | ||
|
||
5. We're going to need to create a samplesheet. [Please refer to the usage section of the pipeline's docs](https://nf-co.re/chipseq/dev/usage) | ||
|
||
The data has been predownloaded for you to the class scratch directory | ||
`/scratch/applied-genomics/` under `chipseq/ming/`. | ||
|
||
```csv title="samplesheet.csv" | ||
sample,fastq_1,fastq_2,antibody,control | ||
WT_YAP1,/scratch/applied-genomics/chipseq/ming/SRR1810900.fastq.gz,,YAP1,WT_INPUT | ||
WT_H3K27ac,/scratch/applied-genomics/chipseq/ming/SRR949140.fastq.gz,,H3K27ac,WT_INPUT | ||
WT_INPUT,/scratch/applied-genomics/chipseq/ming/SRR949142.fastq.gz,,, | ||
``` | ||
|
||
:::tip | ||
If you can't get the formatting right for whatever reason there's a backup samplesheet at `/scratch/applied-genomics/chipseq/ming/samplesheet.csv` just need to update the input path | ||
::: | ||
|
||
6. Start `screen` which is a screen manager | ||
|
||
```bash | ||
login$ screen | ||
``` | ||
|
||
:::info | ||
Useful screen commands | ||
::: | ||
|
||
```bash | ||
# Start a new screen session: | ||
screen | ||
|
||
# Start a new named screen session: | ||
screen -S session_name | ||
|
||
# Reattach to an open screen: | ||
screen -r session_name | ||
|
||
# Detach from inside a screen: | ||
Ctrl + A, D | ||
|
||
# Kill the current screen session: | ||
Ctrl + A, K | ||
``` | ||
|
||
7. Launch the pipeline | ||
|
||
```bash | ||
nextflow run nf-core/chipseq -r dev -profile utd_sysbio -params-file nf-params.json | ||
``` | ||
|
||
The pipeline should start up, and email you when it's finished! | ||
|
||
While we're waiting let's check out the [shell script that would've ran all of that](https://www.biostarhandbook.com/ming-tangs-guide-to-chip-seq-analysis.html#shell-script-comes-to-rescue) | ||
|
||
## Download the Multiqc Report | ||
|
||
1. Open up the file explorer and navigate to | ||
`results/multiqc/multiqc_report.html` and _right-click_ the html | ||
file and select Download. | ||
2. Now that the multiqc report is on your local computer open it up in a web | ||
browser. Preferably next to the [pipeline's output | ||
docs](https://nf-co.re/chipseq/dev/output). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
--- | ||
id: code_alternatives | ||
title: VS Code Alternatives | ||
description: '' | ||
sidebar_label: 'VS Code Alternatives' | ||
sidebar_position: 1 | ||
--- | ||
|
||
Due to changes in some of the UT Dallas systems, we're going to cover some extra | ||
methods to login just in case. [Refer to Environment setup](../week-1) for | ||
alternatives. | ||
|
||
Windows: | ||
|
||
- [Windows Terminal](https://www.microsoft.com/en-us/p/windows-terminal/9n0dx20hk701?activetab=pivot:overviewtab#) | ||
- [git for Windows](https://gitforwindows.org/) | ||
- [MobaXTerm and VS Code Setup](https://www.youtube.com/watch?v=GmMsTc55gLI) | ||
|
||
MacOS: | ||
|
||
- [iTerm2](https://iterm2.com/) | ||
|
||
Once installed, open up a terminal, and try logging in. | ||
|
||
:::danger | ||
When typing in your password, there won't be any \*'s it will just be blank. This is normal. | ||
::: | ||
|
||
```bash | ||
ssh <netid>@sysbio.utdallas.edu | ||
``` | ||
|
||
## Create SSH Keys | ||
|
||
While we're at it let's generate ssh keys so we don't have to type in our | ||
password everytime, and use it with our git repos as well. Public key based | ||
authentication is most secure and has advantages over other methods as well. | ||
|
||
[GitHub Docs for generating a new SSH key](https://docs.github.com/en/github/authenticating-to-github/connecting-to-github-with-ssh/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent) | ||
|
||
First, add the public key to your GitHub. Then copy it to the remote machine with | ||
|
||
### Windows | ||
|
||
```bash | ||
scp C:\Users\username\.ssh\id_ed25519.pub <netid>@sysbio.utdallas.edu:~/.ssh/authorized_keys | ||
``` | ||
|
||
### MacOS | ||
|
||
```bash | ||
scp ~/.ssh/id_ed25519.pub <netid>@sysbio.utdallas.edu:~/.ssh/authorized_keys | ||
``` |
Oops, something went wrong.