Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
makirc authored Mar 30, 2020
1 parent 6590e4d commit 1bc4f49
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ git clone https://github.com/kircherlab/hemoMIPs
cd hemoMIPs
```

We will now initiate three Conda environments which we will need for some preparations as well as getting the Snakemake workflow invoked. The first environment (`hemoMIPs`) will contain only snakemake, the second (`ensemblVEP`) contains Ensembl VEP and htslib, the third (`prepTools`) contains some basic tools for preparing annotations (e.g. bedtools, samtools, htslib, bwa, picard):
We will now initiate three Conda environments, which we will need for some preparations as well as getting the Snakemake workflow invoked. The first environment (`hemoMIPs`) will contain only snakemake, the second (`ensemblVEP`) contains Ensembl VEP and htslib, the third (`prepTools`) contains some basic tools for preparing annotations (e.g. bedtools, samtools, htslib, bwa, picard):

```bash
conda env create -n hemoMIPs --file environment.yaml
Expand Down Expand Up @@ -128,7 +128,7 @@ Examples and further information about these files is provided below.

Information about the designed MIP probes and their location in the reference genome is needed as a tab-separated text file for the script `TrimMIParms.py`. The default input file has the following columns: index, score, chr, ext_probe_start, ext_probe_stop, ext_probe_copy, ext_probe_sequence, lig_probe_start, lig_probe_stop, lig_probe_copy, lig_probe_sequence, mip_scan_start_position, mip_scan_stop_position, scan_target_sequence, mip_sequence, feature_start_position, feature_stop_position, probe_strand, failure_flags, gene_name, mip_name. This format is obtained from MIP designs generated by MIPGEN (Boyle et al., 2014), a tool for MIP probe design available on GitHub (https://github.com/shendurelab/MIPGEN). Alternatively, files containing at least the following named columns can be used: chr, ext_probe_start, ext_probe_stop, lig_probe_start, lig_probe_stop, probe_strand, and mip_name. It is critical, that the reported coordinates and chromosome names match the reference genome used in alignment.

We used Y-chromosome specific targets (SRY) to detect the sex of the patient (see chromosome `Y` in `hemomips_design.txt`). Different Y chromosome targets can be designed for sex determination as the workflow simply counts Y-aligned reads. The pipeline also runs without Y-specific MIPs for sex determination, but in this case will output all samples to be female in the final report.
We used Y-chromosome specific targets (SRY) to detect the sex of the samples (see chromosome `Y` in `hemomips_design.txt`). Different Y chromosome targets can be designed for sex determination as the workflow simply counts Y-aligned reads. The pipeline also runs without Y-specific MIPs for sex determination, but in this case will output all samples to be female in the final report.

### Named target regions in BED format

Expand Down Expand Up @@ -233,7 +233,7 @@ Genomic Variant Call Format (GVCF) files for each sample are available in `gatk4
`realign_all_samples.all_sites.vcf.gz` is the combined VCF generated by GATK4 CombineGVCFs. \
The final genotyped VCF is called `realign_all_samples.vcf.gz`. \
MIP performance statistics can be found in `realign_all_samples.MIPstats.tsv`. \
Variant Effect Pridictions are stored in `realign_all_samples.vep.tsv.gz`.
Variant Effect Predictions are stored in `realign_all_samples.vep.tsv.gz`.

#### Genotyping using GATK3
`/output/dataset/mapping/gatk3`
Expand All @@ -244,7 +244,7 @@ A VCF containing genotypes for all sites generated by GATK3 UnifiedGenotyper: `r
The final VCF with non-homozygote reference alleles: `realign_all_samples.vcf.gz`. \
A filtered list of InDels: `realign_all_samples.indel_check.txt`. \
MIP performance statistics: `realign_all_samples.MIPstats.tsv`. \
Variant Effect Pridictions: `realign_all_samples.vep.tsv.gz`.
Variant Effect Predictions: `realign_all_samples.vep.tsv.gz`.

## Report
`/output/dataset/report`
Expand Down

0 comments on commit 1bc4f49

Please sign in to comment.