Skip to content

Commit

Permalink
Merge pull request #210 from jeromekelleher/docs-more-docs
Browse files Browse the repository at this point in the history
More docs infrastructure
  • Loading branch information
jeromekelleher authored May 15, 2024
2 parents 4461ba1 + 81fb76b commit a1ddde0
Show file tree
Hide file tree
Showing 5 changed files with 34 additions and 23 deletions.
4 changes: 4 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,10 @@ jobs:
pip install -r docs/requirements.txt
python3 -m bash_kernel.install
- name: Install bcftools
run: |
sudo apt-get install bcftools
- name: Install package
run: |
python3 -m pip install .
Expand Down
4 changes: 4 additions & 0 deletions docs/_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,7 @@ html:
sphinx:
extra_extensions:
- sphinx_click.ext
config:
# This is needed to make sure that text is output in single block from
# bash cells.
nb_merge_streams: true
4 changes: 2 additions & 2 deletions docs/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,10 @@ RETVAL=$?
if [ $RETVAL -ne 0 ]; then
if [ -e $REPORTDIR ]; then
echo "Error occured; showing saved reports"
cat $REPORTDIR/*
cat $REPORTDIR/*/*
fi
else
# Clear out any old reports
rm -f $REPORTDIR/*
rm -fR $REPORTDIR/*
fi
exit $RETVAL
4 changes: 2 additions & 2 deletions docs/vcf2zarr/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ need for small, intermediate and large datasets.

<div id="vcf2zarr_convert"></div>
<script>
AsciinemaPlayer.create('_static/vcf2zarr_convert.cast',
AsciinemaPlayer.create('../_static/vcf2zarr_convert.cast',
document.getElementById('vcf2zarr_convert'), {
cols:80,
rows:12
Expand All @@ -33,7 +33,7 @@ need for small, intermediate and large datasets.

<div id="vcf2zarr_explode"></div>
<script>
AsciinemaPlayer.create('_static/vcf2zarr_explode.cast',
AsciinemaPlayer.create('../_static/vcf2zarr_explode.cast',
document.getElementById('vcf2zarr_explode'), {
cols:80,
rows:12
Expand Down
41 changes: 22 additions & 19 deletions docs/vcfpartition/overview.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,33 @@
---
jupytext:
formats: md:myst
text_representation:
extension: .md
format_name: myst
kernelspec:
display_name: Bash
language: bash
name: bash
---
(sec-vcfpartition)=
# vcfpartition
```{code-cell}
:tags: [remove-cell]
cp ../../tests/data/vcf/CEUTrio.20.21.gatk3.4.g.bcf* ./
```

## Overview

Partition a given VCF file into (approximately) a give number of regions:

```
vcf_partition 20201028_CCDG_14151_B01_GRM_WGS_2020-08-05_chr20.recalibrated_variants.vcf.gz -n 10
```
gives
```
chr20:1-6799360
chr20:6799361-14319616
chr20:14319617-21790720
chr20:21790721-28770304
chr20:28770305-31096832
chr20:31096833-38043648
chr20:38043649-45580288
chr20:45580289-52117504
chr20:52117505-58834944
chr20:58834945-

```{code-cell}
vcfpartition CEUTrio.20.21.gatk3.4.g.bcf -n 3
```

These reqion strings can then be used to split computation of the VCF
into chunks for parallelisation.

**TODO give a nice example here using xargs**
```{code-cell}
vcfpartition CEUTrio.20.21.gatk3.4.g.bcf -n 3 \
| xargs -P 3 -I {} sh -c "bcftools view -Hr {} CEUTrio.20.21.gatk3.4.g.bcf | wc -l"
```

**WARNING that this does not take into account that indels may overlap**

0 comments on commit a1ddde0

Please sign in to comment.