Skip to content

Commit

Permalink
usecase data was moved
Browse files Browse the repository at this point in the history
  • Loading branch information
rcannood committed Sep 7, 2024
1 parent 6ac6acb commit 7e787b3
Show file tree
Hide file tree
Showing 9 changed files with 17 additions and 15 deletions.
1 change: 1 addition & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@
*.html
*.DS_Store
/usecase_data/
/usecase/data/
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
/site_libs/
*.html
*.DS_Store
/usecase/data/
/usecase_data/

# Created by https://www.toptal.com/developers/gitignore/api/python,r
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,11 +118,11 @@ pixi run pipeline

### Docker

To run the pipeline with Docker, use the following command. The image is ~5GB and the pipeline can require a lot of working memory ~20GB, so make sure to increase the RAM allocated to Docker in your settings. Note that the usecase_data/ and scripts/ folders are mounted to the Docker container, so you can edit the scripts and access the data.
To run the pipeline with Docker, use the following command. The image is ~5GB and the pipeline can require a lot of working memory ~20GB, so make sure to increase the RAM allocated to Docker in your settings. Note that the usecase/data/ and scripts/ folders are mounted to the Docker container, so you can edit the scripts and access the data.

```bash
docker pull berombau/polygloty-docker:latest
docker run -it -v $(pwd)/usecase_data:/app/usecase_data -v $(pwd)/scripts:/app/scripts berombau/polygloty-docker:latest pixi run pipeline
docker run -it -v $(pwd)/usecase/data:/app/usecase/data -v $(pwd)/scripts:/app/scripts berombau/polygloty-docker:latest pixi run pipeline
```

### Extra: building the Docker image yourself
Expand All @@ -131,7 +131,7 @@ To edit and build the Docker image yourself, use can use the following command.:

```bash
docker build -t polygloty-docker .
docker run -it -v $(pwd)/usecase_data:/app/usecase_data -v $(pwd)/scripts:/app/scripts polygloty-docker pixi run pipeline
docker run -it -v $(pwd)/usecase/data:/app/usecase/data -v $(pwd)/scripts:/app/scripts polygloty-docker pixi run pipeline
```

To publish it to Docker Hub, use the following command:
Expand Down
4 changes: 2 additions & 2 deletions book/in_memory2.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Read in the anndata object
```{r read_in}
library(anndata)
adata_path <- "notebooks/usecase_data/sc_counts_reannotated_with_counts.h5ad"
adata_path <- "notebooks/usecase/data/sc_counts_reannotated_with_counts.h5ad"
adata <- anndata::read_h5ad(adata_path)
```

Expand Down Expand Up @@ -89,5 +89,5 @@ pb_adata = ad.AnnData(
Store to disk:

```{python store_pseudobulk}
pb_adata.write_h5ad("usecase_data/pseudobulk.h5ad")
pb_adata.write_h5ad("usecase/data/pseudobulk.h5ad")
```
4 changes: 2 additions & 2 deletions book/in_memory_interoperability.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,7 @@ with (robjects.default_converter + pandas2ri.converter).context():
```{r read_in}
library(anndata)
adata_path <- "notebooks/usecase_data/sc_counts_reannotated_with_counts.h5ad"
adata_path <- "notebooks/usecase/data/sc_counts_reannotated_with_counts.h5ad"
adata <- anndata::read_h5ad(adata_path)
```

Expand Down Expand Up @@ -270,6 +270,6 @@ pb_adata <- anndata::AnnData(
Store to disk:

```{r store_pseudobulk}
write_h5ad(pb_adata, "notebooks/usecase_data/pseudobulk.h5ad")
write_h5ad(pb_adata, "notebooks/usecase/data/pseudobulk.h5ad")
```
4 changes: 2 additions & 2 deletions book/on_disk_interoperability.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -122,9 +122,9 @@ With the Pixi task runner, you can define these tasks in their respective enviro
pixi run pipeline
```

You can create a Docker image with all the `pixi` environments and run the pipeline in one containerized environment. The image is ~5GB and the pipeline can require a lot of working memory ~20GB, so make sure to increase the RAM allocated to Docker in your settings. Note that the `usecase_data/` and `scripts/` folders are mounted to the Docker container, so you can interactively edit the scripts and access the data.
You can create a Docker image with all the `pixi` environments and run the pipeline in one containerized environment. The image is ~5GB and the pipeline can require a lot of working memory ~20GB, so make sure to increase the RAM allocated to Docker in your settings. Note that the `usecase/data/` and `scripts/` folders are mounted to the Docker container, so you can interactively edit the scripts and access the data.

```bash
docker pull berombau/polygloty-docker:latest
docker run -it -v $(pwd)/usecase_data:/app/usecase_data -v $(pwd)/scripts:/app/scripts berombau/polygloty-docker:latest pixi run pipeline
docker run -it -v $(pwd)/usecase/data:/app/usecase/data -v $(pwd)/scripts:/app/scripts berombau/polygloty-docker:latest pixi run pipeline
```
4 changes: 2 additions & 2 deletions scripts/1_load_data.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
if [[ ! -f usecase_data/sc_counts_reannotated_with_counts.h5ad ]]; then
if [[ ! -f usecase/data/sc_counts_reannotated_with_counts.h5ad ]]; then
aws s3 cp \
--no-sign-request \
s3://openproblems-bio/public/neurips-2023-competition/sc_counts_reannotated_with_counts.h5ad \
usecase_data/sc_counts_reannotated_with_counts.h5ad
usecase/data/sc_counts_reannotated_with_counts.h5ad
fi
4 changes: 2 additions & 2 deletions scripts/2_compute_pseudobulk.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
import anndata as ad

print("Load data")
adata = ad.read_h5ad("usecase_data/sc_counts_reannotated_with_counts.h5ad")
adata = ad.read_h5ad("usecase/data/sc_counts_reannotated_with_counts.h5ad")

sm_name = "Belinostat"
control_name = "Dimethyl Sulfoxide"
Expand Down Expand Up @@ -44,4 +44,4 @@
)

print("Store to disk")
pb_adata.write_h5ad("usecase_data/pseudobulk.h5ad")
pb_adata.write_h5ad("usecase/data/pseudobulk.h5ad")
4 changes: 2 additions & 2 deletions scripts/3_analysis_de.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ library(anndata)
library(dplyr, warn.conflicts = FALSE)

print("Reading data...")
pb_adata <- read_h5ad("usecase_data/pseudobulk.h5ad")
pb_adata <- read_h5ad("usecase/data/pseudobulk.h5ad")

# Select small molecule and control:
sm_name <- "Belinostat"
Expand Down Expand Up @@ -34,4 +34,4 @@ res |>
head(10)

# Write to disk:
write.csv(res, "usecase_data/de_contrasts.csv")
write.csv(res, "usecase/data/de_contrasts.csv")

0 comments on commit 7e787b3

Please sign in to comment.