Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
jkobject authored Jan 9, 2024
1 parent 4a4a22e commit e6cfddb
Showing 1 changed file with 11 additions and 0 deletions.
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,17 @@ It allows you to:
3. create a more complex single cell dataset
4. extend it to your need

## About

the idea is to use it to train models like scGPT / GeneFormer (and soon, scPrint ;)). It is:

1. loading from lamin
2. doing some dataset specific preprocessing if needed
3. creating a dataset object on top of .mapped() (that is needed for mapping genes, cell labels etc..)
4. passing it to a dataloader object that can work with it correctly

Currently one would have to use the preprocess function to make the dataset fit for different tools like scGPT / Geneformer. But I would want to enable it through different Collators. This is still missing and a WIP... (please do contribute!)

## Install it from PyPI

```bash
Expand Down

0 comments on commit e6cfddb

Please sign in to comment.