Skip to content

Latest commit

 

History

History
194 lines (144 loc) · 23.2 KB

README.md

File metadata and controls

194 lines (144 loc) · 23.2 KB

scRNAseq-analysis-notes

my scRNAseq analysis notes

The reason

Single cell RNAseq is becoming more and more popular, and as a technique, it might become as common as PCR. I just got some 10x genomics single cell RNAseq data to play with, it is a good time for me to take down notes here. I hope it is useful for other people as well.

readings before doing anything

single cell tutorials

single cell RNA-seq normalization

single cell impute

single cell batch effect

Single cell RNA-seq

Considerable differences are found between the methods in terms of the number and characteristics of the genes that are called differentially expressed. Pre-filtering of lowly expressed genes can have important effects on the results, particularly for some of the methods originally developed for analysis of bulk RNA-seq data. Generally, however, methods developed for bulk RNA-seq analysis do not perform notably worse than those developed specifically for scRNA-seq.

single cell RNA-seq clustering

dimention reduction and visualization of clusters

See https://t.co/yxCb85ctL1: "MDS best choice for preserving outliers, PCA for variance, & T-SNE for clusters" @mikelove @AndrewLBeam

— Rileen Sinha (@RileenSinha) August 25, 2016
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>

paper: Outlier Preservation by Dimensionality Reduction Techniques

"MDS best choice for preserving outliers, PCA for variance, & T-SNE for clusters"

interesting papers to read

database

advance of scRNA-seq tech

pseudotemporal modelling

large scale single cell analysis

The field is advancing so fast!!

check this website for the tools being added:
https://www.scrna-tools.org/

paper published:
Exploring the single-cell RNA-seq analysis landscape with the scRNA-tools database

contamination of 10x data

https://twitter.com/constantamateur/status/994832241107849216?s=11

Did you know that droplet based single cell RNA-seq data (like 10X) is contaminated by ambient mRNAs? Good news though, we've written a paper (https://www.biorxiv.org/content/early/2018/04/20/303727 …) and created an R package called SoupX (https://github.com/constantAmateur/SoupX) to fix this problem.

Is this really a problem? It depends on your experiment. Contamination ranges from 2% - 50%. 10% seems common; it's 8% for 10X PBMC data. Solid tissues are typically worse, but there's no way to know in advance. Wouldn't you like to know how contaminated your data are?

These mRNAs come from the single cell suspension fed into the droplet creation system. They mostly get their from lysed cells and so resemble the cells being studied. This means the profile of the contamination is experiment specific and creates a batch effect.

cellranger is the toolkit developed by the 10x genomics company to deal with the data.

some tools for 10x

DropletUtils Provides a number of utility functions for handling single-cell (RNA-seq) data from droplet technologies such as 10X Genomics. This includes data loading, identification of cells from empty droplets, removal of barcode-swapped pseudo-cells, and downsampling of the count matrix.