Skip to content

large-scale-gxe-methods/gxe-pheno-simulation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Goal: simulate GxE phenotypes for workflow testing.

assemble_summary_statistics.R: Hack around in R to wrangle Excel Supplementary File from Sung 2018 into a straightforward summary statistics table.

download_1000G_vcfs.sh: Download VCFs containing genotype calls for 2504 individuals from the 1000G phase 3 v5 datast.

process_1000G_vcfs.sh: Filter VCFs first for MAF > 0.05 to generate a more reasonable dataset size, then for top variants from Sung 2018 (smoking -> blood pressure GWIS) and concatenate chromosomes.

simulate_phenos.R: Based on input genotypes and summary statistics, simulate phenotypes to contain (sparse) genetic signal.

fetch_MIS_results.sh & subset_imputed_vcfs.sh: If MAF > 0.05 sequenced genotypes used above are imputed using the Michigan Imputation Server, fetch the results (fetch_MIS_results.sh) and subset to common variants (subset_1000G_vcfs.sh). These imputed genotypes can then be used for downstream interaction testing.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published