diff --git a/vignettes/getting-started.Rmd b/vignettes/getting-started.Rmd index 9fe5ef8..cb42bdb 100644 --- a/vignettes/getting-started.Rmd +++ b/vignettes/getting-started.Rmd @@ -42,7 +42,7 @@ colnames(example_data) We're going to set up three datasets. The first is required, it's the counts that the genetic interaction analysis will be used for. ```{r} -# TODO: ## perhaps here we can clarify the flexibility of the workflow / what can be set up. e.g. does this function allow exploration / comparison of specific days and reps? or is that not really needed for the purposes of the vignette? +## The first data set contains the readcounts from each sample type. Required for analysis is a Day 0 (or plasmid) sample, and at least one further timepoint sample. QC analysis will follow to correlate replicates if inputted. Comparison of early and late timepoints is possible in this function, but not required if early timepoints were not taken. example_counts <- example_data %>% dplyr::select(c("Day00_RepA", "Day05_RepA", "Day22_RepA", "Day22_RepB", "Day22_RepC")) %>% @@ -55,15 +55,13 @@ The next two datasets are metadata that describe the dimensions of the count dat - The first column of the pg_metadata must be a unique id ```{r} -# pg metadata is the information that describes the paired guide RNA targets - -# TODO: ## this is really minor but I think a more accurate description would be closer to: "pg metadata contains a table of paired guide RNAs targeting a paralog pair and their corresponding nucleotide sequences". What's being targeted is a 20bp region of genomic DNA, which is complementary to the gRNA sequence listed +# pg metadata is the information that describes the paired guide RNA targets. This information contains a table of the paired guide RNA sequences and the corresponding paralog gene that is being targeted for cutting by the gRNA-Cas9 complex. example_pg_metadata <- example_data %>% dplyr::select(c("id", "seq_1", "seq_2")) -# sample metadata is the information that describes -# TODO: ## not sure if the comment above was cut off or not, but more description could help here +# sample metadata is the information that describes timepoint information and replicate information relating to each sample. In general, two replicates at each timepoint are carried through to analysis, where they are later collapsed. + example_sample_metadata <- data.frame( id = 1:5,