Skip to content

Commit

Permalink
Merge pull request #20 from rformassspectrometry/jomain
Browse files Browse the repository at this point in the history
docs: update documentation and improve validity check
  • Loading branch information
jorainer authored Mar 28, 2024
2 parents 0c1a9c8 + 9a2002a commit 613de9f
Show file tree
Hide file tree
Showing 7 changed files with 134 additions and 128 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: MsBackendSql
Title: SQL-based Mass Spectrometry Data Backend
Version: 1.3.4
Version: 1.3.5
Authors@R:
c(person(given = "Johannes", family = "Rainer",
email = "[email protected]",
Expand Down
6 changes: 6 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# MsBackendSql 1.3

## Changes in 1.3.5

- Improve input argument check and error message for `backendInitialize()` for
`MsBackendOfflineSql`.
- Update documentation adding `()` to all function names.

## Changes in 1.3.4

- Ensure primary keys from the database are in the correct order for
Expand Down
6 changes: 3 additions & 3 deletions R/MsBackendOfflineSql.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
#'
#' An empty instance of an `MsBackendOfflineSql` class can be created using the
#' `MsBackendOfflineSql()` function. An existing *MsBackendSql* SQL database
#' can be loaded with the `backendInitialize` function. This function takes
#' can be loaded with the `backendInitialize()` function. This function takes
#' parameters `drv`, `dbname`, `user`, `password`, `host` and `port`, all
#' parameters that are passed to the `dbConnect()` function to connect to
#' the (**existing**) SQL database.
Expand All @@ -27,7 +27,7 @@
#'
#' @param object A `MsBackendOfflineSql` object.
#'
#' @param data For `backendInitialize`: optional `DataFrame` with the full
#' @param data For `backendInitialize()`: optional `DataFrame` with the full
#' spectra data that should be inserted into a (new) `MsBackendSql`
#' database. If provided, it is assumed that the provided database
#' connection information if for a (writeable) empty database into which
Expand Down Expand Up @@ -128,7 +128,7 @@ setMethod("backendInitialize", "MsBackendOfflineSql",
function(object, drv = NULL, dbname = character(),
user = character(), password = character(),
host = character(), port = NA_integer_, data, ...) {
if (is.null(drv))
if (is.null(drv) || !inherits(drv, "DBIDriver"))
stop("Parameter 'drv' must be specified and needs to be ",
"an instance of 'DBIDriver' such as returned e.g. ",
"by 'SQLite()'")
Expand Down
102 changes: 51 additions & 51 deletions R/MsBackendSql.R

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions man/MsBackendOfflineSql.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

102 changes: 51 additions & 51 deletions man/MsBackendSql.Rd

Large diffs are not rendered by default.

40 changes: 20 additions & 20 deletions vignettes/MsBackendSql.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -53,15 +53,15 @@ The package can be installed with the `BiocManager` package. To install
# Creating and using `MsBackendSql` SQL databases

`MsBackendSql` SQL databases can be created either by importing (raw) MS data
from MS data files using the `createMsBackendSqlDatabase` or using the
`backendInitialize` function by providing in addition to the database connection
also the full MS data to import as a `DataFrame`. In the first example we use
the `createMsBackendSqlDatabase` function which takes a connection to an (empty)
database and the names of the files from which the data should be imported as
input parameters creates all necessary database tables and stores the full data
into the database. Below we create an empty SQLite database (in a temporary
file) and fill that with MS data from two mzML files (from the `r
Biocpkg("msdata")` package).
from MS data files using the `createMsBackendSqlDatabase()` or using the
`backendInitialize()` function by providing in addition to the database
connection also the full MS data to import as a `DataFrame`. In the first
example we use the `createMsBackendSqlDatabase()` function which takes a
connection to an (empty) database and the names of the files from which the data
should be imported as input parameters creates all necessary database tables and
stores the full data into the database. Below we create an empty SQLite database
(in a temporary file) and fill that with MS data from two mzML files (from the
`r Biocpkg("msdata")` package).

```{r, message = FALSE, results = "hide"}
library(RSQLite)
Expand Down Expand Up @@ -107,13 +107,13 @@ As an alternative, the `MsBackendOfflineSql` backend could also be used to
interface with MS data in a SQL database. In contrast to the `MsBackendSql`, the
`MsBackendOfflineSql` does not contain an active (open) connection to the
database and hence supports serializing (saving) the object to disk using
e.g. the `save` function, or parallel processing (if supported by the database
e.g. the `save()` function, or parallel processing (if supported by the database
system). Thus, for most use cases the `MsBackendOfflineSql` should be used
instead of the `MsBackendSql`. See further below for more information on the
`MsBackendOfflineSql`.

`Spectra` objects allow also to change the backend to any other backend
(extending `MsBackend`) using the `setBackend` function. Below we use this
(extending `MsBackend`) using the `setBackend()` function. Below we use this
function to first load all data into memory by changing from the `MsBackendSql`
to a `MsBackendMemory`.

Expand All @@ -127,7 +127,7 @@ With this function it is also possible to change from any backend to a
originating backend is stored in this database. To change the backend to an
`MsBackendOfflineSql` we need to provide the connection information to the SQL
database as additional parameters. These parameters are the same that need to
be passed to a `dbConnect` call to establish the connection to the
be passed to a `dbConnect()` call to establish the connection to the
database. These parameters include the database driver (parameter `drv`), the
database name and eventually the user name, host etc (see `?dbConnect` for more
information). In the simple example below we store the data into a SQLite
Expand All @@ -142,15 +142,15 @@ sps2
```

Similar to any other `Spectra` object we can retrieve the available *spectra
variables* using the `spectraVariables` function.
variables* using the `spectraVariables()` function.

```{r}
spectraVariables(sps)
```

The MS peak data can be accessed using either the `mz`, `intensity` or
`peaksData` functions. Below we extract the peaks matrix of the 5th spectrum and
display the first 6 rows.
The MS peak data can be accessed using either the `mz()`, `intensity()` or
`peaksData()` functions. Below we extract the peaks matrix of the 5th spectrum
and display the first 6 rows.

```{r}
peaksData(sps)[[5]] |>
Expand Down Expand Up @@ -193,7 +193,7 @@ sps$msLevel <- msLevel(sps)
system.time(msLevel(sps))
```

We can also use the `reset` function to *reset* the data to its original state
We can also use the `reset()` function to *reset* the data to its original state
(this will cause any local spectra variables to be deleted and the backend to be
initialized with the original data in the database).

Expand Down Expand Up @@ -306,7 +306,7 @@ access to the m/z and intensity values.
Performance can be improved for the `MsBackendMzR` using parallel
processing. Note that the `MsBackendSql` does **not support** parallel
processing and thus parallel processing is (silently) disabled in functions such
as `peaksData`.
as `peaksData()`.

```{r}
m2 <- MulticoreParam(2)
Expand Down Expand Up @@ -398,9 +398,9 @@ parallel processing setup was passed along with the `BPPARAM` method.
Some functions on `Spectra` objects require to load the MS peak data (i.e., m/z
and intensity values) into memory. For very large data sets (or computers with
limited hardware resources) such function calls can cause out-of-memory
errors. One example is the `lengths` function that determines the number of
errors. One example is the `lengths()` function that determines the number of
peaks per spectrum by loading the peak matrix first into memory. Such functions
should ideally be called using the `peaksapply` function with parameter
should ideally be called using the `peaksapply()` function with parameter
`chunkSize` (e.g., `peaksapply(sps, lengths, chunkSize = 5000L)`). Instead of
processing the full data set, the data will be first split into chunks of size
`chunkSize` that are stepwise processed. Hence, only data from `chunkSize`
Expand Down

0 comments on commit 613de9f

Please sign in to comment.