You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I don't know what you guys think, but I'd need functions that download data from NMD in my work. The most natural place for me would be to place these functions in the RstoxData package. Under is my attempt to download landings data.
Function to download the landings (requires FDIRcodes data frame):
#' @title Download landings data for a species from the IMR database
#' @description The function downloads landings ("sluttseddel") data from IMR database. Requires access to the intranet.
#' @param species any species identification name in \code{FDIRcodes$speciesCodes} as character. Only one species at the time allowed.
#' @param years an integer vector of years to download. If \code{NULL} (default), all years are downloaded. Please note that this option can take very long time and lead to huge datasets.
#' @author Mikko Vihtakari
#' @importFrom RstoxData readXmlFile
#' @examples \dontrun{
#' downloadLandings("brugde") # Basking shark, all years
#' downloadLandings("kveite", years = 2000:2001) # halibut, 2000-01
#' }
#' @export
# species <- "blåkveite"; years <- c(1900:2020)
downloadLandings <- function(species, years = NULL) {
## Set up variables
splist <- as.data.frame(FDIRcodes$speciesCodes)
dest <- tempfile(fileext = ".xml")
APIpath <- "http://tomcat7.imr.no:8080/apis/nmdapi/landing/v2?version=2.0&type=search"
## Find the species code
species <- paste0("^", species, "$")
tmp <- sapply(colnames(splist), function(x) grep(species, splist[,x], ignore.case = TRUE))
if(all(sapply(tmp, function(k) length(k) == 0))) stop(paste(species, "not found from FDIRcodes$speciesCodes"))
if(sum(sapply(tmp, function(k) length(k) == 1)) > 1) {
stop(
paste(species, "was matched to",
paste(names(sapply(tmp, function(k) length(k) == 1)[sapply(tmp, function(k) length(k) == 1)]), collapse = ", "),
". Cannot extract information from multiple columns."
)
)
}
# spCol <- names(tmp[sapply(tmp, function(k) length(k) == 1)])
spRow <- unlist(unname(tmp[sapply(tmp, function(k) length(k) == 1)]))
spCode <- splist[spRow, "idNS"]
spCode <- ifelse(nchar(spCode) == 3, paste0(0, spCode), spCode)
## Set up the download path
if(is.null(years)) {
DownloadPath <- paste0(APIpath, "&Art_kode=", spCode)
} else {
DownloadPath <- paste0(APIpath, "&Art_kode=", spCode, "&Fangstar=", paste(years, collapse = ","))
}
## Download the data from the database
status <- suppressMessages(suppressWarnings(try(utils::download.file(DownloadPath, dest), silent = TRUE)))
if(class(status) == "try-error") {
## Stop processing if not found
stop(paste("Species code", spCode, "with years", paste(years, collapse = ","), "not found from the database."))
} else {
## Read the data
RstoxData::readXmlFile(dest)
}
}
And a function to prepare FDIR codes (requires their messy Excel sheet):
#' @title Retrieve the Norwegian Directorate of Fisheries codes from a code list
#' @description This function retrieves codes used in the electronic logbook data from an Excel sheet published by the Directorate of Fisheries. This list is already supplied in the package and the function is only required to update the codes.
#' @param path Character string specifying the path to the Excel file downloaded from the \href{https://www.fiskeridir.no/Yrkesfiske/Rapportering-ved-landing/Kodeliste}{Directorate of Fisheries webpage}.
#' @param speciesSheet Character string specifying the name of the tab containing species codes.
#' @param speciesStartRow Integer specifying the \code{skip} argument for \code{\link[readxl]{read_xlsx}} in the species code tab.
#' @param speciesHeaderRow Integer specifying row number of header in the species code tab.
#' @param gearSheet Character string specifying the name of the tab containing species codes.
#' @param gearStartRow Integer specifying the \code{skip} argument for \code{\link[readxl]{read_xlsx}} in the gear code tab.
#' @details The function has been written for \href{https://www.fiskeridir.no/Yrkesfiske/Rapportering-ved-landing/Kodeliste}{the code list Excel sheet} published on 2020-10-30. You may have to adjust the function depending on changes in newer versions of the file.
#' @import readxl
#' @author Mikko Vihtakari
#' @export
# path = "~/Desktop/Kodeliste-landing-171219.xlsx"; speciesSheet = "B-Fiskeslag"; speciesStartRow = 19; speciesHeaderRow = 17; gearSheet = "A7-Redskap"; gearStartRow = 8
readFdirCodes <- function(path,
speciesSheet = "B-Fiskeslag",
speciesStartRow = 19,
speciesHeaderRow = 17,
gearSheet = "A7-Redskap",
gearStartRow = 8
) {
## Species codes
dt <- suppressMessages(readxl::read_xlsx(path = path, sheet = speciesSheet, skip = speciesStartRow, col_names = FALSE))
header <- suppressMessages(readxl::read_xlsx(path = path, sheet = speciesSheet, col_names = FALSE, range = paste0("A", speciesHeaderRow, ":", LETTERS[ncol(dt)], speciesHeaderRow)))
colnames(dt) <- as.character(header[1,])
dt <- dt[c("Tall", "FAO", "Norsk navn", "Engelsk navn", "Latinsk navn")]
dt <- dt[rowSums(is.na(dt)) != ncol(dt), ]
colnames(dt) <- c("idNS", "idFAO", "norwegian", "english", "latin")
dt$idNS <- suppressWarnings(as.numeric(dt$idNS))
dt <- dt[!is.na(dt$idNS),]
dt <- dt[!duplicated(dt$idNS),]
dt$norwegian <- trimws(gsub("\\*", "", dt$norwegian))
dt$english <- trimws(gsub("\\*", "", dt$english))
speciesCodes <- dt
## Gear codes
dt <- suppressMessages(readxl::read_xlsx(path = path, sheet = gearSheet, skip = gearStartRow, col_names = FALSE))
dt <- dt[,1:2]
colnames(dt) <- c("idGear", "gearName")
dt <- dt[!is.na(dt$idGear),]
dt$gearCategory <- cut(dt$idGear, seq(10, 100, 10), right = FALSE, labels = c("Noter", "Garn", "Kroker", "Ruser", "Traal", "Noter", "Skytevaapen", "Annet", "Annet"))
gearCodes <- dt
## Return
list(speciesCodes = speciesCodes, gearCodes = gearCodes)
}
As NMD provide an API for for landings as well, it would be natural to include download services in the package @arnejohannesholmin presents in comments to #136. If it is going to be a pure download package, the code for parsing the Fdir codelists is probably best included in RstoxData along with the code for parsing the landings.
I don't know what you guys think, but I'd need functions that download data from NMD in my work. The most natural place for me would be to place these functions in the RstoxData package. Under is my attempt to download landings data.
FDIR species codes:
Function to download the landings (requires FDIRcodes data frame):
And a function to prepare FDIR codes (requires their messy Excel sheet):
Source: https://github.com/MikkoVihtakari/RstoxUtils
The text was updated successfully, but these errors were encountered: