Download functions - downloadLandings #135

MikkoVihtakari · 2021-03-30T07:11:40Z

I don't know what you guys think, but I'd need functions that download data from NMD in my work. The most natural place for me would be to place these functions in the RstoxData package. Under is my attempt to download landings data.

FDIR species codes:

tmp.file <- tempfile()
download.file("https://github.com/MikkoVihtakari/RstoxUtils/raw/master/data/FDIRcodes.rda", tmp.file)
load(tmp.file)

Function to download the landings (requires FDIRcodes data frame):

#' @title Download landings data for a species from the IMR database
#' @description The function downloads landings ("sluttseddel") data from IMR database. Requires access to the intranet. 
#' @param species any species identification name in \code{FDIRcodes$speciesCodes} as character. Only one species at the time allowed.
#' @param years an integer vector of years to download. If \code{NULL} (default), all years are downloaded. Please note that this option can take very long time and lead to huge datasets.
#' @author Mikko Vihtakari
#' @importFrom RstoxData readXmlFile
#' @examples \dontrun{
#' downloadLandings("brugde") # Basking shark, all years
#' downloadLandings("kveite", years = 2000:2001) # halibut, 2000-01
#' }
#' @export

# species  <- "blåkveite"; years <- c(1900:2020)
downloadLandings <- function(species, years = NULL) {
  
  ## Set up variables
  
  splist <- as.data.frame(FDIRcodes$speciesCodes)
  dest <- tempfile(fileext = ".xml")
  APIpath <- "http://tomcat7.imr.no:8080/apis/nmdapi/landing/v2?version=2.0&type=search"
  
  ## Find the species code
  
  species <- paste0("^", species, "$")
  tmp <- sapply(colnames(splist), function(x) grep(species, splist[,x], ignore.case = TRUE))
  
  if(all(sapply(tmp, function(k) length(k) == 0))) stop(paste(species, "not found from FDIRcodes$speciesCodes"))
  
  if(sum(sapply(tmp, function(k) length(k) == 1)) > 1) {
    stop(
      paste(species, "was matched to", 
            paste(names(sapply(tmp, function(k) length(k) == 1)[sapply(tmp, function(k) length(k) == 1)]), collapse = ", "),
            ". Cannot extract information from multiple columns."
      )
    )
  }
  
  # spCol <- names(tmp[sapply(tmp, function(k) length(k) == 1)])
  spRow <- unlist(unname(tmp[sapply(tmp, function(k) length(k) == 1)]))
  spCode <- splist[spRow, "idNS"]
  
  spCode <- ifelse(nchar(spCode) == 3, paste0(0, spCode), spCode)
  
  ## Set up the download path
  
  if(is.null(years)) {
    DownloadPath <- paste0(APIpath, "&Art_kode=", spCode)
  } else {
    DownloadPath <- paste0(APIpath, "&Art_kode=", spCode, "&Fangstar=", paste(years, collapse = ","))
  }
  
  ## Download the data from the database 
  
  status <- suppressMessages(suppressWarnings(try(utils::download.file(DownloadPath, dest), silent = TRUE)))
  
  if(class(status) == "try-error") {
    
    ## Stop processing if not found
    
    stop(paste("Species code", spCode, "with years", paste(years, collapse = ","), "not found from the database."))
    
  } else {
    
    ## Read the data
    
    RstoxData::readXmlFile(dest)
    
  }
}

And a function to prepare FDIR codes (requires their messy Excel sheet):


#' @title Retrieve the Norwegian Directorate of Fisheries codes from a code list
#' @description This function retrieves codes used in the electronic logbook data from an Excel sheet published by the Directorate of Fisheries. This list is already supplied in the package and the function is only required to update the codes.
#' @param path Character string specifying the path to the Excel file downloaded from the \href{https://www.fiskeridir.no/Yrkesfiske/Rapportering-ved-landing/Kodeliste}{Directorate of Fisheries webpage}.
#' @param speciesSheet Character string specifying the name of the tab containing species codes.
#' @param speciesStartRow Integer specifying the \code{skip} argument for \code{\link[readxl]{read_xlsx}} in the species code tab.
#' @param speciesHeaderRow Integer specifying row number of header in the species code tab.
#' @param gearSheet Character string specifying the name of the tab containing species codes.
#' @param gearStartRow Integer specifying the \code{skip} argument for \code{\link[readxl]{read_xlsx}} in the gear code tab.
#' @details The function has been written for \href{https://www.fiskeridir.no/Yrkesfiske/Rapportering-ved-landing/Kodeliste}{the code list Excel sheet} published on 2020-10-30. You may have to adjust the function depending on changes in newer versions of the file.
#' @import readxl
#' @author Mikko Vihtakari
#' @export

# path = "~/Desktop/Kodeliste-landing-171219.xlsx"; speciesSheet = "B-Fiskeslag"; speciesStartRow = 19; speciesHeaderRow = 17; gearSheet = "A7-Redskap"; gearStartRow = 8
readFdirCodes <- function(path,
                          speciesSheet = "B-Fiskeslag",
                          speciesStartRow = 19,
                          speciesHeaderRow = 17,
                          gearSheet = "A7-Redskap",
                          gearStartRow = 8
) {
  
  ## Species codes
  
  dt <- suppressMessages(readxl::read_xlsx(path = path, sheet = speciesSheet, skip = speciesStartRow, col_names = FALSE))
  header <- suppressMessages(readxl::read_xlsx(path = path, sheet = speciesSheet, col_names = FALSE, range = paste0("A", speciesHeaderRow, ":", LETTERS[ncol(dt)], speciesHeaderRow)))
  
  colnames(dt) <- as.character(header[1,])
  
  dt <- dt[c("Tall", "FAO", "Norsk navn", "Engelsk navn", "Latinsk navn")]
  dt <- dt[rowSums(is.na(dt)) != ncol(dt), ]
  colnames(dt) <- c("idNS", "idFAO", "norwegian", "english", "latin")
  
  dt$idNS <- suppressWarnings(as.numeric(dt$idNS))
  dt <- dt[!is.na(dt$idNS),]
  dt <- dt[!duplicated(dt$idNS),]
  dt$norwegian <- trimws(gsub("\\*", "", dt$norwegian))
  dt$english <- trimws(gsub("\\*", "", dt$english))
  
  speciesCodes <- dt
  
  ## Gear codes
  
  dt <- suppressMessages(readxl::read_xlsx(path = path, sheet = gearSheet, skip = gearStartRow, col_names = FALSE))
  dt <- dt[,1:2]
  colnames(dt) <- c("idGear", "gearName")
  dt <- dt[!is.na(dt$idGear),]
  dt$gearCategory <- cut(dt$idGear, seq(10, 100, 10), right = FALSE, labels = c("Noter", "Garn", "Kroker", "Ruser", "Traal", "Noter", "Skytevaapen", "Annet", "Annet"))
  
  gearCodes <- dt
  
  ## Return
  
  list(speciesCodes = speciesCodes, gearCodes = gearCodes)
  
}

Source: https://github.com/MikkoVihtakari/RstoxUtils

The text was updated successfully, but these errors were encountered:

edvinf · 2021-09-20T08:42:08Z

As NMD provide an API for for landings as well, it would be natural to include download services in the package @arnejohannesholmin presents in comments to #136. If it is going to be a pure download package, the code for parsing the Fdir codelists is probably best included in RstoxData along with the code for parsing the landings.

Changed ImputeSuperIndividuals() to not add rows from EDSUs with no a…

MikkoVihtakari added the enhancement New feature or request label Mar 30, 2021

edvinf self-assigned this Sep 20, 2021

arnejohannesholmin added a commit that referenced this issue Apr 20, 2023

Merge pull request #135 from StoXProject/develop

d5a2dff

Changed ImputeSuperIndividuals() to not add rows from EDSUs with no a…

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Download functions - downloadLandings #135

Download functions - downloadLandings #135

MikkoVihtakari commented Mar 30, 2021 •

edited

Loading

edvinf commented Sep 20, 2021

Download functions - downloadLandings #135

Download functions - downloadLandings #135

Comments

MikkoVihtakari commented Mar 30, 2021 • edited Loading

edvinf commented Sep 20, 2021

MikkoVihtakari commented Mar 30, 2021 •

edited

Loading