Skip to content

Semantics

Alex Robin edited this page Jul 10, 2022 · 1 revision

Introduction

All SWE data models (SWE Common, SensorML, O&M) and service models (SOS, SPS, STA) provide ways of linking to semantic concepts via URIs. These concepts are typically defined elsewhere on the web (e.g. in ontologies) and can be resolved to better understand their exact meaning.

The OpenSensorHub team has created cross-domain ontologies that can be used directly with these standards, based on the work done by the W3C SOSA/SSN and QUDT working groups. The main objective of the OSH ontologies is to serve as a cross-domain basis on which we can map domain specific ontologies.

Ontology Design

We try to follow these design guidelines for our ontologies:

  • Have a textual definition for each term with good reference(s) to more complete write ups about the concept
  • Have meaningful IRIs (no URIs with opaque IDs)
  • Ensure entry IRIs are self-resolvable by using URLs (don't use URL fragments with #)
  • Have internal semantic relationships (skos)
  • Maintain mappings to other well established ontologies

Scope of the Ontologies

The scope of these ontologies is the following:

Observable Properties

  • Based on QUDT Quantity Kind ontology
  • Start from fundamental physical quantities, and add more specific properties
  • Only define properties that are used across several domains and avoid including things that are very specific (e.g. never add something as specific as some CF terms like "sea_surface_wave_directional_variance_spectral_density" but consider the generic concept of "spectral_density" as in scope).

System Properties

  • Sensor response model characteristics (resolution, sensitivity, bias, non-linear characteristic curve, spectral response, etc.)
  • Sampling characteristics (sampling frequency, sampling geometry, directional response, etc.)
  • Actuator response characteristics (command latency, etc.)
  • Mechanical characteristics
  • Electrical characteristics
  • Operating ranges and conditions These are typically instanceof the the "SystemCapability" and "SystemProperty" classes defined in SSN

Assigned Properties

These are properties that are assigned by the equipment manufacturer, owner or operator and not measured.

  • Identifiers (serial number, model number, etc.)
  • Classifiers (sensor type, platform type, application, etc.)
  • Contact roles (manufacturer, operator, etc.)

Data Properties

  • Data fields (array sizes and indexes, flags, sequence numbers, counters, etc.)
  • Data structures (trajectory, image, grid, coverage, profile, etc.)

Property Qualifiers

  • Uncertainty (std error, etc.)
  • Statistical operators (mean, median, min/max, stddev, variance, temporal and spatial variants, etc.)
  • Frequency bands

Object Types

  • System types
  • Platform types
  • Sensor types
  • Actuator types
  • Maintenance event types

Reference Ontologies and Vocabularies

Definition URIs Lookup Process

When developing a new driver or a new SensorML document:

  1. List all properties that are either observed or controlled by the system, as well as all system characteristics and capabilities.

  2. Look for exact matches in existing ontologies referenced above:

    • QUDT for basic physical quantities
    • CF standard names for weather/climate properties
    • W3C SSN for general system capabilities and characteristics
    • sensorml.com for more specific geopositioning properties and other
  3. For concepts that are not already in one of these ontologies (or another well known ontology we decide to use), we should do one of the following:

  • If a new concept needs to be added to one of the ontologies hosted @ https://github.com/opensensorhub/swe-ontologies, create an issue on that repo (Issue template to be added). If the new concept fits within the scope of a more general ontology like QUDT, we will also submit the new entry to them.
  • If the driver is based on a domain standard (e.g. CSM, MISB, ISA...), a new ontology can be created to define terms that are specific to that standard (after sorting the ones that can be covered by existing ontologies)
  1. While waiting for a new concept/property to be properly registered, always use the temporary namespace TBD.

Current Issues

Property Variants

It is sometimes hard to know what definition URI to use to query the API because there can be many different variants. For example, we have different URIs representing location:

  • http://sensorml.com/ont/swe/property/Location
  • http://www.opengis.net/def/property/OGC/0/SensorLocation
  • http://www.opengis.net/def/property/OGC/0/PlatformLocation
  • http://sensorml.com/ont/swe/property/TargetLocation

Different solutions are possible:

  1. Use Composed Properties using more generic definition URIs with qualifiers, at the instance level.
  2. Call the ontology (i.e. using SPARQL) before calling the API to figure things out
  3. Make the API implementation smarter.

These solutions are discussed in more details below.

Composed Properties

Many properties can be defined by combining a base quantity and one or more qualifier, like in the following examples:

  • sea surface temperature is a combination of temperature and a feature of interest sea surface
  • average temperature is a combination of temperature and a statistical operator temporal average
  • speed accuracy is a combination of speed and an uncertainty qualifier accuracy
  • etc.

Some ontologies combine several of these concepts to form new URIs. In theory, we could first query the ontology to discover all flavors of a particular base property and then query OSH for these. However, the number of combinations is so large that I think such ontologies would be hard to create and maintain anyway. On the other hand, having only general concepts such as temperature is insufficient to define a property precisely.

An alternative would be to do the composition at the instance level, using the SWE Common language. Each SWE Common component would thus be able to combine:

  • the base property (e.g. physical quantity)
  • the medium
  • the feature of interest?
  • the statistical operator or uncertainty measure applied to it
  • the frequency band (acoustic or electromagnetic)
  • chemical or biological species (applies to counts, concentrations, etc.)
  • other qualifiers?

We would also need to come up with a syntax to search for combinations as well.

For example, instead of searching for: http://mmisw.org/ont/cf/parameter/air_pressure_at_cloud_base

One could search for: https://qudt.org/vocab/quantitykind/Temperature+http://sweetontology.net/phenAtmoCloud/Cloud or https://qudt.org/vocab/quantitykind/Temperature+http://sweetontology.net/realm/Atmosphere

Since the different qualifiers are provided at the instance level, the API does not need to resolve the URIs with the ontology to understand the associations between them.

Smarter API

The "Connected Systems API" implementation could resolve the different URI variants automatically if it has knowledge of the relationships between properties.

In its simplest form, the idea would be to retrieve results that are matching subclasses of the specified property. For example, querying for Location would retrieve datastreams that have SensorLocation, PlatformLocation and TargetLocation because there exists a "skos:broader/narrower" relationship between these concepts.

There are several ways to implement this:

  1. Resolve things dynamically by calling the ontology(ies) everytime to resolve what URIs are subclasses of a property.
  2. Fetch the relevant ontology entries when a concept is first used by one of the registered system/datastream, building a local cache of the ones used in the sensorhub, so further queries can be run locally much more efficiently. Cache could also be refreshed regularly or on-demand.