-
Notifications
You must be signed in to change notification settings - Fork 18
Semantics
All SWE data models (SWE Common, SensorML, O&M) and service models (SOS, SPS, STA) provide ways of linking to semantic concepts via URIs. These concepts are typically defined elsewhere on the web (e.g. in ontologies) and can be resolved to better understand their exact meaning.
The OpenSensorHub team has created cross-domain ontologies that can be used directly with these standards, based on the work done by the W3C SOSA/SSN and QUDT working groups. The main objective of the OSH ontologies is to serve as a cross-domain basis on which we can map domain specific ontologies.
We try to follow these design guidelines for our ontologies:
- Have a textual definition for each term with good reference(s) to more complete write ups about the concept
- Have meaningful IRIs (no URIs with opaque IDs)
- Ensure entry IRIs are self-resolvable by using URLs (don't use URL fragments with #)
- Have internal semantic relationships (skos)
- Maintain mappings to other well established ontologies
The scope of these ontologies is the following:
- Based on QUDT Quantity Kind ontology
- Start from fundamental physical quantities, and add more specific properties
- Only define properties that are used across several domains and avoid including things that are very specific (e.g. never add something as specific as some CF terms like "sea_surface_wave_directional_variance_spectral_density" but consider the generic concept of "spectral_density" as in scope).
- Sensor response model characteristics (resolution, sensitivity, bias, non-linear characteristic curve, spectral response, etc.)
- Sampling characteristics (sampling frequency, sampling geometry, directional response, etc.)
- Actuator response characteristics (command latency, etc.)
- Mechanical characteristics
- Electrical characteristics
- Operating ranges and conditions These are typically instanceof the the "SystemCapability" and "SystemProperty" classes defined in SSN
These are properties that are assigned by the equipment manufacturer, owner or operator and not measured.
- Identifiers (serial number, model number, etc.)
- Classifiers (sensor type, platform type, application, etc.)
- Contact roles (manufacturer, operator, etc.)
- Data fields (array sizes and indexes, flags, sequence numbers, counters, etc.)
- Data structures (trajectory, image, grid, coverage, profile, etc.)
- Uncertainty (std error, etc.)
- Statistical operators (mean, median, min/max, stddev, variance, temporal and spatial variants, etc.)
- Frequency bands
- System types
- Platform types
- Sensor types
- Actuator types
- Maintenance event types
- OGC Definition Server - SWE Definition Vocabulary
- W3C Semantic Sensor Network (SSN)
- QUDT Quantity Kinds
- CF Parameters hosted on MMI server
- Tried to use DBPedia but terms URIs are somewhat unstable
When developing a new driver or a new SensorML document:
-
List all properties that are either observed or controlled by the system, as well as all system characteristics and capabilities.
-
Look for exact matches in existing ontologies referenced above:
- QUDT for basic physical quantities
- CF standard names for weather/climate properties
- W3C SSN for general system capabilities and characteristics
- sensorml.com for more specific geopositioning properties and other
-
For concepts that are not already in one of these ontologies (or another well known ontology we decide to use), we should do one of the following:
- If a new concept needs to be added to one of the ontologies hosted @ https://github.com/opensensorhub/swe-ontologies, create an issue on that repo (Issue template to be added). If the new concept fits within the scope of a more general ontology like QUDT, we will also submit the new entry to them.
- If the driver is based on a domain standard (e.g. CSM, MISB, ISA...), a new ontology can be created to define terms that are specific to that standard (after sorting the ones that can be covered by existing ontologies)
- While waiting for a new concept/property to be properly registered, always use the temporary namespace
TBD
.
It is sometimes hard to know what definition URI to use to query the API because there can be many different variants. For example, we have different URIs representing location:
http://sensorml.com/ont/swe/property/Location
http://www.opengis.net/def/property/OGC/0/SensorLocation
http://www.opengis.net/def/property/OGC/0/PlatformLocation
http://sensorml.com/ont/swe/property/TargetLocation
Different solutions are possible:
- Use Composed Properties using more generic definition URIs with qualifiers, at the instance level.
- Call the ontology (i.e. using SPARQL) before calling the API to figure things out
- Make the API implementation smarter.
These solutions are discussed in more details below.
Many properties can be defined by combining a base quantity and one or more qualifier, like in the following examples:
-
sea surface temperature
is a combination oftemperature
and a feature of interestsea surface
-
average temperature
is a combination oftemperature
and a statistical operatortemporal average
-
speed accuracy
is a combination ofspeed
and an uncertainty qualifieraccuracy
- etc.
Some ontologies combine several of these concepts to form new URIs. In theory, we could first query the ontology to discover all flavors of a particular base property and then query OSH for these. However, the number of combinations is so large that I think such ontologies would be hard to create and maintain anyway. On the other hand, having only general concepts such as temperature
is insufficient to define a property precisely.
An alternative would be to do the composition at the instance level, using the SWE Common language. Each SWE Common component would thus be able to combine:
- the base property (e.g. physical quantity)
- the medium
- the feature of interest?
- the statistical operator or uncertainty measure applied to it
- the frequency band (acoustic or electromagnetic)
- chemical or biological species (applies to counts, concentrations, etc.)
- other qualifiers?
We would also need to come up with a syntax to search for combinations as well.
For example, instead of searching for:
http://mmisw.org/ont/cf/parameter/air_pressure_at_cloud_base
One could search for:
https://qudt.org/vocab/quantitykind/Temperature+http://sweetontology.net/phenAtmoCloud/Cloud
or
https://qudt.org/vocab/quantitykind/Temperature+http://sweetontology.net/realm/Atmosphere
Since the different qualifiers are provided at the instance level, the API does not need to resolve the URIs with the ontology to understand the associations between them.
The "Connected Systems API" implementation could resolve the different URI variants automatically if it has knowledge of the relationships between properties.
In its simplest form, the idea would be to retrieve results that are matching subclasses of the specified property. For example, querying for Location
would retrieve datastreams that have SensorLocation
, PlatformLocation
and TargetLocation
because there exists a "skos:broader/narrower" relationship between these concepts.
There are several ways to implement this:
- Resolve things dynamically by calling the ontology(ies) everytime to resolve what URIs are subclasses of a property.
- Fetch the relevant ontology entries when a concept is first used by one of the registered system/datastream, building a local cache of the ones used in the sensorhub, so further queries can be run locally much more efficiently. Cache could also be refreshed regularly or on-demand.