Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAS-2166: Improve HyBIG documentation and notebooks #12

Merged
merged 9 commits into from
May 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
198 changes: 167 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,53 +1,189 @@
# Harmony Browse Image Generator (HyBIG) backend service.
# Harmony Browse Image Generator (HyBIG).

This Harmony backend service is designed to produce browse imagery, with
default behaviour to produce browse imagery that is compatible with the NASA
Global Image Browse Services ([GIBS](https://www.earthdata.nasa.gov/eosdis/science-system-description/eosdis-components/gibs)).

flamingbear marked this conversation as resolved.
Show resolved Hide resolved
This means that defaults for images are selected to match the visualization
generation requirements and recommendations put forth in the GIBS Interface
Control Document (ICD), which can be found on [Earthdata
Wiki](https://wiki.earthdata.nasa.gov/display/GITC/Ingest+Delivery+Methods)
along with [additional GIBS
documentation](https://nasa-gibs.github.io/gibs-api-docs/).

HyBIG creates paletted PNG images and associated metadata from GeoTIFF input
images. Scientific parameter raster data as well as RGB[A] raster images can
be converted to browse PNGs. These browse images undergo transformation by
reprojection, tiling and coloring to seamlessly integrate with GIBS.

flamingbear marked this conversation as resolved.
Show resolved Hide resolved
### Reprojection

GIBS expects to receive images in one of three Coordinate Reference System (CRS) projections.

| Region | Code | Name |
|-------------|-----------|-----------------------------------------------------------|
| north polar | EPSG:3413 | WGS 84 / NSIDC Sea Ice Polar Stereographic North |
| south polar | EPSG:3031 | WGS 84 / Antarctic Polar Stereographic |
| global | EPSG:4326 | WGS 84 -- WGS84 - World Geodetic System 1984, used in GPS |

flamingbear marked this conversation as resolved.
Show resolved Hide resolved
HyBIG processing will attempt to choose a GIBS-suitable target CRS from the
input image or read it from the inputs. Reprojection is done by resampling via
nearest neighbor. It is important to note that HyBig outputs are not scientific
data, but browse imagery and should not be used for scientific analysis.


### Tiling

Large output images are divided into smaller, more manageable tiles for
efficient handling and processing, as per agreement with GIBS. The maximum
untiled image size generated by HyBIG is 67,108,864 cells (8,192 x 8,192). If
the output image exceeds this threshold, HyBIG automatically tiles the output
into multiple 4,096 x 4,096 cell images.

Tiled images are labeled with the zero-based column and row numbers inserted
into the output filename before its
extension. For example, `VCF5KYR_1991001_001_2018224205008.r01c02.png` represents the
second row and third column of the output tiles. The tiles at the edges are
truncated to fit the overall image dimensions. Currently, you cannot override
this behavior.

### Coloring

HyBIG images are colored in several ways. A palette can be included in the
input [STAC
Item](https://github.com/radiantearth/stac-spec/blob/master/item-spec/item-spec.md). If
an Item's asset contains a value with the role of `palette`, it is assumed to
be a reference to a remote color table, which is fetched from the asset's
`href` and parsed as a GDAL color table.

If the STAC Item lacks color information, the Harmony message source is
searched for a related URL with a "content type" of `VisualizationURL` and a
"type" of `Color Map`. If found, it is presumed to be a remote color table and
fetched from that location.

In the absence of remote color information, the input image itself is searched
for a color map, which is used if present.

If no color information can be found, grayscale is used.

### Defaults

HyBIG tries to provide GIBS-appropriate default values for the browse image
outputs. When a user does not provide a target values for the output, HyBIG
will try to pick an appropriate default.

#### Coordinate Reference System (CRS)

HyBIG selects a default CRS from the list of GIBS preferred projections. The
steps followed are simple but effective:

1. If the `proj` is `lonlat` use global (`EPSG:4326`)
1. If the projection latitude of origin is above 80° N use northern (`EPSG:3413`)
1. If the projection latitude of origin is below -80° N use southern (`EPSG:3031`)
1. Otherwise use global (`EPGS:4326`)

#### Scale Extent (Image Bounds)

The default scale extent for an output image is computed by reprojecting the
input data boundary into the target CRS. It densifies the edges by adding 21
points ([rasterio's
default](https://rasterio.readthedocs.io/en/latest/api/rasterio.warp.html#rasterio.warp.transform_bounds))
to each edge before reprojection to account for non-linear edges produced by
the transformation ensuring inclusion of all data in the output image.

#### Dimensions / Scale Sizes

Output image dimensions can be explicitly included as `width` and `height` in
the harmony message or computed based on the scale extent and scale size
(resolution).

The dimension computations from the scale extent and scale size:
```
height = round((scale_extent['ymax'] - scale_extent['ymin']) / scale_size.y)
width = round((scale_extent['xmax'] - scale_extent['xmin']) / scale_size.x)
```

When a Harmony message contains neither `dimensions` nor `scaleSizes` a default
set of dimensions is computed.

For coarse input data, the resolution (scale size) is used with the scale
extent to compute the output dimensions. For high resolution data, finer than
2km per gridcell, the input resolution is used to lookup the closest GIBS
preferred resolution (Table 4.1.8-1 and -2 from the ICD) and the preferred
resolution along with the scale extent is used to compute the output image
dimensions.

### Customizations

Users can request customizations to the output images such as `crs`,
`scale_extents`, or `scale_sizes` and dimensions (`height` & `width`) in the
harmony request. However, the generated outputs may not be compatible with
GIBS.

When a user customizes `scale_extent` or `scale_size`, they must also include a
`crs` in the request. The units of the cusomized values must match the target
CRS. For example, specifying a bounding box in degrees requires a target CRS
also with units of degrees.


## Repository structure:

```
|- .pre-commit-config.yaml
|- 📂 bin
|- 📂 docker
|- 📂 docs
|- 📂 harmony_browse_image_generator
|- 📂 tests
|- CHANGELOG.md
flamingbear marked this conversation as resolved.
Show resolved Hide resolved
|- CONTRIBUTING.md
|- LICENSE
|- README.md
|- bin
|- conda_requirements.txt
|- dev-requirements.txt
|- docker
|- docs
|- harmony_browse_image_generator
|- legacy-CHANGELOG.md
|- pip_requirements.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding these missing items. (Although, now it's making me realise we're potentially missing the description of the license file in the list below, maybe the file is self-explanatory, but might be worth mentioning it's required for NASA open-source release)

Copy link
Member Author

@flamingbear flamingbear May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I looked at all the files and was trying to decide which ones should be described. and with license it would be.

|- license

license - license file for repository

IDK. No strong opinions, but also don't think it's necessary

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we were to capture anything it would be that the license is in compliance with NASA open-source software requirements. Potential text:

  • LICENSE - Required for distribution under NASA open-source approval. Details conditions for use, reproduction and distribution.

Probably not perfect text, but I think that's the gist of anything I would add.

|- tests
```

* `.pre-commit-config` - a pre-commit configuration file describing functions to
be run on every git commit.
* `CHANGELOG.md` - This file contains a record of changes applied to each new
release of a service Docker image. Any release of a new service version
should have a record of what was changed in this file.
* `CONTRIBUTING.md` - This file contains guidance for making contributions to
HyBIG, including recommended git best practices.
* `README.md` - This file, containing guidance on developing the service.
* `bin` - A directory containing utility scripts to build the service and test
images. A script to extract the release notes for the most recent service
version, as contained in `CHANGELOG.md` is also in this directory.
* `conda_requirements.txt` - A list of service dependencies, such as GDAL, that
cannot be installed via Pip.
* `dev-requirements.txt` - list of packages required for service development.

* `docker` - A directory containing the Dockerfiles for the service and test
images. It also contains `service_version.txt`, which contains the semantic
version number of the service image. Any time an update is made that should
have an accompanying service image release, this file should be updated.
* `docs` - directory with example usage notebooks.
* `harmony_browse_image_generator` - The directory containing Python source code

* `docs` - A directory with example usage notebooks.

* `harmony_browse_image_generator` - A directory containing Python source code
for the HyBIG. `adapter.py` contains the `BrowseImageGeneratorAdapter`
class that is invoked by calls to the service.

* `tests` - A directory containing the service unit test suite.

* `CHANGELOG.md` - This file contains a record of changes applied to each new
release of a service Docker image. Any release of a new service version
should have a record of what was changed in this file.

* `CONTRIBUTING.md` - This file contains guidance for making contributions to
HyBIG, including recommended git best practices.

* `LICENSE` - Required for distribution under NASA open-source
approval. Details conditions for use, reproduction and distribution.

* `README.md` - This file, containing guidance on developing the service.

* `conda_requirements.txt` - A list of service dependencies, such as GDAL, that
cannot be installed via Pip.

* `dev-requirements.txt` - list of packages required for service development.

* `legacy-CHANGELOG.md` - Notes for each version that was previously released
internally to EOSDIS, prior to open-source publication of the code and Docker
image.

* `pip_requirements.txt` - A list of service Python package dependencies.
* `tests` - A directory containing the service unit test suite.


## Local development:

Expand All @@ -74,7 +210,7 @@ service within that environment via conda and pip then install the pre-commit ho

This service utilises the Python `unittest` package to perform unit tests on
classes and functions in the service. After local development is complete, and
test have been updated, they can be run via:
test have been updated, they can be run in Docker via:

```bash
$ ./bin/build-image
Expand Down Expand Up @@ -106,8 +242,8 @@ major.minor.patch.
When publishing a new Docker image for the service, two files need to be
updated:

* CHANGELOG.md - Notes should be added to capture the changes to the service.
* docker/service_version.txt - The semantic version number should be updated.
* `CHANGELOG.md` - Notes should be added to capture the changes to the service.
* `docker/service_version.txt` - The semantic version number should be updated.

## CI/CD:

Expand All @@ -130,14 +266,15 @@ The `publish_docker_image.yml` workflow will:
* Extract the released notes for the most recent version from `CHANGELOG.md`.
* Create a GitHub release that will also tag the related git commit with the
semantic version number.
* Build and deploy a this service's docker image to `ghcr.io`.

Before triggering a release, ensure both the `docker/service_version.txt` and
`CHANGELOG.md` files are updated. The `CHANGELOG.md` file requires a specific
format for a new release, as it looks for the following string to define the
newest relate of the code (starting at the top of the file).
newest release of the code (starting at the top of the file).

```
## vX.Y.Z
## vX.Y.Z - YYYY-MM-DD
```

### pre-commit hooks:
Expand All @@ -152,7 +289,7 @@ checking the repository for some coding standard best practices. These include:
* [black](https://black.readthedocs.io/en/stable/index.html) Python code
formatting checks.

To enable these checks:
To enable these checks locally:

```bash
# Install pre-commit Python package as part of test requirements:
Expand All @@ -179,10 +316,9 @@ automatically run for every pull request.
## Releasing a new version of the service:

Once a new Docker image has been published with a new semantic version tag,
that service version can be released to a Harmony environment by updating the
main Harmony Bamboo deployment project. Find the environment you wish to
release the service version to and update the associated environment variable
to update the semantic version tag at the end of the full Docker image name.
that service version can be released to a Harmony environment by following the
directions in the [Harmony Managing Existing Services
Guide](https://github.com/nasa/harmony/blob/main/docs/guides/managing-existing-services.md).

## Get in touch:

Expand Down
Loading
Loading