Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for version 2 #24

Merged
merged 8 commits into from
Oct 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 22 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,30 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Fixed

## [v2.0.0] - 2024-08-30

### Added

- `storage:schemes`, `storage:refs` and Storage Scheme Object
- Support the storage extension in Links
- Support for the Alternate Assets Extension
- Support for other storage providers, including custom S3 hosts

### Changed

- The extension is a framework for storage providers, it doesn't strictly define the individual providers.
- The storage providers are grouped in `storage:schemes` and located in the Item Properties, Collections or Catalog metadata
- Assets and Links reference the storage schemes by key in `storage:refs`

### Removed

- `storage:platform`, `storage:region`, `storage:requester_pays` and `storage:tier`
m-mohr marked this conversation as resolved.
Show resolved Hide resolved
(moved to Storage Scheme Object, except for the tier)

## [v1.0.0] - 2021-06-23

Initial release

[Unreleased]: <https://github.com/stac-extensions/storage/compare/v1.0.0...HEAD>
[Unreleased]: <https://github.com/stac-extensions/storage/compare/v2.0.0...HEAD>
[v2.0.0]: <https://github.com/stac-extensions/storage/compare/v1.0.0...v2.0.0>
[v1.0.0]: <https://github.com/stac-extensions/storage/tree/v1.0.0>
43 changes: 43 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Contributing

All contributions are subject to the
[STAC Specification Code of Conduct](https://github.com/radiantearth/stac-spec/blob/master/CODE_OF_CONDUCT.md).
For contributions, please follow the
[STAC specification contributing guide](https://github.com/radiantearth/stac-spec/blob/master/CONTRIBUTING.md) Instructions
for running tests are copied here for convenience.

## Running tests

The same checks that run as checks on PR's are part of the repository and can be run locally to verify that changes are valid.
To run tests locally, you'll need `npm`, which is a standard part of any [node.js installation](https://nodejs.org/en/download/).

First you'll need to install everything with npm once. Just navigate to the root of this repository and on
your command line run:

```bash
npm install
```

Then to check markdown formatting and test the examples against the JSON schema, you can run:

```bash
npm test
```

This will spit out the same texts that you see online, and you can then go and fix your markdown or examples.

If the tests reveal formatting problems with the examples, you can fix them with:

```bash
npm run format-examples
```

## Adding a new provider

1. Add documentation in a Markdown file to the folder `platforms`
2. Add the provider to the table in the `README.md`, see chapter "type"
3. Add a JSON Schema to the folder `json-schema/platforms`
4. Add the schema to the extension schema in file `json-schema/schema.json` (search for `allOf` below the definition of `storage:schemes`)
5. Add the newly created schema to the `validator-config.json`

Use the same file names (excluding the extension) for documentation and schema.
134 changes: 74 additions & 60 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,96 +1,110 @@
# Storage Extension Specification

- **Title:** Storage
- **Identifier:** <https://stac-extensions.github.io/storage/v1.0.0/schema.json>
- **Identifier:** <https://stac-extensions.github.io/storage/v2.0.0/schema.json>
- **Field Name Prefix:** storage
- **Scope:** Item, Collection
- **Scope:** Item, Catalog, Collection
m-mohr marked this conversation as resolved.
Show resolved Hide resolved
- **Extension [Maturity Classification](https://github.com/radiantearth/stac-spec/tree/master/extensions/README.md#extension-maturity):** Pilot
- **Owner**: @davidraleigh @matthewhanson
- **Owner**: @matthewhanson @m-mohr

This document explains the Storage Extension to the [SpatioTemporal Asset Catalog](https://github.com/radiantearth/stac-spec) (STAC) specification.
It allows adding details related to cloud storage access and costs to be associated with STAC Assets.
It allows adding details related to cloud object storage access and costs to be associated with STAC Assets.
This extension does not cover NFS solutions provided by PaaS cloud companies.

- Examples:
m-mohr marked this conversation as resolved.
Show resolved Hide resolved
- [Item example 1](examples/item-naip.json): Shows the basic usage of the extension in a STAC Item.
- [Item example 2](examples/item-nsl.json): Another example of basic usage.
- [NAIP Item with Alternate Assets](examples/item-naip.json): Shows a mixture of storage providers, including custom S3 hosts
and the [alternate assets extension](https://github.com/stac-extensions/alternate-assets).
- [Catalog with Link](examples/catalog-link.json): Shows the usage of the extension on a link in a STAC Catalog.
- [Collection with Auth](examples/catalog-link.json): Shows the usage of the extension in a STAC Collecion in combination with the
[authentication extension](https://github.com/stac-extensions/authentication).
- [JSON Schema](json-schema/schema.json)
- [Changelog](./CHANGELOG.md)

## Fields

The fields in the table below can be used in these parts of STAC documents:

- [x] Catalogs
- [x] Collections
m-mohr marked this conversation as resolved.
Show resolved Hide resolved
- [x] Item Properties (incl. Summaries in Collections)
- [ ] Assets (for both Collections and Items, incl. Item Asset Definitions in Collections)
- [ ] Links

| Field Name | Type | Description |
| ----------------- | ------------------------------------------------------------ | ----------- |
| `storage:schemes` | Map<string, [Storage Scheme Object](#storage-scheme-object)> | **REQUIRED.** A property that contains all of the storage schemes used by Assets and Links in the STAC Item, Catalog or Collection. |

---

The fields in the table below can be used in these parts of STAC documents:

- [ ] Catalogs
- [ ] Collections
- [x] Item Properties (incl. Summaries in Collections)
- [ ] Item Properties (incl. Summaries in Collections)
- [x] Assets (for both Collections and Items, incl. Item Asset Definitions in Collections)
- [ ] Links
- [x] Links
- [x] [Alternate Assets Object](https://github.com/stac-extensions/alternate-assets?tab=readme-ov-file#alternate-asset-object)
m-mohr marked this conversation as resolved.
Show resolved Hide resolved

| Field Name | Type | Description |
| ---------------------- | --------- | ----------- |
| storage:platform | string | The [cloud provider](#providers) where data is stored |
| storage:region | string | The region where the data is stored. Relevant to speed of access and inter region egress costs (as defined by PaaS provider) |
| storage:requester_pays | boolean | Is the data requester pays or is it data manager/cloud provider pays. *Defaults to false* |
| storage:tier | string | The title for the tier type (as defined by PaaS provider) |
| Field Name | Type | Description |
| -------------- | ---------- | ----------- |
| `storage:refs` | \[string\] | A property that specifies which schemes in `storage:schemes` may be used to access an Asset or Link. Each value must be one of the keys defined in `storage:schemes`. |

While these are all valid properties on an Item, they will typically be defined per-asset. If a field applies equally
to all assets (e.g., storage:platform=AWS if all assets are on AWS), then it should be specified in Item properties.
### Storage Scheme Object

### Additional Field Information
| Field Name | Type | Description |
| -------------- | ------- | ----------- |
| type | string | **REQUIRED.** Type identifier for the platform, see below. |
| platform | string | **REQUIRED.** The cloud provider where data is stored as URI or URI template to the API. |
| region | string | The region where the data is stored. Relevant to speed of access and inter region egress costs (as defined by PaaS provider). |
| requester_pays | boolean | Is the data "requester pays" (`true`) or is it "data manager/cloud provider pays" (`false`). Defaults to `false`. |
m-mohr marked this conversation as resolved.
Show resolved Hide resolved
| ... | ... | Additional properties as defined in the URL template or in the platform specific documents. |

#### Providers
Currently this document is arranged to support object storage users of the following PaaS solutions:
The properties `title` and `description` as defined in Common Metadata should be used as well.

- Alibaba Cloud (Aliyun): `ALIBABA`
- Amazon AWS: `AWS`
- Microsoft Azure: `AZURE`
- Google Cloud Platform: `GCP`
- IBM Cloud: `IBM`
- Oracle Cloud: `ORACLE`
- All other PaaS solutions: `OTHER`
#### platform

The upper-cased values are meant to be used for `storage:platform`.
The `platform` field identifies the cloud provider where the data is stored as URI or URI template to the API of the service.

#### Cloud Provider Storage Tiers
If a URI template is provided, all variables must be defined in the Storage Scheme Object as a property with the same name.
For example, the URI template `https://{bucket}.{region}.example.com` must have at least the properties
`bucket` and `region` defined:

| Minimum Duration | [Google Cloud Platform](https://cloud.google.com/storage/docs/storage-classes) | [Amazon AWS](https://aws.amazon.com/s3/storage-classes/) | [Microsoft Azure](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-storage-tiers) | [IBM Cloud](https://cloud.ibm.com/objectstorage/create#pricing) | [Oracle Cloud](https://www.oracle.com/cloud/storage/pricing.html) | [Alibaba Cloud](https://www.alibabacloud.com/product/oss/pricing) |
| ------------- | --------- | ------------------------ | ------- |---------- | ----------------- | ----------------- |
| 0 (Auto-Tier) | | Intelligent-Tiering | | Smart Tier |
| 0 days | STANDARD | Standard | hot | Standard | Standard | Standard |
| 30 days | NEARLINE | Standard-IA, One Zone-IA | cool | Vault | Infrequent Access | Infrequent Access |
| 60 days | | | | | | Archive |
| 90 days | COLDLINE | Glacier | | Cold Vault | Archive | |
| 180 days | | Glacier Deep Archive | archive | | | Cold Archive |
| 365 days | ARCHIVE | | | | | |
```json
{
"type": "example",
"platform": "https://{bucket}.{region}.example.com",
"region": "eu-fr",
"bucket": "john-doe-stac",
"requester_pays": true
}
```

## Contributing
In case an `href` contains a non-HTTP URL that is not directly resolvable,
the `platform` property must identify the host so that the URL can be resolved without further information.
For example, this is especially useful to provide the endpoint URL for custom S3 providers.
In this case the `platform` could effectively provide the endpoint URL.

All contributions are subject to the
[STAC Specification Code of Conduct](https://github.com/radiantearth/stac-spec/blob/master/CODE_OF_CONDUCT.md).
For contributions, please follow the
[STAC specification contributing guide](https://github.com/radiantearth/stac-spec/blob/master/CONTRIBUTING.md) Instructions
for running tests are copied here for convenience.
#### type

### Running tests
We try to collect pre-defined templates and best pratices for as many providers as possible
in this repository, but be aware that these are not part of the official extension releases.
This extension just provides the framework, the provider best pratices
may change at any time without a new version of this extension being released.

The same checks that run as checks on PR's are part of the repository and can be run locally to verify that changes are valid.
To run tests locally, you'll need `npm`, which is a standard part of any [node.js installation](https://nodejs.org/en/download/).
The following providers have defined best pratices at this point:

First you'll need to install everything with npm once. Just navigate to the root of this repository and on
your command line run:
```bash
npm install
```
| `type` | Provider and Documentation |
| ----------- | -------------------------- |
| `aws-s3` | [AWS S3](platforms/aws-s3.md) |
| `custom-s3` | [Generic S3 (non-AWS)](platforms/custom-s3.md) |
| `ms-azure` | [Microsoft Azure](platforms/ms-azure.md) |

Then to check markdown formatting and test the examples against the JSON schema, you can run:
```bash
npm test
```
Feel encouraged to submit additional platform specifications via Pull Requests.

This will spit out the same texts that you see online, and you can then go and fix your markdown or examples.
The `type` fields can be any value chosen by the implementor,
but the types defined in the table above should be used as defined in the best practices.
This ensures proper schema validation.

If the tests reveal formatting problems with the examples, you can fix them with:
```bash
npm run format-examples
```
## Contributing

See the [Contributor documentation](CONTRIBUTING.md) for details.
34 changes: 34 additions & 0 deletions examples/catalog-link.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"stac_version": "1.0.0",
"stac_extensions": [
"https://stac-extensions.github.io/storage/v2.0.0/schema.json"
],
"type": "Catalog",
"id": "20190822T183518Z_746_POM1_ST2_P",
"title": "Example Catalog",
"description": "An example catalog with a link to documentation on object storage.",
"storage:schemes": {
"aws": {
"type": "aws-s3",
"platform": "https://{bucket}.s3.{region}.amazonaws.com",
"bucket": "mybucket",
"region": "us-west-2",
"requester_pays": true
}
},
"links": [
{
"href": "https://example.com/examples/catalog-link.json",
"rel": "self"
},
{
"title": "Documentation",
"href": "s3://mybucket/project/documentation.pdf",
"type": "application/pdf",
"rel": "about",
"storage:refs": [
"aws"
]
}
]
}
78 changes: 78 additions & 0 deletions examples/collection.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
{
"stac_version": "1.0.0",
"stac_extensions": [
"https://stac-extensions.github.io/storage/v2.0.0/schema.json",
"https://stac-extensions.github.io/authentication/v1.1.0/schema.json"
],
"type": "Collection",
"id": "20190822T183518Z_746_POM1_ST2_P",
"title": "Example Collection",
"description": "An example catalog with a link to documentation on object storage.",
"license": "CC-0",
"storage:schemes": {
"aws": {
"type": "aws-s3",
"platform": "https://{bucket}.s3.{region}.amazonaws.com",
"bucket": "mybucket",
"region": "us-west-2",
"requester_pays": true,
"tier": "Standard"
}
},
"auth:schemes": {
"aws": {
"type": "s3"
}
},
"assets": {
"stac-items": {
"title": "STAC Items as GeoParquet",
"href": "s3://mybucket/project/items.parquet",
"type": "application/vnd.apache.parquet",
"storage:refs": [
"aws"
],
"auth:refs": [
"aws"
]
}
},
"links": [
{
"href": "https://example.com/examples/catalog-link.json",
"rel": "self"
},
{
"title": "Documentation",
"href": "s3://mybucket/project/documentation.pdf",
"type": "application/pdf",
"rel": "about",
"storage:refs": [
"aws"
],
"auth:refs": [
"aws"
]
}
],
"extent": {
"spatial": {
"bbox": [
[
-180,
-56,
180,
83
]
]
},
"temporal": {
"interval": [
[
"2015-06-23T00:00:00Z",
null
]
]
}
}
}
Loading