diff --git a/book/_config.yml b/book/_config.yml index ba9b049..d00246d 100644 --- a/book/_config.yml +++ b/book/_config.yml @@ -34,6 +34,7 @@ execute: execute_notebooks: 'force' exclude_patterns: - "**/aviris-ng-data.ipynb" + - "**/earthaccess_icesat2.ipynb" allow_errors: false # Per-cell notebook execution limit (seconds) timeout: 300 diff --git a/book/_toc.yml b/book/_toc.yml index 02fb036..0b4e7d2 100644 --- a/book/_toc.yml +++ b/book/_toc.yml @@ -20,6 +20,14 @@ parts: - file: tutorials/index sections: - file: tutorials/example/tutorial-notebook + - file: tutorials/Data_access/index.md + title: Data Access + sections: + - file: tutorials/Data_access/overview.md + - file: tutorials/Data_access/NSIDC_resources.md + - file: tutorials/Data_access/earthdata_search.md + - file: tutorials/Data_access/earthaccess_snowex.ipynb + - file: tutorials/Data_access/earthaccess_icesat2.ipynb - file: tutorials/albedo/index title: Albedo sections: diff --git a/book/tutorials/Data_access/NSIDC_resources.md b/book/tutorials/Data_access/NSIDC_resources.md new file mode 100644 index 0000000..7e9464b --- /dev/null +++ b/book/tutorials/Data_access/NSIDC_resources.md @@ -0,0 +1,67 @@ +# Exploring NSIDC DAAC resources + +## Learning Objectives + +Explore various resources for learning about and accessing ICESat-2, SnowEx, and other NASA Earthdata. + +Credits: Mikala Beig, Gail Reckase, and the NSIDC DAAC Data Use and Education Team +___ + +Reach out to us with data discovery and access questions! Real people read the emails sent to nsidc@nsidc.org. We are here to help make sure you get the data you need for your analysis. +___ + +## NASA National Snow and Ice Data Center Distributed Active Archive Center (NSIDC DAAC) + +[The National Snow and Ice Data Center](https://nsidc.org) provides over 1100 data sets covering the Earth's cryosphere and more, all of which are available to the public free of charge. Beyond providing these data, NSIDC creates tools for data access, supports data users, performs scientific research, and educates the public about the cryosphere. + +![Map with NASA DAACs](./images/DAAC_map_with_ECS.jpg) + +## Data set documentation, tools, and services at nsidc.org: + +* [The NSIDC ICESat-2 home page](https://nsidc.org/data/icesat-2) provides an overview of the data products and available user resources. + * Associated access, visualization, and data customization tools and services are provided on the [ICESat-2 Tools page](https://nsidc.org/data/icesat-2/tools). +* [The NSIDC SnowEx home page](https://nsidc.org/data/snowex) provides an overview of the data products and available user resources. + * Associated access, visualization, and data customization tools and services are provided on the [SnowEx Tools page](https://nsidc.org/data/snowex/tools). +* Landing pages: Each data set has an associated landing page with citation information, a curated user guide, and support documentation. + * [ATL06 landing page](https://nsidc.org/data/atl06) + * [SNEX23_MAR22_SD landing page](https://nsidc.org/data/snex23_mar22_sd) + +![ATL06 landing page](./images/atl06_landing_page.png) + +## Data Exploration in Earthdata Search + +https://search.earthdata.nasa.gov/search + +Earthdata Search provides a graphical user interface for discovery of NASA data, and ordering and downloading data from its various archive locations. Earthdata Search leverages NASA's [Common Metadata Repository]( https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html) (CMR), a high-performance, high-quality, continuously evolving metadata system that catalogs Earth Science data and associated service metadata records. These metadata records are registered, modified, discovered, and accessed through programmatic interfaces leveraging standard protocols and APIs. + +Key Functions of Earthdata Search: +1. Web mapping interface for discovering and visualizing NASA Earthdata using spatial and temporal filters. +2. Customization services, including spatial subsetting, reformatting, and reprojection for *some* datasets. +3. Data ordering and downloading. + +![SnowEx23 data set in Earthdata Search](./images/EDSC_snowex23.png) + +## ICESat-2 Data Exploration in OpenAltimetry + +https://openaltimetry.earthdatacloud.nasa.gov/data/ + +OpenAltimetry is a cyberinfrastructure platform for discovery, access, and visualization of data from NASA’s ICESat and ICESat-2 missions. The unique data from these missions require a new paradigm for data access and discovery. OpenAltimetry addresses the needs of a diverse scientific community and increases the accessibility and utility of these data for new users. OpenAltimetry is a NASA funded collaborative project between the Scripps Institution of Oceanography, San Diego Supercomputer Center, National Snow and Ice Data Center, and UNAVCO. + +Key Functions of OpenAltimetry: +1. Ground track filtering and visualization +2. On-the-fly plotting of segment elevations and photon clouds based on date and region of interest +3. Access data in CSV or subsetted HDF5 format +4. Plot and analyze photon data from your area of interest using a Jupyter Notebook + +OpenAltimetry tutorials: + +[Presentation on ICESat-2 and OpenAltimetry](https://www.youtube.com/watch?v=gfOGz8kk4VI) by NASA science education team. + +[OpenAltimetry Tutorial](https://www.youtube.com/watch?v=ZanKXh1oQYc) by Walt Meier, NSIDC DAAC Scientist. + +## Data Exploration using the SnowEx SQL Database + +GitHub repository: https://github.com/SnowEx/snowexsql + +## Hackweek Project: SnowEx Data Set Mapping Tool +SnowEx project lead by Jesslyn Di Fiori, NSIDC. This project aims to build a map-based web tool for cross-referencing SnowEx data sets. diff --git a/book/tutorials/Data_access/earthaccess_icesat2.ipynb b/book/tutorials/Data_access/earthaccess_icesat2.ipynb new file mode 100644 index 0000000..3f74b2a --- /dev/null +++ b/book/tutorials/Data_access/earthaccess_icesat2.ipynb @@ -0,0 +1,413 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "76391091-054f-48d6-bc92-2e1c5f3f5024", + "metadata": { + "tags": [], + "user_expressions": [] + }, + "source": [ + "# Using `earthaccess` to Search for, Access and Open ICESat-2 Data in the Cloud\n", + " \n", + "## Tutorial Overview\n", + "\n", + "This notebook demonstrates how to search for, directly access, and work with cloud-hosted ICESat-2 Land Ice Height (ATL06) granules from an Amazon Compute Cloud (EC2) instance using the `earthaccess` package. Data in the \"NASA Earthdata Cloud\" are stored in Amazon Web Services (AWS) Simple Storage Service (S3) Buckets. **Direct Access** is an efficient way to work with data stored in an S3 Bucket when you are working in the cloud. Cloud-hosted granules can be opened and loaded into memory without the need to download them first. This allows you take advantage of the scalability and power of cloud computing. \n", + "\n", + "As an example data collection, we use ICESat-2 Land Ice Height (ATL06) over the Juneau Icefield, AK, for March 2003. ICESat-2 data granules, including ATL06, are stored in HDF5 format. We demonstrate how to open an HDF5 granule and access data variables using `xarray`. Land Ice Heights are then plotted using `hvplot`. \n", + "\n", + "![ATL06 land ice height](./images/atl06_example_plot.png)\n", + "\n", + "We use `earthaccess`, a package developed by Luis Lopez (NSIDC developer) and a community of contributors, to allow easy search of the NASA Common Metadata Repository (CMR) and download of NASA data collections. It can be used for programmatic search and access for both _DAAC-hosted_ and _cloud-hosted_ data. It manages authenticating using Earthdata Login credentials which are then used to obtain the S3 tokens that are needed for S3 direct access. `earthaccess` can be used to find and access both DAAC-hosted and cloud-hosted data in just **three** lines of code. See [https://github.com/nsidc/earthaccess](https://github.com/nsidc/earthaccess).\n", + "\n", + "## Learning Objectives\n", + "\n", + "In this tutorial you will learn: \n", + "1. how to use `earthaccess` to search for ICESat-2 data using spatial and temporal filters and explore the search results; \n", + "2. how to open data granules using direct access to the ICESat-2 S3 bucket; \n", + "3. how to load a HDF5 group into an `xarray.Dataset`; \n", + "4. how visualize the land ice heights using `hvplot`. \n", + "\n", + "## Prerequisites\n", + "\n", + "The workflow described in this tutorial forms the initial steps of an _Analysis in Place_ workflow that would be run on a AWS cloud compute resource. You will need:\n", + "\n", + "1. a JupyterHub, such as CryoHub, or AWS EC2 instance in the us-west-2 region.\n", + "3. a NASA Earthdata Login. If you need to register for an Earthdata Login see the [Getting an Earthdata Login](https://icesat-2-2023.hackweek.io/preliminary/checklist/earthdata.html#getting-an-earthdata-login) section of the ICESat-2 Hackweek 2023 Jupyter Book.\n", + "4. A `.netrc` file, that contains your Earthdata Login credentials, in your home directory. See [Configure Programmatic Access to NASA Servers](https://icesat-2-2023.hackweek.io/preliminary/checklist/earthdata.html#configure-programmatic-access-to-nasa-servers) to create a `.netrc` file.\n", + "\n", + "## Highly Recommended Viewing\n", + "\n", + "[earthaccess NASA Tech Spotlight video recording](https://www.youtube.com/watch?v=EIr3j1_wDc0)\n", + "\n", + "Watch a coding demonstration and learn about the history of earthaccess and the community that supports it.\n", + "\n", + "## Credits\n", + "\n", + "This notebook is based on an [NSIDC Data Tutorial](https://github.com/nsidc/NSIDC-Data-Tutorials) originally created by Luis Lopez and Mikala Beig, NSIDC, modified by Andy Barrett, NSIDC, and updated by Jennifer Roebuck, NSIDC." + ] + }, + { + "cell_type": "markdown", + "id": "b139d27d-fa64-47d2-9863-f5542897915b", + "metadata": { + "user_expressions": [] + }, + "source": [ + "## Computing Environment\n", + "\n", + "The tutorial uses `python` and requires the following packages:\n", + "- `earthaccess`, which enables Earthdata Login authentication and retrieves AWS credentials; enables collection and granule searches; and S3 access;\n", + "- `xarray`, used to load data;\n", + "- `hvplot`, used to visualize land ice height data.\n", + "\n", + "We are going to import the whole `earthaccess` package.\n", + "\n", + "We will also import the whole `xarray` package but use a standard short name `xr`, using the `import as ` syntax. We could use anything for a short name but `xr` is an accepted standard that most `xarray` users are familiar with.\n", + "\n", + "We only need the `xarray` module from `hvplot` so we import that using the `import .` syntax.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3a3659f7-b59c-421e-bc66-d422ce320c32", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "# For searching and accessing NASA data\n", + "import earthaccess\n", + "\n", + "# For reading data, analysis and plotting\n", + "import xarray as xr\n", + "import hvplot.xarray\n", + "\n", + "import pprint # For nice printing of python objects" + ] + }, + { + "cell_type": "markdown", + "id": "15ae2994", + "metadata": {}, + "source": [ + "## Authenticate\n", + "\n", + "The first step is to get the correct authentication to access _cloud-hosted_ ICESat-2 data. This is all done through Earthdata Login. The `login` method also gets the correct AWS credentials.\n", + "\n", + "Login requires your Earthdata Login username and password. The `login` method will automatically search for these credentials as environment variables or in a `.netrc` file, and if those aren't available it will prompt you to enter your username and password. We use a `.netrc` strategy here. A `.netrc` file is a text file located in our home directory that contains login information for remote machines. If you don't have a `.netrc` file, `login` can create one for you.\n", + "\n", + "```\n", + "earthaccess.login(strategy='interactive', persist=True)\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "37d6a667", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "auth = earthaccess.login()" + ] + }, + { + "cell_type": "markdown", + "id": "28d7b582", + "metadata": {}, + "source": [ + "## Search for ICESat-2 Collections\n", + "\n", + "`earthaccess` leverages the Common Metadata Repository (CMR) API to search for collections and granules. [Earthdata Search](https://search.earthdata.nasa.gov/search) also uses the CMR API.\n", + "\n", + "We can use the `search_datasets` method to search for ICESat-2 collections by setting `keyword=\"ICESat-2\"` The argument passed to `keyword` can be any string and can include wildcard characters `?` or `*`.\n", + "\n", + "```{note}\n", + "To see a full list of search parameters you can type `earthaccess.search_datasets?`. Using `?` after a python object displays the `docstring` for that object.\n", + "```\n", + "\n", + "A count of the number of data collections (Datasets) found is given." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4b6131f7-0f3c-4227-9301-618f364dcec6", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "query = earthaccess.search_datasets(\n", + " keyword=\"ICESat-2\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "d3957627", + "metadata": {}, + "source": [ + "In this case, there are 69 datasets that have the keyword ICESat-2. \n", + "\n", + "`search_datasets` returns a python list of `DataCollection` objects. We can view metadata for each collection in long form by passing a `DataCollection` object to print or as a summary using the `summary` method for the `DataCollection` object. Here, I use the `pprint` function to _Pretty Print_ each object." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f54b13d9", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "for collection in query[:10]:\n", + " pprint.pprint(collection.summary(), sort_dicts=True, indent=4)\n", + " print('') # Add a space between collections for readability" + ] + }, + { + "cell_type": "markdown", + "id": "ea86f3e8", + "metadata": {}, + "source": [ + "For each collection, `summary` returns a subset of fields from the collection metadata and Unified Metadata Model (UMM) entry.\n", + "\n", + "- `concept-id` is an unique identifier for the collection that is composed of a alphanumeric code and the provider-id for the DAAC.\n", + "- `short-name` is the name of the dataset that appears on the dataset set landing page. For ICESat-2, `ShortNames` are generally how different products are referred to.\n", + "- `version` is the version of each collection.\n", + "- `file-type` gives information about the file format of the collection files.\n", + "- `get-data` is a collection of URL that can be used to access data, dataset landing pages, and tools. \n", + "\n", + "For _cloud-hosted_ data, there is additional information about the location of the S3 bucket that holds the data and where to get credentials to access the S3 buckets. In general, you don't need to worry about this information because `earthaccess` handles S3 credentials for you. Nevertheless it may be useful for troubleshooting. \n", + "\n", + "```{note}\n", + "In Python, all data are represented by _objects_. These _objects_ contain both data and methods (think functions) that operate on the data. `earthaccess` includes `DataCollection` and `DataGranule` objects that contain data about collections and granules returned by `search_datasets` and `search_data` respectively. If you are familiar with Python, you will see that the data in each `DataCollection` object is organized as a hierarchy of python dictionaries, lists and strings. So if you know the dictionary key for the metadata entry you want you can get that metadata using standard dictionary methods. For example, to get the dataset short name from the example below, you could just use `collection['meta']['concept-id']`. However, in this example the `concept-id' method for the DataCollection object returns the same information. Take a look at https://github.com/nsidc/earthaccess/blob/main/earthaccess/results.py#L80 to see how this is done.\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "b88357e5", + "metadata": {}, + "source": [ + "For the ICESat-2 search results the provider-id is `NSIDC_ECS` and `NSIDC_CPRD`. `NSIDC_ECS` is for collections archived at the NSIDC DAAC and `NSIDC_CPRD` is for the _cloud-hosted_ collections. \n", + "\n", + "For ICESat-2 `short-name` refers to the following products. \n", + "\n", + "| ShortName | Product Description |\n", + "|:-----------:|:---------------------|\n", + "| ATL03 | ATLAS/ICESat-2 L2A Global Geolocated Photon Data |\n", + "| ATL06 | ATLAS/ICESat-2 L3A Land Ice Height |\n", + "| ATL07 | ATLAS/ICESat-2 L3A Sea Ice Height |\n", + "| ATL08 | ATLAS/ICESat-2 L3A Land and Vegetation Height |\n", + "| ATL09 | ATLAS/ICESat-2 L3A Calibrated Backscatter Profiles and Atmospheric Layer Characteristics |\n", + "| ATL10 | ATLAS/ICESat-2 L3A Sea Ice Freeboard |\n", + "| ATL11 | ATLAS/ICESat-2 L3B Slope-Corrected Land Ice Height Time Series |\n", + "| ATL12 | ATLAS/ICESat-2 L3A Ocean Surface Height |\n", + "| ATL13 | ATLAS/ICESat-2 L3A Along Track Inland Surface Water Data |" + ] + }, + { + "cell_type": "markdown", + "id": "fc62d6f6", + "metadata": {}, + "source": [ + "### Search for cloud-hosted data\n", + "\n", + "If you only want to search for data in the cloud, you can set `cloud_hosted=True`. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "322d78c3", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "Query = earthaccess.search_datasets(\n", + " keyword = 'ICESat-2',\n", + " cloud_hosted = True,\n", + ")\n" + ] + }, + { + "cell_type": "markdown", + "id": "8df10797", + "metadata": {}, + "source": [ + "## Search a data set using spatial and temporal filters \n", + "\n", + "Once, you have identified the dataset you want to work with, you can use the `search_data` method to search a data set with spatial and temporal filters. As an example, we'll search for ATL06 granules over the Juneau Icefield, AK, for March and April 2020.\n", + "\n", + "Either `concept-id` or `short-name` can be used to search for granules from a particular dataset. If you use `short-name` you also need to set `version`. If you use `concept-id`, this is all that is required because `concept-id` is unique. \n", + "\n", + "The temporal range is identified with standard date strings, and latitude-longitude corners of a bounding box is specified. Polygons and points, as well as shapefiles can also be specified.\n", + "\n", + "This will display the number of granules that match our search. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5fba5c34", + "metadata": {}, + "outputs": [], + "source": [ + "results = earthaccess.search_data(\n", + " short_name = 'ATL06',\n", + " version = '006',\n", + " cloud_hosted = True,\n", + " bounding_box = (-134.7,58.9,-133.9,59.2),\n", + " temporal = ('2020-03-01','2020-04-30'),\n", + " count = 100\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "df899e61-77cd-461f-8b38-4a9e81688164", + "metadata": {}, + "outputs": [], + "source": [ + "results = earthaccess.search_data(\n", + " concept_id = 'C2564427300-NSIDC_ECS',\n", + " cloud_hosted = True,\n", + " bounding_box = (-134.7,58.9,-133.9,59.2),\n", + " temporal = ('2020-03-01','2020-04-30'),\n", + " count = 100\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "a7bc1b37", + "metadata": {}, + "source": [ + "We'll get metadata for these 4 granules and display it. The rendered metadata shows a download link, granule size and two images of the data.\n", + "\n", + "The download link is `https` and can be used download the granule to your local machine. This is similar to downloading _DAAC-hosted_ data but in this case the data are coming from the Earthdata Cloud. For NASA data in the Earthdata Cloud, there is no charge to the user for egress from AWS Cloud servers. This is not the case for other data in the cloud." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a04370d3", + "metadata": {}, + "outputs": [], + "source": [ + "[display(r) for r in results]" + ] + }, + { + "cell_type": "markdown", + "id": "810da59e", + "metadata": { + "tags": [] + }, + "source": [ + "## Use Direct-Access to open, load and display data stored on S3\n", + "\n", + "Direct-access to data from an S3 bucket is a two step process. First, the files are opened using the `open` method. This first step creates a Python file-like object that is used to load the data in the second step. \n", + "\n", + "Authentication is required for this step. The `auth` object created at the start of the notebook is used to provide Earthdata Login authentication and AWS credentials \"_behind-the-scenes_\". These credentials expire after one hour so the `auth` object must be executed within that time window prior to these next steps. \n", + "\n", + "```{note}\n", + "The `open` step to create a file-like object is required because AWS S3, and other cloud storage systems, use object storage but most HDF5 libraries work with POSIX-compliant file systems. POSIX stands for Portable Operating System Interface for Unix and is a set of guidelines that include how to interact with files and file systems. Linux, Unix, MacOS (which is Unix-like), and Windows are POSIX-compliant. Critically, POSIX-compliant systems allows blocks of bytes, or individual bytes to be read from a file. With object storage the whole file has to be read. To get around this limitation, an intermediary is used, in this case `s3fs`. This intermediary creates a local POSIX-compliant virtual file system. S3 objects are loaded into this virtual file system so they can be accessed using POSIX-style file functions.\n", + "```\n", + "\n", + "In this example, data are loaded into an `xarray.Dataset`. Data could be read into `numpy` arrays or a `pandas.Dataframe`. However, each granule would have to be read using a package that reads HDF5 granules such as `h5py`. `xarray` does this all _under-the-hood_ in a single line but only for a single group in the HDF5 granule, in this case land ice heights for the gt1l beam*.\n", + "\n", + "*ICESat-2 measures photon returns from 3 beam pairs numbered 1, 2 and 3 that each consist of a left and a right beam" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "11205bbb", + "metadata": {}, + "outputs": [], + "source": [ + "%time\n", + "files = earthaccess.open(results)\n", + "ds = xr.open_dataset(files[1], group='/gt1l/land_ice_segments')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "75881751", + "metadata": {}, + "outputs": [], + "source": [ + "ds" + ] + }, + { + "cell_type": "markdown", + "id": "1282ce34", + "metadata": {}, + "source": [ + "`hvplot` is an interactive plotting tool that is useful for exploring data." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "be7386c3", + "metadata": {}, + "outputs": [], + "source": [ + "ds['h_li'].hvplot(kind='scatter', s=2)" + ] + }, + { + "cell_type": "markdown", + "id": "000faab1-147a-435c-aa58-e11331cbd434", + "metadata": { + "user_expressions": [] + }, + "source": [ + "## Additional resources\n", + "\n", + "For general information about NSIDC DAAC data in the Earthdata Cloud: \n", + "\n", + "[FAQs About NSIDC DAAC's Earthdata Cloud Migration](https://nsidc.org/data/user-resources/help-center/faqs-about-nsidc-daacs-earthdata-cloud-migration)\n", + "\n", + "\n", + "Additional tutorials and How Tos:\n", + "\n", + "[NASA Earthdata Cloud Cookbook](https://nasa-openscapes.github.io/earthdata-cloud-cookbook/)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/book/tutorials/Data_access/earthaccess_snowex.ipynb b/book/tutorials/Data_access/earthaccess_snowex.ipynb new file mode 100644 index 0000000..69eb0ff --- /dev/null +++ b/book/tutorials/Data_access/earthaccess_snowex.ipynb @@ -0,0 +1,281 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Using `earthaccess` to Search for and Download SnowEx Data from the NSIDC DAAC Server\n", + " \n", + "## Tutorial Overview\n", + "\n", + "This notebook demonstrates how to search for and download NSIDC DAAC server hosted SnowEx 2023 data using the `earthaccess` package. SnowEx mission data have not yet migrated to the cloud and continue to be hosted at the NSIDC DAAC server. \n", + "\n", + "As an example data collection, we use SnowEx23 Mar23 IOP Snow Depth Measurements, Version 1 (Snex23_MAR23_SD) over the Alaska field sites. The data are stored in csv format with a metadata-rich header. \n", + "\n", + "We use `earthaccess`, an open source package developed by Luis Lopez (NSIDC developer) and a community of contributors, to allow easy search of the NASA Common Metadata Repository (CMR) and download of NASA data collections. It can be used for programmatic search and access for both _DAAC-hosted_ and _cloud-hosted_ data. It manages authenticating using Earthdata Login credentials. `earthaccess` can be used to find and access both DAAC-hosted and cloud-hosted data in just **three** lines of code. See [https://github.com/nsidc/earthaccess](https://github.com/nsidc/earthaccess).\n", + "\n", + "## Learning Objectives\n", + "\n", + "In this tutorial you will learn: \n", + "1. how to use `earthaccess` to search for SnowEx data using a spatial filter and explore the search results; \n", + "2. how to download data granules to your hub space or local machine. \n", + "\n", + "## Prerequisites\n", + "\n", + "The workflow described in this tutorial forms the initial steps of a _Download Model_ workflow that could be run on your local machine or on a AWS cloud compute resource. You will need:\n", + "\n", + "1. a NASA Earthdata Login. If you need to register for an Earthdata Login see the [Getting an Earthdata Login](https://icesat-2-2023.hackweek.io/preliminary/checklist/earthdata.html#getting-an-earthdata-login) section of the ICESat-2 Hackweek 2023 Jupyter Book.\n", + "\n", + "## Highly Recommended Viewing\n", + "\n", + "[earthaccess NASA Tech Spotlight video recording](https://www.youtube.com/watch?v=EIr3j1_wDc0)\n", + "\n", + "Watch a coding demonstration and learn about the history of earthaccess and the community that supports it.\n", + " \n", + "## Credits\n", + "\n", + "This notebook is based on an [NSIDC Data Tutorial](https://github.com/nsidc/NSIDC-Data-Tutorials) originally created by Luis Lopez and Mikala Beig, NSIDC, modified by Andy Barrett, NSIDC, and updated by Jennifer Roebuck and Gail Reckase, NSIDC." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Programmatic data access via earthaccess\n", + "\n", + "[earthacces](https://earthaccess.readthedocs.io/en/latest) is a software library that provides an easy to use Python wrapper in front of the CMR and NASA DAAC API's. Strengths of earthdata include: \n", + "1. cross-DAAC data access (it's not specific to NSIDC). \n", + "2. Access to DAAC server and cloud hosted data. \n", + "3. Easy authentication handling for Earthdata Login and AWS access keys and token.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# For searching NASA data\n", + "import earthaccess\n", + "\n", + "# For nice printing of Python objects\n", + "import pprint" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 1: Authenticate\n", + "\n", + "The first step is to get the correct authentication that will allow us to access NASA Earthdata. This is all done through Earthdata Login. For accessing data that has been migrated to NASA's Earthdata Cloud environment the `login` method also retrieves and handles the correct AWS credentials.\n", + "\n", + "Login requires your Earthdata Login username and password. The `login` method will automatically search for these credentials as environment variables or in a `.netrc` file, and if those aren't available it will prompt us to enter our username and password. We use a `.netrc` strategy. A `.netrc` file is a text file located in our home directory that contains login information for remote machines. If we don't have a `.netrc` file, `login` can create one for us.\n", + "\n", + "```\n", + "earthaccess.login(strategy='interactive', persist=True)\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "auth = earthaccess.login()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 2: Search for Collections (Data sets)\n", + "\n", + "`earthaccess` leverages the Common Metadata Repository (CMR) API to search for collections and granules. [Earthdata Search](https://search.earthdata.nasa.gov/search) also uses the CMR API.\n", + "\n", + "We can use the `search_datasets` method to search for SnowEx collections by setting `keyword='SnowEx'`.\n", + "\n", + "This will display the number of data collections (data sets) that meet this search criteria." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "Query = earthaccess.search_datasets(keyword = 'SnowEx', provider = 'NSIDC_ECS')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The `search_datasets` method returns a python list of `DataCollection` objects. We can view the metadata for each collection in long form by passing a `DataCollection` object to print or as a summary using the `summary` method. We can also use the `pprint` function to Pretty Print each object.\n", + "\n", + "We will do this for the first 10 results (objects)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "for collection in Query[:10]:\n", + " pprint.pprint(collection.summary(), sort_dicts=True, indent=4)\n", + " print('')\n", + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "For each collection, `summary` returns a subset of fields from the collection metadata and the Unified Metadata Model (UMM):\n", + "- `concept-id` is a unique id for the collection. It consists of an alphanumeric code and the provider-id specific to the DAAC (Distributed Active Archive Center). You can use the `concept_id` to search for data granules.\n", + "- `short_name` is a quick way of referring to a collection (instead of using the full title). It can be found on the collection landing page underneath the collection title after 'DATA SET ID'. See the table below for a list of the shortnames for ICESat-2 collections.\n", + "- `version` is the version of each collection.\n", + "- `file-type` gives information about the file format of the collection granules.\n", + "- `get-data` is a collection of URLs that can be used to access the data, collection landing pages and data tools. \n", + "- `cloud-info` this is for cloud-hosted data and provides additional information about the location of the S3 bucket that holds the data and where to get temporary AWS S3 credentials to access the S3 buckets. `earthaccess` handles these credentials and the links to the S3 buckets, so in general you won't need to worry about this information. \n", + "\n", + "Notice that the `concept-id` contains information about the locatio of the data: `NSIDC_ECS` and `NSIDC_CPRD`. `NSIDC_ECS` refers to the NSIDC local server and `NSIDC_CPRD` refers to NSIDC's _Earthatcloud-hosted_ collections, which you will see for the ICESat-2 collections currently in the cloud. \n", + "\n", + "`concept-id's` are unique identifiers for collections. Alternatively, you can specify a `short-name` (and `version` if there are multiple public versions of a data set).\n", + "\n", + "For SnowEx, `ShortNames` are generally how different products are referred to.\n", + "\n", + "| ShortName | Product Description |\n", + "|:--------------|:---------------------|\n", + "| SNEX21_TS_SP | SnowEx21 Time Series Snow Pits |\n", + "| SNEX23_SSA_SO | SnowEx23 Laser Snow Microstructure Specific Surface Area Snow-off Data |\n", + "| SNEX23_Lidar | SnowEx23 Airborne Lidar-Derived 0.25M Snow Depth and Canopy Height|\n", + "| SNEX23_Lidar_Raw | SnowEx23 Airborne Lidar Scans Raw|\n", + "| SNEX23_BCEF_TLS | SnowEx23 Bonanza Creek Experimental Forest Terrestrial Lidar Scans |\n", + "| SNEX23_BCEF_TLS_Raw | SnowEx23 Bonanza Creek Experimental Forest Terrestrial Lidar Scans Raw|\n", + "| SNEX23_SWE | SnowEx23 Snow Water Equivalent |\n", + "| SNEX23_MAR23_SD | SnowEx23 Mar23 IOP Community Snow Depth Measurements |\n", + "| SNEX23_SSA | SnowEx23 Laser Snow Microstructure Specific Surface Area Data |\n", + "| SNEX23_UW_GPR | SnowEx23 University of Wyoming Ground Penetrating Radar" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Narrow Search using a Spatial Filter\n", + "\n", + "Here we are going to use a bounding box for the Alaska study areas to find SnowEx collections." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "Query = earthaccess.search_datasets( \n", + " keyword = 'SnowEx',\n", + " bounding_box = (-149.597,64.699,-147.49,70.085),\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Eleven datasets were found within this bounding box.
As we did above, we will make a list of the datasets within this bounding box." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "for collection in Query[:11]:\n", + " pprint.pprint(collection.summary(), sort_dicts=True, indent=4)\n", + " print('')\n", + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 3: Search for data granules (files)\n", + "First we will search for the SnowEx 23 Mar23 IOP Snow Depth Measurement collection using its short name: Snex23_Mar23_SD. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "results = earthaccess.search_data(\n", + " short_name='Snex23_MAR23_SD',\n", + " bounding_box=(-149.597,64.699,-147.49,70.085)\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "results" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Step 4: Download Data Locally\n", + "For this section, we will download the granule above from the SnowEx23 Mar23 IOP Snow Depth Measurements collection locally.
\n", + "\n", + "We'll download the file into a separate folder named \"tmp\", which will be created for us, if it doesn't already exist." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "downloaded_files = earthaccess.download(\n", + " results,\n", + " local_path='/tmp',\n", + ")" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.9" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/book/tutorials/Data_access/earthdata_search.md b/book/tutorials/Data_access/earthdata_search.md new file mode 100644 index 0000000..f89ee72 --- /dev/null +++ b/book/tutorials/Data_access/earthdata_search.md @@ -0,0 +1,56 @@ +# Using NASA Earthdata Search to Discover Data + +## Learning Objective + +In this tutorial you will learn how to: + +- discover and download DAAC server hosted datasets using NASA Earthdata Search. +- discover cloud-hosted datasets using NASA Earthdata Search. +- get AWS S3 credentials so you can access Earthdata data +- get the S3 links to data granules + +## Prerequisites + +- an Earthdata Login + +## Overview + +NASA Earthdata Search is a web-based tool to discover, filter, visualize and access all of NASA's Earth science data, both in Earthdata Cloud and archived at the NASA DAACs. It is a useful first step in data discovery, especially if you are not sure what data is available for your research problem. + +## Searching for and downloading NASA DAAC server hosted data using Earthdata Search + +Follow the instructions in this NSIDC help article: [Search, Order, and Customize NSIDC DAAC Data Using NASA Earthdata Search](https://nsidc.org/data/user-resources/help-center/search-order-and-customize-nsidc-daac-data-nasa-earthdata-search). + +## Searching for cloud-hosted data using Earthdata Search + +### Search for Data + +Step 1. Go to https://search.earthdata.nasa.gov and log in using your Earthdata Login credentials by clicking on the Earthdata Login button in the top-right corner. + +Step 2. Check the **Available in Earthdata Cloud** box in the **Filter Collections** side-bar on the left of the page (Box 1 on the screenshot below). The Matching Collections will appear in the results box. All datasets in Earthdata Cloud have a badge showing a cloud symbol and "Earthdata Cloud" next to them. To narrow the search, we will filter by datasets supported by NSIDC, by typing NSIDC in the search box (Box 2 on the screen shot below). If you wanted, you could narrow the search further using spatial and temporal filters, or any of the other filters in the filter collections box. + +![Search for Cloud Datasets in Earthdata Search](./images/Screenshot_EDSC_Searching_Cloud_Datasets.png) + +Step 3. You can now select the dataset you want by clicking on that dataset. The Search Results box now contains granules that match you search. The location of these granules is shown on the map. The search can be refined using spatial and temporal filters or you can select individual granules using the "+" symbol on each granule search result. Once you have the data you want, click the **Download All** (Box 1 in the screenshot below). In the sidebar that appears, select **Direct Download** (Box 2 in the screenshot below). Then select **Download Data**. + +![Getting S3 links](./images/Screenshot_EDSC_getting_s3_links_workflow.png) + + +### Getting https download urls, S3 links, and AWS S3 Credentials + +Step 4. A Download Status window will appear (this may take a short amount of time) similar to the one shown below. NASA will continue to support free download of data via https links. Simply click the https link to initiate a single file download, click the "Download Files" button to download the entire list, or go to the Download Script tab and use the provided download script. + +![Getting hpttps download urls](./images/EDSC_https_download_urls.png) + +Step 5. For direct S3 access, you will need the S3 object links. You will see a tab for **AWS S3 Access** (Box 1 in the screenshot below). Select this tab. A list of S3 links starting with `s3://` will be in the box below. You can save them to a text file or copy them to your clipboard using the **Save** and **Copy** buttons (Box 2 in the screenshot below). Or you can copy each link separately by hovering over a link and clicking the clipboard icon (Box 3). + +Step 6. For direct S3 access to data in Earthdata Cloud, you need AWS S3 credentials; “accessKeyId”, “secretAccessKey”, and “sessionToken”. These are temporary credentials that last for one hour. To get them click on the **Get AWS S3 Credentials** (Box 4 in the screenshot below). This will open a new page that contains the three credentials. + +![S3 links and AWS credentials](./images/Screenshot_EDSC_S3_links_credentials.png) + +You now have the information you need to access data in NASA Earthdata Cloud. + +## Next up! Using the earthaccess Python library to access data. +earthaccess can access data from NASA DAAC servers, as well as NASA Earthdata Cloud, and *you don't need to worry about S3 links or credentials - it's all handled for you.* + + diff --git a/book/tutorials/Data_access/images/DAAC_map_with_ECS.jpg b/book/tutorials/Data_access/images/DAAC_map_with_ECS.jpg new file mode 100644 index 0000000..3a0e5c8 Binary files /dev/null and b/book/tutorials/Data_access/images/DAAC_map_with_ECS.jpg differ diff --git a/book/tutorials/Data_access/images/EDSC_https_download_urls.png b/book/tutorials/Data_access/images/EDSC_https_download_urls.png new file mode 100644 index 0000000..91d2861 Binary files /dev/null and b/book/tutorials/Data_access/images/EDSC_https_download_urls.png differ diff --git a/book/tutorials/Data_access/images/EDSC_snowex23.png b/book/tutorials/Data_access/images/EDSC_snowex23.png new file mode 100644 index 0000000..3d41a96 Binary files /dev/null and b/book/tutorials/Data_access/images/EDSC_snowex23.png differ diff --git a/book/tutorials/Data_access/images/Screenshot_EDSC_S3_links_credentials.png b/book/tutorials/Data_access/images/Screenshot_EDSC_S3_links_credentials.png new file mode 100644 index 0000000..cc9bef2 Binary files /dev/null and b/book/tutorials/Data_access/images/Screenshot_EDSC_S3_links_credentials.png differ diff --git a/book/tutorials/Data_access/images/Screenshot_EDSC_Searching_Cloud_Datasets.png b/book/tutorials/Data_access/images/Screenshot_EDSC_Searching_Cloud_Datasets.png new file mode 100644 index 0000000..832d41e Binary files /dev/null and b/book/tutorials/Data_access/images/Screenshot_EDSC_Searching_Cloud_Datasets.png differ diff --git a/book/tutorials/Data_access/images/Screenshot_EDSC_getting_s3_links_workflow.png b/book/tutorials/Data_access/images/Screenshot_EDSC_getting_s3_links_workflow.png new file mode 100644 index 0000000..5eb78ee Binary files /dev/null and b/book/tutorials/Data_access/images/Screenshot_EDSC_getting_s3_links_workflow.png differ diff --git a/book/tutorials/Data_access/images/atl06_example_plot.png b/book/tutorials/Data_access/images/atl06_example_plot.png new file mode 100644 index 0000000..87e6764 Binary files /dev/null and b/book/tutorials/Data_access/images/atl06_example_plot.png differ diff --git a/book/tutorials/Data_access/images/atl06_landing_page.png b/book/tutorials/Data_access/images/atl06_landing_page.png new file mode 100644 index 0000000..ab907f5 Binary files /dev/null and b/book/tutorials/Data_access/images/atl06_landing_page.png differ diff --git a/book/tutorials/Data_access/images/data_provider_cheat_sheet.png b/book/tutorials/Data_access/images/data_provider_cheat_sheet.png new file mode 100644 index 0000000..49077ca Binary files /dev/null and b/book/tutorials/Data_access/images/data_provider_cheat_sheet.png differ diff --git a/book/tutorials/Data_access/images/discovery_and_access_methods.png b/book/tutorials/Data_access/images/discovery_and_access_methods.png new file mode 100644 index 0000000..82783d3 Binary files /dev/null and b/book/tutorials/Data_access/images/discovery_and_access_methods.png differ diff --git a/book/tutorials/Data_access/images/nsidc_logo.png b/book/tutorials/Data_access/images/nsidc_logo.png new file mode 100644 index 0000000..35a06e0 Binary files /dev/null and b/book/tutorials/Data_access/images/nsidc_logo.png differ diff --git a/book/tutorials/Data_access/index.md b/book/tutorials/Data_access/index.md new file mode 100644 index 0000000..96ac0e5 --- /dev/null +++ b/book/tutorials/Data_access/index.md @@ -0,0 +1,11 @@ +# Data Access and Formats + +There is a host of different ways to access NASA Earth observing data. This section provides a map to a selection of data access methods and user resources. At the hackweek, the focus is on accessing data from a cloud compute instance in AWS us-west-2. However, some of the access methods and tools can be used from a local machine as well. + +The tutorials are organized as follows: + +- [Overview](overview.md) provides an introduction to data access tools and offers some guidance on the capabilities and applicability of the different tools. +- [NSIDC DAAC and NASA Resources](NSIDC_resources.md) explores various resources for learning about and accessing ICESat-2, SnowEx, and other NASA Earthdata. +- [Using NASA EarthData Search to Discover Cloud-Hosted Data](earthdata_search.md) describes how to use the Earthdata Search GUI search interface for NASA data. This is probably the simplest way to search for ICESat-2, SnowEx and other NASA data. +- [Using `earthaccess` to Search for, Access and Download SnowEx Data in the Cloud](earthaccess_snowex.ipynb) presents a Python package to search for NASA datasets and granules, and to download those granules. +- [Using `earthaccess` to Search for, Access and Open ICESat-2 Data in the Cloud](earthaccess_icesat2.ipynb) presents a Python package to search for NASA datasets and granules, and to download those granules, or, if you are in AWS `us-west-2` cloud compute instance, open data files directly from cloud storage. diff --git a/book/tutorials/Data_access/overview.md b/book/tutorials/Data_access/overview.md new file mode 100644 index 0000000..8c21897 --- /dev/null +++ b/book/tutorials/Data_access/overview.md @@ -0,0 +1,48 @@ +# Data Discovery and Access: Overview + +## Learning Outcomes + +The purpose of this overview is to introduce some of the data search and access options for ICESat-2 and other NASA data. + +## Prerequisites + +None +## Credits + +Andy Barrett, NSIDC DAAC + +## Modes of Data Access + +In the past, most of our scientific data analysis workflows have started with searching for data and then downloading that data to a local machine; whether that is the hard drive of your laptop or workstation, or some shared storage device hosted by your institution or research group. This can be a time consuming process if the volume of data is large, even with fast internet. It also requires that you have sufficient disk-space. If you want to work with data from different geoscience domains, you may have to download data from several data centers. I'll call this data access mode, the **download model** of data access. + +However, a change is a-foot. New modes of data access are starting to becoming available. Driven by the growth in the volume of data from future satellite missions, the archiving and distribution of NASA data is in a [state of transition](https://www.earthdata.nasa.gov/eosdis/cloud-evolution). Over the next few years, all NASA data will be migrated to the NASA Earthdata Cloud, a cloud-hosted data store that will have all NASA datasets in one place. This not only offers new modes of accessing NASA data but also offers new ways of working with this data. As with Google Docs or Sheets, data in these "files" is not just stored in the cloud but compute resources offered by cloud providers allow you to process and analyze the data in the cloud. When you edit your Google Doc or Sheet, you are working in the cloud not on your computer. All you need is a web browser; you can work with these files on your laptop, tablet or even your phone. If you choose to share these documents with others, they can actively collaborate with you on the same document also in the cloud. For large geoscience datasets, this means you can _skip the download_ and take your _analysis to the data_. I'll call this data access mode **analysis in place**. + +A third mode of access can be considered a hybrid of the **download model** and **analysis in place**. I'll call this **data as a service**. Often, we only need a subset of the data in a file. Data for a select spatial region or time period, or only one or two variables. Web-services like SlideRule have subsetting built-in to most of their APIs; and protocols such as OpenDAP have allowed subsetting for a long time. By using cloud compute resources, we allow software to be run as a service to access and process cloud-hosted data, and then serve the processed data to a user as a subsetted and aggregated file. + +During this transition period, data will be available from both the NASA DAACs (Distributed Active Archive Centers) that have archived and distributed data for over 20 years; and from cloud-hosted storage known as the Earthdata Cloud as data sets are migrated. ICESat-2 data sets were some of the first data to be migrated to the cloud. All Level-2 (e.g. ATL03 and beyond) ICESat-2 datasets are available in Earthdata Cloud. + +"The Cloud" is a somewhat nebulous term (pun intended). In general, the cloud is a network of remote servers that run software and services that are accessed over the internet. There is a growing number of commercial cloud providers (Google Cloud Services, Amazon Web Services, Microsoft Azure). NASA has contracted with Amazon Web Services (AWS) to host data using the AWS Simple Storage Service (S3). AWS offers a large number of services in addition to S3 storage. A key service is Amazon Elastic Compute Cloud (Amazon EC2). This is the service that is _under-the-hood_ of the CryoCloud JupyterHub you are using this Hackweek. When you start a JupyterHub, an EC2 _instance_ is started. You can think of an EC2 _instance_ as a remote computer. + +AWS has the concept of a region, which is a cluster of data centers. These data centers house the servers that run S3 and EC2 instances. NASA Earthdata Cloud is hosted in the us-west-2 region. This is important because if your EC2 instance is in the same region as the Earthdata Cloud S3 storage, you can access data in S3 directly in a way that is analogous to accessing a file on your laptop's or workstation's hard drive. This is one of the key advantages of working in the cloud; you can do analysis where the data is stored without having to download the data to a local machine. + + +```{table} Data Access Method and Tools +:name: data-access-overview-table + +| | `icepyx` | `earthaccess` | Sliderule | OpenAltimetry | NASA Earthdata Search | NSIDC data product pages | +|:--- |:---:|:---:|:---:|:---:|:---:|:---:| +| Filter Spatially using: | | | | | | | +| Interactive map widget | | | x | x | x | x | +| Bounding Box | x | x | x | x | x | x | +| Polygon | x | x | x | | x | x | +| GeoJSON or Shapefile | x | | x | | x | x | +| Filter by time and date | x | x | x | x | x | x | +| Preview data | x | x | | x | x | x | +| Download data from DAAC | x | x | | x | x | x | +| Access cloud-hosted data | x | x | x | | x | | +| All ICESat-2 data | x | x | | | x | x | +| Subset (spatially, temporally, by variable) | x | | x | x | _x_ | | +| Load data by direct-access | x | x | x | | | | +| Process and analyze data | | | x | | | | +| Plot data with built-in methods | x | | x | x | | | +``` diff --git a/conda/README.md b/conda/README.md index 12e37e0..935d71b 100644 --- a/conda/README.md +++ b/conda/README.md @@ -1,6 +1,6 @@ # Conda environment management -**The only file you should need to edit in this folder is `conda/environment.yml`. This file defines the set of conda-packages needed to render the full website.** +**The only file you should need to manually edit in this folder is `conda/environment.yml`. This file defines the set of conda-packages needed to render the full website.** Although we refer to "conda" environments, we recommend using [mamba](https://github.com/mamba-org/mamba) as a drop in replacement for the `conda` package manager. Mamba performs operations in parallel, which we've found to be important for creating complex hackweek environments involving many packages!