Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For a given float WMO, a new class to easily open netcdf dataset for local and remote GDAC #429

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

gmaze
Copy link
Member

@gmaze gmaze commented Jan 8, 2025

This PR shall provides a new float store to delegate to argopy all Argo netcdf files access/read methods, for local and remote GDAC content.

This new data store, ArgoFloat, aims to provide methods for third party libraries, or operator and expert workflow, to access local, or remote, Argo GDAC netcdf files without the burden of transfer protocol and GDAC paths handling.

Online API:

from argopy import ArgoFloat
ds = ArgoFloat(6903091).open_dataset('prof')

More detailed Expected API:

from argopy import ArgoFloat
WMO = 6903091

af = ArgoFloat(WMO)  # Use argopy 'gdac' option by default
af = ArgoFloat(WMO, host='/home/ref-argo/gdac')  # Use your local GDAC copy
af = ArgoFloat(WMO, host='https')  # Shortcut for https://data-argo.ifremer.fr
af = ArgoFloat(WMO, host='ftp')    # shortcut for ftp://ftp.ifremer.fr/ifremer/argo
af = ArgoFloat(WMO, host='s3')     # Shortcut for s3://argo-gdac-sandbox/pub

# Load any netcdf files from this float:
ds = af.open_dataset('meta') # load <WMO>_meta.nc
ds = af.open_dataset('prof') # load <WMO>_prof.nc
ds = af.open_dataset('tech') # load <WMO>_tech.nc
ds = af.open_dataset('Rtraj') # load <WMO>_Rtraj.nc

# List all available datasets for this float:
af.list_dataset()

Once merged, #385 and #423 will provide lazy methods for this class.

This new class shall provides much more attributes and methods. The API design is laid out in here: https://gist.github.com/gmaze/6924fc26405f42aa36fd5a22a64657ea

Online documentation can be found here:
https://argopy--429.org.readthedocs.build/en/429/generated/argopy.ArgoFloat.html

@gmaze gmaze added the enhancement New feature or request label Jan 8, 2025
@gmaze gmaze self-assigned this Jan 9, 2025
@gmaze gmaze requested review from quai20 and CKermabon January 9, 2025 14:40
@gmaze gmaze marked this pull request as ready for review January 9, 2025 16:18
@CKermabon
Copy link

J'ai commencé à tester (sous python 3.10.13).
J'ai un répertoire qui contient uniquement des sous-répertoires WMO.
Dans ce cas, la commande 'af = ArgoFloat(WMO, host='/Users/chemon/ARGO_NEW/NEW_LOCODOX/data_test')'
donne le message d'erreur
:
"GdacPathError: Index file does not exist: /Users/chemon/ARGO_NEW/NEW_LOCODOX/data_test/ar_index_global_prof.txt"

Après discussion avec Kevin, c'est probablement car mon répertoire local n'est pas un répertoire 'GDAC' compatible
Effectivement, si je force mon répertoire local sur le répertoire gdac de lIfremer, ca fonctionne ....

Sinon, ds = af.open_dataset('Sprof') donne un message d'erreur :
ValueError: found the following matches with the input file in xarray's IO backends: ['h5netcdf']. But their dependencies may not be installed, see:
https://docs.xarray.dev/en/stable/user-guide/io.html
https://docs.xarray.dev/en/stable/getting-started-guide/installing.html
J'ai ce message uniquement pour le fichier Sprof.
A noter que si j'ouvre ce même fichier avec xr.open_dataset avec l'option engine='argo', ca fonctionne ...
Pour les autres fichiers (tech/meta/prof/Rtraj), que j'ouvre avec ArgoFloat ou avec xr.open_dataset(engine='argo'), les dataset resultat sont identiques ...
J'ai testé 4 methodes (toutes sauf s3. Il faudrait que j'installe s3fs).
Pour le fichier tech/meta/prof/Rtraj, c'est OK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants