Feedback from MAST #54

astrojimig · 2024-05-31T14:36:24Z

Hello,

First of all, thank you for your continuing efforts curating this resource! I recently hosted a discussion with the MAST team here at STScI about this document, and have consolidated the feedback to present to you here. The following comments contain the feedback and thoughts from a number of scientists and developers at MAST, and are not solely my own. I can put you in touch with the relevant person if you have follow-up questions about any particular point.

Again, thank you so much for putting this together. We greatly appreciate the guidance, and look forward for a future of FAIR astronomical data!

Contact info:
Julie Imig ([email protected])
Susan Mullally (PI of MAST: [email protected])

Feedback organized by section:

Introduction:

Suggestion to include a ~1 sentence history of the FAIR principles, emphasizing that they pre-date SPD-41a and are not NASA-specific. There was some initial confusion about what the FAIR principles are and how they relate to SPD-41a, so it would be good to clarify that with more context.
The statement "Unless noted, nothing in this document is a formal requirement" seems in contradiction with the directives in SPD-41a, and needs more clarification. It would be good to add detail here and/or links to other resources clarifying exactly what is and isn't covered by the formal requirements of SPD-41a in this document. This is particularly needed for the points that might require more resources/funding to implement.
FAIRness is also a spectrum - data can meet a majority of the principles, but still not be completely FAIR. Is the goal to make all data 100% FAIR, or is it good enough to make data "as fair as possible" within the context of what resources are available and discipline-specific standards?

General Resources for Assessing FAIRness:

We echo the interest in seeing the development of NASA-specific and division-specific metrics for assessing FAIRness! It might be good to note that many of the resources for assessing FAIRness listed here, while a good starting point, seem too general to give an accurate assessment in the appropriate context (example: fits files are a widely used standard in astronomy, but the F-UJI tool does not recognize that).
After the statement "we recognized that we would need to develop a NASA specific metric...has formed a task force to develop this metric", it would be great if you could add more detail, including what such tools might look like (will it be a document, a codebase, etc?) and an anticipated timeline for when such tools might be available.

Findable:

F1: Suggest adding more detail about what constitutes a "globally unique and persistent identifier". Not every dataset at MAST is born with a DOI; we do have unique URIs for every dataset, but it's not clear if those are sufficient to meet this standard. Would there be a benefit to creating a large high-level DOI for the entire Hubble Mission Archive or for the JWST Mission Archive? Is it recommended to create DOIs for the resources and documentation that contain the data, or only for the data itself?
F4: If the primary audience for this document is the data repositories, then relevant "searchable resource" would be the the services we provide in many cases. How do we ensure that our search interfaces comply with this standard?
F4: In "First Steps", suggestion to encourage cross-archive coordination and communication within divisions for the goal of "...aggregate metadata from multiple repositories"
Registering DOIs - the recommendation to "Register data DOIs with DataCite" would require resources/money for membership, as would some of the others. Suggestion to clarify what is required here under SPD-41a.

Accessible:

This is out of the scope of the FAIR principles, which are more machine-focused, but it would be good to emphasize as part of open science to strive for human-accessibility as well. There is a difference between making data available and accessible, and there may be additional steps beyond what is listed by the FAIR principles to ensure data is usable by a wide audience. For example, there doesn't appear to be any details in this document about making data documentation and webpages more accessible, which would be good to include additional resources and best practices for. One example could be to make pages more friendly for screen readers.

Interoperable:

Where it says "When a discipline does not have an established vocabulary, effort should be made to add the needed terms to a currently existing vocabulary rather than creating a new one.", perhaps that might be an opportunity to also tack on something like "If existing vocabularies do not work for your use case, effort should be made to use the tools of your chosen data model to map concepts in new vocabularies to analogous concepts in existing vocabularies." MAST is putting significant effort into tagging data and mapping between our internal vocabulary and vocabularies such as the IVOA vocabularies (https://www.ivoa.net/rdf/), and the Unified Astronomical Thesaurus (https://astrothesaurus.org/).
Tentatively, as a non-expert: It's not totally clear why DCAT is given such prominence as "At a base level, repositories should use the W3C Data Catalog Vocabulary (DCAT) to describe their holdings. ". I might suggest either softening that statement / generalizing it to emphasize the importance of using a common published standard, or making a case to try to convince readers that DCAT is truly the best standard for interoperability. DCAT seems like an interesting semantic model that may be helpful for some situations, but isn't widely used to our knowledge. (In astronomy, I suspect that the controlling factor may be whether the IVOA's investigations into DCAT yield action. But that linked presentation was from several months ago, so maybe they already have!)

Reusable:

R1.3: The focus on target community is important. There almost needs to be another "I" in FAIR, standing for "Interpretable". Metadata only gets you so far, and having complete documentation and rich webpages are vital for interpreting that metadata and making it useful for users. We suggest supplying additional resources or best practices on that topic.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feedback from MAST #54

Feedback from MAST #54

astrojimig commented May 31, 2024

Feedback from MAST #54

Feedback from MAST #54

Comments

astrojimig commented May 31, 2024