WIP/RFC: Add detailed reporting for debugging generation #6
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds preliminary support for more detailed reporting of
plausible
testing success, via a tool called Tyche. Tyche is available as a VSCode extension or as a standalone application in the browser, and it allows developers to visualize the distribution of data used to test their code. Currently Tyche is supported in Haskell's QuickCheck, Python's Hypothesis, and Rocq's QuickChick, among other languages. You can read our paper about Tyche for more information.You can try out this PR by downloading the Tyche extension in VSCode, adding
"tyche.observationGlobs": ["**/.lean/observations/*.jsonl"]
to your VSCode configuration, and then running theplausible
test in thetest/Tyche.lean
file. You should see a new interface pop up in your sidebar, giving visual feedback about duplicate/given up tests.Currently this PR adds the bare minimum, but I'd love some comments from more experienced Lean developers on how to improve the integration and add advanced features. In particular, I have a few changes I'd like to make:
plausible
runs produces a really large amount of data. Ideally I'd like users to be able to generate the report as-needed. Any thoughts on how to make that possible?Let me know what you think! We've gotten really good feedback about how useful Tyche is for helping developers understand how confident they should be in their tests, and I think it'd be a great addition to
plausible
.