2024.04.19
haesleinhuepf
released this
19 Apr 15:14
·
120 commits
to main
since this release
This version of the benchmark was submitted as preprint. A link will be added to the readme once it is out.
Most important changes
- Document details of our modifications to the HumanEval framework by @haesleinhuepf in #19
- copy example images to tempdir by @haesleinhuepf in #16
- add documentation how to add requirements by @haesleinhuepf in #41
- add dependencies which made some tests fail by @haesleinhuepf in #43
- add notebook for detecting missing requirements by @haesleinhuepf in #42
- add notebook to summarize common failure reasons by @haesleinhuepf in #51
- add notebook that summarizes which libraries were used in generated code by @haesleinhuepf in #54
- Rerun benchmark by @haesleinhuepf in #56
Changes to list of changed models
- adding gpt-4-turbo-2024-04-09 to tested models by @haesleinhuepf in #18
- The mistral models tested via the blablador infrastructure was temporarily removed from the list of tested models due to technical difficulties. See #55 for details
New test-cases
- Add read_zarr test, add zarr dependency, add zarr example data by @tischi in #33
- Add test for radial intensity profile by @tischi in #40
- Add fit_circle test by @tischi in #38
- add test-case for tiled image processing by @haesleinhuepf in #27
- add test-case binary_skeleton by @haesleinhuepf in #28
- add simeple image masking test case by @nscherf in #20
- added test-case combine-columns by @haesleinhuepf in #30
- add bland-altman test case by @haesleinhuepf in #31
- add test to load a nifti image by @nscherf in #50
- added test-case for using aicsimageio, example data, requirements by @haesleinhuepf in #48
Other changes
- Sample canonical by @haesleinhuepf in #13
- Better data visualization by @haesleinhuepf in #25
- fix typos in test-case names by @haesleinhuepf in #29
- rename read-... test case to open-... so that it fits better to others by @haesleinhuepf in #49
- Tex paper by @haesleinhuepf in #46
- add seaborn plots by @nscherf in #57
- revised main text by @haesleinhuepf in #58
New Contributors
Full Changelog: 2024.04.07...2024.04.19