Request for Release of Enhanced Theorem-Proving Dataset #2

PrithwishJana · 2024-08-21T03:05:56Z

Hi,

The paper looks impressive! Is there a plan to release the training dataset? I noticed that you used an enhanced theorem-proving dataset with 9,645k sequences, derived from DeepSeek-Prover-V1. Will the new dataset, including the natural language descriptions and intermediate tactic state information, be made available?

Additionally, I would greatly appreciate it if you could share the smaller dataset of 4.5k carefully selected instances used for reinforcement learning.

fzyzcjy · 2024-08-28T07:07:39Z

+1 Thanks deepseek for the great work, I would appreciate it if the dataset could be open sourced!

aldopareja · 2024-10-14T17:15:23Z

would also love seeing this or a way of synthetically generate the data. Thank you! and great work indeed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for Release of Enhanced Theorem-Proving Dataset #2

Request for Release of Enhanced Theorem-Proving Dataset #2

PrithwishJana commented Aug 21, 2024 •

edited

Loading

fzyzcjy commented Aug 28, 2024

aldopareja commented Oct 14, 2024

Request for Release of Enhanced Theorem-Proving Dataset #2

Request for Release of Enhanced Theorem-Proving Dataset #2

Comments

PrithwishJana commented Aug 21, 2024 • edited Loading

fzyzcjy commented Aug 28, 2024

aldopareja commented Oct 14, 2024

PrithwishJana commented Aug 21, 2024 •

edited

Loading