I am happy to announce that the ENIGMA system (1,2,3,4,5,12,18,25) of the E (8,9,13) lineage, helped by its Deepire (22,23,24) Vampiric (26,27) cousin, has reached today (July 26th, 2021) the landmark of 75% automatically proved Mizar top-level problems.

This is measured in a setting when the premises for the proof can be selected from the library either by a human or by a machine.

ENIGMA was initially trained in several iterations from scratch (zero initial proofs) on the 57880 Mizar/MPTP/Mizar40 (10,11,14) top-level problems using human premise selection, solving gradually 65.65% of them in June 2020. We then also used E's auto-schedule, many E strategies invented by BliStr/Tune (15,16,17), Malarea/ATPBoost-style (6,7) premise selection (using lgb,gnn,xgb,rnn and transformer 6,19,20,21) for some runs and heuristic premise minimization based on the structure of subproblems.

The last push was provided by the 2-phase and 3-phase version of ENIGMA, combining faster, slower and parental clause evaluation methods (18), and further combined with the use of leapfrogging (25).

Many of the most recent ENIGMA proofs go over 300 inference steps (some reaching over 500), some starting with over 1000 axioms (28). The premise selection finds (as usual (32,14,33)) many alternative proofs (e.g., 29) but ENIGMA can now also replay original Mizar proofs with hundreds of lines (30).

Many of the proofs show that ENIGMA has autonomously (without any human-programmed decision procedures, tactics, and/or dataset preparation/tuning) learned how to routinely perform common mathematical algorithmic tasks such as numeric calculation, matrix manipulation, boolean algebra, integration and differentiation, sequences of standard rewriting and normalization operations in various algebraic theories, etc., and combine them with other reasoning tasks needed for completing the fully formal proofs (25).

This suggests that learned guidance combined with efficient search may in near future lead to a new fully declarative (guided Prolog-style) problem-solving computing/reasoning architectures applicable to arbitrary computing/reasoning problems without any human engineering.

Combined with autoformalization (34,35,36,37,38,39) this could lead us to large deployment of reasoning machines in science and every-day life, following Leibniz's Calculemus dream.

More details about the proofs and progress are available at (31).

Josef Urban
(for the AI4REASON team - http://ai4reason.org/members.html )

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

75percent_announce.md

75percent_announce.md

Files

75percent_announce.md

Latest commit

History

75percent_announce.md

File metadata and controls