Intuition machine
After writing On Math Platform, I’ve run a number of experiments while participating in the Equational Theories project (Terence Tao, Pietro Monticone, and Shreyas Srinivas).
Playing with ATPs like Mace4 and Vampire (or tools like Duper), experimenting with LeanDojo, running Lean as an executable (without the soundness), playing with Knuckledragger (thanks to Philip Zucker for recommending this), making antiseed-driven tools for finding contradictions (no results), reading some fascinating ATP-assisted proofs, playing with new types of operations (the less immediately and locally rational side of math insight generation), SAT in the Std, egg, doing some dimensionality reduction visualizations based on ET implication data, building a magma CLI tool for experimentation, testing the classifier from Adam Topaz, playing with networks and graphs, learning about relaxed weak central grupoids, Gröbner basis, and so on, learning about Asterix & Obelix, the 845, the 1020, and so on (incl. small TS scripts for tweaking the adjacency matrix), and learning a lot from such a great community — that’s simply insane amount of fun.
My intuition that I wrote about in On The Math Platform, Deep Reasoning Through Flexible Universal Language, On Implication Flow, Flower algorithm has been further refined.
So I even first listed 20 ideas that could help take mathematics to the next level, and the made the list smaller. Here’re the ideas:
Organize competitions (like ARC-AGI) with the goal of proving theorems with AI + ATP. AI can’t solve deep problems alone (yet), but a mathematician-coder working with AIs could reach superhuman level.
Create a unified math platform. Maybe adding math AI agents, ATPs, and colab to Zulip?
Math-insight driven. One example would be a project aiming at dealing with those emerging sporadic triplets.
Counterexample machine. Problems in, counterexamples out.
Something like W&B for 1.
Next-gen gradient descent for higher-order concepts (multi-layer). And storage of equivalent statements. And agents refining all that.
A fine-tune of a foundational language model using datasets generated with the project (some models will require chat-formatted data).
Deeper AI integrations for proof generation.
Math to Lean code converter.
ATP in the cloud.
A diffusion model that generates new formalized mathematical knowledge.
Transformers for proof generation.
Next-generation equation search with RL.
Interface / UX for theorem proving.
A proof optimization and eval tool.
Quantum experiments with q# - https://github.com/riccitensor/quantummath
Mathematical intuition model.
Next-gen mathematical dataset.
Conjecture generation machine.
A Python math pipeline library.
I guess the core idea is to dream up formalized math. A diffusion model that generates new formalized mathematical knowledge.
The depth and expressiveness of insight will still lead — AIs, at least for the time being, can’t solve deep enough problems (that changes quickly). Humans can still participate, pick, and define the initial problems, write software, use tools, and then continuously refine them (together with agents) by adjusting the ruleset (context) information. Software engineers that have deep mathematical insights as well as mathematicians who are great coders can have an edge. Compute, as always, will matter a lot.
Let me start with two digressions.
Digression 1.
One of the proofs (Mario Carneiro recommended that I have a look) got generated by Vampire, and then got converted into Lean. It looked like this:
theorem Equation854_implies_Equation378 (G : Type*) [Magma G] (h : Equation854 G) : Equation378 G := by
by_contra nh
simp only [not_forall] at nh
obtain ⟨sK0, sK1, nh⟩ := nh
have eq9 (X0 X1 X2 : G) : (X0 ◇ ((X1 ◇ X2) ◇ (X0 ◇ X2))) = X0 := mod_symm (h ..)
have eq10 : (sK0 ◇ sK1) ≠ ((sK0 ◇ sK1) ◇ sK1) := mod_symm nh
have eq11 (X0 X1 X2 X3 : G) : (X3 ◇ (X0 ◇ (X3 ◇ ((X1 ◇ X2) ◇ (X0 ◇ X2))))) = X3 := superpose eq9 eq9
have eq13 (X0 X1 X2 X3 X4 : G) : (X3 ◇ (X4 ◇ (X3 ◇ (X0 ◇ (X4 ◇ ((X1 ◇ X2) ◇ (X0 ◇ X2))))))) = X3 := superpose eq9 eq11 have eq54 (X0 X1 X2 X3 : G) : (X1 ◇ (X0 ◇ ((X2 ◇ X3) ◇ (X0 ◇ X3)))) = ((X1 ◇ (X0 ◇ ((X2 ◇ X3) ◇ (X0 ◇ X3)))) ◇ X0) := superpose eq9 eq13
have eq62 (X0 X1 : G) : (X1 ◇ X0) = ((X1 ◇ X0) ◇ X0) := superpose eq9 eq54 have eq137 : (sK0 ◇ sK1) ≠ (sK0 ◇ sK1) := superpose eq62 eq10 -- superposition 10,62, 62 into 10, unify on (0).2 in 62 and (0).2 in 10
subsumption eq137 rfl
Then Mario turned it into this:
def G := FreeMagmaWithLaws ℕ {Law854}
instance : Magma G := inferInstanceAs (Magma (FreeMagmaWithLaws ..))
theorem law : ∀ (x y z : G), x = x ◇ ((y ◇ z) ◇ (x ◇ z)) :=
Law854.models_iff.1 <| FreeMagmaWithLaws.isModel _ _ _ rfl
theorem l378 (x y : G) : x ◇ y = (x ◇ y) ◇ y := by
have yy := (law y y ((y ◇ y) ◇ (y ◇ y))).symm; rw [← law y y y] at yy
exact (law _ y (y ◇ y)).trans (by rw [yy, ← law])
From my perspective (correct me if I am wrong), it’s essentially about creating a higher-order concept (using FreeMagmaWithLaws in this case) that obeys a particular law (think about rulesets; in this case, Law854), and then simplifying the formula by using that law. Given that those Vampire → Lean-generated proofs (kudos to Pietro Monticone) mostly use mod_symm and superpose (thank you, Daniel Weber, for pointing out the scripts), perhaps there’s a chance here to simplify proofs at scale (Mario Carneiro mentioned I should look at the Vampire scripts— since I started from easier proofs).
Digression 2.
Terence Tao wrote a very interesting post on 1515 → 255. By starting with a set of really strong axioms, and then doing the commutativity of squaring and cubing, he eventually showed that the axioms were inconsistent. He then fixed the collisions by using the cubing operator and adding modified (new) axioms.
It felt like dynamic play of resilience and imagination. Very diffusionese.
Continuously re-imagining the ruleset while starting from the rather-tight axioms, and fixing the collisions was the way to go. (Even then, we’d still need the ATPs). Similar rules apply to software and writing — one edits and refines things. And looks for deep insights.
I think, for an agent, continuously re-imagining the ruleset by following the diffusionese math (noising, denoising, variational inference, and so on) in order to create formalized math is the way to go.
Axioms ← Initialize(Human Context, ATP Pipeline)
// Continuously re-imagine the ruleset and check code
while not done do
Ruleset ← UpdateRuleset(FixCollisions(Axioms))
// Diffuse
NoisedRuleset ← AddNoise(Ruleset)
DenoisedRuleset ← Denoise(NoisedRuleset)
// New ruleset
NewRuleset ← VariationalInference(DenoisedRuleset)
// Update axioms based on the improved ruleset
Axioms ← UpdateAxioms(NewRuleset)
// Validate & check if green
FormalizedMath ← Validate(Axioms, NewRuleset, ATP Pipeline)
end while
Gradual concept creation and formalization could be the key to long-term development of mathematics. Perhaps this combined with optimized transformers for proof generation.
Eventually, language models (and intelligent hardware) would profit from those datasets, and embrace the more mathematical side of language — i.e. it’d be possible to explicitly integrate implications between abstract concepts (think relaxed wcg that I mentioned before), and reach a more universal (not so specialized as it currently is) understanding. It’d be also possible to run simulations in order to find new insights — but that’d, as always, require compute and those abstract concept graphs.
Formalization would be key. Having soundness is critical.