Spring 2026 Math AI Lab Projects

Project teams meet Mondays and Wednesdays from 4–5:30 pm in Odegaard 136.

The goal of projects is a publication or a significant open-source contribution, e.g. to mathlib4. Students are expected to commit at least 5 hours per week to their project outside of meeting times.

Autoformalization Projects

The goal of autoformalization projects is to use Lean-specific AI assistants such as Harmonic’s Aristotle to formalize large chunks of mathematics, and contribute to mathlib4.

OpenMath: autoformalizing undergraduate textbooks

Project Leader: Vasily Ilin
Description: Most undergraduate mathematics is present in mathlib, but several big gaps remain: differential geometry, partial differential equations, numerical analysis. We will autoformalize undergraduate textbooks on numerical analysis of differential equations, starting with Iserles’s A First Course in the Numerical Analysis of Differential Equations and Butcher’s Numerical Methods for Ordinary Differential Equations. Since manual formalization is time-consuming, we will build a semi-autonomous AI agent along the way, automating more of the process as the project progresses. Goal: workshop paper at the AI for Math workshop at ICML 2026 (deadline May 25).
Code: GitHub repo
Members: Vasily Ilin, Theo Meek, Siyuan Ge, Saumi Joshi, Di Qiu Xiang
Prerequisites: Lean, undergraduate analysis

Geometric Invariant Theory (GIT)

Project Leaders: Arkamouli Debnath, Giovanni Inchiostro, Leo Mayer
Description: Geometric Invariant Theory (GIT) is the theory of taking quotients in algebraic geometry. At present, there is almost no formalization of GIT in Lean on mathlib, and this presents the unique opportunity to contribute in two different directions as follows:
1. Commutative algebra: Our main goal is to formalize the definition of a linearly reductive group and prove the theorem: "If a linearly reductive group G acts locally finitely on a finitely generated k-algebra R, then the invariant ring R^G is a finitely generated k-algebra." The fact that a finite group is linearly reductive (Maschke's theorem) is already in mathlib, and that proof uses the existence of a "Reynolds operator". As part of this formalization project, we define a Reynolds operator and prove its existence (in Lean) for any linearly reductive group.
2. Geometry: The main content of GIT is to define a "good quotient". We aim to formalize the definition.
Code: GitHub repo
Members: Xuanyu Yang, Emily Meng, Bohan Zhao, Jacob Boyce
Prerequisites: Lean, commutative algebra or algebraic geometry

JAX in Lean

Project Leader: Samuel Ainsworth
Description: The goal is to build a verified API for JAX in Lean.
Code: GitHub repo
Members: Jeremy Ma
Prerequisites: Lean, JAX or similar autodiff framework

Geometric Measure Theory

Project Leader: Ignacio Tejeda
Formalization target: Theorem 4.2 in Falconer’s Geometry of Fractal Sets.
Code: GitHub repo
Mathlib PRs: 1, 2
Members: Theo Meek, Nathan Pao, Annie Cao, Josh
Prerequisites: Lean

Commutative Algebra: Auslander–Buchsbaum

Project Leaders: Haoming Ning and Leo Mayer
Formalization target: The theorem that a regular local ring is a UFD, referred by some as the Auslander–Buchsbaum theorem. See Stacks Project 0AG0.
Code: GitHub repo
Members: Nailin Guan, Dora Kassabova
Prerequisites: Lean

Algebraic Geometry

Project Leaders: Bianca Viray and Bryan Boehnke
Formalization target: Monogenic extensions of regular local rings following arXiv:2503.07846, lemmas 3.1 and 3.2.
Members: George Peykanu, Grant Yang
Prerequisites: Lean

Formalization: zero-knowledge proofs

Project Leaders: Eric Klavins and Alexandra Aiello
Description: We would outline a framework for verifying mathematical theorems while keeping the respective proofs secret by leveraging dependent combinatory logic as a host language for Zero-Knowledge (ZK) proof circuits. Specifically, we would interpret the axioms of the dependent SK combinator calculus as a universal ZK circuit capable of checking mathematical proofs encodeable in the calculus. We would target ZK-STARKs as our ideal ZK scheme, enabling quantum resistance of the proofs without trusted setup [Ben+18]. This would be a first-of-its kind result, with wide applications. Goal: publication
Prerequisites: Lean, type theory

Metaprogramming: Provable Computation in Lean

Project Leader: Dhruv Bhatia
Description: While Lean has seen extensive use as a theorem-proving assistant, its capabilities as a computational programming language have been underutilized. The goal of this project is to begin filling that gap. Along the way, we will learn the basics of functional programming, monads, and Lean’s metaprogramming framework to implement algorithms that can both be run efficiently and be reasoned about. Our main goal is to implement basic algorithms with applications to linear algebra while also proving (in Lean) correctness of said algorithms.
Members: Joseph Qian, Junye Ji, Veer Shukla, Alan Chang
Code: GitHub repo

AI Projects

The goal of AI projects is to use AI and ML to advance mathematical research. For example, by helping mathematicians find relevant theorems, or by training models to learn mathematical functions or by creating math-specific datasets or models. AI projects are expected to result in a publication in a submission to a major ML conference such as ICML, ICLR, EMNLP or Neurips.

LeanGCD: stabilizing Lean generation

Project Leaders: Dean Light and Michael Theologitis
Description: LeanGCD aims to improve Lean proof generation by frontier large language models using inference-time stabilization methods rather than retraining. The project adapts grammar-constrained decoding techniques with Lean’s syntax extensions. Planned contributions: a Lean-specific GCD implementation that plugs into HuggingFace/vLLM inference pipelines, and empirical evaluation on MiniF2F, ProverBench, and PutnamBench. Goal: ICML 2026 (October) or 2027 (January).
Code: GitHub repo
Members: Evan Wang, Ben Bioren, Naren Prabhu, Nhan Pham
Prerequisites: Python, ML / LLM inference experience

Mathematician's copilot: Semantic Theorem Search

Project Leaders: Giovanni Inchiostro and Vasily Ilin
Description: We help research mathematicians find relevant theorems quickly. We collected the dataset of all theorems on ArXiv, Stacks Project and other sources, and built vectorized search over it. Goal: submit to ICML 2026 in January.
Demo: HF link
Members: Eric Leonen, Sophie Szeto, Artemii Remizov, Luke Alexander
Code: GitHub repo

Mathematician's copilot: Math2Vec

Project Leader: Henry Kvinge (PNNL)
Members: Saharsh Bhargava, Cecilia, Jiahe Lu, Michael, Kedar Chintalapati, Rachit Jaiswal, Samarth Rao, Jared Darlington, Leo Carlin
Description: Train a text embedder that understands math, LaTeX and Lean. This will improve search over Lean, and natural language theorem search. It can be used for RAG as well. Use arXiv, MathOverflow, mathlib, Lean Reservoir, and possibly other sources. Create an evaluation benchmark as well. Goal: submit to EMNLP 2026 in May.

CayleyPy: search on massive combinatorial graphs

Project Leaders: TBD
Description: Optimize CayleyPy, make a CLI, and use it on various combinatorial problems, such as estimating diameters of symmetric groups. Goal: submission to ICLR 2027 in September. See project proposal document.
Members: Danny Zhang, Gaurang Pendharkar, Merav Frank, Sambhu Ganesan, Xiaoxing Zhang
Prerequisites: solid Python, some group theory or combinatorics.

Reinforcement Learning for Polynomials

Project Leader: Michael Zeng
Description: Use RL to find efficient arithmetic circuits for polynomials.
Members: Kyle Zhang, Rohan Pandey, Naomi Morato
Code: GitHub repo

AI for Quantum Code Compilation

Project Leader: Andres Paz
Description: Quantum error correction (QEC) codes are traditionally described using stabilizers, which define the subspace preserved by the code. However, implementing these codes requires translating stabilizers into fault-tolerant quantum circuits—an inherently nontrivial task that depends on the constraints and capabilities of the underlying hardware architecture.

This project aims to develop an AI agent capable of synthesizing such circuits in a way that:
1. In the ideal (noiseless) setting, the resulting circuits implement the intended stabilizer structure of the code on the target architecture.
2. In the noisy setting, the agent searches over circuit variations to optimize fidelity, taking into account realistic noise models and architectural constraints.
The outcome would be a toolchain bridging the gap between abstract QEC code design and concrete, high-performance circuit implementations, enabling better exploration of architecture-specific tradeoffs in fault-tolerant quantum computing.
Members: Christian Tarta, Sylvie Lausier, Mayee Sun, Sarju Patel

How good are LLMs at Lean?

Project Leader: Tyson Klingner
Description: Design an evaluation procedure for various Lean tasks such as next step generation or entire proof generation. Run our algorithms on frontier models to see which LLMs perform the best, providing guidance to the Lean community on which LLMs to use. Use a scalable architecture so that our algorithms can be rerun when new models are released.
Code: GitHub repo
Members: Escher Crawford, Drew Bladek