LeanDojo is a Python library for learning–based theorem provers in Lean, supporting both Lean 3 and Lean 4. It provides two main features:
- Extracting data (proof states, tactics, premises, etc.) from Lean repos.
- Interacting with Lean programmatically.
- Linux or macOS
- Git >= 2.25
- Python >= 3.9
- Docker strongly recommended
Announcement: If you encounter errors related to ray
and pydantic
, see this temporary workaround: ray-project/ray#37019. The error will be fixed when Ray 2.6 is released.
LeanDojo is available on PyPI and can be installed via pip:
pip install lean-dojo
It can also be installed locally from the Git repo:
pip install .
- For general questions and discussions, please use GitHub Discussions.
- To report a potential bug, please open an issue.
- LeanDojo Website: The official website of LeanDojo.
- LeanDojo Benchmark : The dataset used in our paper, consisting of 96,962 theorems and proofs extracted from mathlib by generate-benchmark-lean3.ipynb.
- LeanDojo Benchmark 4 : The Lean 4 version of LeanDojo Benchmark, consisting of 91,766 theorems and proofs extracted from mathlib4 by generate-benchmark-lean4.ipynb.
- ReProver: The ReProver (Retrieval-Augmented Prover) model in our paper.
- LeanDojo ChatGPT Plugin
LeanDojo: Theorem Proving with Retrieval-Augmented Language Models
Under review, NeurIPS (Datasets and Benchmarks Track), 2023
Kaiyu Yang, Aidan Swope, Alex Gu, Rahul Chalamala,
Peiyang Song, Shixing Yu, Saad Godil, Ryan Prenger, Anima Anandkumar
@article{yang2023leandojo,
title={{LeanDojo}: Theorem Proving with Retrieval-Augmented Language Models},
author={Yang, Kaiyu and Swope, Aidan and Gu, Alex and Chalamala, Rahul and Song, Peiyang and Yu, Shixing and Godil, Saad and Prenger, Ryan and Anandkumar, Anima},
journal={arXiv preprint arXiv:2306.15626},
year={2023}
}