Code Monkey home page Code Monkey logo

fimo's Introduction

FIMO

FIMO is a formal mathematics dataset comprising formal mathematical problem statements sourced from the International Mathematical Olympiad (IMO) Shortlisted Problems from 2006 to 2021. FIMO is currently tailored for the Lean formal language and is designed to facilitate advanced automated theorem proving at the IMO level. It comprises 149 formal problem statements, accompanied by both informal problem descriptions and their corresponding LATEX-based informal proofs.

The dataset aims to facilitate IMO-level AI mathematical reasoning research and provide a challenging benchmark to evaluate the reasoning ability of automated theorem systems. The dataset is currently targeting Lean formal system.

The dataset is a work in progress. Each problem is auto-formalized with feedback and verified by human experts. However, there is still a small probability that the generated formal statement is semantically incorrect. Discussions or P&R are welcome.

Citation

The construction process of the dataset (i.e., auto-formalization with feedback) is described in detail in the following paper. If you find our work helpful, please consider to cite this paper.

@misc{liu2023fimo,
      title={FIMO: A Challenge Formal Dataset for Automated Theorem Proving}, 
      author={Chengwu Liu and Jianhao Shen and Huajian Xin and Zhengying Liu and Ye Yuan and Haiming Wang and Wei Ju and Chuanyang Zheng and Yichun Yin and Lin Li and Ming Zhang and Qun Liu},
      year={2023},
      eprint={2309.04295},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}

Statistics

Category Total Quantity Success Count Success Rate
Algebra 124 89 71.8%
Number Theory 121 60 49.6%
Total 245 149 60.8%

Total Quantity denotes all the problems from the International Mathematical Olympiad Shortlisted Problems from 2006 to 2021. Success Count denotes the problems that are formalized through our auto-formalization with feedback process and, therefore, being a part of this dataset.

Example Problem Statement

  • Informal Statement

Let $a, b, c$ be the sides of a triangle. Prove that

$$ \frac{\sqrt{b+c-a}}{\sqrt{b}+\sqrt{c}-\sqrt{a}}+\frac{\sqrt{c+a-b}}{\sqrt{c}+\sqrt{a}-\sqrt{b}}+\frac{\sqrt{a+b-c}}{\sqrt{a}+\sqrt{b}-\sqrt{c}} \leq 3 $$

  • Informal Proof

Note first that the denominators are all positive, e.g. $\sqrt{a}+\sqrt{b}>\sqrt{a+b}>\sqrt{c}$. Let $x=\sqrt{b}+\sqrt{c}-\sqrt{a}, y=\sqrt{c}+\sqrt{a}-\sqrt{b}$ and $z=\sqrt{a}+\sqrt{b}-\sqrt{c}$. Then

$b+c-a=\left(\frac{z+x}{2}\right)^{2}+\left(\frac{x+y}{2}\right)^{2}-\left(\frac{y+z}{2}\right)^{2}=\frac{x^{2}+x y+x z-y z}{2}=x^{2}-\frac{1}{2}(x-y)(x-z)$

and

$$ \frac{\sqrt{b+c-a}}{\sqrt{b}+\sqrt{c}-\sqrt{a}}=\sqrt{1-\frac{(x-y)(x-z)}{2 x^{2}}} \leq 1-\frac{(x-y)(x-z)}{4 x^{2}} $$

applying $\sqrt{1+2 u} \leq 1+u$ in the last step. Similarly we obtain

$$ \frac{\sqrt{c+a-b}}{\sqrt{c}+\sqrt{a}-\sqrt{b}} \leq 1-\frac{(z-x)(z-y)}{4 z^{2}} \quad \text { and } \quad \frac{\sqrt{a+b-c}}{\sqrt{a}+\sqrt{b}-\sqrt{c}} \leq 1-\frac{(y-z)(y-x)}{4 y^{2}} $$

Substituting these quantities into the statement, it is sufficient to prove that

$$ \frac{(x-y)(x-z)}{x^{2}}+\frac{(y-z)(y-x)}{y^{2}}+\frac{(z-x)(z-y)}{z^{2}} \geq 0 . $$

By symmetry we can assume $x \leq y \leq z$. Then

$$ \begin{gathered} \frac{(x-y)(x-z)}{x^{2}}=\frac{(y-x)(z-x)}{x^{2}} \geq \frac{(y-x)(z-y)}{y^{2}}=-\frac{(y-z)(y-x)}{y^{2}}, \\ \frac{(z-x)(z-y)}{z^{2}} \geq 0 \end{gathered} $$

and (1) follows.

  • Formal Statement (Lean)
theorem fimo_2006_algebra_p5
  (a b c : ℝ)
  (h₀ : a > 0 ∧ b > 0 ∧ c > 0)
  (h₁ : a + b > c ∧ a + c > b ∧ b + c > a) :
  (sqrt (b + c - a) / (sqrt b + sqrt c - sqrt a)) +
  (sqrt (c + a - b) / (sqrt c + sqrt a - sqrt b)) +
  (sqrt (a + b - c) / (sqrt a + sqrt b - sqrt c)) ≤ 3 :=
begin
  sorry
end

Structure

When organizing our project structure, we followed openai/miniF2F, hopefully making it easier for researchers to utilize our work. Each problem is nominated as fimo_{year}_{category}_p{problem_count}, where fimo denotes the Formalized IMO-level problems.

  • lean/fimo.lean: Formalized IMO shortlist problems in LEAN language.
  • prompts/algebra_prompts.txt: Prompts for algebra problems.
  • prompts/number_theory_prompts.txt: Prompts for number theory problems.
  • informal: Problems in JSON format. Each problem has a problem_name, an informal_statement, and an informal_proof.

Lean

To install the project make sure you have elan installed, then in the directory where you want the project installed run:

leanpkg configure
leanpkg build

Since having one file per statement causes slowness in Lean parsing stage, all Lean statements are exceptionally aggregated in one file (fimo.lean). These files contain a list of the problem statements defined as theorems.

fimo's People

Contributors

liuchengwucn avatar pitmonticone avatar

Stargazers

Brando Miranda avatar  avatar fzyzcjy avatar William DeMeo avatar Abhishek Anand avatar Rahul Vishwakarma avatar Utensil avatar Huajian Xin avatar  avatar loopchen avatar Junyan Xu avatar Muchen Li avatar Lê Ngọc Đức avatar Wang Peng avatar Zheng Yuan avatar  avatar Akshay Vegesna avatar ZHU QIHAO avatar Peiyang Song avatar Kaiyu Yang avatar  avatar jhshen95 avatar

Watchers

Junyan Xu avatar jhshen95 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.