README.md · kaiyuy/leandojo-lean3-retriever-tacgen-byt5-small at main

metadata

license: mit
inference:
  parameters:
    max_length: 1024
widget:
  - text: |-
      a b : ℕ
      ⊢ a + b = b + a
    example_title: Example

LeanDojo: Theorem Proving with Retrieval-Augmented Language Models
Under review, NeurIPS (Datasets and Benchmarks Track), 2023
Kaiyu Yang, Aidan Swope, Alex Gu, Rahul Chalamala,
Peiyang Song, Shixing Yu, Saad Godil, Ryan Prenger, Anima Anandkumar

@inproceedings{yang2023leandojo,
    title={{LeanDojo}: Theorem Proving with Retrieval-Augmented Language Models},
    author={Yang, Kaiyu and Swope, Aidan and Gu, Alex and Chalamala, Rahul and Song, Peiyang and Yu, Shixing and Godil, Saad and Prenger, Ryan and Anandkumar, Anima},
    booktitle={Neural Information Processing Systems (NeurIPS)},
    year={2023}
}

Please visit LeanDojo Website for details.

Input Format

The model's input consists of retrieved premises concatenated with the current proof state and truncated to 2300 UTF-8 bytes. The proof state is formatted by Lean's pretty printer, the same as the input format of our model w/o retrieval. Retrieved premises are in the form of Lean code, except that proofs are removed and fully qualified names (marked by <a>...</a>) are used. Please see the example on the right.