Exploration of neural machine translation in autoformalization of mathematics in Mizar

Qingxiang Wang, Chad Brown, Cezary Kaliszyk, Josef Urban
International Conference on Certified Programs and Proofs (CPP 2020), ACM pp. 85 – 98, 2020.

pdf doi:10.1145/3372885.3373827

Abstract

In this paper we share several experiments trying to automatically translate informal mathematics into formal mathematics. In our context informal mathematics refers to human-written mathematical sentences in the LaTeX format; and formal mathematics refers to statements in the Mizar language. We conducted our experiments against three established neural network-based machine translation models that are known to deliver competitive results on translating between natural languages. To train these models we also prepared four informal-to-formal datasets. We compare and analyze our results according to whether the model is supervised or unsupervised. In order to augment the data available for auto-formalization and improve the results, we develop a custom type-elaboration mechanism and integrate it in the supervised translation.

BibTex

@inproceedings{qwcbckju-cpp20,
author = {Qingxiang Wang and
Chad E. Brown and
Cezary Kaliszyk and
Josef Urban},
editor = {Jasmin Blanchette and Catalin Hritcu},
title = {Exploration of neural machine translation in autoformalization of
mathematics in {M}izar},
booktitle = {Proceedings of the 9th {ACM} {SIGPLAN} International Conference on
Certified Programs and Proofs, {CPP} 2020, New Orleans, LA, USA, January
20-21, 2020},
pages = {85--98},
publisher = {{ACM}},
year = {2020},
url = {https://doi.org/10.1145/3372885.3373827},
doi = {10.1145/3372885.3373827},
}

Details: Category: Publications 2020; Published: 28 April 2025