Cezary Kaliszyk, Josef Urban, Jiřı́ Vyskočil
8th International Conference on Interactive Theorem Proving, Lecture Notes in Computer Science 10499, pp. 12 – 27, 2017.
 pdf
 pdf  doi:10.1007/978-3-319-66107-0_2
 doi:10.1007/978-3-319-66107-0_2
Standard Springer LNCS Copyright
Abstract
We discuss the progress in our project which aims to automate formalization by combining natural language processing with deep semantic understanding of mathematical expressions. We introduce the overall motivation and ideas behind this project, and then propose a context-based parsing approach that combines efficient statistical learning of deep parse trees with their semantic pruning by type checking and large-theory automated theorem proving. We show that our learning method allows efficient use of large amount of contextual information, which in turn significantly boosts the precision of the statistical parsing and also makes it more efficient. This leads to a large improvement of our first results in parsing theorems from the Flyspeck corpus.
BibTex
@inproceedings{ckjujv-itp17, author = {Cezary Kaliszyk and Josef Urban and Ji\v{r}\'{\i} Vysko\v{c}il}, title = {Automating Formalization by Statistical and Semantic Parsing of Mathematics}, booktitle = {8th International Conference on Interactive Theorem Proving (ITP 2017)}, pages = {12--27}, year = {2017}, url = {https://doi.org/10.1007/978-3-319-66107-0_2}, doi = {10.1007/978-3-319-66107-0_2}, editor = {Mauricio Ayala{-}Rinc{\'{o}}n and C{\'{e}}sar A. Mu{\~{n}}oz}, series = {Lecture Notes in Computer Science}, volume = {10499}, publisher = {Springer},}