Optical character recognition for typeset mathematics

Benjamin P. Berman, Richard J. Fateman

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

19 Scopus citations

Abstract

There is a wealth of mathematical knowledge that could be potentially very useful in many computational applications, but is not available in electronic form. This knowledge comes in the form of mechanically typeset books and journals going back more than a hundred years. Besides these older sources, there are a great many current publications, filled with useful mathematical information, which are difficult if not impossible to obtain in electronic form. What we would like to do is extract character information from these documents, which could then be passed to higher-level parsing routines for further extraction of mathematical content (or any other useful 2-dimensional semantic content). Unfortunately, current commercial OCR (optical character recognition) software packages are quite unable to handle mathematical formulas, since their algorithms at all levels use heuristics developed for other document styles. We are concerned with the development of OCR methods that m-e able to handle this specialized task of mathematical expression recognition.

Original languageEnglish
Title of host publicationProceedings of the International Symposium on Symbolic and Algebraic Computation, ISSAC 1994
PublisherAssociation for Computing Machinery
Pages348-353
Number of pages6
ISBN (Electronic)0897916387
DOIs
StatePublished - 1 Aug 1994
Externally publishedYes
Event1994 International Symposium on Symbolic and Algebraic Computation, ISSAC 1994 - Oxford, United Kingdom
Duration: 20 Jul 199422 Jul 1994

Publication series

NameProceedings of the International Symposium on Symbolic and Algebraic Computation, ISSAC
VolumePart F129423

Conference

Conference1994 International Symposium on Symbolic and Algebraic Computation, ISSAC 1994
Country/TerritoryUnited Kingdom
CityOxford
Period20/07/9422/07/94

Bibliographical note

Publisher Copyright:
© 1994 ACM.

Fingerprint

Dive into the research topics of 'Optical character recognition for typeset mathematics'. Together they form a unique fingerprint.

Cite this