0% found this document useful (0 votes)
62 views2 pages

History Ocr

The early development of optical character recognition technologies in the late 19th century aimed to help the visually impaired read printed text. Throughout the 1950s and 1980s, OCR systems improved in capabilities and were applied to tasks like processing credit cards and passports. Major advances in the late 20th century saw the development of commercial OCR companies and the ability to recognize text in scanned documents. Recent decades have seen further improvements in OCR accuracy as well as the ability to perform OCR on smartphones and online. Research into non-Latin scripts like Urdu followed Latin scripts but faces additional challenges due to complex writing systems and the lack of large handwritten datasets.

Uploaded by

Ammara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views2 pages

History Ocr

The early development of optical character recognition technologies in the late 19th century aimed to help the visually impaired read printed text. Throughout the 1950s and 1980s, OCR systems improved in capabilities and were applied to tasks like processing credit cards and passports. Major advances in the late 20th century saw the development of commercial OCR companies and the ability to recognize text in scanned documents. Recent decades have seen further improvements in OCR accuracy as well as the ability to perform OCR on smartphones and online. Research into non-Latin scripts like Urdu followed Latin scripts but faces additional challenges due to complex writing systems and the lack of large handwritten datasets.

Uploaded by

Ammara
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

History of OCR

The early optical recognition systems technologies were developed to help peoples that were impaired
optically. Tauschek’s reading machine and Fournier Optophone are two earliest devices
developed back in 1870 to 1931 to help the blind read.[3].
Later in 1950 the Gismo was invented that was capable of translating printed text in machine codes.
These devices were sold by IMR (Intelligent Machines Research) corporation. After 1950 an OCR
was developed for making credit card for the oil company in California by David H. Shepard.
In era of 1954 to 1974 the Optacon hit the market for its portability. In 1980 the developing progress
of OCR system was massive. The price tag and passport scanner were built.

Caere Corporation, Kurzweil Computer Products Inc and ABBYY are some famous companies of
today that were developed in late 1980 and early years of 1990.

In past 19 years (2000 to 2019) the OCR technology is enhanced massively; the online web services
of OCR and some certain offline application or real time translation are developed used on
smartphones.

When talking about the Urdu OCRs, the script research begins decades after the research of Latin
research.
The technologies about Urdu OCR are called “Nastalique”, It was emerged when Persian was official
language of Mughal Empires in South Asia.
Nastalique is was widely used in 1971 in different regions of South Asia, it is still used widely in
India and Pakistan, it is standard calligraphic style in Pakistan.[2]. Naskh is the most common writing
style that is used for Arabic, Persian as well as Pashto scripts

Arabic, Persian, Urdu and Pashto, all four alphabet systems are more or less the same, the only
difference is the total number of characters.

There is a total of 38 characters in Urdu alphabet [12]. In Urdu, the text lines are read from top to
bottom, while, the characters are read from right to left. The characters can be clustered into similar
classes based on the likenesses of their base forms; the characters in the same class vary only by
their dots or retroflex mark.

There is no publicly available handwritten dataset for Nastaliq to research community. While,
character set is almost same for both scripts (i.e., Naskh and Nasta’liq) Efforts are being made to
normalize the dataset of Urdu language for the purpose of comparing different available state-of-
the-art techniques. One such effort is made by CEPARMI (Centre for Pattern Recognition and
Machine Intelligence) [24] to develop a handwritten database from different sources. Other efforts
are being reported by Image understanding and Pattern Recognition Group at the Technical
University of Kaiserslautern, Germany to generate synthetic data of Urdu language, whose contents
were taken from leading Urdu newspaper of Pakistan named Jang [25].

25. Ul-Hasan Adnan, Bukhari SS, Rashid SF, Shafait F, and Breuel TM Semi-Automated OCR Database
Generation for Nabataean Scripts. ICPR:1667, (2012).

12. S. B. Ahmed, S. Naz, M. I. Razzak, S. F. Rashid, M. Z. Afzal, and T. M. Breuel, ‘‘Evaluation of


cursive and non-cursive scripts using recurrent neural networks,’’ Neural Comput. Appl., vol. 27, no.
3, pp. 603–613, 2016

2. S. Naz, K. Hayat, M. I. Razzak, M. W. Anwar, S. A. Madani, and S. U. Khan, ‘‘The optical character
recognition of Urdu-like cursive scripts,’’ Pattern Recognit., vol. 47, no. 3, pp. 1229–1248, 2014.

3. H. F. Schantz, History of OCR, Optical Character Recognition. Manchester, VT, USA: Recognition
Technologies Users Association, 1982.

4. W. J. Bijleveld and A. J. Van De Toorn, ‘‘Process and apparatus for producing and reading Arabic
numbers on a record sheet,’’ U.S. Patent 3 527 927, Sep. 8, 1970.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy