0% found this document useful (0 votes)

130 views6 pages

Optical Character Recognition Research: Index

This document discusses optical character recognition technology and potential solutions for an OCR project. It reviews popular OCR engines like Tesseract, SimpleOCR, and ABBYY. Tesseract used with OpenCV library is proposed as a feasible option, combining Tesseract's accuracy with OpenCV's image processing capabilities. Previous attempts using just the Tesseract API did not work well for business card text detection. The document provides accuracy results and processing architectures for the different technologies.

Uploaded by

Phi Thiện Hồ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

130 views6 pages

Optical Character Recognition Research: Index

Uploaded by

Phi Thiện Hồ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

1

Optical Character Recognition Research

Index
A.

There Issues Addressed ........................................................................................................................ 2

Current Techonlogy............................................................................................................................... 2
1.1.

Tesseract ....................................................................................................................................... 2

1.2.

SimpleOCR..................................................................................................................................... 3

1.3.

ABBYY ............................................................................................................................................ 3

Text detect algorithm in images ........................................................................................................... 5

Solution ................................................................................................................................................. 5
3.1.

Using Tesseract API ....................................................................................................................... 5

3.2.

Using Tesseract API and OpenCV Library ..................................................................................... 5

3.3.

Using Matlab API ........................................................................................................................... 6

Feasible option ...................................................................................................................................... 6

Refences ................................................................................................................................................ 6

2
A. There Issues Addressed
Current technology for development Optical Character Recognition.
Popular algorithm about text detect in images
Solution for project requirement
Feasible option for project requirement
There references using for this report
1. Current Technology
in the word have a lot of applications about optical character recognition engine for various
operating systems. Almost application was built using Google Libs, SimpleOCR SDK And ABBYY
SDK.
1.1. Tesseract
Tesseract is an optical character recognition engine for Mobile operating systems. Its
free software, released under Apache License. Tesseract is considered one of the most
accurate open source OCR engines currently available. Tesseract was in the top three OCR
engines in terms of character accuracy in 1995. Its available for Android, Window, Ubuntu,
Mac OS X.

Tesseract OCR Architecture:

Accuracy Results in 1995

1.2. SimpleOCR
SimpleOCR is a proprietary optical character recognition application developed originally
by Cyril Cambien of France under title WOCAR. Version 3.1, reviewed in PC Magazine in
2004.
Accuracy Results: 99% http://www.simpleocr.com/Info.asp
1.3. ABBYY
ABBYY is an international software company thats provider optical character Recognition.
ABBYY product, such as FineReader. In January 2007 the FineReader Engine (an OCR SDK) was
selected for use in Ricohs DocumentMall document management system.

Accuracy Results: 99.8%

4
Processing architecture

5
2. Text detect algorithm in images

3. Solution
3.1. Using Tesseract API
Tesseract 3.0 can handle any Unicode characters. Tesseract needs to know about different
shapes of the same character by having different fonts separated explicitly. This used to be
limited to 32 fonts, but the limit has been raised to 64. Architecture and accuracy result, I was

show at index above.

3.2. Using Tesseract API and OpenCV Library
OpenCV is written in C++ and its primary interface is in C++, but it still retains a less
comprehensive though extensive older C interface. The API for these interfaces can be found
in the online Document. In system, Application can takes only 1.1MB and accuracy result:

90% in OCR of hande written digits and 93.22% in OCR of English alphabets. OpenCV run on
Android, IOS, Window Its free.

6
Architecture using Opencv and Tesseract API.

3.3. Using Matlab API

MATLAB allows matrix manipulations, plotting of functions and data, implementation of
algorithms, creation of user interfaces, and interfacing with programs written in other
languages, including C, C++, Java, and Fortran.
We can generate say C code or build a library and add that to our project for android. But
the MATLAB Coder alone is not enough to do it. So we need to buy the Tier 1 package
Embedded Code. It is not meet requirement of project.
4. Feasible option
From 24/3/2014 to 28/3/2014, I was built an application in android for demo with Tesseract API.
But I have been some result is not good about detect text in Business card.
During I research about optical character recognition at some articles, tutorial, and report. I
think We can using OpenCV and Tesseract API for build application in android.
5. Refences
http://www.mathworks.com/help/vision/examples/automatically-detect-and-recognize-text-innatural-images.html#zmw57dd0e728
http://antoniogarrote.wordpress.com/2011/01/30/ocr-with-clojure-tesseract-and-opencv/
http://tesseract-ocr.googlecode.com/files/TesseractOSCON.pdf
http://www.abbyy.com/mobileocr/android/

Next Reports Server
No ratings yet
Next Reports Server
81 pages
D7.12 Data Management Plan Phase 3 v1.0
No ratings yet
D7.12 Data Management Plan Phase 3 v1.0
9 pages
Ocr Gtts
No ratings yet
Ocr Gtts
49 pages
Open Requirements: TATA Consultancy Services Hyderabad
No ratings yet
Open Requirements: TATA Consultancy Services Hyderabad
6 pages
Optical Character Recognition
100% (1)
Optical Character Recognition
17 pages
PDF
No ratings yet
PDF
269 pages
Capstonepres
No ratings yet
Capstonepres
12 pages
Optical Character Recognition: Article
No ratings yet
Optical Character Recognition: Article
5 pages
Test Case Generation
No ratings yet
Test Case Generation
46 pages
1st Review
100% (1)
1st Review
14 pages
Data Flow Diagram of Customer and Administrator Registration
No ratings yet
Data Flow Diagram of Customer and Administrator Registration
3 pages
Ocr Ann PDF
100% (1)
Ocr Ann PDF
4 pages
Optical Character Recognition
No ratings yet
Optical Character Recognition
27 pages
CYPEFIRE Sprinklers Manual
100% (2)
CYPEFIRE Sprinklers Manual
33 pages
Document
33% (3)
Document
97 pages
As 4755.3.1-2008 Demand Response Capabilities and Supporting Technologies For Electrical Products Interaction
No ratings yet
As 4755.3.1-2008 Demand Response Capabilities and Supporting Technologies For Electrical Products Interaction
8 pages
rp 1_merged
No ratings yet
rp 1_merged
104 pages
Senior Capstone
No ratings yet
Senior Capstone
9 pages
Lin
100% (1)
Lin
1 page
01 Cloud IAM
100% (1)
01 Cloud IAM
44 pages
Machine Learning With Python
100% (1)
Machine Learning With Python
9 pages
Moore's Law: Electronics, April 19, 1965
No ratings yet
Moore's Law: Electronics, April 19, 1965
117 pages
Project Report On OCR Scanner
No ratings yet
Project Report On OCR Scanner
40 pages
Multilingual text recognition system
No ratings yet
Multilingual text recognition system
21 pages
Network Programming With Python
No ratings yet
Network Programming With Python
12 pages
Quiz 2
No ratings yet
Quiz 2
21 pages
A Hands-On Tour Inside The World of PROC SQL: Kirk Paul Lafler, Software Intelligence Corporation
No ratings yet
A Hands-On Tour Inside The World of PROC SQL: Kirk Paul Lafler, Software Intelligence Corporation
17 pages
SL NO. Name Usn Number Roll No
No ratings yet
SL NO. Name Usn Number Roll No
10 pages
IP MINI GD (Ver02) FINAL DG
No ratings yet
IP MINI GD (Ver02) FINAL DG
18 pages
Mini Project-04,52 00
No ratings yet
Mini Project-04,52 00
85 pages
An Efficient OCR System Based On The Regional Feature Using The ASVM As Classifier
No ratings yet
An Efficient OCR System Based On The Regional Feature Using The ASVM As Classifier
7 pages
document-11
No ratings yet
document-11
10 pages
Sudan Prajapati Aman Maharjan Prof. Dr. Shashidhar Ram Joshi Asst. Prof. Bikash Balami
No ratings yet
Sudan Prajapati Aman Maharjan Prof. Dr. Shashidhar Ram Joshi Asst. Prof. Bikash Balami
15 pages
Optical Character Recognition System
No ratings yet
Optical Character Recognition System
41 pages
Ocr Nanonets Tesseract
No ratings yet
Ocr Nanonets Tesseract
39 pages
Portfolio Data Cleaning
No ratings yet
Portfolio Data Cleaning
39 pages
Demand Metrics Excel Template
No ratings yet
Demand Metrics Excel Template
14 pages
Review On Optical Character Recognition of Devanagari Script Using Neural Network
No ratings yet
Review On Optical Character Recognition of Devanagari Script Using Neural Network
6 pages
Tesseract OCR Engine: Svetlin Nakov and Veselin Kolev
No ratings yet
Tesseract OCR Engine: Svetlin Nakov and Veselin Kolev
19 pages
Change Log
No ratings yet
Change Log
11 pages
Performance Characterization and Acceleration of Optical Character Recognition On Handheld Platforms
No ratings yet
Performance Characterization and Acceleration of Optical Character Recognition On Handheld Platforms
10 pages
Fi Pdflatex mk4 - Bezdeklarace
No ratings yet
Fi Pdflatex mk4 - Bezdeklarace
41 pages
ML Report
No ratings yet
ML Report
5 pages
Tài liệu về OCR
No ratings yet
Tài liệu về OCR
4 pages
A Business Card Reader Application For iOS Devices Based On Tesseract
No ratings yet
A Business Card Reader Application For iOS Devices Based On Tesseract
4 pages
Bubble Sort - OpenMP
No ratings yet
Bubble Sort - OpenMP
4 pages
Optical Character Recognition Project Report
No ratings yet
Optical Character Recognition Project Report
71 pages
Optical Character Recognition (OCR) System
No ratings yet
Optical Character Recognition (OCR) System
5 pages
An Overview of Tesseract OCR Engine
No ratings yet
An Overview of Tesseract OCR Engine
15 pages
AI Possible Risks & Mitigations: Optical Character Recognition
No ratings yet
AI Possible Risks & Mitigations: Optical Character Recognition
33 pages
3 M&a
No ratings yet
3 M&a
24 pages
Hand Written Character Recognition Using Neural Network: BACHELOR OF ENGINEERING (Computer Engineering)
No ratings yet
Hand Written Character Recognition Using Neural Network: BACHELOR OF ENGINEERING (Computer Engineering)
46 pages
Study of Tesseract OCR
No ratings yet
Study of Tesseract OCR
12 pages
Jagruthi Institute of Engineering and Technology: Optical Character Recognition
No ratings yet
Jagruthi Institute of Engineering and Technology: Optical Character Recognition
28 pages
MPC With Integrators
No ratings yet
MPC With Integrators
11 pages
Optical Character Recognition - OCR Text Recognition
No ratings yet
Optical Character Recognition - OCR Text Recognition
11 pages
A12REVIEW
No ratings yet
A12REVIEW
18 pages
Optical Character Recognition by Open Source OCR Tool Tesseract A Case Study
No ratings yet
Optical Character Recognition by Open Source OCR Tool Tesseract A Case Study
7 pages
Optical_character_recognition_system_using_artific
No ratings yet
Optical_character_recognition_system_using_artific
7 pages
OCR (Optimal Character Recogintion)
No ratings yet
OCR (Optimal Character Recogintion)
7 pages
ANN Miniproject Report
No ratings yet
ANN Miniproject Report
11 pages
Software Requirements Specification
No ratings yet
Software Requirements Specification
7 pages
Trends Neural Networks
100% (3)
Trends Neural Networks
3 pages
Module # 10C - Text Recognition with Tesseract OCR
No ratings yet
Module # 10C - Text Recognition with Tesseract OCR
8 pages
9589-First Manuscript-57755-2-10-20220620 - X
No ratings yet
9589-First Manuscript-57755-2-10-20220620 - X
12 pages
Optical Character Recognition
No ratings yet
Optical Character Recognition
27 pages
Build Your Own Optical Character Recognition (Ocr) System Using Google'S Tesseract and Opencv
No ratings yet
Build Your Own Optical Character Recognition (Ocr) System Using Google'S Tesseract and Opencv
10 pages
Recognition of Handwritten Roman Numerals Using Tesseract Open Source OCR Engine
No ratings yet
Recognition of Handwritten Roman Numerals Using Tesseract Open Source OCR Engine
6 pages
Review of Related Literature
No ratings yet
Review of Related Literature
10 pages
Development of Text Extraction Technique 3acb33e9
No ratings yet
Development of Text Extraction Technique 3acb33e9
8 pages
Optical Character Recognition (Ocr) : Karan Panjwani T.E - B, 68 Guided By: Prof. Shalini Wankhade
No ratings yet
Optical Character Recognition (Ocr) : Karan Panjwani T.E - B, 68 Guided By: Prof. Shalini Wankhade
24 pages
Program Plan Algorithm Draw A Flowchart
No ratings yet
Program Plan Algorithm Draw A Flowchart
3 pages
Optical Character Recognition: Divyanshu Sagar Ahmed Zaid Faizee Vidyut Singhania
No ratings yet
Optical Character Recognition: Divyanshu Sagar Ahmed Zaid Faizee Vidyut Singhania
11 pages
Latest Base Paper
No ratings yet
Latest Base Paper
4 pages
Optical Character Recognition: Article
No ratings yet
Optical Character Recognition: Article
5 pages
Raj Synopsis12
No ratings yet
Raj Synopsis12
5 pages
Muzit Desta PDF
No ratings yet
Muzit Desta PDF
98 pages
Optical Character Recognizer: Team Member
No ratings yet
Optical Character Recognizer: Team Member
7 pages
Assignment 2 MLDS Lab
No ratings yet
Assignment 2 MLDS Lab
3 pages
Ocr PDF
No ratings yet
Ocr PDF
5 pages
Loops in C Language
No ratings yet
Loops in C Language
13 pages
Text Detection in Natural Scene Images Using Ocr Algorithm
No ratings yet
Text Detection in Natural Scene Images Using Ocr Algorithm
3 pages
Chapter 2. Direct Stiffness Method (Applied To Systems of Springs)
No ratings yet
Chapter 2. Direct Stiffness Method (Applied To Systems of Springs)
20 pages
International Tuition Fees
No ratings yet
International Tuition Fees
7 pages
Word Search 1st
No ratings yet
Word Search 1st
2 pages
SAP CC Solution Brief - Telco
No ratings yet
SAP CC Solution Brief - Telco
4 pages
ARDUINO CODING: A Comprehensive Guide to Arduino Programming (2024 Crash Course)
From Everand
ARDUINO CODING: A Comprehensive Guide to Arduino Programming (2024 Crash Course)
NORMAN BOWEN
No ratings yet
ARDUINO DETECTION: Harnessing Arduino for Sensing and Detection Applications (2024 Guide)
From Everand
ARDUINO DETECTION: Harnessing Arduino for Sensing and Detection Applications (2024 Guide)
ADDISON GARDNER
No ratings yet
ARDUINO CODE: Mastering Arduino Programming for Embedded Systems (2024 Guide)
From Everand
ARDUINO CODE: Mastering Arduino Programming for Embedded Systems (2024 Guide)
PIERCE SPRAGGINS
No ratings yet
JAVA PROGRAMMING FOR BEGINNERS: Master Java Fundamentals and Build Your Own Applications (2023 Crash Course)
From Everand
JAVA PROGRAMMING FOR BEGINNERS: Master Java Fundamentals and Build Your Own Applications (2023 Crash Course)
Theo Houle
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Optical Character Recognition Research: Index

Uploaded by

Optical Character Recognition Research: Index

Uploaded by

1

Optical Character Recognition Research

There Issues Addressed ........................................................................................................................ 2

Text detect algorithm in images ........................................................................................................... 5

Using Tesseract API ....................................................................................................................... 5

Using Tesseract API and OpenCV Library ..................................................................................... 5

Using Matlab API ........................................................................................................................... 6

Feasible option ...................................................................................................................................... 6

Tesseract OCR Architecture:

Accuracy Results in 1995

Accuracy Results: 99.8%

show at index above.

3.3. Using Matlab API

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.