News

Acknowledging Innovations in Optical Character Recognition

Optical Character Recognition (OCR) is a technology that has become more popular in recent years. All because of its ability to automatically scan and extract data from images or scanned documents.

Some people think that this is a new technology, but that’s not true, because Optical Character Recognition technology has been around for a long time. Its history can be easily traced back to the 20th century.

Since then, it has continuously seen innovations in it that make it more beneficial for users. In this blog post, we are going to acknowledge those innovations in OCR by discussing its history.

What is OCR Technology?

Before directly heading toward the main topic, it will be good if we first understand what kind of technology Optical Character Recognition is.

It is basically a pattern-matching technology that analyzes the text that pictures or scanned documents contain. Once they’ve analyzed it, they then extract that text in an editable form without changing anything from it.

In simple words, OCR has turned the time-consuming data extraction process into an automated one. We can expect much more innovations in Optical Character Recognition technology, driven by ongoing advancements in tech development.

History of OCR – A Comprehensive Overview

In order to understand the innovations in OCR, we will need to take a detailed look into its history. All of it has been discussed below.

Founder of Optical Character Recognition (OCR):

As we already mentioned in the introduction, the history of Optical Character Recognition can be traced back to the 20th century.

When a scientist named “Emanuel Goldberg” in 1916 introduced a device that would turn the letters or characters of a given image into printed telegraph codes. This invention is widely known as the “Early form of OCR.”

An electronic version of OCR:

Emanuel Goldberg did not stop working here. Instead, six years later (in the 1920s) he introduced an “Electronic version of OCR.” This version could efficiently scan and extract text from digital images and documents.

In order to create an electronic version, Goldberg used already-existing technologies and equipment. He used a photoelectric cell to effectively recognize text patterns in the input image or document using a movie projector. However, this version of OCR can only read and extract text/data in one font.

After this invention, a well-known technology organization “IBM” acquired this project and named it a “Statistical machine” to further enhance it.

Ability to convert images into printed text:

When the invention of “Emanuel Goldberg” became popular, many other scientists also became interested and started making efforts.

A scientist named “Gustav Tauschek” invented an equipment known as “The reading machine” in 1929. This device had the ability to scan, extract, and print the data on physical paper.

This equipment used a template matching, a printing drum along with a photodetector, and a disk that contained characters and number-like holes.

Whenever a picture that had some text was passed into this machine, the letters or characters in the image that matched up with the holes would trigger the device. The reading machine would then use a printing drum to print those matched characters or numbers on the paper.

Omni font-OCR

After “Gustav Tauschek” many more improvements were made to the Optical character recognition technology in the next 3-35 years. Then the game-changing moment arrived. In the 1970s, a scientist named “Ray Kurzweil” came up with a new form of OCR that was called “Omni font-OCR.”

This version had something special which was that it could scan and extract text from images and documents in any kind of font and style. This is somewhat the same version of OCR that is being used today, with some advancements (of course).

The Current OCR

Fast forward to the start of the 21st century, mobile and desktop applications or tools have taken the world by storm. These are based on cloud-based service which means users from every part of the world.

Optical Character Recognition technology has also turned into algorithms and APIs. So that it can be easily integrated with these applications and tools for public use. In recent years, Artificial Intelligence has gained a lot of advancements as well as popularity.

So, the OCR tools have also loaded themselves with advanced AI technologies. The partnership with AI has not only made them quicker and more efficient but also taken their abilities to new heights.

Now, modern OCR-based tools have the ability to extract text from images, they can extract special symbols, characters, mathematical equations, etc as well. Moreover, they now extract data in almost any language and font style.

In the future, we can expect much more innovations in Optical Character Recognition technology.

Wrapping Up

Most of us who are using OCR for data extraction might think that it is a new technology. But the truth is that this technology is very old, and has seen a lot of innovations from the time it was created. In this blog, we tried our best to acknowledge all those innovations by taking a detailed look into the history of Optical Character Recognition.

Share this

Leave a Reply