Digital Image Processing And Pattern Recognition Computer Science Essay

In electrical technology and computing machine scientific discipline, image processing is any signifier of signal processing for which the input is an image, such as a exposure or picture frame. The end product of image processing may be either an image or, a set of features or parametric quantities related to the image. Most image processing techniques involve handling the image as a two dimensional signal and using standard signal-processing techniques to it. Digital image processing is the usage of computing machine algorithms to execute image treating on digital images. As a subcategory or field of digital signal processing, digital image processing has many advantages over linear image processing. It allows a much wider scope of algorithms to be applied to the input informations and can avoid jobs such as the build-up of noise and signal deformation during processing. Since images are defined over two dimensions ( possibly more ) digital image processing may be modeled in the signifier of Multidimensional Systems.

Pattern acknowledgment:

It is the procedure of analyzing a form and delegating a category utilizing a classifier ( e.g. , a regulation based on the location of a graphical representation of the given sample with regard to other samples of the known category ) . Pattern acknowledgment is used in diverse applications: script acknowledgment, fiscal analysis, cistron look, biometries, and so on. Pattern acknowledgment purposes to sort informations based either on a priori cognition or on statistical information extracted from the forms. The forms to be classified are normally groups of measurings or observations, specifying points in an appropriate multidimensional infinite. This is in contrast to model matching, where the form is stiffly specified.

Optical character acknowledgment ( OCR ) :

Optical Character Recognition ( OCR ) is a system used to change over scanned printed/handwritten image files into machine readable/editable format such as text papers. OCR package receives its input as an image, processes it and compares its characters with a set of OCR founts stored in its database. Character acknowledgment, which is one of the applications of pattern acknowledgment, is of great importance these yearss. Character acknowledgment systems can be used in:

Fiscal concern applications: for screening bank cheques since the figure of cheques per twenty-four hours has been far excessively big for manual sorting.

Commercial informations processing: for come ining informations into commercial informations treating files, ( e.g. for come ining the names and references of mail order clients into a database ) . In add-on, it can be used as a work sheet reader for paysheet accounting.

In postal section: for postal reference reading, screening and as a reader for handwritten and printed postal codifications.

In newspaper industry: high quality typescript may be read by acknowledgment equipment into a computing machine typesetting system to avoid typing mistakes that would be introduced by re-punching the text on computing machine peripheral equipment.

Use by blind: It is used as a reading assistance utilizing exposure detector and haptic simulators, and as a centripetal assistance with sound end product. In add-on, it can be used for reading text sheets and reproduction of Braille masters.

In facsimile transmittal: that involves transmittal of pictural informations over communications channels. In pattern, the pictural information is chiefly text. Alternatively of conveying characters in their pictural representation, a character acknowledgment system could be used to acknowledge each character so transmit its text 9 codifications. Finally, it is deserving to state that the biggest possible application for character acknowledgment is as a general information entry for the mechanization of the work of an ordinary office typist.

OCR can be of two types: ( 1 ) online character acknowledgment ; and ( 2 ) off-line character acknowledgment. The off-line OCR system type trades with printed and handwritten texts, while the online OCR system type trades with handwritten texts merely see Figure 1.


Fig. 1 the form acknowledgment and the character acknowledgment system

If the OCR system has the ability to follow the points generated by traveling a particular pen on a particular screen, so the system belongs to the online type, while it belongs to the off-line type when it accepts merely the pre-scanned text images to execute the acknowledgment procedure.

Background of Study:

History and Characteristics of the Sorani alphabet:

Off-line OCR system:

The offline OCR system can be divided into predefined procedures which yield a recognized text. Figure 2 illustrates the four chief criterion procedures of the offline OCR system which apply every bit significantly to any offline OCR system: ( 1 ) Preprocessing ; ( 2 ) Cleavage ; ( 3 ) Features Extraction ; and ( 4 ) Recognition. The procedures figure is standard even if it is different in some offline OCR systems.


Fig.2 the standard offline OCR system

Preprocessing phase:

Preprocessing measure is the most of import because it straight affects the dependability and efficiency in the quality of the end product. It involves many operations on the digitized image, of a natural image, used to minimise noise and increase capableness of the extracting characteristics, by thinning and cleaning the image. Those operations are viz. ; binarization, smoothing, thinning, alliance, standardization, and base-line sensing figure 3.


Fig.3 Preprocessing Phase

Binarization: Converts a grey scale image into bi-level image. A dependable binarization method is by calculating the histogram of the grey values of the image and so happening a cutoff point.

Filtering and Smoothing: Filtering and smoothing are conditioning stairss that remove unwanted fluctuations in the input image.

Skeletonization ( Thining ) : Is a really of import preprocessing measure for the analysis and acknowledgment of the Sorani OCR because it is the procedure of simplifying the character form, in an image, from many pels broad to merely one pel to cut down the sum of informations to be handled. Thining algorithms can be classified into two types ; consecutive algorithms, and parallel algorithms, figure 4. There is a chief difference between these two types of thinning algorithms is that consecutive algorithm operates on one pel at a clip, and the operation depends on predating processed consequences, while parallel algorithm operates on all the pels at the same time.


Fig.4 Skeletonization ( Thining )

Standardization: Sorani characters sizes vary tremendously. Therefore, standardization method should be followed to scale characters to a fixed size and to center the character before acknowledgment.

Slant Correction: One of the most obvious mensurable factors of different script manners is the angle between longer shots in a word and the perpendicular way referred to as the word angle. The purpose of this phase is to observe any aslant shots. It can be achieved in two stairss ; incline sensing and slant rectification.

Baseline Detection and Skew Detection: is defined as the line on which missive prevarication. It contains utile information about the orientation of the character. Horizontal projection histogram is considered as one of the methods for repairing the baseline.


After the preprocessing phase, most of the OCR systems isolate the single characters or shots before acknowledging them. Segmenting a page of text can be divided into two degrees: page decomposition and word cleavage. Page decomposition used to divide the different page elements, bring forthing text blocks, lines, and sub-words when the page contain different object types like artworks, headers, and text blocks. Word cleavage, on the other manus, used to divide the characters of word and bomber word. The public presentation of the system depends on how accurately they isolate the characters. As this statement is by and large true for cursive text acknowledgment, it is particularly pertinent to Arabic and the other similar alphabets, where characters connect within a word.

Feature Extraction:

The following measure after cleavage procedure is the characteristic extraction in which the produced in segmentation measure is used to pull out some characteristics which in bend passed to the following phase, the classifier. Features can be categorized into: planetary transmutations, structural characteristics, statistical characteristics, and template matching and correlativity. The characteristics can be manipulated in two ways:

Interleaved control, in which an optical character acknowledgment system alternates between characteristic extraction and categorization by pull outing a set of characteristics from a form, passes them on to the classifier so extracts another characteristic and so on.

One measure control, in which an optical character acknowledgment system extracts all the needed characteristics from a primitive and so makes the categorization.


It is besides named categorization measure. Categorization is the chief determination devising phase in which the extracted characteristics of a trial set are compared to those of the theoretical account set. Based on the characteristics extracted from a form, categorization efforts to place the form as a member of a certain category. When sorting a form, categorization frequently produces a set of hypothesized solutions alternatively of bring forthing a alone solution. Categorization follows three chief theoretical accounts: syntactic ( or structural ) , statistical ( or determination theoretic ) , and nervous webs categorization. By and large, there are five chief paradigms for executing pattern acknowledgment: ( 1 ) templet matching ; ( 2 ) geometrical categorization ; ( 3 ) statistical categorization ; ( 4 ) syntactic or structural matching ; ( 5 ) unreal nervous webs.

Problem statement of the research:

The OCR system is really of import because it improves the interactivity between the human and the computing machines and it has many practical applications that are independent of the treated linguistic communication. So far there is no convenient OCR system available for the modern Meleagris gallopavo alphabet. Based on this ground, this research focuses on the job of the modern Turkish alphabet characteristics to bring forth successful off-line OCR system. This system consists of several phases, get downing from fixing the database of modern Turkish alphabet, input the database to the computing machine by scanner, reading the input which is an image file, treating it, and after that convert it to an editable format. Then, this OCR system can be integrated into devices such as nomadic phones to change over any image file ( captured by Camera/mobile phone or scanned by a scanner ) to machine readable/editable format.


The purpose of this undertaking is to develop a simple and easy to utilize OCR system for off-line modern Turkish alphabet. To accomplish this purpose, the undermentioned aims are set:

To present and foreground the features of modern Turkish alphabet.

To supply the Sorani alphabet database.

To change over any image file into readable/editable format.

To better the interactivity between the human and the computing machines.

To look into about the available Pattern Recognition and image processing attacks and finds a suited 1 for OCR.

The cardinal thought behind this undertaking is to develop a simple and ready to hand OCR system that can be integrated in some devices such as nomadic phones and laptops. The developed OCR is used to change over input image files dwelling of modern Turkish text into editable format.

Scope of research:

This research will be under the Computer Vision and Pattern Recognition. An OCR algorithm will be developed to change over the scanned text image into an editable text papers. The system algorithm is programmed in MATLABA® as it provides particular characteristics such as efficient matrix and vector calculations, application development including graphical user interface edifice, threading processing, etc. The templet set involved in the acknowledgment procedure was prepared with Paint and imported into the OCR algorithm. Users can import their text images utilizing scanners, digital camera, or they can do it with Paint. The latter was used during execution and proving stage of the undertaking. Output text papers can be printed out or observed on the computing machine screen. The initial apparatus of the undertaking with the usage of scanner or smart phone for image digitisation, personal computing machine for image processing, and pressman for end product observation is shown in Figure 5.


Fig. 5 needed tools and equipments.

Research methodological analysis:

The proposed method will be implemented utilizing MATLABA® which has powerful characteristics as mentioned earlier. Template matching is utilized as the OCR attack. Unlike nervous webs attack, template matching takes shorter clip and does non necessitate sample preparation. The OCR chief stairss are depicted in Figure 6.


Fig. 6 OCR undertaking chief stairss.

First, the templet is prepared and preprocessed. The preprocessing involves digitisation, binarization, and noise remotion. Following, the image is processed by placing the lines and so the characters utilizing template-matching strategy. Finally, upon a successful execution of the OCR, the recognized forms are displayed in a text papers.

Design Premises:

The undermentioned conditions are assumed during the execution of the proposed OCR:

The font household that will be used is Arial ( as it is widely used ) , black, bold, and of size 12 points. The input image declaration will be ranged by tight 196 dpi fine-mode facsimile quality up to 400 dpi, and the templet size will be 24×42.

The image will be in black and white, clear from noise or with small noise.

The input image consists of text merely which will be divided into lines with one word per line.

Fictional characters to be recognized are modern Turkish alphabets with uppercase merely.

Template Preparation:

The templets to be processed with the OCR will be prepared by utilizing Paint and MATLABA® . The templets will be drawn utilizing “ Text ” in Paint and so saved in MATLABA® current directory. Thereafter, a little codification will be used to harvest the image and resize it to the desired size. This codification besides ensures that all the templets will be binarized and pixel-inverted, so that the templet background will be black and the missive will be in white.


Before implementing the OCR, some preprocessing is required to change over the image into a valid format, which is ready for acknowledgment. As shown in Figure 4, the preprocessing includes digitisation, binarization, noise remotion and skew rectification.


As the input will be in a physical paper format, it should be converted into a digital format so that the system can pull strings it. This transition from a printed page to a digital image involves specialized hardware such as an optical scanner or digital camera. The born-again digital image is so saved for farther processing. In this undertaking, the input images will be used for proving will be created by utilizing Paint plan, such that the text that will be typed in Paint, should be saved in a specific format ( e.g. , JPEG ) , and so will be processed by utilizing Matlab.


A binary image is a digital image that has merely two possible values for each pel. Typically the two colourss used for a binary image are black, 0, and white, 1, though any other brace of colourss can non be used. In a binary image, the colour used for the object ( s ) is the foreground colour while the colour of the remainder of the image is the background colour. As OCR frequently deals with text, which is normally black and white, input images should be converted to a binary format. Images in binary representation need really little infinite to hive away, but normally suffer information loss. After binarization, the pels of the image will be inverted to hold a black background and white foreground. This colour inversion makes the computations simplified, particularly during the line designation procedure in which many computations will be performed with the image background. Therefore, by doing the background value nothing ( i.e. , black ) , the computations can be simpler.

Noise remotion:

During the scanning procedure, differences between the digital image and original input ( beyond those due to quantization when stored on the computing machine ) can happen. Hardware or package defects, dust atoms on the scanning surface, improper scanner useaˆ¦ etc can alter the expected pel values. Such unwanted Markss and inaccurate pel values constitute noise that can potentially cut down character acknowledgment truth. There are normally two types of noise ; first type is the linear noise where background pels are assigned a foreground value alternatively of a background value. This type of unwanted noise can be reduced or eliminated by taking any groups of foreground pels that are smaller than some threshold but without taking little parts from characters such as ‘i ‘ and ‘j ‘ or some punctuations. For illustration, in the OCR codification that will be used in this undertaking, the undermentioned Matlab map will be used to take all objects incorporating fewer than 10 pels,

imagen = bwareaopen ( imagen, 10 ) aˆ¦ ( 1 )

The 2nd type of unwanted noise is when pels are assigned a noisy background value alternatively of the foreground value that it should hold been given. A additive or non-linear filter can be used to smooth the noisy images. These filters must be used carefully because if excess smoothing is applied, the filtering can do some jobs such that discontinuous character edges become joined or multiple characters merge together.

Cleavage and Bunch:

In this phase, the clean digital page will be segmented so that the single characters of involvement can be extracted and later recognized. The attack that will be used in this phase is a top-down attack. First, the lines in the page under procedure can be identified by utilizing horizontal projection, by which the page will be scanned horizontally to apportion the first and end lines and divide/segment the page into lines. Thereafter, single characters of each line can be recognized utilizing perpendicular projection, by which each line is scanned vertically to happen groups/clusters of affiliated pels, where each group of them represents one character. Finally, the characters composing the lines will be compared with templets to obtain the best lucifer.

Line designation and word extraction:

In general, any image consists of rows and columns of pels ; group of columns and rows concepts one line or specifically one word. Therefore, to place lines of an image, the figure of rows will be our chief concern. First, the boundary lines ( i.e. , foremost and last row of the file ) have to be identified. This can be done by scanning the image from up to toss off to take the extra rows whose all columns labeled by nothing. The scanning is paused when a row is found with a column labeled 1, and therefore this row is recorded as the first row. The scanning is so resumed until another bunch of rows whose all pels are zeros is detected and deleted. The row merely before this group of rows is called the last row. Second, the rows between first and last rows of the file are to be processed. The horizontal projection will be used for line designation as stated earlier. The end of the horizontal projection is to group the horizontally-connected pels and to give them the row figure as a label. If the rows are connected, their labels should be back-to-back ( this is merely applicable for uppercase alphabets ) . The entire figure of rows representing one word is merely 12 because of the font size ( i.e. 12-pt font size ) . Lines are so extracted and stored in an array, and so on ( Figure 7 ) .

Fig. 7 line designation

Connected constituents analysis and character extraction:

With this attack, perpendicular projection will be performed and connected pels will be grouped together to make a set of affiliated constituents. There are two strategies: ( 1 ) 8-connected strategy ; and ( 2 ) 4-connected strategy as illustrated in Figure 8. In this OCR undertaking, 8-connected strategy is used because it is more accurate, as it considers all the environing pels.

Fig. 8 A individual pel with its 4- connected vicinity ( left ) , and 8- connected vicinity ( right ) .

OCR templet matching:

Resized images from the old phase will be usage to compared it with the templets stored in the database to obtain the best lucifer. For this intent, the Matlab map ( Equation 2 ) can be used to calculate the correlativity coefficient between A and B, where both of these characters are matrices or vectors of the same size. Therefore, fiting characters will be based on the correlativity end product. Some jobs are incurred in the correlativity end product of some letters with similar form such ( as R and P, O and Q ) every bit good as the two combined letters in the instance of AZ, AA, TT aˆ¦ etc. These jobs will be overcome by sing the figure of pels in concurrence with the correlativity map end product, therefore, the best duplicate character can be found. The end product which will be a series of characters can be printed in a text papers format.

R = corr2 ( A, B ) aˆ¦ ( 2 )

The correlativity algorithm used in Equation 3.3 is every bit shown in Equation 3:

aˆ¦ ( 3 )

Where Aˆ= mean2 ( A ) , and B= mean2 ( B ) . The entire figure of pels of each image is 1008 consisting of black and white pels. Using this characteristic, the characters can be compared and the jobs addressed earlier can be overcome. For illustration, with the extraction mechanism used the two letters ( AA ) are considered one character because some of their pels are connected at the underside. Therefore, the correlativity end product will happen the best lucifer which is in this instance the missive ( M ) . However, this job can be solved by using the figure of pixels characteristic ; the figure of foreground pels of AA ( & gt ; 500 ) is greater than that of M ( & lt ; 500 ) .