Simple Digit Recognition OCR in OpenCV-Python

Question

I am trying to implement a  Digit Recognition OCR  in OpenCV-Python  cv2   It is just for learning purposes  I would like to learn both KNearest and SVM features in OpenCV    I have 100 samples  i e  images  of each digit  I would like to train with them   There is a sample letter recog py that comes with OpenCV sample  But I still couldn t figure out on how to use it  I don t understand what are the samples  responses etc  Also  it loads a txt file at first  which I didn t understand first   Later on searching a little bit  I could find a letter recognition data in cpp samples  I used it and made a code for cv2 KNearest in the model of letter recog py  just for testing    import numpy as np import cv2  fn    letter-recognition data  a   np loadtxt fn  np float32  delimiter      converters   0   lambda ch   ord ch -ord  A      samples  responses   a   1    a   0   model   cv2 KNearest   retval   model train samples responses  retval  results  neigh resp  dists   model find nearest samples  k   10  print results ravel     It gave me an array of size 20000  I don t understand what it is   Questions   1  What is letter recognition data file  How to build that file from my own data set   2  What does results reval   denote    3  How we can write a simple digit recognition tool using letter recognition data file  either KNearest or SVM

User · Answer

OCR which stands for Optical Character Recognition is a computer vision technique used to identify the different types of handwritten digits that are used in common mathematics  To perform OCR in OpenCV we will use the KNN algorithm which detects the nearest k neighbors of a particular data point and then classifies that data point based on the class type detected for n neighbors    Data Used    This data contains 5000 handwritten digits where there are 500 digits for every type of digit  Each digit is of 20  20 pixel dimensions  We will split the data such that 250 digits are for training and 250 digits are for testing for every class   Below is the implementation                                                                                                                                                                                                                                                                                                                                                     import numpy as np   import cv2    nbsp  nbsp  nbsp    nbsp  nbsp  nbsp  nbsp  nbsp  nbsp     Read the image   image   cv2 imread  digits png      nbsp  nbsp     gray scale conversion   gray img   cv2 cvtColor image     nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp cv2 COLOR BGR2GRAY     nbsp  nbsp     We will divide the image     into 5000 small dimensions nbsp      of size 20x20   divisions   list np hsplit i 100  for i in np vsplit gray img 50      nbsp  nbsp     Convert into Numpy array     of size  50 100 20 20    NP array   np array divisions     nbsp  nbsp  nbsp     Preparing train data     and test data      Size will be  2500 20x20    train data   NP array    50  reshape -1 400  astype np float32     nbsp  nbsp     Size will be  2500 20x20    test data   NP array   50 100  reshape -1 400  astype np float32     nbsp  nbsp     Create 10 different labels nbsp      for each type of digit   k   np arange 10    train labels   np repeat k 250    np newaxis    test labels   np repeat k 250    np newaxis     nbsp  nbsp  nbsp     Initiate kNN classifier   knn   cv2 ml KNearest create      nbsp  nbsp     perform training of data   knn train train data     nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp cv2 ml ROW SAMPLE  nbsp     nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp  nbsp train labels     nbsp  nbsp  nbsp     obtain the output from the     classifier by specifying the     number of neighbors    ret  output  neighbours    distance   knn findNearest test data  k   3     nbsp  nbsp  nbsp     Check the performance and     accuracy of the classifier      Compare the output with test labels     to find out how many are wrong    matched   output  test labels   correct OP   np count nonzero matched     nbsp  nbsp  nbsp    Calculate the accuracy    accuracy    correct OP 100 0   output size     nbsp  nbsp  nbsp     Display accuracy    print accuracy                                                                                           Output  91 64                                        Well  I decided to workout myself on my question to solve the above problem  What I wanted is to implement a simple OCR using KNearest or SVM features in OpenCV  And below is what I did and how   it is just for learning how to use KNearest for simple OCR purposes    1  My first question was about letter recognition data file that comes with OpenCV samples  I wanted to know what is inside that file   It contains a letter  along with 16 features of that letter   And this SOF helped me to find it  These 16 features are explained in the paper Letter Recognition Using Holland-Style Adaptive Classifiers    Although I didn t understand some of the features at the end   2  Since I knew  without understanding all those features  it is difficult to do that method  I tried some other papers  but all were a little difficult for a beginner   So I just decided to take all the pixel values as my features   I was not worried about accuracy or performance  I just wanted it to work  at least with the least accuracy   I took the below image for my training data      I know the amount of training data is less  But  since all letters are of the same font and size  I decided to try on this    To prepare the data for training  I made a small code in OpenCV  It does the following things     It loads the image   Selects the digits  obviously by contour finding and applying constraints on area and height of letters to avoid false detections    Draws the bounding rectangle around one letter and wait for key press manually  This time we press the digit key ourselves corresponding to the letter in the box   Once the corresponding digit key is pressed  it resizes this box to 10x10 and saves all 100 pixel values in an array  here  samples  and corresponding manually entered digit in another array here  responses    Then save both the arrays in separate  txt files     At the end of the manual classification of digits  all the digits in the training data  train png  are labeled manually by ourselves  image will look like below     Below is the code I used for the above purpose  of course  not so clean    import sys    import numpy as np  import cv2    im   cv2 imread  pitrain png    im3   im copy      gray   cv2 cvtColor im cv2 COLOR BGR2GRAY   blur   cv2 GaussianBlur gray  5 5  0   thresh   cv2 adaptiveThreshold blur 255 1 1 11 2                            Now finding Contours                                contours hierarchy   cv2 findContours thresh cv2 RETR LIST cv2 CHAIN APPROX SIMPLE     samples    np empty  0 100    responses       keys    i for i in range 48 58      for cnt in contours       if cv2 contourArea cnt  gt 50            x y w h    cv2 boundingRect cnt                     if  h gt 28               cv2 rectangle im  x y   x w y h   0 0 255  2               roi   thresh y y h x x w               roismall   cv2 resize roi  10 10                cv2 imshow  norm  im               key   cv2 waitKey 0                 if key    27      escape to quit                   sys exit                elif key in keys                   responses append int chr key                     sample   roismall reshape  1 100                    samples   np append samples sample 0     responses   np array responses np float32   responses   responses reshape  responses size 1    print  training complete     np savetxt  generalsamples data  samples   np savetxt  generalresponses data  responses       Now we enter in to training and testing part   For the testing part  I used the below image  which has the same type of letters I used for the training phase     For training we do as follows     Load the  txt files we already saved earlier  create an instance of the classifier we are using  it is KNearest in this case   Then we use KNearest train function to train the data    For testing purposes  we do as follows     We load the image used for testing  process the image as earlier and extract each digit using contour methods  Draw a bounding box for it  then resize it to 10x10  and store its pixel values in an array as done earlier   Then we use KNearest find nearest   function to find the nearest item to the one we gave    If lucky  it recognizes the correct digit      I included last two steps  training and testing  in single code below   import cv2  import numpy as np              training part                      samples   np loadtxt  generalsamples data  np float32   responses   np loadtxt  generalresponses data  np float32   responses   responses reshape  responses size 1      model   cv2 KNearest    model train samples responses                                   testing part                               im   cv2 imread  pi png    out   np zeros im shape np uint8   gray   cv2 cvtColor im cv2 COLOR BGR2GRAY   thresh   cv2 adaptiveThreshold gray 255 1 1 11 2     contours hierarchy   cv2 findContours thresh cv2 RETR LIST cv2 CHAIN APPROX SIMPLE     for cnt in contours       if cv2 contourArea cnt  gt 50            x y w h    cv2 boundingRect cnt           if  h gt 28               cv2 rectangle im  x y   x w y h   0 255 0  2               roi   thresh y y h x x w               roismall   cv2 resize roi  10 10                roismall   roismall reshape  1 100                roismall   np float32 roismall               retval  results  neigh resp  dists   model find nearest roismall  k   1               string   str int  results 0  0                  cv2 putText out string  x y h  0 1  0 255 0      cv2 imshow  im  im   cv2 imshow  out  out   cv2 waitKey 0     And it worked  below is the result I got       Here it worked with 100  accuracy  I assume this is because all the digits are of the same kind and the same size   But anyway  this is a good start to go for beginners  I hope so

[python] Simple Digit Recognition OCR in OpenCV-Python

Examples related to python

Examples related to opencv

Examples related to numpy

Examples related to computer-vision

Examples related to ocr