Week 4 (9 Dec – 16 Dec) – Handwriting Algorithm Explanation

From our understanding, the code uses three neural networks to recognize handwriting, Convulational neural networks (CNN), recurrent neural networks (RNN) and a Connectionist Temporal Classification. The neural networks identify the letters of words by characters; hence it will identify other words that are not in the word data set as long as it is written neatly. Firstly, the image is fed into the CNN algorithm which is used to identify the relevant parts of the fed image. The output of this step is a downsized image with the feature map added. This output is then fed into the RNN to identify the relevant information of the sequence. Longer Short-Term Memory (LTSM) implementation of RNN is used to generate information through longer distance, leading to a more accurate finding. The RNN will output a matrix which will then be used by the CTC which will use the ground truth text to compute the loss value. The CTC will decode the final text through the matrix given by the RNN. In context of the code implementation:

  1. Input image of grey-scale 128 x 32 is inputted (image may need to be rescaled beforehand)
  2. The image is copied into a white target image of 128 x 32 to properly fit and scale the image
  3. The CNN will generate the feature map based on the input image
  4. The RNN will output based on the features with high correlation with characters based on comparison to the dataset. 
  5. It compares the features to the character list data set

It will fill in the blanks based on predictions, based on the most likely character to follow the previous characters. 

However,

Source: https://towardsdatascience.com/build-a-handwritten-text-recognition-system-using-tensorflow-2326a3487cd5

This entry was posted in Uncategorized. Bookmark the permalink.

Comments are closed.