MITB Banner

Handwritten Character Digit Classification using Neural Network

Handwritten Character Digit Classification using Neural Network - EMNIST Handwritten alphabet digit recognition using deep learning

Share

Handwritten Character Digit Classification

Once upon a time, when we had simpler questions like “What is the problem?”, structured datasets were used to report numbers. Fast forward a few decades, we have more complex questions to answer like “Why is this problem happening?” and with complex problems comes complex datasets or unstructured datasets. To rescue us from all this complexity comes neural network, making machines learn and resolve all those complexities for us, scaling out with each level of complexity.

One of such complex problem is handwriting recognition, imagine handwriting on a paper or a tablet and getting that translated into a computer typed text, no more redo! Imagine not wracking your brains into deciphering a doctor’s handwriting. Imagine a child with dysgraphia, a condition that results in poor handwriting, not struggling in the classroom. 

All this can happen with the handwriting recognition tool, which classifies text from an image. This tool has a Graphical User Interface, where inside a canvas, a user can write any English word in freehand style, and the model sitting at the backend will be able to recognize the word. For this tool, Multi-Layer Perceptron (MLP) classifier with Adam solver and sigmoid function has been used to achieve significant results.

Dataset

Dataset used was created by the National Institute of Standards and Technology (NIST). The NIST Special Database 19 consists of roughly 0.7 million sample png images. The current model has been trained only for uppercase letters (A-Z).  The following table highlights the number of observations per character: 

Table 1: Number of Observations Per Characters

A: 7,010Q: 2,566g: 3,839w: 2,699
B: 4,091R: 4,536h: 9,713x: 2,820
C: 2,792S: 23,827i: 2,788y: 5,088
D: 4,945T: 10,927j: 1,920z: 2,726
E: 5,420U: 14,146k: 2,5620: 34,803
F: 10,203V: 4,951l: 16,9371: 38,049
G: 2,575W: 5,026m: 2,6342: 34,184
H: 3,271X: 2,731n: 12,8563: 35,293
I: 13,179Y: 2,359o: 2,7614: 33,432
J: 3,962Z: 2,698p: 2,4015: 31,067
K: 2,473a: 11,196q: 3,1156: 34,037
L: 5,390b: 5,551r: 15,9347: 35,796
M: 10,027c: 11,315s: 2,6988: 33,884
N: 9,149d: 11,421t: 20,7939: 33,720
O: 28,680e: 28,299u: 2,837
P: 9,277f: 2,493v: 2,854

Preprocessing

Each character in the original dataset occupies 128×128 pixels per raster (Fig 1 a), to avoid heavy computation the size of the image was reduced to 56 x 56 pixels (Fig 1 b). Furthermore, canvas size was reduced to 28×28 pixels by removing the padding (Fig 1 c) which resulted in a 784 feature configuration dataset. Each character was labelled sequentially from “A”- “Z”. 

Handwritten Character Digit Classification
Figure 1: (a) Original 128 x 128-pixel raster as obtained from the NIST Database, (b) 56 x 56- pixel raster upon resizing the image, (c) 28 x 28-pixel raster upon removing the padding from the resized image

The package ‘tkinter’ was used to create the canvas-like user interface. Once a user writes a word in the canvas, the tool converts the image into a NumPy 2-D array and then traverses the array column-wise looking for a filled pixel to mark the beginning of a letter. For words, the model continues to traverse and look for a column where there is significant relative blank space to mark the beginning of the second character. The tool is intelligent enough to differentiate a break in letters versus the beginning of a second letter.

Handwritten Character Digit Classification
Figure 2: User- Input Interpretation by the Model

Handwritten Character Digit Classification
Figure 3: User-Interface

Challenges

While designing and creating this tool, several challenges were faced highlighted below:

#1: The original dataset demanded heavy computational power to hyper tune the model for different combinations. To overcome this limitation with the personal computer that was used to build this model, the dataset was split into batches of five characters (i.e. letters A-E and F-I, etc.), and the model was trained and tested using these batches. 

#2: An imbalance in the number of observations among the characters in the dataset, resulted in complications during the testing and training phase (i.e. for example 10k and 2.5k images for J and K respectively). To overcome this a script was created that divided the test and train dataset for each character individually, and later merged them to ensure a well-balanced dataset.

#3: English alphabets contained letters that appeared very similar to each other (i.e. B and P, D and O). While training and testing, the model struggled to classify these letters accurately. To minimize misclassification for these letters, the model was trained and hyper tuned for these letters separately with a larger dataset. 

Neural Network Model Configuration

For this tool, Multi-Layer Perceptron (MLP) classifier has been trained using backpropagation to achieve significant results. Below is the configuration of the neural network:

  • Hidden Layer Size: (100,100,100) i.e., 3 hidden layers with 100 neurons in each
  • Activation Function: logistic sigmoid, returns f(x) = 1 / (1 + exp(-x))
  • Solver for weight optimization: stochastic gradient-based optimizer (“Adam”)
  • Early Stopping (to avoid overfitting): True

A picture containing table, large, computer, wooden

Description automatically generated

Figure 4: Model Architecture

Results

Table 2: Results Summary

Number of Samples (Test Set):74,491
Correctly Classified:71,227
Accuracy:95.6%
 
CharacterAttemptsCorrectly ClassifiedAccuracyMisclassified With
2,774 2,696 97.2%‘B’, ‘H’, ‘K’, ‘N’, ‘R’, ‘X’
1,734 1,572 90.7%‘A’, ‘D’, ‘E’, ‘G’, ‘H’, ‘R’, ‘S’
4,682 4,544 97.1%‘E’, ‘G’, ‘L’, ‘O’
2,027 1,776 87.6%‘B’, ‘O’, ‘P’, ‘Q’
2,288 2,115 92.4%‘B’, ‘C’, ‘F’, ‘G’, ‘K’, ‘S’
233 210 90.4%‘E’, ‘P’,’T’
1,152 1,050 91.1%‘B’, ‘C’, ‘E’, ‘O’, ‘Q’
1,444 1,278 88.5%‘A’, ‘B’, ‘K’, ‘N’, ‘R’
224 196 87.4%‘J’, ‘L’, ‘T’, ‘Z’
1,699 1,589 93.5%‘I’, ‘T’, ‘Z’
1,121 1,008 90.0%‘A’, ‘E’, ‘H’, ‘M’, ‘N’, ‘R’, ‘X’, Y
2,317 2,255 97.3%‘C’, ‘I’
2,467 2,337 94.7%‘K’, ‘N’, ‘W’
3,802 3,606 94.9%‘A’, ‘H’, ‘K’, ‘M’, ‘R’
11,565 11,338 98.0%‘C’, ‘D’, ‘G’, ‘Q’
3,868 3,778 97.7%‘D’, ‘F’, ‘R’
1,162 1,009 86.8%‘D’, ‘G’, ‘O’
2,313 2,144 92.7%‘A’, ‘B’, ‘H’, ‘K’, ‘N’, ‘P’
9,684 9,481 97.9%‘B’, ‘E’
4,499 4,430 98.5%‘F’, ‘I’, ‘J’
5,802 5,658 97.5%‘V’, ‘W’
836 810 96.8%‘U’, ‘W’, ‘Y’
2,157 2,017 93.5%‘M’, ‘U’, ‘V’
1,254 1,163 92.7%‘A’, ‘K’, ‘Y’
2,172 2,055 94.6%‘K’, ‘X’
1,215 1,162 95.6%‘I’, ‘J’
Figure 4: Accuracy per character

Future Expansion 

While this tool serves as a base model in bridging the communication gap, there is more work that needs to be done. Currently, the model can decrypt letters and words, but it is capable of processing phrases and paragraphs with proper expansion. Additionally, the UI/ UX can be further developed to be leveraged by a wider user-audience.

Share
Picture of Utkarsh Nigam

Utkarsh Nigam

An aspiring Data Scientist currently pursuing MS in Data Science at The George Washington University, who also holds a B. Tech. in Computer Science, and an MBA in Marketing. He has work experience in Machine Learning, Statistical Modeling, Automation and Data Analytics, and wants to specialize in the field of Deep Learning.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.