Handwritten Character Recognition with Neural Network

Solanki Yash
5 min readMar 18, 2021

--

In this machine learning project, we will recognize handwritten characters, i.e, English alphabets from A-Z. This we are going to achieve by modeling a neural network that will have to be trained over a dataset containing images of alphabets.

Architecture

The dataset contains 26 folders (A-Z) containing handwritten images in size 2828 pixels, each alphabet in the image is centre fitted to 2020 pixel box.

Each image is stored as Gray-level.

First of all, we do all the necessary imports as stated above. We will see the use of all the imports as we use them.

Read the data:

Now we are reading the dataset using the pd.read_csv() and printing the first 10 images using data.head(10).

Split data into images and their labels:

Splitting the data read into the images & their corresponding labels. The ‘0’ contains the labels, & so we drop the ‘0’ column from the data dataframe read & use it in the y to form the labels.

Reshaping the data in the csv file so that it can be displayed as an image:

In the above segment, we are splitting the data into training & testing dataset using train_test_split().

Also, we are reshaping the train & test image data so that they can be displayed as an image, as initially in the CSV file they were present as 784 columns of pixel data. So we convert it to 28×28 pixels.

Making an Dictionary for identifying which Alphabet is their with the Predicted value:

All the labels are present in the form of floating point values, that we convert to integer values, & so we create a dictionary word_dict to map the integer values with the characters.

Plotting the number of alphabets in the dataset:

I convert the labels into integer values and append into the count list according to the label. This count list has the number of images present in the dataset belonging to each alphabet.

Now I create a list — alphabets containing all the characters using the values() function of the dictionary.

Now using the count & alphabets lists we draw the horizontal bar plot.

Shuffling the data:

Now I shuffle some of the images of the train set. The shuffling is done using the shuffle() function so that we can display some random images.

We then create 9 plots in 3×3 shape & display the thresholded images of 9 alphabets.

Data Reshaping:

Reshaping the training & test dataset so that it can be put in the model

Now we reshape the train & test image dataset so that they can be put in the model.

CNN:

CNN stands for Convolutional Neural Networks that are used to extract the features of the images using several layers of filters.

The convolution layers are generally followed by maxpool layers that are used to reduce the number of features extracted and ultimately the output of the maxpool and layers and convolution layers are flattened into a vector of single dimension and are given as an input to the Dense layer (The fully connected network).

The model created is as follows:

Above Ihave the CNN model that we designed for training the model over the training dataset.

Compiling & Fitting Model:

Here I compiling the model, where I define the optimizing function & the loss function to be used for fitting. The optimizing function used is Adam.

Getting the train and validation accuracies and Losses:

Doing Some Predictions on Test Data:

Here I create 9 subplots of (3,3) shape & visualize some of the test dataset alphabets along with their predictions, that are made using the model.predict() function for text recognition.

Conclusion:

I have successfully developed Handwritten character recognition with Python, Tensorflow, and Machine Learning libraries.

Handwritten characters have been recognized with more than 98% test accuracy. This can be also further extended to identifying the handwritten characters of other languages too.

--

--

Responses (1)