Table of Contents
- Introduction
- Prerequisites
- Setup
- Creating a Dataset
- Preprocessing the Images
- Building a Convolutional Neural Network
- Training the Model
- Testing the Model
- Conclusion
Introduction
In this tutorial, we will explore how to build a handwriting recognition system using Python. Handwriting recognition, also known as Optical Character Recognition (OCR), is the ability of a computer to recognize and convert handwritten characters into digital text. By the end of this tutorial, you will be able to develop a Python program that can accurately recognize handwritten characters.
Prerequisites
To follow along with this tutorial, you should have a basic understanding of Python programming language. Familiarity with machine learning concepts, particularly Convolutional Neural Networks (CNNs), would be beneficial but not strictly required.
Setup
Before we begin, let’s set up our Python environment and install the necessary libraries. Open your terminal or command prompt and create a new Python virtual environment:
bash
python -m venv handwriting-recognition
Activate the virtual environment:
- For Windows:
handwriting-recognition\Scripts\activate
- For macOS and Linux:
source handwriting-recognition/bin/activate
Next, install the required libraries by running the following command:
pip install tensorflow keras numpy matplotlib
We are now ready to build our handwriting recognition system.
Creating a Dataset
To train our model, we first need to collect and create a dataset of handwritten characters. The larger the dataset, the better our model’s performance will be. For simplicity, we will create a small dataset of 1000 samples, each containing a handwritten character and its corresponding label.
- Create a new Python file named
create_dataset.py
. - Import the necessary libraries:
import numpy as np import cv2 import os import random
- Define the function to create the dataset:
def create_dataset(num_samples, directory): characters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" dataset = [] for _ in range(num_samples): label = random.choice(characters) image = np.zeros((32, 32, 3), dtype=np.uint8) cv2.putText(image, label, (5, 25), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2) dataset.append((image, label)) os.makedirs(directory, exist_ok=True) for i, (image, label) in enumerate(dataset): filename = os.path.join(directory, f"{i}.png") cv2.imwrite(filename, image) print(f"Dataset created successfully at {directory}")
- Set the number of samples and the output directory:
num_samples = 1000 output_directory = "dataset"
- Call the
create_dataset
function:create_dataset(num_samples, output_directory)
Run the script to create the dataset. Ensure that the
dataset
directory is created and contains 1000 PNG images representing handwritten characters.
Preprocessing the Images
Now that we have our dataset, we need to preprocess the images before feeding them into our model. Preprocessing involves performing operations such as resizing, normalization, and converting the images to grayscale.
- Create a new Python file named
preprocess_images.py
. - Import the necessary libraries:
import cv2 import os import numpy as np
- Define the function to preprocess the images:
def preprocess_images(input_directory, output_directory): os.makedirs(output_directory, exist_ok=True) for filename in os.listdir(input_directory): image = cv2.imread(os.path.join(input_directory, filename)) # Convert image to grayscale grayscale_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) # Resize image to 28x28 pixels resized_image = cv2.resize(grayscale_image, (28, 28)) # Normalize pixel values to range [0, 1] normalized_image = resized_image / 255.0 # Save preprocessed image cv2.imwrite(os.path.join(output_directory, filename), normalized_image) print(f"Images preprocessed successfully at {output_directory}")
- Set the input and output directories:
input_directory = "dataset" output_directory = "preprocessed_images"
- Call the
preprocess_images
function:preprocess_images(input_directory, output_directory)
Run the script to preprocess the images. Ensure that the
preprocessed_images
directory is created and contains 1000 preprocessed images.
Building a Convolutional Neural Network
Now that our dataset is prepared, we can proceed to build a Convolutional Neural Network (CNN) that will learn and recognize the patterns in the handwritten characters.
- Create a new Python file named
cnn_model.py
. - Import the necessary libraries:
import keras from keras.models import Sequential from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
- Define the function to create the CNN model:
def create_model(): model = Sequential() model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1))) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Conv2D(64, (3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dense(26, activation='softmax')) model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) return model
- Call the
create_model
function:model = create_model()
We have successfully created a CNN model for our handwriting recognition system.
Training the Model
With our model ready, we can now train it using the preprocessed images as input.
- Create a new Python file named
train_model.py
. - Import the necessary libraries:
import os import numpy as np from keras.utils import to_categorical from keras.preprocessing.image import load_img, img_to_array from sklearn.model_selection import train_test_split
- Define the function to load and preprocess the images:
def load_images(directory): images = [] labels = [] for filename in os.listdir(directory): image = load_img(os.path.join(directory, filename), grayscale=True, target_size=(28, 28)) image = img_to_array(image) images.append(image) label = int(filename.split(".")[0]) labels.append(label) images = np.array(images) labels = np.array(labels) labels = to_categorical(labels, num_classes=26) return images, labels
- Set the input directory:
input_directory = "preprocessed_images"
- Load and preprocess the images:
images, labels = load_images(input_directory)
- Split the dataset into training and testing sets:
train_images, test_images, train_labels, test_labels = train_test_split(images, labels, test_size=0.2, random_state=42)
- Train the model:
model.fit(train_images, train_labels, epochs=10, batch_size=32, validation_data=(test_images, test_labels))
We have now trained our model on the handwritten character dataset.
Testing the Model
To evaluate the performance of our handwriting recognition system, we need to test it on unseen data.
- Create a new Python file named
test_model.py
. - Import the necessary libraries:
import os import numpy as np from keras.preprocessing.image import load_img, img_to_array
- Define the function to load and preprocess a single image:
def load_image(file_path): image = load_img(file_path, grayscale=True, target_size=(28, 28)) image = img_to_array(image) image = np.expand_dims(image, axis=0) return image
- Set the path to the test image:
test_image_path = "test_image.png"
- Load and preprocess the test image:
test_image = load_image(test_image_path)
- Make a prediction using the trained model:
prediction = np.argmax(model.predict(test_image), axis=-1)
- Print the predicted character:
characters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" predicted_character = characters[prediction[0]] print(f"Predicted character: {predicted_character}")
Run the script to test the model on a sample image. Ensure that the predicted character matches the actual character in the image.
Conclusion
Congratulations! You have successfully built a handwriting recognition system using Python. In this tutorial, we learned how to create a dataset of handwritten characters, preprocess the images, build a Convolutional Neural Network model, train it on the dataset, and test the model’s performance.
With further improvements and a larger dataset, you can enhance the accuracy of the handwriting recognition system. This technology has various practical applications, such as digitizing handwritten documents, processing handwritten forms, and enabling text input via handwriting on devices.
Feel free to explore and experiment with different architectures, optimization algorithms, and augmentation techniques to improve the accuracy of your handwriting recognition system.
Remember to terminate the script and deactivate the virtual environment once you have finished:
bash
deactivate
Happy coding!