The sudoku game is something almost everyone plays either on a daily basis or at least once in a while. The game consists of a 9×9 board with numbers and blanks on it. The goal is to fill the blank spaces with suitable numbers. These numbers can be filled keeping in mind some rules. The rule for filling these empty spaces is that the number should not appear in the same row, same column or in the same 3×3 grid. Wouldn’t it be interesting to use the concepts of OpenCV, deep learning and backtracking to solve a game of sudoku?
In this article, we will build an automatic sudoku solver using deep learning, OpenCV image processing and backtracking.
Using CNN and MNIST dataset
Now, since this game involves numbers let us make use of the simple MNIST dataset and build a neural network on it. We will use Keras to build this neural network.
AIM Daily XO
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Import libraries and load the data
from sklearn.model_selection import train_test_split from tensorflow.keras import backend as K from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Convolution2D, Lambda, MaxPooling2D from keras.utils.np_utils import to_categorical from keras.datasets import mnist (trainx,trainy),(testx,testy)=mnist.load_data() plt.imshow(trainx, cmap=plt.get_cmap('gray')) plt.show()
Let us pre-process the data by reshaping them and normalizing the data. I have also converted the target column into categorical values.
trainx = trainx.reshape(60000, 28, 28, 1) testx = testx.reshape(10000, 28, 28, 1) trainx=trainx/255 testx = testx / 255 trainy=np_utils.to_categorical(trainy) testy = np_utils.to_categorical(testy) num_classes=testy.shape
Build a sequential CNN and train
Here we will build a convolutional neural network with only three layers since we are using MNIST data and it is easy to converge.
Download our Mobile App
model = Sequential() model.add(Convolution2D(32, kernel_size=3, activation='relu', input_shape=(28, 28, 1))) model.add(Convolution2D(64, kernel_size=3, activation='relu') model.add(Flatten()) model.add(Dense(10, activation='softmax')) model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) model.fit(trainx,trainy, validation_data=(testx, testy), epochs=5)
Now that the model has been trained, we can save the weights of the model and use it for the sudoku game.
Using OpenCV for image processing
After we have built the model and saved it, let us now move on to the image processing part of the puzzle. For this, I have selected a random puzzle from the internet.
You can download the above image with this link. Once you have this image, let us use OpenCV and read this image.
import cv2 import numpy as np import matplotlib.pyplot as plt puzzle= cv2.imread(‘sudoku.png’) gray_puzzle= cv2.cvtColor(puzzle, cv2.COLOR_BGR2GRAY) plt.imshow(gray_puzzle,cmap='gray')
The next step here is to reduce the background noise in the image and set all the blanks to 0 value since the machine needs to understand where the numbers have to be filled.
To do this, we will use the OpenCV gaussian blur and the inverse binary threshold. Using the inverse binary threshold, the destination pixels will be set to zero if the condition that the source pixel is greater than zero is satisfied.
destination = cv2.GaussianBlur(gray_puzzle,(1,1),cv2.BORDER_DEFAULT) source_px,threshold_val = cv2.threshold(gray_puzzle, 180, 255,cv2.THRESH_BINARY_INV) plt.imshow(threshold_val,cmap='gray')
As the image shows, all the destination pixels have been converted to grayscale.
Now, we have the source and destination pixels but the image grids are not being read. This is important because the machine needs to understand that there are 9 3X3 grids present within the main grid. To do this, we will use the probabilistic hough transform and warping methods of Opencv.
grid = cv2.HoughgridP(threshold_val,1,np.pi/180,100,set_min=100,set_max=10) for lines in grid: hor_1,ver_1,hor_2,ver_2 = lines cv2.line(image,(hor_1,ver_1),(hor_2,ver_2),(0,255,0),2, cv2.LINE_AA) cv2.imwrite('tranform.jpg,image) hough_image = cv2.imread(transform.jpg',0) final = cv2.imread(‘transform.jpg') plt.imshow(final, cmap='gray')
The final step of image processing is to warp the image so that the complete top-down view can be obtained. This is an important step so that the machine understands how to read the image and where it should start reading the image from.
To do this, we need to contour the image and then find the width and height of the image as follows.
cont,value = cv2.findContours(hough_img,cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE) area = cont max_area = cv2.contourArea(area) for cont in cont: if cv2.contourArea(cont) > max_area: area = cont max_area = cv2.contourArea(cont) eps = 0.01*cv2.arcLength(area,True) final_cont = cv2.approxPolyDP(area, eps, True) def take_point(point): grid_rect = np.zeros((4, 2), dtype = "float32") summation = point.sum(axis = 1) grid_rect = point grid_rect = point diff = np.diff(point, axis = 1) grid_rect = point grid_rect = point return grid_rect def warping(image, point): grid_rect = take_point(point) (len1, len2, breadth1, breadth2) = grid_rect width_1 = np.sqrt(((breadth1 - breadth2) ** 2) + ((breadth1 - breadth2) ** 2)) width_2 = np.sqrt(((len2 - len1) ** 2) + ((len2 - len1) ** 2)) tot_width = max(int(width_1), int(width_2)) height_1 = np.sqrt(((len2 - breadth1) ** 2) + ((len2 - breadth1) ** 2)) height_2 = np.sqrt(((len1 - breadth2) ** 2) + ((len1 - breadth2) ** 2)) tot_height = max(int(height_1), int(height_2)) destination = np.array([ [0, 0], [0, tot_height - 1], [tot_width - 1, tot_height - 1], [tot_width - 1, 0]], dtype = "float32") M = cv2.getPerspectivelen2ansform(grid_rect, destination) warped = cv2.warpPerspective(image, M, (tot_width, tot_height)) return warped
Once the image is warped we can save this and move on to the predictions.
warped_img = wapring(threshold_val,final_cont)
Once we have the image and model ready, we can start loading them and making the predictions on the image.
Now, the first step is to write a function that will predict all the existing numbers on the grid.
def existining_pred(output_img): classes = sudokumodel.predict_classes(output_img) if classes == []: return 0 elif classes == []: return 1 elif classes == []: return 2 elif classes == []: return 3 elif classes == []: return 4 elif classes == []: return 5 elif classes == []: return 6 elif classes == []: return 7 elif classes == []: return 8 elif classes == []: return 9
Next, we need to extract the positions of the individual cells. This can be done as follows:
def getcells(warped_img): warped_img = cv2.resize(img,(252,252)) cells =  wt = warped_img.shape ht = warped_img.shape cell_wt = wt//9 cell_height = ht//9 hor_1,hor_2,ver_1,ver_2 = 0,0,0,0 for i in range(9): ver_2 = ver_1 + cell_height hor_1 = 0 for j in range(9): hor_2 = hor_1 + cell_wt current_cell = [hor_1,hor_2,ver_1,ver_2] cells.append(current_cell) hor_1 = hor_2 ver_1 = ver_2 return cells
Now, we will predict the numbers by writing the logic to understand the rules of the game as shown below.
def predictnums(cell,img): position =  img = cv2.resize(img,(252,252)) img = img[cell+2:cell-3,cell+2:cell-3] cont,value = cv2.findContours(img.copy(), cv2.RElen2_len2EE, cv2.CHAIN_APPROX_SIMPLE) if len(cont) != 0: for c in cont: x,y,w,h = cv2.boundinggrid_rect(c) if (w < 15 and x > 2) and (h < 25 and y > 2): position.append((x,y,x+w,y+h)) break if position == : result = 0 if position: img1 = img[(position):(position),(position):(position) img1 = cv2.resize(img,(28,28)) img1 = img1.reshape(1,28,28,1) result = existining_pred(img1) return result
Now, we will use the transformed image that was created earlier and extract the sudoku digits and print the board as per the extracted values.
def sudoku_dig(warped_img): cell_digits,num = ,0 cells = getcells(warped_img) for cell in range(len(cells)): num = predictnums(cells[cell],warped_img) cell_digits.append(num) n = 9 cell_digits = [cell_digits[i:i+n] for i in range(0, len(cell_digits), n)] return cell_digits puzzle = sudoku_dig(warped_img) def get_board(board_img): for i in range(len(board_img)): if i % 3 == 0 and i != 0: print("- - - - - - - - - - - - - ") for j in range(len(board_img)): if j % 3 == 0 and j != 0: print(" | ", end="") if j == 8: print(board_img[i][j]) else: print(str(board_img[i][j]) + " ", end="") get_board(puzzle)
Backtracking for making predictions
For solving this format of the board, we need to first let the machine find empty spaces and check for the valid numbers that can be put in the right spaces. Empty spaces are those that have 0 as the value.
def check_empty(board_img): for i in range(len(board_img)): for j in range(len(board_img)): if board_img[i][j] == 0: return (i, j) return None
After identification of this, let us now use backtracking to solve the sudoku puzzle. Backtracking is basically a recusrive algorithm that tests all possible paths that can be taken to give the best possible solution.
def backtrack(board_img, digits, position): for i in range(len(board_img)): if board_img[position][i] == digits and position != i: return False for i in range(len(board_img)): if board_img[i][position] == digits and position != i: return False board_img_x = position // 3 board_img_y = position // 3 for i in range(board_img_y*3, board_img_y*3 + 3): for j in range(board_img_x * 3, board_img_x*3 + 3): if board_img[i][j] == digits and (i,j) != position: return False def get_solution(board_img): check = check_empty(board_img) if not check: return True else: x, y = check for i in range(1,10): if backtrack(board_img, i, (x, y)): board_img[x][y] = i if get_solution(board_img): return True board_img[x][y] = 0 return False get_solution(puzzle) print_board(puzzle)
As you can see above, the final solution has the numbers in the grid that were obtained with backtracking.
In this article, we learned how to solve a game of sudoku using simple concepts of deep learning, OpenCV and backtracking. This model can be used on harder puzzles as well as sufficient training.