Real-Time Sudoku Solver

Aarushi Dua
6 min readJun 19, 2021

Whether you want to learn how to play Sudoku - understand tips, techniques, and strategies to get to the right solution, or you consider yourself a puzzle expert who quickly scans rows and columns to solve the puzzle in a few time like a computer, but both of them required complex thinking to solve the trivial. With the advancement of Artificial intelligence in computer vision, we can create an automatic sudoku solver with the help of a webcam.

Figure 1: Unsolved Sudoku grid (9*9)
Figure 1: Unsolved Sudoku grid (9*9)

The goal of Sudoku is to fill a 9x9 grid such that each row, each column, and 3x3 grid contains all of the digits between 1 to 9. In this article, we aim to create a real-time Sudoku solver which recognizes the elements of Sudoku puzzles and provides a digital solution using Computer vision.

Sudoku Solver is a collection of very basic image processing techniques. A very good way to start is the OpenCV library which can be compiled on almost all the platforms. We also require Tersseract library in Python, also known as pytersseract, and is used for object recognition. Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and “read” the text embedded in images.

We have carried out our Implementation in Three Steps

1. Capture the Sudoku grid from our webcam Image.

2. Extract and Detect the Digits.

3. Solve the puzzle and Print the solution.

In the first step, to capture the video frame continuously use cv2.VideoCapture() to get a video capture object from the camera. Our main goal in this step is to find the largest grid in the frame for that Contour is the best approach for grid detection, and contour works best with binary images, so we’ll start by converting our video frame to a binary frame.

We take the input frame and convert it to grayscale(cv2.cvtColor) and blur the image using Gaussian Blurring (cv2.gaussianBlur) then Apply Adaptive Thresholding (cv2.adaptiveThreshold) and here we obtain a binary frame.

The left clip shows the grayscale and blurred frame and the right clip shows the binary frame after applying thresholding

Now that we have a binary frame, extracting the biggest grid is possible with help of contours.

Contours are defined as the line joining all the points along the boundary of an image that are having the same intensity.

OpenCV has findContour() function that helps in extracting the contours from the image and cv2.contourArea() helps in finding the biggest area but for the major grid, we cannot simply specify the area and extract it as there can different size of sudoku. To overcome this problem we need to detect shape with help of approximating the contours. Our grid is a polygon of 4 sides with cv2.approxPolyDP(), the polygon can quite accurately be detected. Locate the 4 corners of the contour for accurate extracting and also determine the upper left, upper right, bottom left, bottom right corners.

After getting the Boundary of our Grid using Contours, we need to obtain the top-down image of our frame with the help of cv2.getPerspectiveTransform then calculate the perspective transform matrix and warp the perspective to grab the screen.

Warp image is obtained from the biggest contour

In The second step, Extract and Detect the Digits will take place. We apply the same image preprocessing techniques for converting our warp image to binary, as white and black contrast images are easy to detect. Before proceeding, let’s Initialise an empty grid of size 9*9 to store Sudoku Board digits when the digits will be predicted. We loop through each block and chop our sudoku image into 9*9 images using Region of Interest(ROI). The height/width of each block is calculated by dividing the total height/width of warp image by 9 and also remove the margin as we don’t want to include the boundaries(we assume 10 units as the thickness of each boundary).

Chopped numerical images only

As the cropped images have noise so we need to clean them first by following methods: a) Remove Black lines near 4 edges (if any). b) Take the largest connected component (cv2.connectedComponentWithStats), which should correspond to digit, and turn the rest into white pixels. c) Check only for the blank image as in blank image most of the pixels are white and a few are black pixels so that we can pass 0 to empty grid for that blank cell position.

We only left with the numerical image, these images will be predicted with the help of image_to_string function of pytesseract.

Unsolved predicted grid-(Here 0 means blanked cell)

In the third step, we solve the Sudoku puzzle and print the solution back on the warped image. The grid obtained in the previous step is passed into our algorithm. The Algorithm contains the following rules:
1. Firstly check whether the given grid is violating sudoku rules or not
2.In the loop, start with the box with the least possible choice and then fill it with the remaining option.
3.Before assigning a number, check whether it is safe to assign. Check that the same number is not present in the current row, current column, and current 3X3 subgrid. After checking for safety, assign the number, and recursively check whether this assignment leads to a solution or not. If the assignment doesn’t lead to a solution, then try the next number for the current empty cell. Loop till all grid becomes non zero

our grid has filled now we will just map these text at the position of warp image using cv2.putText()

Yeah! our grid has filled now, just one last step is to map these predicted digits at the position of warp image using cv2.putText().

The final output of the Solved Sudoku

Conclusion and Future Scope:

This Real-Time Sudoku solver instantly solves and provides answers to people making them learn and check their knowledge of sudoku solving. We have created this with the Help of Computer vision. It solves the sudoku puzzle in front of the camera and it detects the image with the help of image filtering techniques such as converting the image into greyscale and then apply Gaussian blurring and Thresholding so that we get a binary image. We then find all the contours and out of that, we get the biggest area which is the sudoku grid. The main task then is to extract and detect digits which we have done using pytesseract , it recognizes and reads the text in images. After the extraction of digits, we apply the solving algorithm to get the solution of the sudoku problem and then print and on the warped image.

For the Future scope, we can develop an application based on a much more complex sudoku puzzle which will be a user-friendly tool and also provide a better user experience.

For the code of this project check the Github link given below 👇

--

--