Neural Network Training Process for Digit Recognition

Understand how neural networks learn to recognize handwriting by analyzing patterns across thousands of images. Learn how these models progressively refine their accuracy through self-testing and pixel-level adjustments.

Key Insights

Neural network models learn handwriting recognition by analyzing patterns across 60,000 training images, each consisting of 28 by 28 pixel arrays, allowing them to generalize understanding to new images.
The model progressively improves accuracy through iterative self-testing processes, adjusting internal parameters such as neuron weights to emphasize important pixel relationships.
Upon testing with 10,000 new images, neural networks assign confidence scores to predictions, clearly indicating their certainty level, even when images closely resemble multiple digits.

This lesson is a preview from our Data Science & AI Certificate Online (includes software) and Python Certification Online (includes software & exam). Enroll in a course for detailed lessons, live instructor support, and project-based training.

Before we start normalizing data, getting it all set up, and training our model, let's talk about that process a little bit more. We're going to feed the model our 60,000 28 × 28 arrays—our X train, our training data—and the labels—the answers—our Y train. And if we're talking to the model, we might say, “Hey, model, look at this.”

It's a handwritten eight. I want you to memorize all 784 numbers that make up this eight. Hey, here's another one.

This one's a five. There are 60,000 in total—so good luck. Read them all.

Pay close attention to patterns. Later, we're going to ask you to identify ones you've never seen before, and you should know from looking at these 60,000 and learning these patterns, you’ll be able to look at the next 10,000 and say, “Yep, 98% accurate, ” identifying each one. It’s going to do this through repetition.

We're going to see this in action, which is one of the neatest things about the neural network model system. It’ll detect patterns. It’ll start off doing okay, but it will keep improving. It will train itself.

Data Analytics Certificate: Live & Hands-on, In NYC or Online, 0% Financing, 1-on-1 Mentoring, Free Retake, Job Prep. Named a Top Bootcamp by Forbes, Fortune, & Time Out. Noble Desktop. Learn More.

We’ll watch it train itself. It’ll quiz itself. It’ll say, “Okay, I’ve got this model.”

How accurate am I? And it’ll respond, “Eh, not accurate enough. Let me see if I can improve it.” And it’ll run as many times as we want, trying to improve itself.

It will adjust its knobs and dials. It’ll apply different weights to the hidden layer neurons. It’ll decide, “Okay, maybe this pixel is a little less important.”

“This pixel’s a little more important. This aspect of this pixel, and its relationship to surrounding pixels, might matter more.”

It’ll keep looking at more and more numbers, quizzing itself to make sure it’s still improving. And when it’s done, we’ll be able to test it—show it the 10,000 testing images—and see how it performs. One of the interesting things about neural networks is they hedge their bets.

They assign a confidence score to each of the 10 digits. Most of the time, it’ll say something like, “I’m 99% sure that’s a nine.” And in fact, it’s often rounding to get to 99%—it might be 99.99999999.

For example, when a zero kind of looks like a six, the model might say, “I’m 53% sure it’s a zero and 47% sure it’s a six.” But hey, it leans slightly more toward zero, so it’ll say zero. It will also tell us how confident it was in that prediction.

Now, our testing data is going to be very similar, right? We’re going to give it 28 × 28 arrays—10,000 of them—and say, “Okay, you’ve seen 60,000 of them. You’ve learned how to identify these digits. Hope you got the hang of it—now here’s your 10,000-point quiz.”

Good luck. But it’s going to do great. You’ll probably be surprised at how well it performs.

Okay, next we’ll normalize the data and start training our model.

Colin Jaffe

Colin Jaffe is a programmer, writer, and teacher with a passion for creative code, customizable computing environments, and simple puns. He loves teaching code, from the fundamentals of algorithmic thinking to the business logic and user flow of application building—he particularly enjoys teaching JavaScript, Python, API design, and front-end frameworks.

Colin has taught code to a diverse group of students since learning to code himself, including young men of color at All-Star Code, elementary school kids at The Coding Space, and marginalized groups at Pursuit. He also works as an instructor for Noble Desktop, where he teaches classes in the Full-Stack Web Development Certificate and the Data Science & AI Certificate.

Colin lives in Brooklyn with his wife, two kids, and many intricate board games.

Key Insights

Colin Jaffe

How to Learn Machine Learning