Model Accuracy: Understanding Precision and Recall Metrics

Evaluate machine learning predictions effectively by interpreting accuracy scores and detailed classification reports. Understand precisely how precision, recall, and F1 scores reveal your model's strengths and weaknesses.

Key Insights

Using the KNN model, accuracy was evaluated at 97%, indicating only one incorrect prediction out of 30 test cases.
A detailed classification report from SK Learn Metrics showed perfect precision and recall for the Setosa category, but highlighted mild inaccuracies distinguishing between Versicolor and Virginica.
The classification report provided critical evaluation metrics, including precision, recall, and F1 score, helping to better understand the model's predictive performance.

This lesson is a preview from our Data Science & AI Certificate Online (includes software) and Python Certification Online (includes software & exam). Enroll in a course for detailed lessons, live instructor support, and project-based training.

Let's check our score a couple of different ways. First, accuracy. What is our accuracy—out of all the predictions we made, how many were correct? We can get that by doing knn_model.score. And it looks like, ah, we need to score some data.

We're missing two required positional arguments: X and y, indeed. In order to score it, we need to give it the testing data. Here's the X_test data.

Make your predictions based on that, and then here's the answers. Tell me how many we got right. And that's pretty good, 97%.

So that means we only missed 3% of it, which probably means only one wrong out of 30. Getting one wrong would result in 97%. We got one wrong out of 30.

We could sit here and eyeball it to try and figure out which one it is. Tempted to do that, but we definitely got one of them wrong, and we can see, however, better if we get a classification report. It will tell us what we missed.

Data Analytics Certificate: Live & Hands-on, In NYC or Online, 0% Financing, 1-on-1 Mentoring, Free Retake, Job Prep. Named a Top Bootcamp by Forbes, Fortune, & Time Out. Noble Desktop. Learn More.

If you remember, we talked about precision and recall. Precision is out of that category. When we guessed that category, how often were we right? And recall is how often our guesses for that category were correct, out of the total number of times it actually was that category. How often did we identify that category correctly? We can get all of that, and the F1 score, which is the harmonic mean of precision and recall, we can get that using the classification report.

That's a function given to us by sklearn.metrics. Let's make a report. It's the classification report, and we'll pass it.

Here's the actual answers. Here's our model's predictions. And also, just to make this easier for us to read, we'll give it the iris data's target names, and then we'll print that report.

And here it is. We can see that we had perfect precision and recall on setosas, but we got a little bit wrong in the versicolor and virginica. We'll dive into that more in the next video.

Colin Jaffe

Colin Jaffe is a programmer, writer, and teacher with a passion for creative code, customizable computing environments, and simple puns. He loves teaching code, from the fundamentals of algorithmic thinking to the business logic and user flow of application building—he particularly enjoys teaching JavaScript, Python, API design, and front-end frameworks.

Colin has taught code to a diverse group of students since learning to code himself, including young men of color at All-Star Code, elementary school kids at The Coding Space, and marginalized groups at Pursuit. He also works as an instructor for Noble Desktop, where he teaches classes in the Full-Stack Web Development Certificate and the Data Science & AI Certificate.

Colin lives in Brooklyn with his wife, two kids, and many intricate board games.

Key Insights

Colin Jaffe

How to Learn Machine Learning