Evaluate machine learning model accuracy by testing predictions against unseen data. Learn how to effectively compare model outputs to actual results using Python.
Key Insights
- Evaluate model accuracy by comparing predictions generated with
model.predict()
against unseen test data, allowing for assessment of how well the model generalizes. - Convert the test labels from a Pandas series to a list to facilitate a clear side-by-side comparison between predicted values and actual test outcomes.
- Examine prediction accuracy visually for small datasets (around 31 rows), noting that predictions often approximate actual values closely, though occasional discrepancies occur.
Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.
Okay, let's test our model now properly. And again, we withheld some of our data, our test data. We can now see what it thinks.
Giving this test, we're giving it a quiz, being like, okay, you learned your math, now what is 10 minus six? All right, trying to teach a subtraction. Or you learned cats and dogs, now what's, is this a cat? You haven't seen this one before, but based on what you learned, is this a cat or a dog? And we'll see how accurate it was. All right, so our test data is small enough, it's only, I think, 31 different rows, that I think we could just take a look at it.
We'll say, okay, let's make a model predictions variable. And say it's what calling model.predict evaluates to. Predict is just a method that our model has on it now.
And what we pass it as this time, not the X and y, we don't want it to have the answer. Instead, we just say, hey, look at the X test data and give me your set of predictions. Run that block and then let's print it out.
And those are certainly some predictions. Are they good? Well, we actually have the answers. We could test it against y test.
We can say, okay, print out y test. Actually, we want the list version of y test because the model predictions are a list and y test was not a list, it was a Pandas series. This'll make this look pretty similar.
Convert it to a list. All right, so some of these are accurate and some of them are gonna be a little off. 26.6 compared to 31.39. That's reasonably close.
This one's also reasonably close. It guessed 16.6, it was actually 19. This one's a little more off.
14.69,22. That's like 50% off. This one, the fourth one is also super off.
Some of them are gonna be on, some of them are gonna be off. But they're all reasonably close. And some of them are gonna be really, really close.
All right, looking for an example here, but this 39 is I think the 46, that's pretty close. And this 19.39 is super close to the 19.58, if I'm counting right. I'm not positive I am.
Great news is we're eyeballing it. We're seeing that it's pretty close. We have a way to directly measure how close are these answers.
Let's take a look at that next.