Evaluate your classification model's effectiveness by examining accuracy and error types. Understand the nuances of true positives, true negatives, false positives, and false negatives to enhance predictive performance.
Key Insights
- The overall accuracy of the evaluated classification model is 77%, indicating a relatively strong predictive capability on approximately 3,000 test samples.
- Sample assessments revealed variations in prediction accuracy, ranging from 90% correct predictions (2 incorrect out of 20 samples) to lower accuracy intervals such as 75% (5 incorrect out of 20 samples).
- The article distinguishes clearly between the types of prediction errors: false negatives (predicted to stay but left) and false positives (predicted to leave but stayed), highlighting the importance of analyzing errors for improving model performance.
Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.
Let's run the same evaluation we ran on the linear regression and see how we did. So first, let's just take a look at our predictions. Model, predict based on the test data now.
I'll save that as `predictions`. And then we'll say, okay, I want to print a list version of, there are going to be about 3,000 things here. We don't want to print them all out.
So make a list of Y test and give me the first 20. And then also, same thing for our model, for our predictions. Give me the first 20.
And they're not perfectly matched. In this case, we can actually see this one, they got almost all right. All these zeros indicate employees who stayed.
This one, the actual, this is the actual answers. The third one actually left, but we did not predict that correctly. The first one stayed and we instead predicted it left.
That's two wrong out of 20—90%. That's pretty good. How about we take a look at predictions from 20 to 40?
Let's look at the next 20. Here, we didn't do quite as well. Here, we got one wrong.
We got another one wrong. Remove this number five here. And there were three more that left that we didn't catch at all.
That's five wrong out of 20 in that case. That's only 75%. But these are tiny samples—20 predictions out of 3,000.
Let's get an actual score. Our accuracy score, which is just, in this case, it's not about guessing the mean. This is not about getting numbers closer or not.
This is just, hey, how many predictions did you get correct? How many predictions did you get right out of how many predictions you made? Right, so again, like this one would be 75% because we missed five out of 20. And the previous one, we only missed two out of 20. So that was 90%.
Let's try to take a look. Also, I think the math I just did was incorrect. But that's why we have computers.
Okay, so we're going to say, give me model.score. What does this evaluate to? We pass it the test data and the corresponding answers. And we got overall 77%. That's not bad.
That's pretty good. Now, we're going to analyze next exactly what we got wrong because that's a pretty good score overall. But we're going to see that there's some real variance in what we got right and what we got wrong.
So let's take a look. We could have predicted correctly. We predicted, stayed, and they did.
So that's, you know, ones like this. Third one, we predicted they stayed. Yeah, that was right.
Another one, we predicted they left, and they did. There's actually none in this sample. If I undo this and run it again, we might find one.
I think there were some—maybe not. Let's try predictions from 40 to 60 and see if we find examples there. Let's see, one, two, three, four, five.
One, two, three, four, five, six. Nope, that was wrong. That was wrong.
I think this one is right. One, two, three, four, five, six. Yeah, this one we predicted they left and they did.
When they match, that's a correct prediction. And this is called a true negative. Zero, and it was zero.
True positive. It was one, and we guessed one. We didn't just guess; we estimated it was a one.
We predicted it. Now there's two different kinds of errors, and this will be important. We predicted they stayed, but they left.
We predicted zero, but it was one. Here's an example where we predicted they stayed, but they actually left.
That's a false negative. We said, nope, they didn't leave. But actually, they did leave.
We predicted negative, but that was false. Then there is the opposite situation. We predicted they left, but they actually stayed.
That's a false positive. One, two, three, four, five, six. Our sixth one here, we predicted left.
One, two, three, four, five, six, but they actually stayed. That's a false positive. We'll be analyzing this further because, although our overall score is good, we made some errors.
Overall, it's a good score, but we have some errors. Which ones did we get wrong in general? And which ones did we get right in general? Let's take a look at those more advanced evaluations in a moment.