Enhance your data analysis capabilities by efficiently adding predictive insights to attendance data. Learn to leverage Pandas' powerful mapping tools for streamlined linear regression predictions.
Key Insights
- Implement predictive analytics by adding a new column, "predicted concessions," to the concessions data frame using Pandas'
.map()
method. - Easily apply a pre-defined function,
predict concessions
, directly to attendance data to quickly generate predicted concession values. - Assess linear regression predictions against actual data, noting generally accurate results but acknowledging the presence of distinct outliers.
Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.
Our final time in this notebook will be spent very shortly. It won’t be a lot of code. We’re going to add this prediction for each day we have attendance numbers.
So this way somebody could input their attendance numbers, rerun this code, and get further predicted concessions based on this formula. All right, so we’re going to work with our data frame. We’re going to say concessions data frame—I’m sorry, concessions data frame at the column—a new column, predicted concessions.
And we’ll set that value for each value. Predicted concessions should be our attendance column mapped over. Now, map operation means that we’ll give it a function to run on every single attendance and give us back a new series that we’ll put into predicted concessions. So,.map is great.
You could pass it a lambda. You could pass it any type of function you want. This, again, is a built-in method on Pandas Series.
And we can take that value, and it’ll give us back a new series. We can say yeah, concessions data frame predicted concessions is that series. Fortunately, we don’t even have to write a lambda.
We already named our function. We can pass in predict concessions, the function itself. And Pandas, again, will run this function on every single value in attendance to give us back a new column.
And it will be this one. Let’s take a look at the concessions data frame at that point.
You can see these values are generally pretty close, with the exception of that one super outlier. Where is it? I don’t see it. But it’s a fairly major outlier.
Maybe it’s this one. I think it’s somewhere around here. Here it is.
That’s a pretty big miss on the prediction. That’s a number that’s way off the line. So again, there are outliers.
The linear regression does a pretty good job—a linear regression through the mean. And we’re going to be comparing things to that linear regression as we go.
All right, we’re going to move on to the next notebook, 1.5. And I’ll see you folks there.