Predicting Revenue Using Attendance Data and Pandas

Enhance your data analysis capabilities by efficiently adding predictive insights to attendance data. Learn to leverage Pandas' powerful mapping tools for streamlined linear regression predictions.

Key Insights

Implement predictive analytics by adding a new column, "predicted concessions," to the concessions data frame using Pandas' .map() method.
Easily apply a pre-defined function, predict concessions, directly to attendance data to quickly generate predicted concession values.
Assess linear regression predictions against actual data, noting generally accurate results but acknowledging the presence of distinct outliers.

Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.

Our final time in this notebook will be spent very shortly. It won’t be a lot of code. We’re going to add this prediction for each day we have attendance numbers.

So this way somebody could input their attendance numbers, rerun this code, and get further predicted concessions based on this formula. All right, so we’re going to work with our data frame. We’re going to say concessions data frame—I’m sorry, concessions data frame at the column—a new column, predicted concessions.

And we’ll set that value for each value. Predicted concessions should be our attendance column mapped over. Now, map operation means that we’ll give it a function to run on every single attendance and give us back a new series that we’ll put into predicted concessions. So,.map is great.

You could pass it a lambda. You could pass it any type of function you want. This, again, is a built-in method on Pandas Series.

And we can take that value, and it’ll give us back a new series. We can say yeah, concessions data frame predicted concessions is that series. Fortunately, we don’t even have to write a lambda.

Data Analytics Certificate: Live & Hands-on, In NYC or Online, 0% Financing, 1-on-1 Mentoring, Free Retake, Job Prep. Named a Top Bootcamp by Forbes, Fortune, & Time Out. Noble Desktop. Learn More.

We already named our function. We can pass in predict concessions, the function itself. And Pandas, again, will run this function on every single value in attendance to give us back a new column.

And it will be this one. Let’s take a look at the concessions data frame at that point.

You can see these values are generally pretty close, with the exception of that one super outlier. Where is it? I don’t see it. But it’s a fairly major outlier.

Maybe it’s this one. I think it’s somewhere around here. Here it is.

That’s a pretty big miss on the prediction. That’s a number that’s way off the line. So again, there are outliers.

The linear regression does a pretty good job—a linear regression through the mean. And we’re going to be comparing things to that linear regression as we go.

All right, we’re going to move on to the next notebook, 1.5. And I’ll see you folks there.

Colin Jaffe

Colin Jaffe is a programmer, writer, and teacher with a passion for creative code, customizable computing environments, and simple puns. He loves teaching code, from the fundamentals of algorithmic thinking to the business logic and user flow of application building—he particularly enjoys teaching JavaScript, Python, API design, and front-end frameworks.

Colin has taught code to a diverse group of students since learning to code himself, including young men of color at All-Star Code, elementary school kids at The Coding Space, and marginalized groups at Pursuit. He also works as an instructor for Noble Desktop, where he teaches classes in the Full-Stack Web Development Certificate and the Data Science & AI Certificate.

Colin lives in Brooklyn with his wife, two kids, and many intricate board games.

Key Insights

Colin Jaffe

How to Learn Machine Learning