Discover how to effectively manipulate DataFrames by adding new items and filtering based on specific conditions. Learn practical techniques for handling multi-word data entries and calorie-based filtering using pandas.
Key Insights
- Demonstrates adding a new item ("hot dog") into a DataFrame, highlighting potential indexing pitfalls and effective troubleshooting by adjusting the insertion index (length of DataFrame plus two).
- Illustrates filtering a DataFrame to create a subset ("max 650 calorie df") containing only items with 650 calories or fewer, emphasizing conditional filtering using pandas.
- Shows how to filter items with names containing multiple words ("multi-word df") by leveraging string methods in pandas, specifically using the
str.contains()
method with spaces.
Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.
This is a lesson preview only. For the full lesson, purchase the course here.
And here's the challenge. Challenge, put hot dog back at the end of the food df. So pause and do the same thing you just did with blt except for hot dog.
For hot dog you can just make up the values. Okay, hot dog into df. We're going to say food df food df lock square bracket.
We want the length of the d of the food df. That's where the new item is going in. And that is going to be set equal to call it hot dog is the name.
The price will be 450. The calories will be 350. Vegan is false and the bread will be bun will be hot dog bun, hot dog bun.
Oh, it went in and replaced. That's interesting. Why did it go in and replace? Oh, right, right, right, right, right, right, right, right, right.
It replaced falafel because the length is, you know, two items are missing. We want to come in at the end after hot dog. But since two items are missing, instead of going in after blt, it goes in before blt.
So what we'll do is run this again and put the falafel back. Falafel. Um, uh, hot dog over over wrote falafel.
So put it back. And the reason it overwrote it is we said we wanted to go in at the length number. And the length number is not the max number.
We really wanted to go in now at length plus two, right? Or length. Yeah. How many items are there? Yeah.
10, 11, 10, 11, 12, 13, 14. We wanted to go in at length plus two at 16. We'll say food df dot lock len food df plus two.
This being the lock would be square brackets. Okay. This being a location that we wanted to go in at.
There you go. Falafel is back in. All righty.
Moving on. Make a new df of just max 650 calorie items. And then make another, do another one, do a double challenge here.
Make a new df called multi-word df that contains only those items of two or more words. So no falafel, no pizza, no blt, no Reuben, but tuna salad sandwich, turkey sandwich, anything that's more than one word is what you want in your multi-word df. So pause, make two dfs, different filter challenges.
Okay. Here we go with the solution. We're going to say max 650 cows df equals food df.
And we're going to filter inside the food df and cows. And we want less than or equal to 650, right? The max is 650. There you go.
Nothing more than 650. That is your condition. Only those rows where the calories value is less than or equal to 650 will go into the result.
All right. Challenge. Make a new df called multi-word.
Hint. Salads df and burgers df. A reminder that you want to use that string contains.
Contains a space, right? So multi-word df is the food df. And we're going to filter on food df item, right? The name of the food dot straw dot contains, and it needs to contain a space. There is no case in a space, right? We don't need to do that case false thing.
There you go. And all you have now in your results are the multi-word items because they are the ones with a string containing a space. And the one word foods do not have a space in the name.