Sorting, Renaming, and Reordering Columns

Perform vector operations, sorting, renaming columns, and repositioning columns in a DataFrame.

Master key Pandas DataFrame operations such as vector calculations, sorting, and column manipulation. Learn practical techniques like using "in place" operations to streamline data processing.

Key Insights

  • Perform vector operations efficiently by multiplying corresponding column values row by row—for example, calculating a "sales" column as the product of "price" and "quantity."
  • Sort DataFrame values with sort_values(), specifying ascending or descending order by setting the parameter ascending=False, and use inplace=True to update the original DataFrame directly without assigning it to itself.
  • Manipulate DataFrame structure by renaming columns with df.rename() using a dictionary of old-to-new names, and reorder columns effectively using df.pop() and df.insert(), such as moving the "price" column next to related numeric columns.

Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.

This is a lesson preview only. For the full lesson, purchase the course here.

All right, vector operations. Vector operations is when you do math across the value, all the corresponding values in an entire column iteratively. So you have one column and another column, you do math between the columns to get the value for yet another column.

So what we wanna do is make a sales column, the value of which is gonna be the price of the item times the quantity that we just added. It's gonna be able to multiply iteratively. We'll say UDF sales is gonna equal UDF price times UDF QTY.

There, math vector operation, right? Vectors are linear and we're just, we're basically iterating the data frame and taking the values in each column, in each row and the corresponding columns, just multiplying together and get the new value for the new column. Straight up math across columns, vector operation, corresponding price and quantity we multiply together iteratively, right? Row by row by row by row. All right, now you could sort, we haven't done that yet.

I mean, we filtered, we haven't sorted. So let's sort, let's say foodDF.sort values. And that takes argument by, and that we will sort by whatever column we want.

And run, we can set it equal to itself though. We store the return value. There you go, they're sorted by price.

Python for Data Science Bootcamp: Live & Hands-on, In NYC or Online, Learn From Experts, Free Retake, Small Class Sizes,  1-on-1 Bonus Training. Named a Top Bootcamp by Forbes, Fortune, & Time Out. Noble Desktop. Learn More.

But the cheapest one is first, what if we want descending order? So by default, you get ascending order. In other words, cheapest to most expensive or smallest to highest numeric value, or A-to-Z, if you're alphabetized, if you're doing strings. But if we want descending order, we have to pass in an extra argument, ascending false to override the default of ascending true.

So then you'll get the most expensive item at the top. In place equals true. Now I noticed time and time and time again, as we've made our new data frames, we've been setting the data frame equal to itself.

Boom, equal to itself. Boom, why? Because we're catching a return value. Because it's not directly changing the data frame, it's making you a copy, but we can override that and say, we want in place equals true.

If you do that, you don't have to save the operation equal to itself. So what we're gonna do is say, all right, we'll sort by name alphabetically. So foo df.sortValuesByItem, and we won't say this ascending equals false, because we'll let it be ascending equals true, because we want it A-to-Z. But what we will pass in now is an argument in place equals true, so that we don't have to set it equal to itself.

It'll basically stamp the change right on the item in place. That's what in place means. It stamps the existing instance with the change, makes the change stick without you having to catch a return value.

Makes the change the existing df without returning a new df. Therefore, you do not set the operation equal to itself to make the change stick. In place true.

In place true very generally can be used in lieu of setting the operation equal to itself. All right, we can also rename a column. Let's say we wanna rename a column sales to revenue.

We got a sales column. We don't know how much, we don't know that that means money. Maybe that's units of sales or something.

We just wanna change the name. So the syntax for that is dataframe.rename parameters columns. You set it equal to a dictionary consisting of a key value pair where it's old name is the key and value is a new name.

And then we'll do in place equals true so that we don't have to set that operation equal to itself. So let's say food df.rename. We wanna rename columns, but we can just do one column. If you wanna do more than one column, you just pass in multiple keys into that dictionary.

The old name, the existing name, we wanna change to sales. And the new name we wanna change it to is revenue. And we wanna do it in place.

There it is. It's revenue now. Gonna pop the price column by name.

So let's say you wanna move a column. You've got price, quantity, and revenue, but they should really be next to each other, right? That price column should move over and be like right after bread. So it should be price, quantity, revenue.

We'll have those three numeric columns next to each other, especially since math is being done with them. So to do that, we're gonna pop, we're gonna say df.pop to remove the column that we wanna move, which would be price. Price.

And we'll store it in a variable. It'll remove it as a series and we'll save it. And then we'll do df.insert, which takes three arguments, the index of the new column.

So wherever we need to move it over from. If we pop price, then we'll wanna move it over after bread. We'll wanna move it to zero, one, two, three, at index four.

And then you will pass in the data. Okay, what data do you wanna move there? That'll be the pop column as series that we held onto when we did pop the column. We're gonna pop price column by name and save it to a variable, which will be a series.

So let's say pop price call equals food, df.pop. Pop the price column. And let's print. We'll say shape and type.

The shape should be, pop price call is not defined. Yeah, you gotta run it, okay. Yeah, it's a 14 nothing burger because it's a vector of 14 items extracted out, right? All the prices.

And here's all, there's all the prices as a series. The data type is series. If you now run the food df again, you'll see there's no price column, but we're gonna put it back in.

We're gonna insert the pop price column right before the quantity column at index four, right? Because it was like index zero, one, two, three. We wanna go in at four right before quantity. So that will be the insert method that takes the index of the new column four, the new column name, keep it price, and the data being the pop price series.

Pop price call. Insert pop price call. We'll say fooddf.insert for price popped price call.

That should get it right back in there in the new position. There you go. Your price column has been moved down.

Now it looks more logical and you can kind of tell at a glance that the math is working as well. So that's how to pop an insert, how to move a column basically.

Brian McClain

Brian is an experienced instructor, curriculum developer, and professional web developer, who in recent years has served as Director for a coding bootcamp in New York. Brian joined Noble Desktop in 2022 and is a lead instructor for HTML & CSS, JavaScript, and Python for Data Science. He also developed Noble's cutting-edge Python for AI course. Prior to that, he taught Python Data Science and Machine Learning as an Adjunct Professor of Computer Science at Westchester County College.

More articles by Brian McClain

How to Learn Python

Master Python with hands-on training. Python is a popular object-oriented programming language used for data science, machine learning, and web development. 

Yelp Facebook LinkedIn YouTube Twitter Instagram