Sorting, Renaming, and Reordering Columns

Perform vector operations, sorting, renaming columns, and repositioning columns in a DataFrame.

Master key Pandas DataFrame operations such as vector calculations, sorting, and column manipulation. Learn practical techniques like using "in place" operations to streamline data processing.

Key Insights

  • Perform vector operations efficiently by multiplying corresponding column values row by row—for example, calculating a "sales" column as the product of "price" and "quantity."
  • Sort DataFrame values with sort_values(), specifying ascending or descending order by setting the parameter ascending=False, and use inplace=True to update the original DataFrame directly without assigning it to itself.
  • Manipulate DataFrame structure by renaming columns with df.rename() using a dictionary of old-to-new names, and reorder columns effectively using df.pop() and df.insert(), such as moving the "price" column next to related numeric columns.

Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.

This is a lesson preview only. For the full lesson, purchase the course here.

All right, vector operations. Vector operations are when you do math across all the corresponding values in an entire column iteratively. So you have one column and another column, and you do math between the columns to get the value for yet another column.

So what we want to do is make a sales column, the value of which will be the price of the item times the quantity that we just added. It will be able to multiply iteratively. We'll say `UDF_sales` is going to equal `UDF_price` times `UDF_qty`.

There—math vector operation, right? Vectors are linear, and we're basically iterating the DataFrame and taking the values in each row and the corresponding columns, multiplying them together to get the new value for the new column. Straight-up math across columns, vector operation: corresponding price and quantity multiplied together iteratively, right? Row by row by row. All right, now you can sort—we haven’t done that yet.

I mean, we filtered—we haven’t sorted. So let’s sort. Let’s say `foodDF.sort_values`, which takes the argument `by`, and we’ll sort by whatever column we want.

And run—we can set it equal to itself though. We store the return value. There you go, they're sorted by price.

Python for Data Science Bootcamp: Live & Hands-on, In NYC or Online, Learn From Experts, Free Retake, Small Class Sizes,  1-on-1 Bonus Training. Named a Top Bootcamp by Forbes, Fortune, & Time Out. Noble Desktop. Learn More.

But the cheapest one is first. What if we want descending order? By default, you get ascending order—in other words, cheapest to most expensive, or smallest to largest numeric value, or A-to-Z if you're alphabetizing strings. But if we want descending order, we have to pass in an extra argument: `ascending=False` to override the default of `ascending=True`.

So then you’ll get the most expensive item at the top. `inplace=True`. Now I’ve noticed time and time again, as we've made our new DataFrames, we've been setting the DataFrame equal to itself.

Boom—equal to itself. Boom—why? Because we're catching a return value. Because it’s not directly changing the DataFrame; it's making a copy. But we can override that by saying we want `inplace=True`.

If you do that, you don’t have to save the operation equal to itself. So what we're going to do is say, all right, we'll sort by name alphabetically. So `foodDF.sort_values(by="Item")`, and we won’t say `ascending=False`, because we’ll let it be `ascending=True` since we want A-to-Z. But what we will pass in now is the argument `inplace=True`, so that we don’t have to set it equal to itself.

It’ll basically stamp the change right on the item in place. That’s what in place means. It stamps the existing instance with the change—makes the change stick without you having to catch a return value.

Makes the change to the existing DataFrame without returning a new DataFrame. Therefore, you do not set the operation equal to itself to make the change stick. `inplace=True`.

`inplace=True` very generally can be used in lieu of setting the operation equal to itself. All right, we can also rename a column. Let’s say we want to rename a column `sales` to `revenue`.

We’ve got a `sales` column. We don’t know how much—it’s unclear if that means money. Maybe that’s units sold or something.

We just want to change the name. So the syntax for that is `dataframe.rename` with the parameter `columns`. You set it equal to a dictionary consisting of a key-value pair where the old name is the key and the new name is the value.

And then we’ll do `inplace=True` so that we don’t have to set that operation equal to itself. So let’s say `foodDF.rename`. We want to rename columns, but we can just do one column. If you want to do more than one column, you just pass in multiple key-value pairs into that dictionary.

The old name, the existing name, we want to change is `sales`. And the new name we want to change it to is `revenue`. And we want to do it in place.

There it is. It’s `revenue` now. We're going to pop the `price` column by name.

So let’s say you want to move a column. You’ve got `price`, `quantity`, and `revenue`, but they should really be next to each other, right? That `price` column should move over and be right after `bread`. So it should be `price`, `quantity`, `revenue`.

We’ll have those three numeric columns next to each other, especially since math is being done with them. So to do that, we’re going to pop it using `df.pop` to remove the column that we want to move, which would be `price`.

And we’ll store it in a variable. It’ll remove it as a Series, and we’ll save it. And then we’ll do `df.insert`, which takes three arguments: the index of the new column, the new column name, and the data.

We decide where to move it. If we pop `price`, then we want to move it over after `bread`, at index 4.

And then you pass in the data. What data do you want to move there? That will be the popped column as a Series that we held onto when we popped the column. We’re going to pop the `price` column by name and save it to a variable, which will be a Series.

So let’s say `pop_price_col = foodDF.pop("price")`. And let’s print. We’ll say shape and type.

The shape should be—`pop_price_col` is not defined. Yeah, you have to run it, okay. Yeah, it’s a vector with 14 items extracted out, right? All the prices.

And here are all the prices as a Series. The data type is Series. If you now run the `foodDF` again, you’ll see there’s no `price` column—but we’re going to put it back in.

We’re going to insert the `pop_price_col` right before the `quantity` column at index 4, right? Because it was like index 0,1, 2,3—we want to go in at 4 right before `quantity`. So that will be the `insert` method that takes the index of the new column (4), the new column name (`price`), and the data being the `pop_price_col` Series.

That should get it right back in there in the new position. There you go. Your `price` column has been moved.

Now it looks more logical, and you can kind of tell at a glance that the math is working as well. So that’s how to pop and insert—how to move a column, basically.

Brian McClain

Brian is an experienced instructor, curriculum developer, and professional web developer, who in recent years has served as Director for a coding bootcamp in New York. Brian joined Noble Desktop in 2022 and is a lead instructor for HTML & CSS, JavaScript, and Python for Data Science. He also developed Noble's cutting-edge Python for AI course. Prior to that, he taught Python Data Science and Machine Learning as an Adjunct Professor of Computer Science at Westchester County College.

More articles by Brian McClain

How to Learn Python

Master Python with hands-on training. Python is a popular object-oriented programming language used for data science, machine learning, and web development. 

Yelp Facebook LinkedIn YouTube Twitter Instagram