Explore the transition from basic Python programming into powerful data science applications with NumPy. Learn how NumPy enhances Python lists, making data manipulation and analysis easier and more efficient.
Key Insights
- Introduces NumPy (Numerical Python) as a Python module that significantly enhances list functionality, enabling reshaping into multi-dimensional arrays such as two-dimensional spreadsheets.
- Covers techniques for auto-generating non-repeating random numbers using Python's built-in random module, simplifying the creation of large datasets.
- Demonstrates practical methods for slicing nested lists, extracting subsets of data, and highlights how these foundational skills set the stage for advanced data manipulation using NumPy.
Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.
This is a lesson preview only. For the full lesson, purchase the course here.
Hi, welcome back to this course on Python programming for data science. We are now getting into the data science part. My name is Brian McLean.
Thanks for rejoining me. All right, the first five lessons are done. That's the core programming.
We did variables and data types, if else logic, modules, loops, and dictionaries. That'll get us a foundation in core programming so that we're in a good position to move forward to what Python is more famous for perhaps, and that is data science. Data science being the loading, manipulating, aggregating, cleaning, interpreting, gaining insight from, and visualizing data, including large amounts of data.
So I'm going to copy file six. We've got 06 NumPy. And what, pray tell, is NumPy? NumPy is short for numerical Python.
The NumPy module adds functionality to lists. It's kind of like having lists all souped up with superpowers. It enables lists to be reshaped into two- and three-dimensional shapes, like a spreadsheet of rows and columns would be two-dimensional.
All the lists that we've been working with are just one-dimensional vectors, as they're called. And this two-dimensional format, this matrix, is the underlying structure of a spreadsheet. In Python, a spreadsheet is a Pandas's data frame.
So let's begin by importing NumPy. And it's conventional to alias it as np. And we're going to import random again.
And import pprint. All right. So what we're going to do is begin by declaring a list of numbers.
We've got some numbers here. We'll say nums and print and print the type, which we know to be a list, of course. There we go.
What else can we find out about this list? Remember, we can get the length of it. And we could also get the sum of the list. So there's 12 items, and they add up to 427.
We know this kind of stuff, right? We could also get, you know, print the last three items, last negative 3 to the end, and so on. Print every other item, if you recall. We could print every other.
We could print the items backwards. That's negative 1. We haven't looked at that. If your step is negative 1, it actually runs backwards.
So this is just a little quick recap of lists. Now, we could also auto-generate. What if we didn't have the numbers? We just wanted to auto-generate some non-repeating numbers.
Well, we could use random sample. Remember, we use random sample to get five unique lottery tickets. We could say nums equals.
Instead of hard coding the numbers or just having them happen to be lying around, we'll say random sample. And we'll do range. We want the numbers from 1 to 100,99.
It's exclusive. We'd like 12 numbers, just like our original nums. And let's print all that, see if it works.
There you go. That adds up to 566. If you run it again, it's going to change every time.
So random sample is giving us a dozen unique numbers in our 1 to 99 range, as opposed to sitting there typing them. Now, let's make another 12-pack of numbers. But this time, let's bundle them into child lists.
And that's a very laborious process. We've got them. We'll call it nested nums.
There's 12 of them still, but they're in little packs of three. So nested nums actually has four items. Nested nums, right.
Nested nums is a list. The length should be four. I don't know if it's going to be able to do the sum.
No, it cannot do the sum. There we go. So it is a list, right? It can't drill in to do the sum.
So the nested list has, now it has a length of four, because this little three-pack is considered just one item, of course. Now, we could also keep the same numbers in the same order, right, and just bundle them into groups of four instead of groups of three. And it's really all the same numbers.
It's just the cutoff. Instead of four groups of three, it's three groups of four. Length would be three.
And you could do that as well. All right. So let's call that nested nums two.
All righty. Coming back to… Yeah. Okay.
Let's stick with this one. Nested nums. Let's come back to… These examples are… Let's move that down.
Okay. So here's what we want to do. We're going to do a little… We're going to select items here.
If you want to print all, you just boom, like so. Now, what about this little challenge? Pause. Try this.
Try to get what you see next to the print statement. So try to get the 45,51,24 little inner list and so on. Okay.
Here we are. So that would be the second item in nested nums. We would say nested nums at index one.
Let's print a little break here. Okay. So there's that.
We got that. Now the next one, 51 and 24, it's the same one, except we only want the last two items or the second and third item. We could still say give me index one.
And then now that we're in that list, we'll say go from negative two to the end. There we go. Right.
Because that's how you get the last two items, negative two to the end. And the next one, 12 and 39, that would be the third little child list, first two items. So that would be the items at index two, right? The index two gives you all of them, but we just want the 12 and the 39.
We'll say zero, two. And lastly… No, that's it. Oh, just the 12.
How do we find just the 12? Okay. That would be nested nums two, item zero. Go to nested nums two and then get item zero.
Little recap. Now, that's just lists. This is not… Lists here are just a stepping stone for what we really want to about a segue into the topic of the lesson, which is the NumPy array.