Exercise- Inner Join in SQL - Free Video Tutorial

Explore how SQL joins can link data from different tables, helping you build comprehensive data sets effortlessly. Learn practical tips to simplify your SQL queries and enhance readability.

Key Insights

Understand that SQL joins connect tables by matching common column values, such as joining the employees table to the departments table by matching their identical department IDs.
Learn to use table aliases, such as "E" for employees and "D" for departments, to write more concise SQL join statements.
Recognize that each SQL join creates a new combined data set, letting you easily filter and query multiple tables—for example, finding all users who placed orders and exactly what products they ordered.

Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.

This is a lesson preview only. For the full lesson, purchase the course here.

So let's see how all of this works behind the scenes and see the live coding. So I'm going to open up our last code file here for the day, inner_joins. I'm going to choose my database up here.

Now if I were just looking at the employees table, and we haven't seen these tables before, so I want to show you these by themselves. This is what the employees table looks like. We have a list of their names, their emails, and the departments that they work in.

Department 1,2, and 3. Separately, I want to show you the departments table because I want you to see these both independently by themselves. These are going to be two different queries. My departments are a list of departments.

Notice that I have five departments, but in my employees list, I only have the first three. No wonder this company is going down the drains. Nobody works in support or administration.

This company is about to go bankrupt. No, just kidding. Trying to have some fun with our data here.

SQL Bootcamp: Live & Hands-on, In NYC or Online, Learn From Experts, Free Retake, Small Class Sizes, 1-on-1 Bonus Training. Named a Top Bootcamp by Forbes, Fortune, & Time Out. Noble Desktop. Learn More.

But that is a problem that we only have people working in three out of the five departments. That's a problem. Now I want to know what's the name of the department these people work in.

The names of the departments are in my departments table. My list of employees is in my employees table. I need a way to join these together, and I only want to see a list of people who work in a department.

I don't care about the people who don't work in the departments. Those are the upper management people. They're not assigned a specific department because they're the VP and the president or CEO.

Bruce, he's either the CEO or the laziest person because he has no manager and he has no department, but in actuality, he's the CEO, so he doesn't need a department and he doesn't need a manager. And Chad is like the VP underneath Bruce. So I just want to see a list that like Amy works in department one.

What's the name of department one? I know I can get that from the departments table. Department one is sales. So how do I join them together? Well, I need to say, let's start with the employees table.

And I want to join in the departments. So I'm taking these two tables and I'm joining them. But SQL is going to say, well, how am I joining them? Where am I looking for like data? There needs to be a place that there's some common data.

And if we look in that ER entity relationship diagram, we can see that employees relate to departments. And how do they relate? The department_id, because employees have a department_id and departments have a department_id. So essentially when those two things are equal, we have a match.

If they are equal, we're going to join on the equality of those two columns. So employees has a department_id. Departments has a department_id.

Now, do those column names have to be the same name? No, they could be different names. Whoever makes the database, they can decide if they want to name them the same, but they could have called it, for example, they could have called it emp_department_id and department_id. They could have called them different names.

In our case, they have the same name. And because they have the same name, when I say to join on those two columns, essentially I'm saying that the department_id equals the department_id. The problem is, is that that is ambiguous because it doesn't know which table they come from because the names are the same.

This would give me an error. If they were two different unique names that weren't repeated, this would be perfectly fine. But since they have the same name, I have to say, hey, one of them comes from employees, dot.

The other one comes from the other table. So you're saying the table that contains it and the column in that table, that those two columns are equal. And when they're equal, do the join.

Now we have just joined. Notice that all the people who don't work in a department have now disappeared. And we're only getting people who do work in a department.

See department one, matched with department one. Department two, matched with department two. Department three, see how it made matches between those.

And see how these columns over here are from departments. And all these columns over here, those are from employees. Because whichever column, sorry, whichever table you put first, that's going to start on the left of the result set.

Whichever one you put second, that will be on the right. If I reordered this, it would still work fine. It would just reorder my result set.

It would put the department columns first. And then after that, it would put employees. Because we read from left to right.

So whichever one you put first, that will be on the left. And whichever one you put second, that will be on the right. I care more about the employees' data.

So I'm going to put them on the left. Because I'd rather start with their data. Because I want them on the left side of my result set.

Now instead of having to write out the word employees, I would like to save some code. So to save some code, I will assign E to employees. And this should be known as D. Now instead of having to write it out, I can simply say D. Instead of having to write out that, I can simply say E. That is way more efficient from a code writing perspective. It's just to write less code.

You don't have to use aliases, but they're convenient. And it still works just fine. Just less code that I have to write.

You can say as, but you don't need to. And that is now explaining why when we choose employees, that's why it adds the E. Because it's assuming we're going to do a join, so it makes that alias for you. If I was then going to do a join for departments, see how it adds the alias for you? So the default assumption is so many times you're going to do a join, and so they automatically add the alias, figuring that if you don't want the then you having to type it out yourself.

They're trying to save you time. They figure it's so common to do joins, and they're kind of right. It's very common to do joins.

Now, right now I'm including all of the columns, right? Asterisk. So for example, here, let me do these on different lines here. I'm including all columns.

What if I don't want all the columns? What if I just want employee name? I could do emp_name. And what if I also want the department name? I could also include department_name. And now I know the name of the person and the name of the department they work in.

Those were in two different tables before. I could not have gotten them into one place without joining them together, right? Two different tables. Now they're in one place.

It's not confusing to SQL because emp_name only appears once. Department_name only appears once. However, if I use department_id, notice how department_id exists in both? That's ambiguous because if I try to say also show the department_id column, it's going to say that reference is ambiguous, meaning I don't know which table to pull it from.

Now, technically, both columns are equal, so both columns are the same. I don't really care which one you pull it from, but you got to tell it either the employees or the departments. So you do have to use that little alias there to say which table it comes from just to make SQL happy.

My thought is if it's departments, I may as well pull it from the departments table. If it was employees, I would pull it from the employees table. But it doesn't really matter because right now they're equal because we're doing an inner join, which, by the way, we could say inner join, but inner is not necessary because that's what a join is.

You don't need to say the inner part. It's the most common join, so inner is completely optional. I never write the word inner because why write it when you don't have to write it? Why type a word just for the sake of writing a word? No need to say that.

You do not see table aliases in the result set. Table aliases are just so that you write less code in your code. The only aliases that you see in your result set would be column aliases.

I have not defined any column aliases. So the point of table aliases is not to see them. It's just to write joins with less code so that I don't have to spell out employees here.

I can say E. I don't have to write out departments. I can say D. That's the only point of doing table aliases is to write less code in a join. So you don't use table aliases unless you're doing a join, and you won't see them in the result set.

It's just for writing less code. That's it. Not to make things pretty, just so you write less code.

If you don't want to use them, you can spell out your whole table name, but every time you have to spell out your whole table name, so that just ends up being a lot of extra writing. We use these letters to write less code. That's the only point of them.

Okay. So that takes care of that. Now, once we've gotten that join done, you could do whatever filtering or anything you want, because you've got essentially kind of like a new table that you can then continue working with.

If you then want to add a where, you can now refer to any of these columns. For example, you might want to see where the department_name is sales to find everybody who works in sales. So that is where department_name equals single quotes for a string.

So these are all the people who work in sales. I don't even need that department_id up there. Right? How would I have otherwise found the names of employees who work in sales? Because if I just look in the employees table in here, it doesn't tell me the name of sales.

If I just look in departments, I see sales, but I don't know who works in sales. Right? Now I know you could say, well, can't I just filter this for one? Yeah. But then you have to remember that one is sales.

You have to remember that. The way I did it, I don't have to remember that. Right? I let it make the connection.

I can see the results. And this is much more intuitive for me to write, to say look for sales, not look for department_id one. Like I'm going to remember that one is sales.

Not going to remember that. So let's say you want to join, and you could join as many tables as you want. So this is going to be our last thing here.

Then you can start to do some challenges here. If I say for my users, which I'm going to call you, because I'm going to do a join. Here's all my users.

But I want to see only people who placed an order. And what did they order? Now these are in three different tables. Users place orders, which are made up of line items.

If I want to connect users to their orders, users place orders, and we connect them on the user ID. If I'm joining users to orders, what am I joining on? Well, what I just showed. User ID equals user ID.

They're the same name. So from the users table, it's called user ID, which equals in the orders table, it's also called user ID. So I have a column in each table, and I'm saying when they're equal, there's a match.

Now I have on the left, I have user data. So these are all my user datas. And then as I scroll over to the right, I now have all the order data over here.

Notice that sometimes people are duplicated. See how Eleanor is repeated multiple times? Why? Because those are two different orders. If I highlight the first order here, and I scroll over to the order data, see how this is order number six.

See how this is order number seven? Two different orders by the same person, right? They were placed on different dates. And if I want to know what did they order, well, that's not in these two tables. What they ordered, the products, those are in the line items table.

How do I get to line items? So here, if I look at the order ID, I can match it up with line items. So I need to join in that table. So I add those columns.

Every time you join in another table, you're adding more columns of data. But it's not just about columns. See, I was misunderstanding when I first started learning joins.

I just thought we add more columns. And we do add more columns. But it's also matching the correct row to the correct row in those.

So it's a matter of adding columns. But when it affects the number of rows, you're trying to match a row in one table and a row in another table and saying, let's connect those rows. And then, yes, we get all the columns from one table, all the columns from the other table.

But we go row by row, matching rows, like this user to that order, this user to this order, this user to another order, to another order, and matching those up. So I then want to do another join. And you just join again.

Now I want to join into line items. Line items, we'll call li as their short abbreviation. And what are we joining on? Once again, we need to find two columns that we can match up to make a connection.

And luckily, DBeaver makes it very easy. It lights them up in green. It says, well, in orders, you've got order ID.

And that equals in line items, it equals order ID. So you just start on one side. Orders has order ID.

Orders has order ID. And what does that equal to in the other table? In line items, it's also order ID. So I say in line items, it's called order ID.

I'm just saying those two columns are equivalent. And when you find a match, you know that that line item goes with that order, because you're matching. Now I got a lot of data.

I got a lot of columns. I have the user data over here on the left. I've got all the columns from the users table.

And as I scroll over, I then see all of the order data, because that comes next. As I keep scrolling, I'm now going to also see all of the line item data, because that comes third. And if I were to sort by, to order by, let's say the user ID, user ID.

Now, if I just say user ID, it's going to say ambiguous, because it's saying, hey, I got a couple of those columns. Which one do you want? I want it from the users table. We'll grab it from there.

Notice how many times user number one shows up. See all the duplicates? Why? Because they have placed many orders, probably. And if I scroll over, I'll see that.

See all of these different orders that they placed? Same person, multiple orders, all on different dates and times. As I scroll over more, notice that for a particular order, I might see the order repeated potentially. Oh, here, for example, same order, but different products.

So one order for one user, but they purchased multiple things on a singular order. Different products with different prices. So every row is really just one item for one order for one person.

The next row is another item potentially on the same order or different order for somebody. But I always know who placed the order and what was on that order. So I now have what I call the mega table.

I have a big table, and yes, there's duplicates. But this is why we don't store it like this. This is why we just do it for the sake of our query.

Imagine a database with this much duplication. That would be a waste of storage space. So every join you do, you just join another table, just join another table, and you're just looking for what columns are in common.

This column equals that. And when there's a match, you know that this user placed that order for those line items. And we're not seeing anybody who didn't place an order.

We're just finding people who placed an order and what those line items were on that order. Alright, we do have a couple of minutes left to go through.

There's kind of one multi-step challenge where I walk you through, hey, let's start with this first step, take you through the second step, and you're going to build up a query as a challenge. But let's get started on this and see how much progress we make.

Key Insights

Dan Rodney

How to Learn SQL