Joins- Combining Data from Multiple Tables in SQL

Discover how SQL inner joins efficiently connect related data across tables, streamlining database queries. Learn the practical use of table aliases to simplify your SQL statements and minimize coding effort.

Key Insights

Understand that an inner join matches columns from two tables (such as users and orders or employees and departments) to combine rows based on common data, effectively excluding records without corresponding matches.
Recognize the role of primary and foreign keys in joins, where tables share a common column (e.g., user ID or department ID) to establish relationships, enabling data from separate tables to be viewed in a unified query result.
Implement table aliases to shorten SQL code, reducing verbosity by replacing complete table names (employees, departments) with concise abbreviations (E, D); this expedites writing queries involving joins.

Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.

This is a lesson preview only. For the full lesson, purchase the course here.

Understanding those relationships, we can then join these two tables together when we want to have all of the data in one single query. There are two types of joins, inner joins and outer joins, but inner joins are the most common joins and we simply call it a join. So inner joins are the most common joins and you're looking for common things.

You're saying I have two tables and I have a common column for each of those, as we were just seeing those connecting columns. Often the primary and foreign key, although that's not a requirement. And what a join will do is it's gonna try to look for matches and say, well, we've got users and we've got orders and I wanna find where there's a match.

So could we have people who are registered on our website that have never placed an order? Could be, right? They created an account and they never placed an order. If I were to look through my users table, will I always find an order that matches? Not necessarily. Now, if I go through my orders, I will always have a user who placed that order because we always have to have a user for each order.

When I'm trying to connect, I'm trying to connect users to orders. So both of those tables have to have something referring to each other. So that is where the user ID is in both users and in orders.

It's in both tables. So essentially, I have a bit of duplicate data. You have to have some duplicate data to point between them like user number one, user number one, user number two, user number two.

SQL Bootcamp: Live & Hands-on, In NYC or Online, Learn From Experts, Free Retake, Small Class Sizes, 1-on-1 Bonus Training. Named a Top Bootcamp by Forbes, Fortune, & Time Out. Noble Desktop. Learn More.

There is going to be a little duplicate data but it's just short little numbers generally. So it's the least duplication that I can have. You're looking for these two columns because you're saying if this is user number one and this order is for user number one, they match.

That order goes with that user. Now, when you're looking through matches, could you, let's say you get to user number two and it looks through and it never finds a user number two. There's no match.

Do we want to include user number two in our result set? No, because we're just looking for people who have placed an order. So you could skip some rows and say, no, I don't want to include them in my result set because I only want to see people who placed an order. So what it does is it goes through all the rows comparing the two rows, right? So it says, let's look for user number one.

Let's go through every single order looking for all the user number one orders. And every time there's a match, we know that this user data and this order data go together and we'll form a new row in the result set. And it goes row by row for each order.

If that user number one didn't place that order, it's not a match, so it doesn't do anything. If they never have a match, it doesn't do anything with that user. It's gonna leave them off, which means you might get fewer rows than what you started with because you could have a hundred users, but if you only have 50 orders, you're not going to get a hundred rows because you're only going to get 50 rows for all your orders.

So, you might get fewer rows than what you started with. We're just looking for matches. So for example, we're going to use a different table here.

We will use a list of employees and their departments. These are two different tables. The employees table is on the left.

The department's table is on the right. For the sake of illustration only, I used Ds for the departments and Es for the employees. In your real-life table, they will just be numbers, but I wanted to help illustrate the connection here.

If you get all of the columns from your employees table, that'll give you your whole table, but we want to join it with the department's table, this other table, but how is it going to make connections? What is it looking for equality in? We need to join on one column matching another column. In the employees table, I've got a list of departments. We know which employee works in which department, but over here in the department's table, we see the name of the department with the unique identifier.

So notice that Amy works in D1. What department is that? Sales. You just mentally joined those two rows.

This is an unrelated row to this, except you found a match. That department, number one, is sales. We know that Amy works in sales.

What does Riley work in? Department three, D2. Is that sales? No.

Is that HR? No. Is that IT? Yes. So that's what a join does, it looks through and tries to find a match.

When it finds a match for D3, it will include that and say, oh, Riley works in IT. See how we're joining one row of one table to another row of another table? We're just making connections here, but there has to be some like data in a column of each table. And we're saying when the employee's table's department ID, so this is the table.column. So this column lives in this table.

When that column is equal to the department's table department ID column, so notice there's also this column here. When we say when this equals this, those are a match, connect those rows. That's what joins are all about, is trying to find a match and when it finds it, join those two rows.

So what it does is it goes through one by one and says, oh, that's a match for that. Let me take this row and this row and join them into one big row. We get all the columns from one table and all the columns from the other table joining together and you get a big row with all the data from both.

What about someone who doesn't work in a department? Can you match no department with something over here? No. So since they don't work in a department, there's no match to be made. They get left off of the result set saying, well, hey, Bruce doesn't work in a department.

So I can't match them up with anything. Let's leave them off. Tomorrow we'll see that you can include them with what are called outer joins, but right now we're just looking for matches with inner joins.

Inner joins are saying, I only want to find people who did place an order. I only want to find people who do work in a department. I'm just looking for matches.

I'm not looking for people who didn't place an order or people who didn't work in a department. I'm just looking for people who do. Just a match, that's all.

Now, when you're using aliases, there are two types. We've seen one of them. We've seen column aliases.

A column alias is a nice name for a column when you want to rename it or give it a name if it doesn't have one. Now we're going to see new table aliases. Table aliases are only used when doing joins.

And table aliases are simply to let you write less code. If you've noticed in DBeaver, you notice how it's been adding a letter after our table names? You're wondering, what is this annoying letter? That's for doing joins. They assume you're almost always doing joins, so they add a table alias afterward.

Now the word as, which we use to define an alias, the word as is always optional. If you define a column known as another name, you can just leave off the word as because you don't have to write the word as. Same thing whether you're doing column aliases or table aliases, as is optional.

It's okay if you have it. It's also, you can just get rid of it. The goal of a table alias is to give a short name because if you look at the first code here, see how I had to write out employees? See how I had to write out departments? That's a lot of typing to write out the entire name.

What if you could just say E for employees? What if you could say D for departments? That would be a lot less writing, wouldn't it? That's the point of creating a table alias. A table alias says, instead of having to spell out employees, let's just know it as E. Instead of having to say departments, let's just know it as D for the sake of doing the join. So instead of having to write out the whole table name, you can use the alias instead.

It's simply to write less code. And if you're writing less code, why even bother putting the as because as is optional? So you can simply say employees E, departments D, and then you can use the E for employees and the D for departments. So table aliases are not meant to be pretty.

They're meant to be short and simple, so you don't write a lot of code. That's their only purpose. And if you're not doing a join, you don't need the alias.

That's why we've been able to delete those letters and they haven't done anything because we haven't been using them because we haven't been doing joins. Now, as long as you're including all of the columns and you're saying show all the columns from both tables, you don't have to worry about using aliases in your select. But let's say you just want to show some of the columns, some of the columns from this or some of the columns from this.

If you think about the employee name right here, is there an employee name in this table? No, it's not confusing to say emp name because there's only one of those columns. Department name, it's only in this table. Notice there's no department name over here.

So this is not confusing. But if you say department ID, notice how there's a department ID here and also here. So this is what we would call an ambiguous name.

It's in two places. If I were to say department ID, SQL would return an error: ambiguous name. You must specify: do you want this one or that one? Do you want the one from the employees table or do you want the one from the departments table? Now, it knows that they match, but still requires you to specify which one you want.

And in the case of just a regular inner join, it doesn't really matter which one you use. You could use it from either table, but you do have to specify which table it comes from. And in this case, we could say, it's the department ID.

Let's bring it from the departments table. We'll use D because that is the alias from the departments table. You only need to specify the table it comes from if that name is used in both.

If it's a unique name like empName and there's no ambiguity, you do not need to say which table it comes from. Some people like to specify which table it comes from, so that is optional. You could say E.empName to specify that it comes from the employees table, but that is optional in that case.

But it's required if the name is ambiguous, if it could be coming from two different places. So again, to reiterate: table aliases are short, normally just one or two letters. The point of those is for joins.

You only use them for joins. If you're not doing a join, you don't need them. Unlike column aliases, where column aliases were meant to be nice names, longer names, pretty, nice things that are going to be in your result set.

And with either of them, you can leave off the as. As is always optional. It's okay to have it.

It's also okay to leave it off. In a column alias, if you're putting just a single name with no space, you don't need quotes. If you are putting a space, you need the double quotes.

That's never going to happen with a table alias because table aliases are meant to be letters, just short. It's to save you typing. So the point of a table alias is to be as few letters as possible.

So you're never going to have things like spaces in a table alias, because it's just one or two or maybe three letters, and that's about it. They're meant to be short and sweet.

Key Insights

Dan Rodney

How to Learn SQL