Subqueries in SQL

Explain how subqueries function within SQL statements, emphasizing their execution order, use of parentheses for nesting, and their ability to pass results to an outer query.

Master the power of SQL subqueries to streamline complex database queries. Understand how nesting SELECT statements can simplify retrieving precise data across multiple tables.

Key Insights

  • Use parentheses to nest subqueries within SELECT statements, allowing the inner query to execute first and pass results to the outer query, such as finding line items corresponding to the most expensive product.
  • Understand that subqueries can handle multiple tables without joins, but joins typically perform faster and offer clearer code when matching data across tables.
  • Recognize that subqueries can produce single values, lists of values, or entire tables, which can then be utilized in WHERE conditions, IN clauses, or JOIN operations depending on their output format.

Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.

This is a lesson preview only. For the full lesson, purchase the course here.

So let's talk about subqueries because we've been kind of leading up here to subqueries So this is now our level three class files here So a query is a select statement When you put a select statement inside of another select statement, we call that a subquery. It's a query inside a query Uh, or some people say inner select versus outer select because you're nesting them in You can nest them infinitely Put queries inside of queries inside of queries And the way we do it is with parentheses Now in math when you put parentheses, how does that change the operation of your math? The stuff that's in the parentheses happens first So that's how a subquery works. We simply put parentheses And by putting parentheses we change the order of operation And we will now execute the inner select or inner query the nested query It'll do that first and then pass its results to the parent query So think about what this inner query does we just look at this This will go into my products and say look in the price column and find the most expensive product price Then look at what the parent query does It goes into my line items table, which happens to be a different table And it says let's find line items and only show rows where the price is equal to the most expensive product price But how would I not know what the most expensive product was unless I had done a query to find that out? See how that works Now it must do the inner query first Because for the outer query to get a result it needs to have first done this code to know what that gives you The output of this query is a single number Which is going to be passed in so where the price is equal to, you know, let's say it's 50 dollars and 21 cents It'll then find line items where the price is equal to 50 dollars and 21 cents So it'll always run that nested query first Ah That's how we can put multiple queries together when one query alone is not enough Always the inner query is executed And then it's like it's running that and just taking the result and copying and pasting it into the parent query And then running the parent query So because these are independent separate queries They can refer to the same table.

They can refer to different tables And this is the first time we can deal with different tables without having to join them Uh Now a lot of times we do joins when we want to have all the columns and we want to affect how many rows we have But this is the only time Aside from joins where we can actually deal with multiple tables All in kind of one query, but it's not really all in one query because they're separate queries So the point of subqueries is each subquery Each query is an independent thing And you could talk to the same table or different tables But if you want in a single Query, if you want all the columns and the appropriate rows Joins are still how you do that. So subqueries in no way replace joins. They're just a different feature Depending on what we want to use We could use them separately.

We could use them together. They're just different features I just bring it up because most people think they assume Two tables I must join That's not always the case I'm going to say that's probably the most normal case But sometimes subqueries are the way you go Also in certain cases you can choose to write it in a subquery Or a join because they can deal with multiple tables In most of those situations, I would say it's better to just do the join Subqueries are going to run one query and then run another query So in terms of speed, to run one query and then run another query takes a little bit more time than to just run a single query so most of the cases If you could choose a subquery or a join joins are going to be faster, clearer in code. I'm not saying that every subquery can be written like a join But sometimes they can And I would choose joins So, let's see how these things work All right So this is going to be in our level three folder 1.0 subqueries We're going back to our company data.

So I'm going to choose my server It defaults into company data and we're back in company data where we spend most of our time So I'm going to close up game shows And just reopen the company data We can see our table So let's say I'm in line items We got lots of prices If I want to see products or I should say line items Line items where the price is equal to a certain amount I could just copy and paste that up there and find all the line items for that price But what if I want to find it for the most expensive products Some people would say well I'm going to separately write a query down here And go into my products table And I'm going to look in that price column and out of all those values I'm going to find the maximum price And I'm going to figure that out And see oh, it's 89.58 I'm just going to manually copy and paste that up into this query And run You could do that, but that's a very manual process. Now every time you want to do this again What if there's new products? Now this value might change And so you'd have to keep copying and pasting doesn't seem very efficient, right? But what you could do instead is just to say hey instead of this hard-coded value Let me just replace it with parentheses That allow me to nest in another whole query that does that job And so this will run this nested query first to calculate that amount Pass it into the parent query. When you run the whole thing It finds things for that price. So you don't have to be copying and pasting I show you that because behind the scenes that's kind of what it does for you, right? And when you're trying to understand someone else's subquery Always run the nested query first to see what does that produce? And then think about how that gets passed to the parent query and how that parent query uses that value that it creates Because that's what parentheses do they say run this code first We can insert using parentheses.

We can insert a nested query. So it'll run that first And then send it to the parent query Now we call this a single value because it produces one single value and that's what price wants Price wants in this case price equals a single value I couldn't give a list of values here when I'm doing equals, right? I couldn't give like a whole table to this for example, but it just wants a singular value just one value But let's say we do something a little different if I go into my Users table I want to find my gmail users So I find I'm going to do I like I know I have like here. That's just because this works in SQL Server Whenever I'm doing PostgreSQL, I always use most typically I like most situations Most of the time I don't need to be in case that's it Every once in a while I do but most of the time I don't need to be So this gives me all the gmail people Now if I want to somehow connect that to orders Because in orders I don't have people's names people's emails those kinds of things, right? The only user data I have in orders is user ID Just user ID.

SQL Bootcamp: Live & Hands-on, In NYC or Online, Learn From Experts, Free Retake, Small Class Sizes,  1-on-1 Bonus Training. Named a Top Bootcamp by Forbes, Fortune, & Time Out. Noble Desktop. Learn More.

That's it Now in this query here, oh actually before I do that, let's say I'm trying to find one user ID If I said let me find What if the user ID is user number two? Like show me all the orders from user number two or user number five All right That would be one particular user. That would be a single value What if I want to find multiple people? I could say in Is it in this list of users Is it two or five or ten? And it'll find all the orders for all of those users who are in that list Is this a list of values or a single value It's a list of values. It's multiple values, right? We're not saying it equals one person.

We're saying it equals many people So what goes into these parentheses is a list of multiple values Now down here This is all my gmail people and over here are their user IDs If I only show that user ID column What do I have here? I have a list of user IDs, right? 3 7 14 and so on Up here.

I already have parentheses. Oh, I already have parentheses This query produces a list Up here.

I want a list This produces a list of user IDs up here. I want a list of user IDs If I take this query and I nest it Up here And by the way, you can use tab to indent Shift tab to outdent And it's not required that you indent But I like to do it so that I can see the nested query inside I just think it makes it more obvious what I've nested in Also be careful with nested queries. You cannot put a semicolon In your nested queries because it'll think your whole query ends there You can only ever have one semicolon at the end of the entire query So just make sure you do not use semicolons for your nested queries that will break it Now it passes this list of user IDs Into the end and then it will find those Because in expects a list Of multiple values and this query produces a list of multiple values We always need to make sure that where we're putting the subquery Is in a place that we match the expectation Does it expect a single number or a single string or whatever it is Or does it expect a list of things, multiple values? They need to match This needs to produce What the parent query needs? Now, of course, I also could have just done this with a join And this is one of those things where something like this up here I couldn't do this with a join like I couldn't calculate the max price to then search Using a join But also down here this technically I could have just done a join and added the where to a join Like just to show if I wanted to do this with a join all I want is the order data and the Line item data, right? Well, we could join those We could join them on The order ID, right If I was joining on the order ID I could just say hey, let's take the orders table And join I just want matches like orders that had the line items I'm not looking for outliers here uh Remember join is the most common join you'll ever do It's the default one.

You should always look for normally in most situations God bless you And I'd be joining on the orders order ID So each table has an order ID So I've got all the order data on the left On the right I've got the line item data And then I could just do my where oh, sorry, uh No users, why am I going to line item? Sorry, that's my mistake. I wanted to go to users. Sorry I don't need line items.

I need users Sorry, my bad users How am I joining users? User ID, sorry user ID so this is user This is users There we go. Yeah, because I want order data and I want User data because I need that email And I can still use my where And this way I'm seeing orders From people who have a gmail address So in a case like this Could I do it through subquery? Yes Would I do it? The join is less code Simpler and only requires one query instead of it having to do the nested query first Then do the outer query I just want to point that out that I'm not saying every subquery can be written as a join But if it can Just do it as a join Don't do it as a separate the first example. We could only do as a subquery; you can't do that as a join So sometimes people when they learn later things they're like, oh I'm going to start using this for everything It's just another feature that we use sometimes don't think that later things you learn replace earlier things These are just another feature that we use sometimes subqueries We try to avoid them whenever possible.

We only use it when we have to Because they do complicate things you got a nested query versus the outer query so when we need them we use them But if there's another way to do it Probably rather just do it another way. It'll keep things simpler Now We're now going to put the two together Because the issue in the last exercise at the end is When we're playing with the jeopardy table, which means I need to switch over to the jeopardy uh the Game shows database. So I'm going to click on here Select game shows and then double click on public Now I'm in the game shows database And I'm back to deal with jeopardy over here This is the first multi-line comment by the way You can do a double dash on every single line But if you want to comment out a lot of code This can be the start of a comment and then it'll ignore all the lines until the end of the comment Notice how that alternates So this is the start of a multi-line comment and this is the end If I uncomment that Then all of this code just by deleting two parts becomes uncommented so now I can run this code if I want to So this is what we did at the end of the previous exercise And what we're doing here is we're saying I want to see a list of all the air dates I'm grouping by the air And for each air day, I want to see what's the highest point value And the kind of natural question that sometimes people have is But I want to know the question With that highest point value.

I want to know more And we need to do the group by here Because we needed to figure out what this max point value was for air date We need to do this But it's not enough to do it in the group by because the group by I can't add any more columns here So what I produced here is a table And if I want to get the question where do my questions live? In the jeopardy table So I produced the table here And I need the values that are here But then I also have that jeopardy table Uh, and so I'm going to take this here and I'm gonna bring it down here Okay so This produces valuable information because I needed to figure out For each air date, what was the highest point value? But then separately I have the jeopardy table And in there I have the questions And so when you look, you know, see how there might be multiple questions that have a certain point value So I want to find some way to connect two tables together Oh a way to connect two tables together is through a join, right If I'm looking for some common thing that ties them together now This is something I can't do through the ER diagram Because my subquery is not a real I quote real table Does this live in my database? No, it's just something that's created by the query But it's still a table that I want to join to this and I need to find some common Tie that binds these together The value Uh, so the value is definitely a way to join it. Yep Because I want to see like okay, so I don't want to join a 200 To a 1300, right? So I do want to Have the same point value But it's just joining on point value enough It needs to also consider the air date, right So I need to actually join on two columns I never said you could only join on one column That's all the examples we've seen so far is that we've said hey, these two columns have equal values I never said we were only limited to that We can actually join when this set of columns are equal and this set of columns are equal to each other Yeah question Joins are so okay, so we'll see how the join works. But remember when joins decide to insert a row If both of these things are true It's going to find all of the rows where those two things are true Because joins go through and they compare all the rows And it will output all of the rows We're gonna do a join on these Right Okay, so we're gonna select everything from jeopardy and because we're doing a join we got we're gonna use an alias here.

Okay Now what am I joining it to I'm going to join it to this table that's created by this subquery So I need to put in a parentheses Instead of the actual table name because anytime you insert parentheses you can nest in a subquery Okay So I'm going to put in my subquery here I'm going to paste that in here So I am joining a real table To a table that is produced by query and I need to find something to join on Okay um And also if this is j How am I going to refer to this table because does this table even have a name? No If you think about it this way if you say jeopardy is going to be known as j This table should be known as and then I can name it whatever I want I'll name it f as in found You could name it s as in subquery I don't care what letter you come up with just don't name it j because you've already used j for jeopardy You're just naming this whatever you want But if j is going to be for jeopardy, we can say f as in found or s as in subquery or whatever you want Figured out Now I've got these two tables and I need to see how am I going to join them? So in jeopardy I've got value And in this subquery if I just run this subquery here So, oh bless you Here I've named this I've said that this is my found max value. This is the one that I found to be the max value So I want to compare the value in jeopardy to the value here so Oh now value Is not the same as found max value as far as the name it's not the same name same content inside But for the first time I can just say value equals that Because the names are not exactly equal to each other.

It's not ambiguous I don't have to say which table they come from because it's obvious There is no found max value over here And in this table, I don't have any value. So it's obvious which one they come from SQL will knock away But it's not just enough that I say 1300 equals 1300 I also need to match the air dates So I'm only joining on when this is true and I need to see does this air dates match that air dates? Now those names are the same So one will be in the jeopardy called air dates One will be in my found table called air dates So I'm joining only when those things are both true So the idea here is that we're getting all of jeopardy getting all the columns from jeopardy And now we also have these things over here as well Because we got those two columns we join them together And it only says there's a match when both of those things match when this air dates Matches this air dates And when this value Matches this value when both of those things are true Then it says those are a match And it only includes rows When they're a match Because keep in mind that when you do a join Okay the Default result set is empty And it then goes row by row and decides if there's a match included in the result set And it's only including things in the result set when there's a match If there were never any matches you'd get nothing in your result set So it just says hey This thing that we figured out This air date and that max value when those match the full jeopardy table Include those questions And so now we can see all of this stuff Not just the question, but we can see everything we can see the show number the category the point value the question and the answer And if I were to sort to order by I'll say order by air date Now it gets confused because I have two columns It doesn't really matter which one I pull from I'll pull it from the jeopardy table And order by why is that not working? Oh stupid me Semicolon But it still wouldn't work without the j They would have said it's ambiguous because I've got two air dates It doesn't really matter which one I pull from technically because they both would be a match But just so we can see the air date here Notice that for any given air dates All of these are a thousand So notice how many questions have that highest point value And then if I go to the next like here there was only one question with that highest point value Here there was only one But some shows had multiple Some shows have a lot Right, that's why we can't just add a column in a group by And this is why your queries can start to get longer Because you're piling everything So this kind of puts together almost everything we've learned so far all into one big query I'm not saying every query is as complicated as this but they can get complicated like this So this is an example of a subquery And in this case we're using as a table So you always have to look at what your subquery produces And say how can I use that in another query if I need to combine my queries? And if it's a table, you got to use it where you'd use a date That could be in a from it could be in a join So think of the results as it results as a single value A list of multiple values or an entire table Does this produce an entire table? Like I could use that in another query This also kind of drives home this idea of you're creating a result set You know, you're literally using something that you created from your database And you can join it to something else This is not a quote real table in the sense that it lives in my database I don't have to store it in my database to use it It'll run this generate this table and then join it to the other one It's not stored anywhere, but it works for my Particular query Mind blown a little bit I know it starts to build up This takes a little bit of processing power Don't worry lunch is coming up soon Can recharge everything Okay All right. So, uh, we got a couple challenges here.

Uh, let's go through the 1.0 subqueries And this is in the level 3 Let's go through those challenges

photo of Dan Rodney

Dan Rodney

Dan Rodney has been a designer and web developer for over 20 years. He creates coursework for Noble Desktop and teaches classes. In his spare time Dan also writes scripts for InDesign (Make Book JacketProper Fraction Pro, and more). Dan teaches just about anything web, video, or print related: HTML, CSS, JavaScript, Figma, Adobe XD, After Effects, Premiere Pro, Photoshop, Illustrator, InDesign, and more.

More articles by Dan Rodney

How to Learn SQL

Master SQL with Hands-on Training. SQL is One of the Most In-demand Programming Languages and is Used Across a Variety of Professions.

Yelp Facebook LinkedIn YouTube Twitter Instagram