Solution- Aggregate Function and ROUND in SQL

Approach SQL queries incrementally to ensure accuracy and clarity.

Learn to efficiently use SQL to count and analyze data without loading extensive datasets. Understand the critical difference between accuracy and speed when querying databases.

Key Insights

  • Use SQL's COUNT function to quickly determine the total number of rows in a database table instead of loading and scrolling through large datasets manually, enabling efficient management of datasets containing hundreds or even thousands of rows.
  • Select the correct table based on your data analysis objectives, distinguishing between tables like "products" (listing all available items) and "line items" (tracking sold items), as each provides subtly different insights for average pricing or sales analysis.
  • Prioritize accuracy by carefully constructing query filters and testing wildcards explicitly before applying negative filters (e.g., using LIKE to verify inclusion before using NOT LIKE), thus avoiding false positives and ensuring reliable results.

Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.

This is a lesson preview only. For the full lesson, purchase the course here.

So let's talk through these challenges here. If we want to know how many orders. First of all, if we just say I just want to view the entire orders table.

One of the issues is I can't just look down here and see all the accounts because it only counts 200 rows at a time. Because if we have a million users or a million orders or a million whatever, it doesn't want to return and download all that stuff. If we want to know how many orders there are.

What I don't want to do is scroll through them all, letting them load in increments of 200. That would take way too long. So I just need to perform a count.

A count of the total number of orders. Every row in the orders table is an order. If I count, I'm counting the total number of rows.

I get 500. I have 500 orders. If I switch that to users, I have a hundred users.

SQL Bootcamp: Live & Hands-on, In NYC or Online, Learn From Experts, Free Retake, Small Class Sizes,  1-on-1 Bonus Training. Named a Top Bootcamp by Forbes, Fortune, & Time Out. Noble Desktop. Learn More.

Whatever table you're counting, you're knowing how many of that thing you have. Count employees, count departments, count products. You can count anything.

It's just counting whatever that name is of that table. Count anything at all. Finding the average product price.

Now, are there different places where we can find the product price? Like we could find it in products, right? Is there any other place we can find a price? Line items. There's also a price there. So the question is, which table should we use? And this is where if somebody asks you a question, you want to know what table you're looking for.

You might have to go back and ask them further clarification of what they're looking for. Because if I'm looking for the average price of something that's in the products table or the average price of something that's in line items, what's the difference? Products may or may not have been sold. I don't know how many of them I sold.

Line items refer to the products I've sold. If I look at the average price from something in line items, I know the average price of products I've actually sold. If I'm looking in the products table, that's the average price of products I have available for sale, which I've not necessarily sold.

See how that's subtly different. And it depends on what the person asking the question wants to know. And if they're not clear, what do you have to do? Ask, just ask them.

What you don't want to do is just make an assumption, give them an answer, and then not bring up the issue. Now what you could do, let's say you don't have time to ask him a question. You could just calculate both of them.

Like it doesn't take long to calculate both of them. So I could just say, I want the average price from line items. And that could be one answer.

And then I could take the same thing and do it for products. They're almost the same in this case, but let's just say they were even different. You know, one's 53 and change.

The other one's 54 and change. But I could just say the average price of products available for sale is 53.15. The average price of products that we've sold is 54.20. I could just respond to them with both of those things. If I don't know what they want, because I might not want to wait for that response.

It takes a long time, and to be honest, it's faster for me to calculate it than it is to ask them a question. I could just give them both things, but I want you to start thinking of, make sure you understand the question.

Don't just rush into an answer because I don't care how fast you are. If you give me the wrong information, what's most important always is accuracy over speed. It's great if you're fast, but your ultimate goal must always be accuracy.

Don't sacrifice accuracy just because you're fast. You're like, wow, you're really good at being wrong. That's not, that's not good.

We need accuracy. Okay. The maximum product price that's not a lamp.

Now I want to go through the approach of this here when we're looking in the product table, because we need to get rid of lamps out of this. And I want to make sure that we do in fact, get rid of lamps because, you know, maybe lamps are going out of style, whatever. And people are not going to be using lamps anymore.

They're going to be using built-in lighting. We want to get rid of the lamps, whatever it is. I don't want to show lamps.

And if I say, show me all the rows where the title is not like a lamp. If I mess this up, I won't exclude anything at all. If I mess this up.

So for example, I've got 52 rows. If I just say, Bob, show me everything. It's not Bob.

I don't have any bobs. So what am I hiding? Nothing. Did I change how many rows I have? No, nothing happened.

I don't get an error because it's like, oh, hide all the bobs. Oh, sorry. I don't have any, but we need to make sure that this works.

So don't put the not in the beginning. Do the like. Look for lamp.

Now I know this is wrong. And we're going to get an empty result set, which says, hey, you're not finding any lamps because lamp could be anywhere. Could it end in lamp? Okay.

In this case, it ends in lamp, but I didn't know that. It could have started with lamp and then had some characters after lamp, right? In this case, I found that there were characters before the lamp. Could lamp be somewhere in the middle? In theory, but I don't have any in this case, but I didn't know until I searched for it.

And if you're saying not right from the very beginning, you don't know if your filter worked. See the filter work in a positive sense. Make sure that your I like worked.

And it's really important when you're using wildcards because you might be like, I'm sure I'm finding lamps, but are you sure you're not accidentally also including other things? Because these things can sometimes be way more powerful than you think. And you might accidentally include false positives that you didn't mean to include. Always look at your I like, see the results, make sure it's good.

Then you can flip the not. And you're excluding those lamps now. Now I'm down to 51, right? So always see that first.

Then out of all the prices, because I'm just interested in this list of prices, I want the max price out of all of those. So your answer should be 88.94. But you see how I added each piece of that. So don't just pay attention to in the end, what the query is.

Also pay attention to the way in which I build up the queries. I know when I first started learning SQL, you know, I'd watch some masters, it would just write from start to finish. And they just hit execute and be like, see, there's the answer.

I'm like, wow, you're really good. But I don't know what all those pieces do. So as I was learning SQL, and I started realizing that that's not how you always think from start to finish.

You start with all your data and you wean yourself down so you understand what each piece of code does. If you can think from start to finish, by all means, write that way. But I'm just saying that you don't have to write it that way.

You can start with a lot and filter it down and you can do things piece by piece by piece. You don't have to write it start to finish in perfection. So for example, finding the number of users.

If I'm in the users table, if I counted the number of users, I would know how many users I have. Some of them are Gmail users. So while I want to do accounts, the problem with doing accounts right away is that I don't know what went into that count.

I don't know what emails they have because I don't see it anymore. Once you do an aggregate function, you see the result. You didn't see what went into creating that result.

So I wait to do aggregate functions until later. I'm going to first filter and say, I only want to see rows where the email is like, and I'm using I like because I want to be insensitive to case. I don't care about case, upper or lower case, don't care.

And I know that it's going to end in at gmail.com. So that means there's going to be some number of characters beforehand. I don't know how many characters. I don't know what they are.

They're just something beforehand. So I like lets me use this wildcard of a percentage side. And that percentage sign says zero, one or more characters could be there, but we know it'll end in dot at gmail.com. I verify that this works, that I didn't accidentally include false positives.

And then I know that I can add my count. Once I've seen that that works because I saw what went into creating that count. So I think it's important that we kind of do it in that order.

And what took me a little while to figure out, but once I did, it helped me a lot, at least. And hopefully it'll help you is when you think about the written order in the end, you have to write your queries in this order. Like select has to come first, distinct has to come second.

And just to point this out in all of our levels in level one, level two, level three, I put the same exact file in all three levels, just to give you easy access to this file. This is a two page PDF with the written order on one page and the execution order on the second page. We have to write our queries in this order, ignoring SQL Server because we don't do that.

While you have to write it in that order as a final query, it doesn't mean you need to add the features in that order. What I have found is that the execution order is a much better way in terms of doing the features. I worry about the from first, and I start joining in additional tables if I need to.

Then I do the where, and these we'll learn later. Then I worry about things like select, like doing counts and things. But the idea here is that all of these things, as you go down, they filter your data.

When you start with one table, the only thing they can add is a join. And then we add up and we start with all of our data. Every step from here on down limits our data.

Where says let's reduce some rows. Select says don't show all the columns or do aggregation functions. Distinct gets rid of duplicates.

Order doesn't get rid of data, but they sort the data. And then limit limits how many rows we see. Each step further reduces how much stuff we see for the most part.

And so think of it like a funnel. And so I do the from, then I do joins, then I add where's, then I come back and do the select. And I found that that way I can see my data as long as possible.

Because I see it all and then I filter it down and filter it down and filter it down. That way I get to see what's happening as long as possible. At least that helped me to kind of think of it in that approach.

Yes, in the end, it needs to be written in this order. But it doesn't mean we need to add the features in that order. I can come back and do the select.

I could put an asterisk temporarily and then come back and fill it out later on. So in line items here, we've got returns. Returned is a status.

I'm interested in how much money is locked up in returns. How much money has been returned? So I only want to see where the status is equal to returned. I don't need wildcards statuses.

There's only four statuses. People would choose it from a menu. We're not having people type in a status because they make silly little typos and stuff.

These are all my returns. And now price times quantity would give me the value of those line items. Now I can come up here and if I just say price times quantity.

Row by row, it does price times quantity. Now I know the value of each row. Now I want to take that whole column and I want to add all those numbers up.

So see how when I do a sum. It will then take all those numbers and add them all up into one value. But you see how each step got less and less and filtered down to the point that I got what I wanted.

If you look at the products table. Notice we have product ID, title, right? So that's the title right there. So until you get familiar with your data, you simply have to look at it.

Either look at it over here where we can see that there's a title. And even looking over here, you can probably figure out what title is. But if you don't know what it is, just show yourself the table.

Just look at it. You have to look at your data to even know what's there. The first step in always doing anything before you even write any code is just to look through your data, understand what's there.

And you probably good idea to just select everything from a table and just show yourself the table so you can see what's there. And as you get familiar with it, then you'll know where these things are. So it's just like the town that you grew up in when you were young.

If you spent your whole life in a town, you know, like, you know, all the places to go. You know, all the restaurants, you know, all the buildings, you know it really well. But then you move to a new city.

You don't know anything. The first time you get into a database, you don't know anything. You don't know what's there.

You don't know what information you have. You got to look through. You got to start to do some basic queries before you do anything advanced because you're learning the names.

Eventually you'll get used to this. Um, some databases will be small. Some will be very large.

I had a woman in my class once. She was from Ketchikan, Alaska. And the guy that had been self-taught in SQL that worked on the Ketchikan, Alaska database for like the town of Ketchikan, Alaska, um, he had been like, I don't know how many years he was working there.

They gave her one year to replace this guy that was going to be retiring. They have 1500 tables in their database. All sorts of like public utilities, taxes, all sorts of information for like the whole town.

Evidently they have one giant database of 1500 tables. Now that's not your average size database. So don't be scared by that.

But, but they gave her one year to learn all of the tables, all of the stuff that they, that that person was doing. And so she came to learn SQL here so she can go back to her town and start to do all the stuff that she needs to do in SQL. But she's got a lot of time to go through all of those tables to see like, how does she even know what's there? She's got to look through it.

It's like when I first came to New York city, it seemed daunting. Every single street corner looked the same to me. But now you put, you pop me in a particular street and I'll be like, oh, I've seen this before.

I know this. The first time that I made a subway, I went down into a subway, went to another stop and came out and I mentally connected above ground to below ground. New York city shrank so much smaller.

I'm like, I can mentally understand the city. The same thing as like a database. Once you get used to it, you get used to the names, you know, the columns, you know, the tables, because you spent so much time in that particular database, you'll eventually get used to it.

In the beginning, you just got to look through it until you get that familiarity to know those things. But you can just look at it, just do a query and you see what's there. And then you know where to look.

Find the average price of all products containing the word 'hat'. So let's just say we're doing this from the products table. Technically we could, if we're looking for hats that sold, we could be in the line items table.

But let's just do this in the products table. We're just going to look for products and find the average selling price of products based on their currently listed sale price. Okay.

So I just want to look at hats. So what hats do we have? So I need to see only rows where the names are here in title. Hat is somewhere in there.

So I'm going to say, I don't care about case. So I like in case it's insensitive to case. Hat will be somewhere in there.

Now I'm curious, do I start with a hat and then I have characters afterwards? Nope. Do I end in hat? Yes. Is hat somewhere in the middle? No.

The rustic concrete hat feels so good on your head. The rubber hat's a little better. Funny, funny product names here.

So I got two hats here. Calculating the average isn't too bad, but imagine, you know, that could be, that could be 20,000 hats or something, depending on how many products you have. Now I want to find the average price.

So I want to look in that price column. Those are my prices. And I want the average price, which is right in between those two prices.

Step by step, weaning things down to what you want.

photo of Dan Rodney

Dan Rodney

Dan Rodney has been a designer and web developer for over 20 years. He creates coursework for Noble Desktop and teaches classes. In his spare time Dan also writes scripts for InDesign (Make Book JacketProper Fraction Pro, and more). Dan teaches just about anything web, video, or print related: HTML, CSS, JavaScript, Figma, Adobe XD, After Effects, Premiere Pro, Photoshop, Illustrator, InDesign, and more.

More articles by Dan Rodney

How to Learn SQL

Master SQL with Hands-on Training. SQL is One of the Most In-demand Programming Languages and is Used Across a Variety of Professions.

Yelp Facebook LinkedIn YouTube Twitter Instagram