Conditionals in SQL - Free Video Tutorial

Unlock the power of SQL's CASE statements to streamline your data queries and enhance readability. Learn how to categorize and correct data dynamically within SQL queries without altering your database structure.

Key Insights

Discover how SQL's CASE statements enable conditional logic without traditional if-else syntax, allowing for dynamic data categorization and customization directly within query results.
Understand the importance of condition order within CASE statements, as SQL evaluates conditions sequentially and executes only the first true condition to optimize query efficiency.
Learn practical applications of CASE statements, including correcting inconsistent data entries (such as standardizing spelling errors) and creating meaningful labels or categories, facilitating improved grouping and aggregation within result sets.

Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.

This is a lesson preview only. For the full lesson, purchase the course here.

Conditionals. Most programming languages have if-else logic. And SQL doesn't exactly have it.

SQL kind of has it, but they don't do it in if-else language, like most programming languages do. They use something called Case. This is SQL Server, so we're just going to ignore that.

So what Case does, it still creates a situation where you can list a set of conditions to do things to say, well, if it's this, do something. Else, if it's this, do something else. But we do it through a Case.

So you get different cases. There's two kinds of flavors. If you just want to perform an equality check to say, does this equal this? We call that a simple Case.

If you want to do anything else, then you do a search Case. So let's say you're doing a simple Case. Imagine you're creating a new column in your database.

SQL Bootcamp: Live & Hands-on, In NYC or Online, Learn From Experts, Free Retake, Small Class Sizes, 1-on-1 Bonus Training. Named a Top Bootcamp by Forbes, Fortune, & Time Out. Noble Desktop. Learn More.

Well, not in your database, but in your result set. So let's say you have a column already called Bike Type. And let's say in your database, Bike Type has 'R' or 'M'. And they do that for just minimizing storage because they know that 'R' is equal to Road, and 'M' is for Mountain.

But they don't want to take up a lot of storage space in the database. So they just put these codes in the database. If you're showing somebody a nice result set, do you just want to see R and M in the result set? That might look pretty ugly, right? So instead of using the values that are in the database, could we maybe use those to create a new column that has nice-to-read values? We could do something like a Case for that.

We could say in the Case that the Bike Type column contains an 'R', then in this new column, output the string Road. Notice the same quotes. That's for strings.

That's for text. In the Case that the Bike Type column equals 'M', because with simple Case, it's all about equals. If Bike Type is equal to 'M', then it's Mountain.

Otherwise, in every other situation, just say it's Unclassified. And then we end the Case. We start and end the Case.

And we could put this whole piece of code that would create a column. The way that cases work is we do different cases, right? Different situations. The first one will be evaluated first.

And it'll say, "Hey, is that true? Does Bike Type equal 'R'?" If that's true, you'll get the result. So it'll be Road. If that's true, it just goes to the end and it repeats going to the next row.

So it goes row by row and evaluates this. "Hey, does this row contain 'R'? If so, make it Road." Oh, if it's false, go on to the next expression.

Does Bike Type equal 'M'? If so, make it Mountain. If none of those are true, only then will it get to Else. It does not evaluate all of them to see which one is most true.

It just says, "Give me the first result that's true." So, in other words, if it's 'R', it'll never evaluate 'M' or Unclassified. As soon as it hits 'R', it's done, and it moves on to the next row and starts looking through the cases.

That's to minimize the work. Why compare it to all the values when SQL can just find the first one that's true? That's all SQL is trying to do. Now, in this Case, notice the column name is up here.

I'm saying, in the Case that Bike Type contains 'R', this is always equality in this simple Case. With Search Case, you can do more stuff. And notice I'm putting this here in the SELECT, where we're actually creating a new column.

And this Case will create a new column known as Pricing because this column we're creating doesn't live in the database. I'm creating a new one.

You've got to give it an alias. We create it from the Products table.

We can output the Title column, create a new column known as Pricing. And instead of putting the name of the column here, I'm putting it down here and saying, when the price is less than $20, then in this column, output the string Cheap. When the price is less than $50, output the string Moderate.

If none of those is true, in every other situation, output the string Expensive. So I'm going to get one of these three strings filling up any given row, right? So it says, "For the first row, what's the price?" Now, some people will say, "Well, Dan, what if it's $10? Isn't $10 less than $20 and less than $50?" Remember what I said before? The first one that's true will then go to the end. It will not make it to the next one if the first one is true.

So yes, $10 is less than both of those. But if $10 is less than $20, it's Cheap, and it's done. It will never get to the second line because the first one is true.

Now, if you reordered these, you'd have a problem because anything less than $50 would be Moderate, and it would never get to making anything Cheap. So the order of these things does matter. If it's less than $20, it'll be Cheap.

So you know that by the time you get here, the only thing that can get here is if the price is $20 or more. That's the only way you can get to this because it's just looking for the first thing that's true. It doesn't evaluate them all. It's just looking for the first true condition.

That's to minimize work. So let's see how this works here. I'll open up my query using Case, and I'm going to choose my normal Company Data database.

All right. So let's say I'm outputting data from my Products table. Let's say I put in Title and Price.

So here I see all the Titles and Prices of my products. And I want to create a category. I don't have categories like Cheap, Moderate, and Expensive.

So I want to create a new category, and I want to choose my own labels for these with my own criteria of what gets put into these categories. If you think about it, remember when we did Group By, we had to use groups already created in our database. But what if you want to define your own groups, either as labels or for grouping? We can define any categories we want at any time, directly in the query.

If I say, let's create a whole new column as a Case. Notice how it was added in there—I did type it in.

Although it did add an extra space, which we don't want. In the Case, we say what we want to detect.

So I'm going to say when the Price column is less than $20, then say it's Cheap. Notice, in this Case, no pun intended, everything's null unless it's less than $20, and then it's Cheap.

I didn't say what to do if it's not. So it just leaves it empty. For all the other situations that aren't Cheap, I could say Else, if it's not Cheap, then put in Expensive.

Now it's either Cheap or Expensive. I just have two groups, but I can put in as many groups as I want. I could say when the Price is less than $50, then it's Moderate.

Again, let's say it's $30 or $18. $18 is less than $20, so it's Cheap and done. It'll never get to here because it goes automatically to the end.

If it's $30, this isn't true, so it gets here. If it's $88, this isn't true either.

This isn't true, but this one is. Now, if you reorder these and you put this one first, you're going to have a problem. Nothing will be Cheap because $20 is always going to be less than $50.

So it'll think it's Moderate. Only $50 or more would get through here. Nothing will ever make it to Cheap because we reordered it incorrectly.

You have to pay attention to your order when you're doing this. While I suppose you could say things like, "If the Price is between $20 and $50, " it makes your query needlessly complex. With something like this, if I change it to $30, I only have to change it in one place. It's very easy for me to change those.

Why put it in twice when I don't have to worry about that? Because I know that if it's true here, it's never going to get here. We don't have to make our cases more complex than they need to be. I could label it as a Case, right? I just want a label, right? And I could say as a Category.

You can call it whatever you want, right? Call it Pricing, Category, whatever. That's up to you. And I might just want it to be a label there.

Maybe I'm exporting this into Excel or Power BI or whatever. And I want those labels in that other application. But what if I want to do something here? I could do more things with this here.

There's the alias. Also, if you want to, you can Order By that. For example, grouping all Cheap together and all Expensive together.

Yes, it's just like any column—you can Order By it if you want. But I think where this could be interesting is if we use it with a Group By.

Because if I choose to group by this category—now, remember how a while ago I said, and you might've forgotten, but Group By in PostgreSQL only supports aliases in Group By—I'm very happy that Group By supports aliases, even though HAVING doesn't, which is weird.

In theory, Group By shouldn't work, but it does. They did like a look-ahead to see that it will be created. Thankfully, we can use the alias; otherwise, we'd have to repeat the entire Case statement.

That's not nice, but luckily we can use the alias. Now, anytime you're doing a Group By, you typically only want to show the grouped data, so I can't show Title and Price.

I'm just showing the group, which I know is like the biggest definition of a column ever, because there's a lot of code just to create one column. But then if I put a comma, I can put my aggregate function after that. That's just one big column, right? And then what's my aggregate function?

Maybe I want to count how many products I have in each category. If I were to count, this would tell me how many products I have in each of those categories—categories that I defined myself.

If I change them, I'm moving things between categories very easily because I define what the categories are, and I don't have to store that in my database. I could define any categories I want at any time, directly in the query. So Group Bys are not just limited to what's in your database.

You can make up your own labels or categories using Group By. That becomes really useful. That's why I said that aggregate functions are so useful.

You can do so many things with them. Anything with SQL Server, just ignore the SQL Server stuff. SQL Server—just ignore the SQL Server.

And that's all SQL Server stuff, so just ignore all that. One last thing: we didn't cover a simplified Case. So simplified Case—we're going to do something a little different here. In the Products table, I've got a list of tags.

I want to show you those tags. Now, if I were to group by those tags, I want to count how many items, how many products have those tags. To simplify, I'll type '1' to sort by the first column. So gray—gray is spelled two different ways.

Gray with an 'a', gray with an 'e'. This is a bit of a problem because it's throwing off my group counts. Because somebody typed it incorrectly. I don't care which way we go, if we like the 'e' or if we like the 'a', but I don't want there to be two different spellings of this.

That's a bit of a problem because I want it to be merged into one group of either name with a five next to it. I want to merge them together, but I can't change my database. Now, of course, if you can ask somebody to go and change the database and fix it, by all means, do that. But let's pretend for a moment that we can't.

Is there a way that I could modify it just for this query, to change something? Yes. So what I'm going to do here is I want to have a new column for my fixed tags.

So let's say before I do my whole thing, let me go back. So here's my tags column. And I want to do a new column, which is going to be my fixed tags.

And my fixed tags are basically going to be all the current tags, except I want to replace gray with an 'e' to gray with an 'a', for example. So I need a Case to create this new column. And in the Case—and this is a simple Case—I'm just looking for equalities.

So I'm always going to be in the tags column. And when that equals—let's say I don't want the 'e'. Maybe I want to go from the 'e' to the 'a'. When it's gray with an 'e', change it to gray with an 'a'. This will create a new column. We have all the regular tags.

And gray with an 'a'—we're not changing that one—but notice gray with an 'e' gets changed to gray with an 'a'. Now, the problem is if I'm going to use this new column, which I'm going to let be known as fixed tags, if I'm going to use these fixed tags, don't I need all these other ones to just move over as they are? I don't want to have to say when Lemon equals Lemon, when Aaron equals Aaron; I don't want to repeat all those because that's a whole lot of tags.

So in every other situation, just move the tags over as they are. Don't change anything else in the current tags column.

So essentially you're saying, when the tag is gray with an 'e', change it to an 'a'. Otherwise, just move it over exactly as it is right now. Don't make any change at all. So for everybody else, they just move over.

And it's only when these exceptions are there, that there's an actual change. So this is like your exceptions list. So for example, just to show you one other thing—not that I need to do this—but let's say 'poos' is supposed to be 'prus'.

'poos' becomes 'prus'. So you can list all the errors or things you want to change. I only want to do gray.

But now when I do my Group By, instead of using the built-in tags, I'm going to use my fixed tags instead. So I can still use my Group By, but I'm going to use my fixed tags. And the only two columns I'm going to show are the fixed tags and the counts that I'm going to do.

And I'll order by the first column. We get the gray together, and notice the gray with an 'e' is eradicated. And that's been fixed.

And then just ignore the SQL Server because it's a bit different. So just ignore that. And this is also SQL Server.

So just ignore that.

Key Insights

Dan Rodney

How to Learn SQL