Discover practical SQL techniques for extracting valuable user account insights, from identifying the oldest accounts to pinpointing weekend sign-ups. Uncover multiple approaches to easily analyze date-based data and improve your query writing skills.
Key Insights
- Utilize SQL's
DATE_PART
function to isolate specific date components, such as year or month, enabling efficient calculation of account ages and identification of unique months when accounts were created. - Apply conditional filtering techniques (such as using
WHERE
clauses withBETWEEN
,IN
, and comparison operators) to efficiently segment data, for example, finding exactly 30 out of 100 accounts created on weekends. - Recognize multiple equally valid SQL solutions—like using either
MAX
or an ordered limit—to retrieve similar data results, promoting flexibility and clarity in query writing.
Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.
This is a lesson preview only. For the full lesson, purchase the course here.
Let's talk through these solutions here. One thing that also helps when you're trying to think of an answer and you're trying to envision a result is try to think about what do I want to see in the end? Like imagine the final results and think how do I create that result? That might help you to think through how to write your code. So how many years old is the oldest user account? So in the end, I want to see a number.
I want to see a single number, you know, three years, 20 years, you know, like credit cards will do this. You know, like, you know, thank you for being a member for 22 years or something. They'll kind of calculate how old your account is.
So out of all of the user accounts, I want to know how many years old is the oldest one? So I want to have a number. I want to how many different, how many years different is it between today's year and the year that that account was created? So, because I need to know how many years old each account is to then decide what's the oldest one out of all of those. And sometimes there's different ways you can go about doing this.
So for example, here, these are all the users and I have the created_at information. And I want to see for the year part of that. So let's say I look at the date_part of this, this a little bigger for you.
I want the year from that, because that's all I care about. How many years old is this from created_at? So here I have all the years. If I take today's year, 2025.
Now you could just say 2025, type it in, but what if you're going to use this multiple times? We could get the year from right now. So to show more SQL features, I can simply say from instead of created_at, which is in the database, I can say now, and now we'll give us the current year. And I can say, take that current year, which is 2025 minus the one from the database.
So 2025 minus whatever that year is. Now out of here, we're gonna get all sorts of numbers. Some are gonna be bigger, some are gonna be smaller.
Out of all of those, I want the biggest amounts. What is the oldest? Right? How many years, like 20 years old, six years old. If I want the biggest number, I want the max number.
And that will give me the highest number in that column. Here's another way to do it. Although it would be more code to do it this way.
If I were to order by that column, and I only have one column. So instead of writing out the name, I'm just gonna say order by the first column, which is from one at the top going down. So I'd need to reverse it with descending.
I could then say, just limit it to one row because the biggest number would be at the top. Now that is more code to write, but it would function the same way. It would give me the biggest amount at the top.
I think max is more efficient, but this would still give me the same answer in the edge. So there's not always just one solution with SQL code to do things. Sometimes there are different ways you can approach it.
During which years were user accounts created? So in other words, I want to see a list of all the years, you know, 2019,2020,2021, like what were the years that people created their accounts? Just show me a list of all those years. If I go in and say from my users table, I want to see the year, cause that's what I'm interested in. So once again, I'll do the date part, and I want the year from created_at.
I've got a list of years. Every single user has a year. What do I do next? I got a lot of copies here.
A lot of duplicates here. Yeah, I had distinct, because distinct eliminates duplicates. And now I have a nice little list of just the years without duplicates.
Also, we can sort this. Now notice how they call it date_part. That's like the default name.
If we don't give an alias, if I want to order by, I could just say order by the first column in my result set cause I only have one column. Or I could say date_part because that's the default name that they gave it. But if I just want to do it quickly, I could just say the first column in my result set.
I can use either a number or a name to refer to a column. I only have one column. So this is a time when I think it's perfectly fine to just throw in a number because it's short and sweet.
And I only have one column anyway. I'm not going to reorder columns in this case. So while normally I like to use names, this is one time when I think it's fine to use the number instead.
How many people created their account on a weekend? If I'm looking at my users, these are all the users. And if I put the count on right now, I can't see what goes into that count, but I would see that I have a hundred users. Okay.
Now, if I'm trying to filter things down, and I don't want to see all of my users, I only want to see users who were created on a weekend. I'd want to filter out some of these rows because some of these rows are users who created their account on a weekday. So I only want to see people where their date_part of created_at is that the day of the week is Sunday or Saturday.
So I would say date_part, day of the week from created_at, and that's going to give me anywhere from zero through six. Zero is Sunday, six is Saturday. If it's found to be equal to either zero or six, we know that Sunday or Saturday.
And now we're going to get a lot less people. Now, as far as spot checking yourself, doing that little sniff test, does this smell right? You might want to do a little spot check just to make sure that you did this correctly. Now you could always say what day of the week was, and then just spot check yourself.
Saturday. Good. Right? So you might want to spot check a couple of days just to make sure that you're on the right track.
And then we want to count just these people. We count, I would see that 30 of them were created on a weekend. And if we fill up a not in there, obviously the other 70 were created on a weekday instead of a weekend.
So, alright. Everybody loves this question. I say, ironically, it's confusing to people sometimes, but when anybody, and when anyone asks you a question, sometimes they can be confusing in the questions they ask.
Don't be afraid to go back and ask clarifying questions and say, I didn't quite understand that. What do you mean by this? Or one way to not offend people is to say, I think I understand it this way. Am I correct? Or if you really don't understand it, just say, can you restate that in a different way? I'm not quite sure.
I'm not understanding this part. So during which months in the first third of the year? If a year is 12 months, what's the first third of the year? 12 divided by three is four. So which of those first four months, January, February, March, and April, out of which of those first four months were people creating their accounts? Were people creating them in January, February, March, and April? Maybe, maybe not.
Maybe people were only creating them in one of those months or two of those months. I don't know until I look, but I'm curious. So I want to see months.
I'm interested in the month part of our user accounts. And right now they created_at, it's the whole thing. So I could just look at the date_part, which is the month from created_at.
Now, these are all the months. This could be anywhere from one to 12 because they can create their account in any month. I'm only interested in the first four months.
How could I filter this down to just be the first four months? Where this is what? It could be between one and four. That's one way to do it. Is there any other way I could do this? Less than or equal to four, yes.
Could say less than or equal to four. That would also work. Not that I would do it this way, but you could also say that it's in this list of one, two, three, or four.
I would totally not do it that way, but you could, because if it's in that list, you'd find it. I think this is more clear or this is more clear, but that would work. Any of these would perform equally well.
So I'm going to do the between one and four. And so we look here, we got some twos, we got some threes, we got some fours, but we're seeing duplicates again. So that tells me I need to distinct.
And there you go. So in this case, they've created their accounts in all those months, but I didn't know until I looked, maybe one of those months was missing and I would know they didn't create their account in those months. Now, for anybody that likes names for things instead, remember there was the two car? And they flipped it.
I wish they wouldn't have flipped it because then you can't just switch from date_part to two car. If they would have kept the same order, it would have been easier to switch. That would have been nice, right? Wouldn't that have been nice? And actually, yeah, let me just run this first part here.
See, these are names now. Now if I was looking for one of these, and actually I would do a capital M for month. Oh, and then I need to do that there as well.
I can't be looking at it from one to four anymore, can I? Because I need to be saying it's found in a list of January, February, March, or April, because these are just words. I can't say that a month is between these numbers if these are not numbers, right? Or actually, let's think about this. Could I still filter by this, but show people something different? What you filter by is whatever you want, right? Like if I go back here, let me cut this.
Let me go back. So here, if you're filtering and saying only show me users who are in months one to four, right? I still filtered out the ones that I don't want, where filters out rows. Now, what I show people, that's up to me.
And I could say, well, let's show people the created_at as a month and get rid of the copies. So I still have January, February, March, and April. The problem here is organization.
Like, how do I order this? Because the only order by that you have, if you order by this first column, is you have ascending or descending alphabetical, right? So A-to-Z, there's no like month organization. Like they don't know that January is the first month. February is the second month.
A, B, C, D, E, F, right? And then we go through J and then M. This is alphabetical. If we reverse it, that's Z to A, there is no month sort. If you wanted something like that, what I would recommend doing, if you really want to have the names, is I would have two columns, one for the name and one for the number.
Because then you can sort. And personally, I would probably put the number first because it'll be on the left of your result set. So you can have it like this.
And if you're sorting by that column, you basically just have it to sort. And that way I can have the name next to the number so it's sorted properly. But that way you can see the names and not just see the numbers, if you want to see the names.
How many user accounts were created between two dates? If we have our users, right now it's all the users. Don't want to see all the users, just want to see some of them. When you want to filter out rows, you use WHERE.
I only want to see rows where they're created_at was between these two dates, somewhere in that range. And so I want to say WHERE created_at is between, and when we see 2020, so we start with 2020, then we do nine for September, and then 21, we go from big to small because that's the format that our dates use. It's between that and 2020.
Zero key is not always working there. December is 12, and then 20, so December 20th. Now, I know that a user account was created on December 20th, because I've seen this dataset before.
But yet when I run this, I don't get it. Anybody remember that? Remember why? Because of the time. I don't want it to factor in time, so I need to cast this as just a date, so it looks at it as only a date, and it forgets about the time.
Because otherwise it will start at the beginning of this date, but it will only go up to the beginning of this day, and it won't go through the end of that day. So just forget about time. Just say, deal with it as a date, and that's it.
That is the simplest solution to it. Now, I did say, how many accounts? If I then count, the answer is eight. We should have eight as the total for that.
That's how many user accounts were created. And the nice thing about something like this is that in the future, if anybody wants to come back, they could just put in whatever two dates they want, and they can see how many users were created between two date ranges, or how many orders there were, if you were in the orders table. So it's a very common thing where we want to find some things between two ranges of dates.