In this article, we'll explore the differences between SQL and Excel in performing a text-based grouping exercise. Our hands-on exercise will take you inside the mind of a digital marketer and show how to work through a problem using both Excel and SQL.
Excel vs. SQL: CASE WHEN
If you’ve ever tried to group values based on text criteria in Excel, you know how difficult that can be. In this article, we’ll discuss how to do this in Excel, as well as in SQL using the CASE WHEN function. For this article, we’ll use the example of a credit card company analyzing search data.
Put yourself in the shoes of this credit card digital marketer and imagine the variety of keywords that one could search about credit cards. Some people will be looking for the best credit cards of 2020, the best credit cards for cash back, best travel rewards, no fees, or many other perks and features of credit cards.
As the digital marketer here, we’re going to go ahead and download our search query report from Google Ads that will give us a list of all the keywords where our ads showed up with the number of impressions, clicks, conversions, and cost. We’ll probably look at this list for a little bit and not be able to make much sense of the hundreds of rows and multiple columns of data. Then we’ll get this brilliant idea that we can group our keywords into various categories to better understand our performance. We’ll want a group for “best cards,” “no fees,” “travel rewards,” “cash back,” and maybe a couple more. Here’s where we run into our Excel vs. SQL question.
In Excel, we’d have to take a couple of steps to get our data in that digestible form. First, we’d need to tag each keyword into one of our groupings. We can make a new column called “keyword grouping” and then we’ll use the IF, ISNUMBER, and SEARCH functions to do the grouping. The SEARCH function will look for text within another set of text and return the position of the text or string of text you are looking for. The ISNUMBER function serves as a logical statement (T/F) for whether the value is a number or not, and the IF function serves as the final logic statement that tells us what to do if the value is TRUE or FALSE.
We’ll start our formula with “=IF(ISNUMBER(SEARCH(...” and then input each of the different strings we are trying to match and then assign the group in the IF statement if it’s true, and then if it’s false, we’ll want to write another IF statement with the conditions for the next group.
This can be a bit of a tedious process and it can be difficult within the small cells of Excel. Once we have finished the lengthy logic statement that gives us all the grouping of keywords, we can then create a pivot table with the keyword grouping as the rows and the SUM of values for impressions, clicks, cost, and conversions. Now we have a smaller table grouped by categories to see what the sizing of each group is and how we are performing in each of those groups.
Now, let’s go through the same exercise in SQL. This assumes that we have our search report in a database, and if not, we can upload the Excel file to a database. SQL is a programming or querying language so it is much better suited to handle these types of tasks. The IF(ISNUMBER(SEARCH part above will be replaced by SQL’s CASE WHEN and the Pivot Tabling will be replaced by the SUM and GROUP BY functions. We’ll start with the SELECT and then go right into our CASE WHEN. It will look something like this:
SELECT CASE WHEN query LIKE “%best%” THEN ‘Best” …
The ‘%’ before and after the phrase indicates that we don’t care what comes before or after the word “best.” Use the “%” and other REGEX symbols based on your desired outcome. We’ll create a separate line for each grouping we want to make and can use the AND or OR operators to have multiple conditions for a grouping. Once we finish our list of CASE WHEN’s we can add our SUM’s, FROM, and GROUP BY to get our final table in a clean format.
Advantages of SQL
Both processes require a bit of manual work but SQL has some advantage over the Excel route for this type of problem. Using a programming language like SQL allows for more flexibility and replicability in your work. Once you create that CASE WHEN statement for how you want your keywords grouped, you can very easily share and edit that text based on your needs and run the query again. In Excel, to make changes you would have to go into the cell and try to find the correct nested statement you are looking to change and then update your Pivot Table by re-creating or refreshing it to get your final answer. You can very easily share your grouping logic in a text file or in Github to centralize your process.Both Excel and SQL are viable options for this type of work, but SQL can provide some additional scale and speed.