Since we discussed the mean and median, the last most common central tendency statistic is the mode. The mode is the value that occurs the most frequently in the data set. A good trick to remember the definition of mode is it sounds very similar to “most”. This is probably the least useful out of three statistics but still has many real-world applications.
Let’s pretend we are consultants for Chipotle, and we are supposed to give the company some insight into their customer’s order preference and order size. So now let’s think about which statistical measurement we will use in this scenario. For order size, the mean and median are both contenders, but I would choose the median since there might be some outliers such as expensive corporate catering orders that probably comprise a small percentage of their in-store orders.
Using the median, we would be able to tell Chipotle the “average” order price. However, the mean and median would not be a good statistic to show the most popular item on the menu. This is where we would use the mode to find the most frequently ordered item on the menu. This is very insightful information for Chipotle as they can use what they learn from this data to improve their menu or offer new items that are similar to the most popular item.
Now that we understand that mode has significant application, let’s learn how to calculate this statistic. Well, first off, I apologize in advance and wish I had a better way to do it by hand. However, the only way to find the mode is to line up the data (I recommended from least to greatest), and count each point and see which data point is the most common value.
Luckily, this article’s goal is not to teach students how to do math by hand but how to do it in Python. I only introduce the motivations and calculations because it is important for a programmer to understand what their code is doing.
For anyone who has limited to no experience programming in Python, please stop reading here and enroll in a Python Course, this one is given live online or in-person in NYC. Finding the mode without a library is painful but is very useful to learn. In a couple of blog posts, we will be showing how Python’s libraries truly make all of these calculations much easier.
*I want to repeat that this code is very complex for such a simple problem and I am only showing it to all of you as a teaching moment. Since this code is complex, programmers will always favor open-source libraries but if you do not know how to program mode without using a library, you can run into many issues when using a library.
Step 1: Create a function called mode that takes in one argument
Step 2: Create an empty dictionary variable
Step 3: Create a for-loop that iterates between the argument variable
Step 4: Use an if-not loop and else combo as a counter
Step 5: Return a list comprehension that loops through the dictionary and returns the value that appears the most.
Step 6: Call the function on a list of numbers and it will print the mode of that set of numbers.