The Hidden Hero of Data Analysis: The Mode (Part 3) Mode-Based Customer Behavior Analysis (Ice Cream Sales Dataset Practical Analysis)

Ice Cream Preference : Dominance of Strawberry

(If you missed it, please check out below first)




 

1. Data Introduction and Mode Analysis Plan

This article utilizes a hypothetical ice cream sales transaction dataset (Total N=20) to identify the structural truth of customer behavior—the most common choice (the Mode)—which can be easily overlooked by focusing only on the mean.

1.1. Dataset Structure and Description

This data consists of 20 hypothetical customer transaction records and includes five key fields.

This data consists of 20 hypothetical customer transaction records and includes five key fields.

1.2 Mode Analysis Plan Using Example Data


To grasp the 'reality of the majority of customers' hidden behind the mean, we plan to focus the Mode analysis on three key fields of this sample dataset:

A. Categorical Data Analysis: Preferred Flavor
  • Goal: Identify the most overwhelmingly preferred flavor (the Mode) among the 20 customers. We will measure its Dominance Ratio to determine the priority for inventory and marketing resource allocation.
  • Visualization: Use a Bar Chart to intuitively compare the frequency of each flavor.

B. Discrete Data Analysis: Item Count
  • Goal: Determine the most frequent number of items customers purchase in a single transaction (the Mode). This result provides the basis for deciding whether single-item purchase patterns are dominant or if bundled products would be effective.
  • Visualization: Use a Bar Chart to check the rapid decline trend in frequency by purchase count.

C. Continuous Data Analysis: Purchase Amount (USD)
  • Goal: Instead of relying on the Mean, which can be distorted by a few high-value purchasers, we will identify the price range most customers spent (the Mode Interval) to establish realistic pricing and promotion strategies.
  • Visualization: Use a Histogram to visually confirm the shape of the data distribution (right-skewed) and the Mode Interval.


 

2. Power BI DAX Measure Implementation


We will analyze the sample data (Preferred Flavor, Item Count, Purchase Amount) and implement measures, particularly using RANKX and CONCATENATEX, to handle potential tie situations.

2.1 Categorical Data Mode – Preferred Flavor (Most Preferred Flavor)

This measure returns the name of the most frequently selected ice cream flavor (Mode) as a text string within the current filter context. This code is specifically designed to capture and display all top-ranked items, even if the Mode is a tie.

The measure is broadly composed of three steps: 
1) Creating the FlavorRank temporary table, 2) Filtering for TopFlavors, and 3) Returning the final text output.

1) Creating the FlavorRank temporary table, 2) Filtering for TopFlavors, and 3) Returning the final text output

A. Step 1: FlavorRank Variable (Creating the Rank Table)

The measure first defines a variable called FlavorRank to create a temporary rank table. This process calculates the transaction frequency for all flavors and determines who is ranked 1st.

FlavorRank to create a temporary rank table

1) DAX Element Function Analysis
  • Setting the Competition Scope: ALLSELECTED ensures that the ranking competition occurs only within the currently selected filters in the report (e.g., specific region, specific year).
  • Ranking Calculation: The rank is assigned in descending order (DESC) based on the transaction count for each flavor.
  • Tie Handling: If transaction counts are the same (e.g., Chocolate (150 times) and Vanilla (150 times)), the DENSE option assigns them all a joint 1st place.
  • No Rank Skipping: DENSE ensures that if two items are tied for 3rd place, the next rank is 4, not 5.
DAX Element Function Analysis

2) DAX Code Execution Result

The FlavorRank variable generates a table containing the list of all unique flavors within the current filter context and their rank based on transaction count.

DAX Code Execution Result

B. Step 2: TopFlavors Variable (Selecting the Winner(s))

The TopFlavors variable filters the previously generated rank table to select only the winner(s) who achieved 1st place.

TopFlavors Variable (Selecting the Winner

1) DAX Element Function Analysis
  • The FILTER function is used to keep only the rows with a rank of 1 in the rank table.
  • If there are tied winners, all of them remain in this table, preparing for the final output to include all results when the Mode is not singular.

2) DAX Code Execution Result

The result of the table stored in the TopFlavors variable shows only the rows where the rank equals 1.

DAX Code Execution Result

C. Step 3: RETURN - Final Result Output Stage

The DAX code uses the CONCATENATEX function to compress the list of top-ranked flavors into a single string for final output.

RETURN - Final Result Output Stage

1) DAX Element Function Analysis

DAX Element Function Analysis2

2) Code Execution Result (RETURN Value)

The final result is a text string where the names of the best-selling flavors are separated by ", " (comma and space).

Code Execution Result (RETURN Value)

2.2 Discrete Data Mode – Item Count (Purchase Quantity)


The Item Count measure finds and returns the most frequently purchased number of items (Mode) in a single transaction. The DAX logic structure is identical to the Preferred Flavor measure; only the target field changes from 'Ice Cream'[Preferred Flavor] to 'Ice Cream'[Item Count].
The core logic (1) Ranking, 2) Filtering 1st place, 3) Text combination) is exactly the same, with only the target field being altered.

A. Step 1: ItemRank Variable - Creating the Rank Table (Key Change)

  • Goal: List all unique purchase counts (Item Count) within the current filter context and create the ItemRank table by assigning a rank based on the transaction frequency for each count.
ItemRank Variable - Creating the Rank Table (Key Change)

1) DAX Element Function Analysis

DAX Element Function Analysis3

2) DAX Code Execution Result

DAX Code Execution Result
B. Steps 2 & 3 (Filtering and Output)

Steps 2 (filtering 1st place items via the TopItems variable) and 3 (combining the result into a string via CONCATENATEX) operate exactly the same as the Preferred Flavor measure.

Steps 2 & 3 (Filtering and Output)
RETURN: Uses CONCATENATEX to output the selected 1st place purchase count, such as "1" or "1, 2" if tied.

RETURN: Uses CONCATENATEX to output the selected 1st place purchase count, such as "1" or "1, 2" if tied.

This Item Count analysis is used to determine whether a single-item purchase pattern (1 item) is dominant or if a bundled purchase pattern (2 or more items) is the mainstream. This result forms the core basis for planning bundled products or discount policies.

2.3 Continuous Data Mode – Purchase Amount (Purchase Amount)

For continuous data like Purchase Amount, finding the Mode Interval is analytically more realistic than finding a single Mode. This requires combining a calculated column and a measure. To visualize with a histogram or bar chart, a Bin (Column) is needed for each transaction row, and the Mode Interval is calculated based on this.

1) Creating a Bin (Column) per Transaction – Purchase Bin

As a foundational step for histogram visualization and Mode Interval calculation, a calculated column is created in the original data table ('Ice Cream') to assign a $5 interval label to the amount of each transaction.

Select Ice cream table from Data view tab in Power BI Desktop and input the code after click the New Column button.

Creating a Bin (Column) per Transaction – Purchase Bin

2) Purchase Bin DAX Code and Step-by-Step Logic Explanation


Purchase Bin DAX Code and Step-by-Step Logic Explanation

Code Execution Result: The continuous Purchase Amount data is categorized into standardized $5 interval bins, returned as a string (Text) interval label.

3) Mode Interval Measure – Mode Purchase Amount

This measure aggregates the transaction count per interval (Bin) based on the transaction-level Bin information and returns the interval with the highest frequency (the Mode).

Mode Interval Measure – Mode Purchase Amount

A. Step 1: BinRank Variable (Creating the Rank Table)

This step creates a temporary rank table by aggregating the transaction frequency per Purchase Amount interval and assigning a rank based on that frequency.

BinRank Variable (Creating the Rank Table)

Result from DAX code

Result from DAX code

B. Step 2: TopBins Variable (Filtering the Mode Interval)

This step selects only the highest-frequency interval(s) (the Mode Interval) from the rank table created in Step 1 and passes them to the next stage. If a tie occurs, multiple intervals remain, allowing CONCATENATEX to output all of them.

TopBins Variable (Filtering the Mode Interval)

Result from DAX code

Result from DAX code

C. Step 3: RETURN (Final Output)

This step uses CONCATENATEX to combine the list of 1st place intervals into a single, readable text string as the measure's result. This ensures that if multiple mode intervals exist (a tie), they are presented clearly (e.g., "6~10, 11~15").

RETURN (Final Output)

Result from DAX code

Result from DAX code

Analysis Interpretation and Business Implication (Reinforced):
  • Mode Interval: Clearly shows the most frequently spent amount, representing the customer's typical purchasing habit. While the Mean can be distorted by a few high-value transactions (outliers), the Mode is less affected and better represents actual purchasing behavior.
  • Business Implication: This analysis result supports core business decisions:
    • Pricing Policy: Sales can be maximized by discounting or focusing promotions on products/services within the Mode Interval (e.g., $6 ~ $10).
    • Product Composition: It provides a strategic basis for planning bundled products or strengthening the product lineup within the price range most customers purchase.


 

3. Statistical Interpretation and Ratio Analysis of the Mode


We apply the DAX measures to the sample data (N=20) and interpret the structure of the data distribution by comparing the Mode with other measures of central tendency (Mean, Median).

3.1. Preferred Flavor Analysis: Mode of Categorical Data


This analysis answers the question: "What is the flavor customers like most?"

Preferred Flavor Analysis: Mode of Categorical Data

Since Preferred Flavor is nominal data, the Mode is the highest priority indicator for marketing and inventory strategy. Strawberry is not just the 1st place, but the Core Pillar sustaining the business.

3.2. Item Count Analysis ("How Many Do They Buy at Once?"): Mode of Discrete Data

This analysis answers the question: "What is the most frequent number of items customers purchase in a single transaction?"
How Many Do They Buy at Once?

Mode & Mean & Distribution Shape

The average purchase quantity is about 1.65 units, but the actual customer standard is 1 unit. Creating a "1.6-item set" based on the mean will likely fail. The core strategy is to successfully encourage 1-item purchasers (65%) to buy 2 items.

3.3. Purchase Amount Analysis ("How Much Do They Most Commonly Spend?"): Interpretation of Skewness in Continuous Data

This analysis answers the question: "What is the most frequently spent price range?" For continuous data, it is crucial to compare the three measures of central tendency (Mean, Median, Mode) to grasp the degree of Skewness.
How Much Do They Most Commonly Spend?

Mode ($7.94) < Median ($7.90) < Mean ($10.24)

Statistical Significance: The relationship Mode < Mean clearly shows that the distribution is Right-Skewed. This means a few high-value purchasers have pulled the average upward, overstating the typical spending. The Mean ($10.07) is about 27% higher than the typical customer spending (around $7.9). If the pricing policy is set based on the Mean ($10), it could lead to the perception that the product is "expensive" for the actual mainstream customers (50%).



 

4. Intuitive Understanding of the Mode through Data Visualization


Visualization is the most powerful tool for intuitively checking the Mode's position in the data distribution.

4.1. Categorical Data: Ice Cream Preference Distribution

The graph below shows the frequency of ice cream flavors selected by 20 customers.

Ice Cream Preference : Dominance of Strawberry

Frequency Distribution by Preferred Flavor

Observation: As seen in the graph, Strawberry is not just 1st place; it is the dominant 'Mainstream' accounting for 65% of the total. While the Mean or Median might point to a non-existent flavor "somewhere between Strawberry and Chocolate," the Mode clearly instructs: "Focus on Strawberry."

Target field & Characteristic & Analysis Purpose

4.2. Continuous Data: Sales Amount Distribution (Histogram)

This analysis answers: "What is the most frequently spent amount?" We look at the Mode's position in data that is skewed, like income or purchase amounts.

Continuous Data: Sales Amount Distribution (Histogram)

Observation: The peak (the longest bar) of the Histogram is located in the $6 ~ $10 interval, and the frequency rapidly decreases for high-value purchases ($16 or more).

Insight: The Histogram visually confirms that the distribution is Right-Skewed. The Mean ($10.24) is located to the right of this peak, meaning the average is overstating the typical purchase amount.

4.3. Discrete Data: Purchase Count Distribution

This bar chart shows the distribution of Item Count, where the highest bar is the Mode, representing the most common purchase quantity.

Discrete Data: Purchase Count Distribution

Observation: The bar for purchasing 1 item is the highest at 13 transactions, and the frequency drops sharply for 2 items, 3 items, and so on.

Insight: This shows that customer purchasing habits are highly concentrated on single-item purchases. The average of 1.65 items has a discrepancy with the structural truth of customer behavior, which is 1 item.
Target field & Characteristic & Analysis Purpose



  

5. Mode-Based Business Strategy Formulation


Specific business strategies that can be executed based on the Mode analysis results. The intensity of the strategy varies, especially according to the Mode's Ratio.

Based on the analysis result [Mode: Strawberry, Ratio: 65%], strategies are formulated to optimize resource input.

5.1. Inventory and Production Strategy: "Zero Out-of-Stock (OOS)"

  • Strategy Goal: Minimize the risk of Out-of-Stock for the Mode flavor, Strawberry.
  • Execution Plan: Set the safety stock coefficient for Strawberry 1.5 to 2 times higher than the general coefficient to ensure absolute inventory stability. OOS for Strawberry means 65% of sales vanish, making it the most critical risk.
  • Display Management: Allocate more than 60% of the shelf space (Share of Shelf) to Strawberry to maximize visibility.

5.2. Marketing Strategy: "Leveraging the Bandwagon Effect"

  • Strategy Goal: Use the overwhelming figure of 65% as Social Proof to induce purchasing by potential customers.
  • Execution Plan:
    • Copywriting: "The flavor chosen by 6 out of 10 people—the never-fail choice, Strawberry!"
    • Budget Allocation: Concentrate the majority of the marketing budget (at least 60%) on Strawberry-related campaigns to increase the overall Conversion Rate.
    • Promotion: Focus on increasing the satisfaction of the mainstream customer base by offering benefits upon purchasing Strawberry, rather than indiscriminate discounts on secondary flavors (Chocolate, Vanilla).

5.3. CRM and Product Planning: "Inducing Associated Purchases"

  • Strategy Goal: Configure a product mix that encourages additional purchases based on Strawberry buyers (65%).
  • Execution Plan:
    • Bundling: Plan a "Best Combo Pack" that bundles the Mode (Strawberry) and the runner-up (Chocolate). This is highly likely to draw the highest demand customers (65%) into purchasing Chocolate.
    • New Product Development: Attempt vertical expansion within the Mode category with items like "Premium Strawberry" or "Strawberry Cheesecake." Avoid expanding low-demand lines like Mint Chocolate (5%).


 

6. Warpping up


We have analyzed all cases using DAX code, from the Mode of discrete data like Preferred Flavor and Item Count to the Mode Interval of continuous data like Purchase Amount. Through this analysis, we confirmed that the Mode is not just a simple statistical figure but a core indicator representing the 'most powerful customer preference.'

In the next part, we will delve deeply into how to use the DAX measure system built upon this Mode to monitor core preferences and their market dominance in real-time, enabling immediate, data-driven decision-making.


<Other posts on the blog>


Comments

Popular posts from this blog

DAX CALENDAR Function Deep Dive and Practical Usage Guide

Standard Deviation The Complete Guide to the Core of Business Data Analysis

How to load Text or CSV files into Power BI