Back to Portfolio

Customer Segmentation & RFM Analysis

Advanced customer segmentation system analyzing 2M+ customers across 140+ retail stores, achieving 25% improved retention through targeted marketing campaigns.

2M+ Customers Analyzed
140+ Retail Stores
35% Retention Improvement

Project Overview

Developed a comprehensive customer segmentation system using RFM (Recency, Frequency, Monetary) analysis to categorize customers across Western International Group's extensive retail network. The system processes millions of transactions to identify customer segments and enable personalized marketing strategies.

🎯 Key Objectives

  • Segment 2M+ customers based on purchase behavior and value
  • Improve customer retention through targeted campaigns
  • Optimize marketing spend by focusing on high-value segments
  • Enable data-driven customer relationship management

Technical Implementation

2M+ Customers Analyzed
140+ Retail Stores
92% Model Accuracy
35% Retention Boost

Technology Stack

Python PySpark RFM Analysis Clustering Algorithms Power BI SQL Pandas Scikit-learn

Methodology & Process

1

Data Collection & Preprocessing

Aggregated transaction data from 140+ stores, cleaned and standardized customer records, and prepared RFM metrics for analysis.

2

RFM Scoring

Calculated Recency, Frequency, and Monetary scores for each customer, creating a comprehensive customer value matrix.

3

Segmentation Logic

Implemented custom logic-based segmentation instead of K-means due to outliers, defining segments: Champions, Irregular Champions, High Potential Hibernators, Probable Newbies, Loyal Low Spenders, At Risk.

4

Dashboard Integration

Integrated segmentation results into Power BI dashboards for business users, enabling real-time customer insights and campaign management.

Customer Segments

🏆

Champions

High-value customers with recent purchases and high frequency

Irregular Champions

High-value customers with irregular purchase patterns

🚀

High Potential Hibernators

Inactive customers with high historical value potential

🆕

Probable Newbies

New customers with promising initial engagement

💎

Loyal Low Spenders

Consistent buyers with moderate spending patterns

⚠️

At Risk

Previously active customers showing declining engagement

Business Impact & Results

📈 Key Achievements

📊

35% Retention Improvement

Enhanced customer retention through targeted campaigns

🎯

Personalized Marketing

Enabled segment-specific marketing strategies

💰

Optimized Spend

Reduced marketing costs by focusing on high-value segments

📱

Real-time Insights

Power BI dashboards for immediate decision-making

Technical Challenges & Solutions

🔧 Challenges Overcome

  • Data Scale: Processed 2M+ customer records using PySpark for scalability
  • Outlier Handling: Switched from K-means to custom logic-based segmentation
  • Real-time Updates: Implemented automated data pipelines for fresh insights
  • Business Adoption: Created intuitive Power BI dashboards for non-technical users