top of page
Search

Step by Step Process Preparing Data and Developing RFM

  • Writer: OracioRosas
    OracioRosas
  • Mar 21, 2023
  • 2 min read

Updated: Mar 21, 2023

When it comes to implementing RFM analysis using Python, a significant amount of effort is dedicated to data preparation and cleaning. Ideally, the dataset used for the analysis should be well-structured and organized, with the necessary columns and data types in place. For instance, the chart below represents an ideal dataset for RFM analysis.

ree

However, in reality, the dataset is likely to be fragmented, with orders broken down at the line item level, unnecessary columns, and incorrect data formats, especially for dates. Therefore, before delving into the analysis itself, it is crucial to devote time and resources to cleaning and restructuring the dataset to ensure accurate and meaningful results.


ree

Once the data has been thoroughly reviewed and cleaned, it is essential to handle any missing values that may still exist in the dataset. In this step, we typically remove any observations that have incomplete information, as these missing values can distort the analysis and lead to inaccurate results. Therefore, dropping missing values is a crucial step in ensuring the accuracy and reliability of the RFM analysis.


ree

We then combine total price per order instead of breaking up each line.


ree

We plot to see what our data looks like. We see United Kingdom has the highest amount of customers, so we will focus on this segment.

ree

ree

Create a subset of the UK

ree

then drop the Country column as that is not needed for RFM

ree

Drop Stock Code Description Quantity and Unit Price Columns

ree

ree

ree

ree

Our datasframe below is starting to look better. We have single order number for each row and revenue total for the order. Customer Id is assigned to each order number and days since last purchased is in the last Column.


ree

We create another dataframe aggregating Days Minimum and Summing Total Price

ree

ree

We assign RFM Scores

ree

ree

Create labels for customers depending on the RFM scores

ree

Based on our customer segmentation analysis, we have identified a significant number of customers in the 'Almost Lost Customers' segment. This segment has RFM scores that indicate they have not made recent purchases, spend less than other segments, and are less frequent buyers. By targeting this segment with personalized and targeted PPC and email campaigns, we can re-engage these customers and potentially convert them into higher-value customers. Strategies such as offering personalized discounts, product recommendations based on past purchases, and exclusive promotions can help entice these customers to make another purchase. Focusing on this segment can not only help us increase sales but also improve customer retention and loyalty

ree




 
 
 

Comments


© 2023 by Oracio Rosas

bottom of page