# RFM Analysis For Successful Customer Segmentation using Python

--

“RFM is a method used for analyzing customer value”.

It groups customers based on their transaction history :

• Recency — How recently did the customer purchase?
• Frequency — How often do they purchase?
• Monetary Value — How much do they spend?

Combine and groups them into different customer segments for easy recall and campaign targeting. It’s super useful in understanding the responsiveness of your customers and for segmentation driven database marketing.

The resulting segments can be ordered from most valuable (highest recency, frequency, and value) to least valuable (lowest recency, frequency, and value). Identifying the most valuable RFM segments can capitalize on chance relationships in the data used for this analysis.

Let's look at the practical implementation of RFM segmentation

# import libraries
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import pandas_profiling as pp
import seaborn as sns
import datetime as dt
#Glimpse of data
print(data.info())
print(data.describe()) #validate min and max values of each values.
# Before moving forward towards RFM score calculations we need to proceed with some basic preprocessing steps:
Clean the data like Delete all negative Quantity and Price;
Delete NA customer ID;
Handle duplicate null values;
Remove unnecessary columns
#After preprocessing, we will proceed forward towards RFM score calculations

For RFM analysis, We need a few details of each Customer:

• Customer ID / Name / Company etc — to identify them
• Recency (R) as days since last purchase: How many days ago was their last purchase? Deduct most recent purchase date from today to calculate the recency value. 1 day ago? 14 days ago? 500 days ago?
• Frequency (F) as the total number of transactions: How many times has the customer purchased from our store? For example, if someone placed 10 orders over a period of time, their frequency is 10.
• Monetary (M) as total money spent: How many \$\$ (or whatever is your currency of calculation) has this customer spent? Simply total up the money from all transactions to get the M value.

To extract these values, we only need the following columns from dataset.

‘CUSTOMERNAME’, ‘ORDERNUMBER’, ‘ORDERDATE’ and ‘SALES’

temp=['CUSTOMERNAME', 'ORDERNUMBER', 'ORDERDATE', 'SALES']
RFM_data=data[temp]
RFM_data.shape

Create the RFM Table

• In the dataset, the last order date is May 31, 2005, we have used this date as NOW date to calculate recency.
NOW = dt.datetime(2005,5,31)#Convert ORDERDATE to datetime format.
RFM_data['ORDERDATE'] = pd.to_datetime(RFM_data['ORDERDATE'])
# RFM TableRFM_table=RFM_data.groupby('CUSTOMERNAME').agg({'ORDERDATE': lambda x: (NOW - x.max()).days, # Recency
'ORDERNUMBER': lambda x: len(x.unique()), # Frequency
'SALES': lambda x: x.sum()}) # Monetary

RFM_table['ORDERDATE'] = RFM_table['ORDERDATE'].astype(int)

RFM_table.rename(columns={'ORDERDATE': 'recency',
'ORDERNUMBER': 'frequency',
'SALES': 'monetary_value'}, inplace=True)

Now we have RFM values with respect to each customer

Let's work on the RFM score. We have used Quintiles — Make four equal parts based on available values — to calculate the RFM score.

quantiles = RFM_table.quantile(q=[0.25,0.5,0.75])
quantiles
# Converting quantiles to a dictionary, easier to use.
quantiles = quantiles.to_dict()
## RFM Segmentation ----RFM_Segment = RFM_table.copy()# Arguments (x = value, p = recency, monetary_value, frequency, k = quartiles dict)
def R_Class(x,p,d):
if x <= d[p][0.25]:
return 4
elif x <= d[p][0.50]:
return 3
elif x <= d[p][0.75]:
return 2
else:
return 1

# Arguments (x = value, p = recency, monetary_value, frequency, k = quartiles dict)
def FM_Class(x,p,d):
if x <= d[p][0.25]:
return 1
elif x <= d[p][0.50]:
return 2
elif x <= d[p][0.75]:
return 3
else:
return 4
RFM_Segment['R_Quartile'] = RFM_Segment['recency'].apply(R_Class, args=('recency',quantiles,))
RFM_Segment['F_Quartile'] = RFM_Segment['frequency'].apply(FM_Class, args=('frequency',quantiles,))
RFM_Segment['M_Quartile'] = RFM_Segment['monetary_value'].apply(FM_Class, args=('monetary_value',quantiles,))
RFM_Segment['RFMClass'] = RFM_Segment.R_Quartile.map(str) \
+ RFM_Segment.F_Quartile.map(str) \
+ RFM_Segment.M_Quartile.map(str)

• Who are my best customers?
• Which customers are at the verge of churning?
• Who are lost customers that you don’t need to pay much attention to?
• Who are your loyal customers?
• Which customers you must retain?
• Who has the potential to be converted into more profitable customers?
• Which group of customers is most likely to respond to your current campaign?

Some of them are:

Q. Who are my best customers?

Q. Which customers are at the verge of churning?

#Customers who's recency value is low

Q. Who are the lost customers?

#Customers who's recency, frequency as well as monetary values are low

Q. Who are loyal customers?

#Customers with high frequency value