July 7, 2017
Following a seven-step procedure, companies can deploy predictive analytics to identify potential ‘churners’ and then take steps with short-term marketing campaigns to re-engage with them.
Simply put, non-subscription churn happens when users or customers, who can end their relationship with your business at any time, leave your website or sales funnel. These types of customers may gradually reduce their purchase frequency over time, or they may all of a sudden never buy again. The gradual decline in purchase frequency is hard to identify and, therefore, difficult to address.
Companies in nearly every industry have to address this type of churn because it has the power to plateau the growth of any businesses even if that business is gaining customers quickly. The most successful companies address it by building predictive models that accurately identify and predict churn; then they take action by building targeted marketing campaigns around preventing it or by making product changes that combat churn.
Data science company Dataiku has published a free white paper detailing how organizations can use predictive analytics to combat non-subscription churn using seven proven steps:
1. Understand the Expected Time Between Purchases
It’s also a good idea to do basic descriptive statistical analysis upfront (unsupervised/clustering) to decide which users should even be considered in the churn analysis. For example, if someone used the product or service only one time, are they considered a churner after that? Or is there some minimum threshold after which a user should be considered and included in churn analysis?
How will your specific business define churn? This step is crucial — defining a churn period that is too long risks creating predictive models with artificially low churn rates, not capturing enough people and defeating the purpose of predictive modeling. But defining a churn period that is too short makes it difficult for marketing teams to evaluate churn prevention campaigns because they ultimately can’t distinguish between organic actions (users or customers who would have come back anyway without intervention) and effective campaigns.
2. Get Your Data
The minimum data required to predict churn is simply some form of customer identification and a date/time of that customer’s first and last interaction. This data, though not incredibly detailed, would allow you to build models to analyze and predict churn at a basic level.
3. Explore and Prepare Data
Remember that this step of the process can account for up to 80 percent of the total time spent on the project, so don’t be discouraged as you get your data into a useable format. Take time to ensure you understand what all the different variables in your data mean before moving on to cleaning up different spellings or possibly missing data to ensure everything is homogeneous. Thoroughly exploring and cleaning will save time in subsequent steps, particularly when it comes time for prediction.
4. Enrich Your Data
If you’re working with a more advanced data set than simply customer identification and date/time of last interaction (which is, as mentioned, highly recommended for better prediction), this is the time to enrich that data and join it to get down to the essentials. For example, if you have one data set with customer identification and date/time of last interaction and another with customer identification and demographic information, you’ll want to join these into one set of data.
5. Unleash a Machine Learning Algorithm
When building a predictive model, you have to be careful that it will actually learn what you want. For instance, one of the common pitfalls for a churn modeling project is to train your model on both past and future events. To avoid this common mistake, you need to put yourself in the position you’ll be in when your model will be deployed into production: What data will be available to you? When would you like your prediction to be: for next week, next month?
An important part of the predictive process is the interaction and iteration between predictive modeling and feature engineering. In step 4, you enriched your data and generated features. Now it’s time to see if the features you’ve added are actually valuable to your model. Try keeping the feature set relatively small at first and then run your model(s) to evaluate performance. Little by little, continue to add features and evaluate their effect on the accuracy of the model.
Now that you have explored and know your data by digging in, cleaning, and enriching it, it’s time to visualize. Visualization is an important step in the process because it allows a way for end users — in the case of churn, this is the marketing team and/or the product team — to consume the data quickly and easily.
7. Iterate and Deploy
This is where the interplay between data science and business is strongest – work together to determine if the model is actually effective. In particular, ensure models are sufficiently generic, which means using training, validation, and testing sets that are not specific to a certain time period or to a certain type of customer. For example, you would not want to train or test based on a data set from a time period where there was perhaps a pricing change or some other factor that caused churn rates to be different than usual.
Once you have a good churn prediction model in place, the job is only half complete. The final (and perhaps most important) step is to take actions based on your predictions. Many businesses make the mistake of taking those who scored the highest (i.e., are most likely to churn) and targeting them, but often it’s the customers with lower scores who can be saved from the ‘churning out.’ Often, short-term, marketing campaigns (particularly those offering special deals or discounts) are the most effective means of re-engaging predicted churners.
Dr. Ken Sanford is the U.S. lead analytics architect for Dataiku. He is a reformed academic economist who likes to empower customers to solve problems with data. In addition, Dr. Sanford teaches courses in Applied Forecasting, Stress Testing and Big Data Tools for Economists at Boston College. He has a Ph.D. in Economics from the University of Kentucky in Lexington and his work on price optimization has been published in peer-reviewed journals.