Analyzing the Promoters and Detractors and Net Promoter Score

Table of contents

Contributed BY: VENKATESH SUSWARAM

I’m Venkatesh, currently reside in Bangalore. I graduated from BITS Pilani in 2013. My hobbies include playing the Piano, gymming, doing fitness activities, and reading books. Coming to my professional experience, I’m currently working as a Senior Business analyst in Data Science and Advanced Analytics team at VMWare. I have a total of 8+ years of experience in the data analytics domain. I’ve worked on management consulting and product-based firms so far. Before joining PGP-DSBA at Great Learning, I worked on a couple of machine learning projects. One of the projects is to predict the sentiment of the customers ( Net promoters score ) using the transactional data of the customers.

Business constantly tries to improve customer satisfaction, and the net promoter survey is a crucial component of capturing feedback from the customers. In this survey, the customers would like us to rate their experience doing business with the organization (1 being worst and ten being best ). If the customer score is between 1- 6, they are classified as ‘Detractors’ ( most likely churned-out customers ), 7,8 – Neutral, and 9,10 – Promoters (loyal customers). The response rate is very low for this survey. Hence business wants to predict the customers’ sentiment by merely using the transactional data ( bookings, sales leads, service requests, etc. ) without taking any survey inputs. 

The key challenges were data is not balanced and the data for customers with a low rating is sparsely available. Data cleansing and transformation are required as we collect the data from multiple data sources. Customer identifiers are different across data sources. Hence we have to merge data at a common unique identifier. NPS survey is conducted at the email level. There could be chances where both promoters and detractor feedbacks are from the same company.

EDA Steps

Data Cleansing: Removed the customers with no bookings in the past 5 years and default customers., Removed highly correlated variables ( threshold = 70 % )  

Data Manipulation: 1. Replaced the outliers ( greater than z  & 0.99)  with z = 0.99, and normalized the input variables between -1 and 1.

Exclusions:

1. Excluded the customer details who have had no bookings in the past 5 years, 

2. Excluded the “Default” account name, especially owing to a large number of SRs

Models Applied

We have tried logistic regression, random forest, etc., models on this data. The random forest model provides decent accuracy and an F1 score of 76%. 

Impact

  • Non-Detractors have higher SRs, higher campaign responses, and quicker SR resolve time
  • Detractors have higher average bookings per survey wave. Opportunity win & lost rates are similar for Detractors & non Detractors. Start segmentation-based customer service (differentiated customer service) based on bookings
  • Refine the survey contact list based on purchase, SR, Campaign response etc.

With the topics and models learned in this course, I’m hoping that we can use advanced techniques next time to get better predictions.

→ Explore this Curated Program for You ←

Avatar photo
Great Learning Editorial Team
The Great Learning Editorial Staff includes a dynamic team of subject matter experts, instructors, and education professionals who combine their deep industry knowledge with innovative teaching methods. Their mission is to provide learners with the skills and insights needed to excel in their careers, whether through upskilling, reskilling, or transitioning into new fields.
Free Online Courses by Great Learning Academy
Scroll to Top