Modern customer segmentation strategies/techniques

  • Personalization in Recommendation
  • Using patterns of activity to define segments
  • Tying multiple traditional methodologies together for better customer segmentation
  • Building a customer persona
  • Online customer segmentation

Data sources

Limitation of traditional data sources:

  • Survey-based: 
    • Limited Data
    • Limited contextual understanding
  • Siloed Data

Newer data sources:

  • Internal Data Warehouses
  • Social Media Platforms
    • Social Media Advertising
    • Social Media Analytics
  • External Web Tracking Tools (Pendo and similar tools)
  • Service/Product Usage (Internal database/warehouses)
  • Web Scraping (Post Segmentation – eg. Spotify)
  • Freely available data (Government provided et. al.):
    • Demographics
    • Profession
    • Industry Type 
    • (Helpful in the starting phase of company to understand the market)

Clustering:

  • When we talk about segmentation, the very first thing that comes to our minds is AI-based clustering
  • AI models enable clustering based on similar traits/similar characteristics but with softer boundaries as opposed to the rule-based models
  • Clustering forms the first step to personalization 
  • Ecommerce example:
    • Most commonly used technique:
      • RFM clustering ( recency/frequency/monetary value-based segments created)
    • Limitations of RFM:
      • RFM technique is used mostly on transactional data
      • Parameters such as  age, gender, region, etc. of the customer are not tied to the transactional data
      • Customer-based data points like Usage/trend/seasonality/calendar and impulsive buys not taken into consideration
    • Churn/Engagement scores as an additional input to the clustering algorithm
      • Identify accounts with a high propensity of churning (score them)
      • Behavior (usage) – understand the engagement of a customer wrt platform and provide scores
      • These Propensities/probabilities can be used as features for Segmentation
      • Usage – data source examples GA, Pendo, Internal Data warehouses

ML Techniques used for clustering

Unsupervised learning is generally used for clustering

  • K-means 
    • Define the number of clusters (K)
    • Forms clusters after computing sum of a mean of squared distances between centroids and the data points
    • Optimize the position of centroids to get this sum to a minimum
    • Computationally less expensive and easier to build
  • DBscan
    • Density-based clustering
    • No need to define the number of clusters (unlike K-means)
    • Handles outliers quite well
    • Computationally expensive; Takes a lot of time for execution
    • If the density varies across the data, then the clusters may not form properly
    • Can’t handle data with higher dimensionality
  • GMM
    • Based on 
      • Mean – defines the centre
      • Covariance – defines the width
      • Mixing probability – defines how big or small the cluster will be
    • Can form clusters of different shapes, and not just circular like K means
  • SOM/PCA
    • Techniques used for dimensionality reduction
    • The output from these techniques can be subsequently used as input to the clustering algorithms

Use Cases for clustering

  • Once clusters are created, you can devise growth/marketing strategies around each cluster based on their common characteristics 
  • Types: Retention /Growth/ Upsell/ Xsell/ Ad-targeting
    • Retention – High-value customers
    • Growth – Low-level engagement + prospects – push them to the other side of the fence – eg: Ad targeting
    • Upsell/Xsell – Customers (medium engagement but the scope for increased revenue) with better recommendation (Recommendation System)

Recommendation Systems – Various techniques/approaches

  • Collaborative filtering (Cross referencing – pick favourites from one user and recommend to another user belonging to the same cluster)
    • Some of the useful data points for collaborative filtering for a platform such as Spotify:
      • Genre
      • Artist
      • Preferred Language
  • Content Based filtering using  NLP for a platform like Quora
    • Word embeddings – convert words to vectors
    • Similarity between words (distance between the vectors)
    • Doc2vec (Finding similarity between documents)
    • Topic modelling on the document
    • Contextual similarity – similarity of the content falling within a certain topic

Case Study:Spotify B2C case study 

Leave a Reply