5 min read

Identifying Involuntary Churn

In this analysis we will try to identify subscriptions that are cancelled due to failed payments. The cancellations should not have been initiated by the customers but by Stripe. This subset of churn is often called “involuntary churn”.

We’ll also look at when customers add payment methods in relation to their trial end times.

Summary

  • Between 800 and 1200 subscriptions cancellations are preceded by a failed payment and unpaid invoice each month. These subscriptions are assumed to have been cancelled automatically.

  • Around 37-40% of users that added payment methods since July 2020 did so after their trials ended.

Data Collection

Our approach to identifying subscriptions that were cancelled automatically will center around the subscriptions’ latest invoices with an subtotal greater than 0. If the latest invoice of a cancelled subscription is not paid and has at least one failed payment attempt, then we will assume that the subscription was cancelled automatically.

The following SQL query will be used to retrieve cancelled subscriptions and their latest invoices.

# sql query
sql <- "
  with last_invoice as (
    select 
      i.id as invoice_id
      , i.subscription_id
      , i.customer_id
      , i.paid
      , i.forgiven
      , i.attempted
      , i.attempt_count
      , i.closed
      , i.total
      , i.subtotal
    from dbt_buffer.stripe_invoices i
    inner join (
      select 
        subscription_id
        , max(created_at) as last_invoice_at 
      from dbt_buffer.stripe_invoices 
      where total > 0
      and subtotal > 0
      group by 1) as recent
      on i.subscription_id = recent.subscription_id
    where i.created_at = recent.last_invoice_at
    )
    select distinct
      s.id as subscription_id
      , s.plan_id
      , s.created_at
      , s.canceled_at
      , s.ended_at
      , i.invoice_id
      , i.customer_id
      , i.paid
      , i.forgiven
      , i.attempted
      , i.attempt_count
      , i.closed
      , i.total
      , i.subtotal
    from dbt_buffer.stripe_paid_subscriptions s
    left join last_invoice i
      on s.id = i.subscription_id
    where s.ended_at >= '2019-01-01'
    and s.status = 'canceled'
"

Cancelled Subscriptions

Let’s explore the data and plot the number of churned subscriptions per month since January 2019.

It looks like there are usually between 3500 and 4000 subscriptions ending each month. Next we’ll attemp to identify subscriptions that were cancelled autmatically due to failed payments.

Automatic Cancellations

Again, we will identify these subscriptions by looking at their latest non-zero invoice and checking if it was paid and if there were multiple paymenat attempts.

# determine if churn is involuntary
subs <- subs %>% 
  mutate(involuntary = !paid & !forgiven & attempted & attempt_count > 1)

Now let’s count the number of cancellations that are due to our definition of involuntary churn:

# count subscriptions 
subs %>% 
  group_by(involuntary) %>% 
  summarise(subscriptions = n_distinct(subscription_id)) %>% 
  mutate(prop = percent(subscriptions / sum(subscriptions)))
## # A tibble: 2 x 3
##   involuntary subscriptions prop 
##   <lgl>               <int> <chr>
## 1 FALSE               62026 73%  
## 2 TRUE                22806 27%

Around 27% of all subscription cancellations occurred when the last invoice was not paid and multiple payment attempts were made. That is a surprisingly high percentage.

Now let’s plot this number over time.

The data suggests that around 800-1200 subscriptions are cancelled automatically each month. What percentage of subscriptions does this represent?

The data suggests that around 25-30% of cancellations occur after an unpaid invoice and failed payments. This is consistent with our earlier finding.

We can view individual subscriptions in the Stripe Dashboard to double check that the subscriptions were cancelled after a series of failed payments.

# preview involuntary churn subscriptions
subs %>% 
  filter(involuntary) %>% 
  select(customer_id, subscription_id, plan_id) %>% 
  sample_n(20)
## # A tibble: 20 x 3
##    customer_id        subscription_id    plan_id                     
##    <chr>              <chr>              <chr>                       
##  1 cus_8rpDNV6vDk5tAC sub_8rpDs80zPz9Yep pro-annual                  
##  2 cus_5a43aBstaZM3B5 sub_FAI1qo3vUW55QJ business_v2_business_monthly
##  3 cus_BsW0pne28qGbWZ sub_DkceJiGAwxGXuy business_v2_small_monthly   
##  4 cus_E5vDSXSmRcYBzv sub_E5vDe6pFCoKjG1 pro_v1_yearly               
##  5 cus_9CXW4GbAiPG0ap sub_EKVmVkGTOSmANj pro_v1_monthly              
##  6 cus_F2MCvt9xqUS2HB sub_GT0iqwkLwLyAba pro_v1_monthly              
##  7 cus_CRdEdsk1WVcaQ7 sub_CRdEwaiLSSrlB4 pro-annual                  
##  8 cus_8lGCIBejBB0eQH sub_A9r5qePkhey7ON pro_v1_monthly              
##  9 cus_9InMLyQEfKKE6O sub_9InMp0VOXRNJFR pro-monthly                 
## 10 cus_846cGXNOcYrxBg sub_846crkA4OTnY3Q pro-annual                  
## 11 cus_993rpVAvzH8jmp sub_993rBUxX5SIO2Z pro-annual                  
## 12 cus_EhE2eavtLzyaeZ sub_EoiQ39QfyLzpdF business_v2_small_monthly   
## 13 cus_DJ7LGIX5ihNE5M sub_DJ7LUa9Nfe30o7 business_v2_business_monthly
## 14 cus_A8LyiH181CydI6 sub_GXxY7y3NKfFIfu pro_v1_monthly              
## 15 cus_CcoRShN7VcVJpA sub_Cx3bjjDqSP3wLX pro_v1_monthly              
## 16 cus_336aUukJ9snsrQ sub_5F8U1sWJYtYYcz pro-monthly                 
## 17 cus_BhW2r5trJBKJRQ sub_BhW2ekNf4Z0etA pro-monthly                 
## 18 cus_8vcTIpNBsvJdp9 sub_8vcTQCy6zpsahc pro_v1_yearly               
## 19 cus_FD74HDhtSSVPkp sub_FM5SnD8aX3jzXM pro_v1_yearly               
## 20 cus_6KX37FWLa0iZsz sub_7UA015IqpZ9Zcl pro_v1_monthly

When Payment Methods are Added

Next we will look at when payment methods are added in relation to trials. The goal is to estimate the number of customers that add payment methods after their trials end. We’ll use the following SQL query to gather the data:

# sql query
sql <- "
  select distinct
    p.user_id
    , p.id as event_id
    , p.event
    , p.stripe_event_id
    , p.timestamp
    , t.id as trial_end_id
    , t.subscription_id
    , t.timestamp as trial_end_at
  from segment_publish_server.payment_method_added p
  left join dbt_buffer.segment_trial_ends t
    on p.user_id = t.user_id
"

Now let’s count the number of users that added a payment method after their trial ended.

# count users that added payment after trial ended
cards_added %>% 
  group_by(after = !is.na(trial_end_at) & timestamp > trial_end_at) %>% 
  summarise(users = n_distinct(user_id)) %>% 
  mutate(prop = percent(users / sum(users)))
## # A tibble: 2 x 3
##   after users prop 
##   <lgl> <int> <chr>
## 1 FALSE 39627 63%  
## 2 TRUE  23243 37%

Since July 2020, around 37% of users added their payment methods after their trial ended. Let’s see if this rate changed over time.

The proportion of customers that added their payment methods after their trial ended has tended to increase over time for whatever reason. The proportion is hovering around 40% in recent months.

How Long it Takes to Add Payment Methods

Let’s take a look at how long it takes people to add payment methods after their trials expire.

# calculate days to add payment method
cards_added <- cards_added %>% 
  mutate(trial_end_date = as.Date(trial_end_at, format = "%Y-%m-%d"),
         payment_add_date = as.Date(timestamp, format = "%Y-%m-%d"),
         days_to_add = as.numeric(payment_add_date - trial_end_date))

We can visualize the distribution of the number of days between the payment method added date and the date on which the trials ended.

Another way to visualize this distribution is to look at the cumulative distribution function (CDF).

Around 67% of users that added a payment method after their trial expired did so within 30 days of the trial expiring. Around 79% added a payment method within 60 days of the trial expiring.