In this analysis we’ll look at video updates sent with Buffer in 2019 and break them down by whether or not they were created by paying customers. The idea is to get a better sense of where our costs come from.
If a high percentage of video updates are sent by free users, resulting in a disproportionately high proportion of processing costs, it might make sense to only allow paying customers to create video updates.
We’ll collect video updates sent in 2019 from BigQuery.
# connect to BigQuery con <- dbConnect( bigrquery::bigquery(), project = "buffer-data" ) # define sql query sql <- " select timestamp_trunc(up.created_at, month) as created_month , u.billing_plan , u.billing_plan_number , count(distinct up.id) as video_updates from dbt_buffer.publish_updates up inner join dbt_buffer.publish_profiles p on up.profile_id = p.id inner join dbt_buffer.publish_users u on p.user_id = u.id where json_extract(up.media, '$.video') is not null and up.synced_at >= '2019-01-01' and up.created_at between '2019-01-01' and '2020-01-01' group by 1,2,3 " # query bigquery videos <- dbGetQuery(con, sql)
Now let’s plot the proportion of video updates that were sent by paying and free users across the year. It should be noted that we determine if a user is a paying customer by looking at their current plan.
That means that a user that sent a video update while on a paying plan in 2019 and later churned would not be considered a paying customer. Conversely, a user that sent a video update while on the Free plan in 2019 who later upgraded to a paid plan would be considered a paying customer in this plot.
The proportion of video updates that were created by paying customers increased throughout 2019. This may be in part because we launched direct scheduling for Instagram videos. It may also be because we deemphasized the Free plan throughout the year.
If we calculate the proportion of all video updates created in 2019 sent by users that are currently paying for Publish, it comes out to around 51%.
# get proportion of video updates for all of 2019 videos %>% group_by(paying) %>% summarise(videos = sum(video_updates)) %>% mutate(prop = percent(videos / sum(videos)))
## # A tibble: 2 x 3 ## paying videos prop ## <lgl> <int> <chr> ## 1 FALSE 2212121 48.9% ## 2 TRUE 2313238 51.1%