4 min read

What Proportion of Active Users Are Tagged?

In this analysis we’ll calculate the proportion of Buffer’s user base that is labeled with Segment Identify Traits. We’ll look at the percentage of active customers that have been labeled by the target customer survey as well as the proportion of paying customers that are labeled by the traits isPayingPublishCustomer and isPayingAnalyzeCustomer.

I gathered the “active” user counts by collecting users that took the most common action in each product. For Publish that was “Post Created”, for Analyze it was “Report Module Added”, and for Reply it was “Conversation Closed”.

TLDR

  • Around 44% of active Publish users have been tagged by the target customer survey.
  • Around 48% of active Analyze users have been tagged by the target customer survey.
  • Around 48% of active Reply users have been tagged by the target customer survey.

Paying Customer Identify Traits

Let’s start with Publish users. It should be noted that there are millions of inactive free accounts, so we don’t expect many of those to be tagged with a currentPublishPlan trait. Therefore we’ll only look at paying Publish customers.

# connect to bigquery
con <- dbConnect(
  bigrquery::bigquery(),
  project = "buffer-data"
)

# define sql query
sql <- "
  select distinct
    u.id as user_id
    , u.billing_plan
    , i.current_publish_plan
    , i.is_paying_publish_customer
  from dbt_buffer.publish_users u
  left join segment_authentication_service.identifies i
    on i.user_id = u.account_id
  where u.billing_plan != 'individual' 
  and u.billing_plan != 'free'
  and u.billing_plan is not null
"
  
# query BQ
users <- dbGetQuery(con, sql)

Now let’s calculate the proportion of all paying users that are tagged.

# calculate proportion that are tagged
users %>% 
  count(is_paying_publish_customer, name = "users") %>% 
  mutate(prop = percent(users / sum(users)))
## # A tibble: 3 x 3
##   is_paying_publish_customer users prop 
##   <lgl>                      <int> <chr>
## 1 FALSE                       1729 2%   
## 2 TRUE                       38981 56%  
## 3 NA                         29278 42%

Around 80% of paying Publish customers are not tagged by IsPayingPublishCustomer. In addition, around 1.6% of paying Publish customers are incorrectly tagged (i.e. they are given a value of FALSE instead of TRUE).

Now let’s look at Analyze customers.

# connect to bigquery
con <- dbConnect(
  bigrquery::bigquery(),
  project = "buffer-data"
)

# define sql query
sql <- "
  select distinct
    u.id as user_id
    , u.current_plan
    , i.current_analyze_plan
    , i.is_paying_analyze_customer
  from dbt_buffer.analyze_users u
  left join segment_authentication_service.identifies i
    on i.user_id = u.account_id
  where u.current_plan != 'free'
  and u.current_plan is not null
"
  
# query BQ
analyze_users <- dbGetQuery(con, sql)
# calculate proportion that are tagged
analyze_users %>% 
  count(is_paying_analyze_customer, name = "users") %>% 
  mutate(prop = percent(users / sum(users)))
## # A tibble: 2 x 3
##   is_paying_analyze_customer users prop 
##   <lgl>                      <int> <chr>
## 1 FALSE                         31 6%   
## 2 TRUE                         526 94%

All paying Analyze customers have a value in isPayingAnalyzeCustomer, but 31 (6%) are incorrectly labeled FALSE.

Target Customer Survey

We’ll use the SQL query below to retrieve data related to the target customer survey from BigQuery.

# connect to bigquery
con <- dbConnect(
  bigrquery::bigquery(),
  project = "buffer-data"
)

# define sql query
sql <- "
  select
    i.company_type
    , event
    , count(distinct t.user_id) as users
  from dbt_buffer.segment_tracks t
  left join dbt_buffer.segment_identifies i
    on t.user_id = i.user_id
  where t.original_timestamp > timestamp_add(current_timestamp() , interval -60 day)
  and event in ('post_created', 'report_module_added', 'conversation_closed')
  group by 1, 2
  order by 3 desc
"
  
# query BQ
tags <- dbGetQuery(con, sql)

Let’s see the overall proportion of active users that have responded to the target customer survey.

company_type users percent
NA 153,079 56.416%
no-answer 27,947 10.300%
agency 20,861 7.688%
publisher 14,349 5.288%
online-store 14,119 5.203%
personal 13,630 5.023%
physical-store 10,447 3.850%
b2b 8,008 2.951%
none 4,975 1.834%
saas 3,923 1.446%

We can see that around 56% of users that have triggered a Segment event have not been tagged by the target customer survey.

Publish

Now let’s only look at Publish customers.

company_type users percent
NA 150,316 56.491%
no-answer 27,361 10.283%
agency 20,157 7.575%
publisher 14,166 5.324%
online-store 13,755 5.169%
personal 13,507 5.076%
physical-store 10,212 3.838%
b2b 7,889 2.965%
none 4,939 1.856%
saas 3,787 1.423%

Around 56% of active Publish users don’t have a company type associated with them.

Analyze

Around 52% of active Analyze users do not have a company type associated with them.

company_type users percent
NA 1,525.0 52.586%
agency 543.0 18.724%
no-answer 240.0 8.276%
online-store 142.0 4.897%
physical-store 134.0 4.621%
publisher 109.0 3.759%
personal 83.0 2.862%
b2b 66.0 2.276%
saas 30.0 1.034%
none 28.0 0.966%

Reply

Around 52% of active Reply users do not have a company type associated with them.

company_type users percent
NA 1,238.0 52.70%
no-answer 346.0 14.73%
online-store 222.0 9.45%
agency 161.0 6.85%
saas 106.0 4.51%
physical-store 101.0 4.30%
publisher 74.0 3.15%
b2b 53.0 2.26%
personal 40.0 1.70%
none 8.0 0.34%