In this analysis we’ll explore the results from the Publish and Analyze bundle A/B test, also known as experimnent EID19. The experiment was run as an A/B test via our A/Bert framework, and split visitors randomly 50/50 between the control and the variation groups. Visitors were enrolled in the experiment once they landed on a page that had a CTA promoting the experiment landing page (ie, a visitor did not have to land on the experiment landing page to be enrolled, as the variation treatment included menu items and ctas propmting users to go to the experiment landing page).
The experiment hypthosis was:
- If we offer an easier, more convenient way for new-to-Buffer users to create a Buffer account and trial Publish & Analyze all at the same time (ie, create a Buffer account, signup for Analyze, signup for Publish, start an Analyze trial, and start a Publish trial all in one workflow), then we will see an increase in the number of users paying for both products.
Given this hypothesis, our success metric was:
- Number of users paying for both products
Also, it is important to note, we are interested in comparing the overall impact between the two groups, not just the specific acquisition flow from within the experiment landing page.
TLDR
The result of the experiment is that there is insufficient evidence to confirm the hypothesis. Though it appears the variation treatment lead to more Analyze trials (which had a down stream effect of increasing the mrr value per paying account), the landing page underperformed compared to trials started from other locations, and the number of new users that ended up on a paid subscription to both products had no statistical difference between the two groups. Given there were only 3 users that started a trial from the experiment landing page and converted to paid subscriptions in both products, it appears the positive results from the variation group were related to increasing the exposure of Analyze during user acquisition.
We recommend persuing an additional iteration of this experiment, examing both ways to increase click throughs on the experiment CTAs to the solutions landing page, as well as increase the performance of the solutions landing page via changes to messaging and changes to which plans are paired for the solution offered.
Data Collection
To analyze the results of this experiment, we will use the following query to retrieve data about users enrolled in the experiment.
# connect to bigquery
con <- dbConnect(
bigrquery::bigquery(),
project = "buffer-data"
)
# define sql query to get experiment enrolled visitors
sql <- "
with enrolled_users as (
select
anonymous_id
, experiment_group
, first_value(timestamp) over (
partition by anonymous_id order by timestamp asc
rows between unbounded preceding and unbounded following) as enrolled_at
from segment_marketing.experiment_viewed
where
first_viewed
and experiment_id = 'eid19_publish_analyze_bundle_ab'
)
select
e.anonymous_id
, e.experiment_group
, e.enrolled_at
, i.user_id as account_id
, c.email
, c.publish_user_id
, c.analyze_user_id
, a.timestamp as account_created_at
, t.product as trial_product
, t.timestamp as trial_started_at
, t.subscription_id as trial_subscription_id
, t.stripe_event_id as stripe_trial_event_id
, t.plan_id as trial_plan_id
, t.cycle as trial_billing_cycle
, t.cta as trial_started_cta
, t.multi_product_bundle_name as trial_multi_product_bundle_name
, s.product as subscription_product
, s.timestamp as subscription_started_at
, s.subscription_id as subscription_id
, s.stripe_event_id as stripe_subscription_event_id
, s.plan_id as subscription_plan_id
, s.cycle as subscritpion_billing_cycle
, s.revenue as subscription_revenue
, s.amount as subscritpion_amount
, s.cta as subscription_started_cta
, s.multi_product_bundle_name as subscription_multi_product_bundle_name
from enrolled_users e
left join segment_login_server.identifies i
on e.anonymous_id = i.anonymous_id
left join dbt_buffer.core_accounts c
on i.user_id = c.id
left join segment_login_server.account_created a
on i.user_id = a.user_id
left join segment_publish_server.trial_started t
on i.user_id = t.user_id
and t.timestamp > e.enrolled_at
left join segment_publish_server.subscription_started s
on i.user_id = s.user_id
and s.timestamp > e.enrolled_at
group by 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26
"
# query BQ
# users <- dbGetQuery(con, sql)
Exploratory Analysis
Let’s start by reviewing a few of the summary statistics from out data.
skim(users)
Name | users |
Number of rows | 284326 |
Number of columns | 26 |
_______________________ | |
Column type frequency: | |
character | 20 |
numeric | 2 |
POSIXct | 4 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
---|---|---|---|---|---|---|---|
anonymous_id | 0 | 1.00 | 36 | 36 | 0 | 282262 | 0 |
experiment_group | 0 | 1.00 | 7 | 9 | 0 | 2 | 0 |
account_id | 259040 | 0.09 | 24 | 24 | 0 | 23855 | 0 |
259123 | 0.09 | 8 | 50 | 0 | 23673 | 0 | |
publish_user_id | 259218 | 0.09 | 24 | 24 | 0 | 23677 | 0 |
analyze_user_id | 281767 | 0.01 | 24 | 24 | 0 | 1411 | 0 |
trial_product | 268322 | 0.06 | 7 | 7 | 0 | 2 | 0 |
trial_subscription_id | 268322 | 0.06 | 18 | 18 | 0 | 15843 | 0 |
stripe_trial_event_id | 268322 | 0.06 | 18 | 18 | 0 | 15843 | 0 |
trial_plan_id | 268322 | 0.06 | 13 | 44 | 0 | 14 | 0 |
trial_billing_cycle | 268322 | 0.06 | 4 | 5 | 0 | 2 | 0 |
trial_started_cta | 268440 | 0.06 | 28 | 70 | 0 | 52 | 0 |
trial_multi_product_bundle_name | 284123 | 0.00 | 22 | 22 | 0 | 1 | 0 |
subscription_product | 282644 | 0.01 | 7 | 7 | 0 | 2 | 0 |
subscription_id | 282644 | 0.01 | 18 | 18 | 0 | 1397 | 0 |
stripe_subscription_event_id | 282644 | 0.01 | 18 | 18 | 0 | 1398 | 0 |
subscription_plan_id | 282644 | 0.01 | 13 | 44 | 0 | 14 | 0 |
subscritpion_billing_cycle | 282644 | 0.01 | 4 | 5 | 0 | 2 | 0 |
subscription_started_cta | 282802 | 0.01 | 30 | 70 | 0 | 40 | 0 |
subscription_multi_product_bundle_name | 284314 | 0.00 | 22 | 22 | 0 | 1 | 0 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
subscription_revenue | 283212 | 0.00 | 35.19 | 33.49 | 12 | 15 | 15 | 50 | 399 | ▇▁▁▁▁ |
subscritpion_amount | 282644 | 0.01 | 83.31 | 142.14 | 15 | 15 | 50 | 144 | 1010 | ▇▁▁▁▁ |
Variable type: POSIXct
skim_variable | n_missing | complete_rate | min | max | median | n_unique |
---|---|---|---|---|---|---|
enrolled_at | 0 | 1.00 | 2019-10-14 19:21:46 | 2019-12-06 22:17:51 | 2019-10-22 03:30:14 | 282226 |
account_created_at | 259040 | 0.09 | 2019-04-23 17:03:04 | 2019-12-11 13:42:29 | 2019-10-18 09:57:29 | 23855 |
trial_started_at | 268322 | 0.06 | 2019-10-15 06:52:41 | 2019-12-11 15:00:59 | 2019-10-24 08:58:03 | 15845 |
subscription_started_at | 282644 | 0.01 | 2019-10-15 10:02:53 | 2019-12-11 15:05:22 | 2019-11-04 22:58:53 | 1399 |
Let’s start with a quick validation of the visitor count split between the two experiment groups.
users %>%
group_by(experiment_group) %>%
summarise(visitors = n_distinct(anonymous_id), accounts = n_distinct(account_id), trials = n_distinct(trial_subscription_id), subscriptions = n_distinct(subscription_id)) %>%
mutate(visitor_split_perct = visitors / sum(visitors)) %>%
kable() %>%
kable_styling()
## `summarise()` ungrouping output (override with `.groups` argument)
experiment_group | visitors | accounts | trials | subscriptions | visitor_split_perct |
---|---|---|---|---|---|
control | 141129 | 12148 | 8081 | 690 | 0.4999929 |
variant_1 | 141133 | 11709 | 7764 | 709 | 0.5000071 |
Great, there is a total of 282,249 unique visitors enrolled in the experiment, and the percentage split between the two experiment groups is within a few visitors of exactly 50/50. This confirms that our experiment framework correctly split enrollments for the experiment.
res <- prop.test(x = c(12080, 11656), n = c(141123, 141126), alternative = "two.sided")
res
##
## 2-sample test for equality of proportions with continuity correction
##
## data: c(12080, 11656) out of c(141123, 141126)
## X-squared = 8.2403, df = 1, p-value = 0.004097
## alternative hypothesis: two.sided
## 95 percent confidence interval:
## 0.0009514326 0.0050610215
## sample estimates:
## prop 1 prop 2
## 0.08559909 0.08259286
explain(res)
## Warning: package 'broom' was built under R version 3.6.2
This was a two-sample proportion test of the null hypothesis that the true population proportions are equal. Using a significance level of 0.05, we reject the null hypothesis, and conclude that two population proportions are not equal. The observed difference in proportions is -0.00300622704010574. The observed proportion for the first group is 0.0855990873209895 (12,080 events out of a total sample size of 141,123). For the second group, the observed proportion is 0.0825928602808838 (11,656, out of a total sample size of 141,126).
The confidence interval for the true difference in population proportions is (9.514325910^{-4}, 0.005061). This interval will contain the true difference in population proportions 95 times out of 100.
The p-value for this test is 0.0040971. This, formally, is defined as the probability – if the null hypothesis is true – of observing a difference in sample proportions that is as or more extreme than the difference in sample proportions from this data set. In this case, this is the probability – if the true population proportions are equal – of observing a difference in sample proportions that is greater than 0.00300622704010574 or less than -0.00300622704010574.
We can see that the number of visitors that ended up creating a Buffer account (ie, signed up for their first Buffer product), was a few hundred higher in the control group. Using a quick proportion test we can see that this difference in proportion of enrolled visitors that created a Buffer account is statistically significant, with a p-value of 0.004 (which is less than the generally accepted 0.05 threshold).
Next, we will calculate how many users from each experiment group started a trial from each product.
users %>%
mutate(has_publish_trial = trial_product == "publish") %>%
group_by(experiment_group, has_publish_trial) %>%
summarise(users = n_distinct(account_id)) %>%
ungroup() %>%
filter(has_publish_trial) %>%
group_by(experiment_group) %>%
summarise(users_with_publish_trials = users) %>%
kable() %>%
kable_styling()
## `summarise()` regrouping output by 'experiment_group' (override with `.groups` argument)
## `summarise()` ungrouping output (override with `.groups` argument)
experiment_group | users_with_publish_trials |
---|---|
control | 7440 |
variant_1 | 7012 |
There were 7372 users in the control group that started a Publish trial, and 6954 in the variation group. Just like above, we should also run a proportion test here.
res <- prop.test(x = c(7372, 6954), n = c(12080, 11656), alternative = "two.sided")
res
##
## 2-sample test for equality of proportions with continuity correction
##
## data: c(7372, 6954) out of c(12080, 11656)
## X-squared = 4.5707, df = 1, p-value = 0.03252
## alternative hypothesis: two.sided
## 95 percent confidence interval:
## 0.001130084 0.026194502
## sample estimates:
## prop 1 prop 2
## 0.6102649 0.5966026
explain(res)
This was a two-sample proportion test of the null hypothesis that the true population proportions are equal. Using a significance level of 0.05, we reject the null hypothesis, and conclude that two population proportions are not equal. The observed difference in proportions is -0.0136622925634184. The observed proportion for the first group is 0.610264900662252 (7,372 events out of a total sample size of 12,080). For the second group, the observed proportion is 0.596602608098833 (6,954, out of a total sample size of 11,656).
The confidence interval for the true difference in population proportions is (0.0011301, 0.0261945). This interval will contain the true difference in population proportions 95 times out of 100.
The p-value for this test is 0.0325235. This, formally, is defined as the probability – if the null hypothesis is true – of observing a difference in sample proportions that is as or more extreme than the difference in sample proportions from this data set. In this case, this is the probability – if the true population proportions are equal – of observing a difference in sample proportions that is greater than 0.0136622925634184 or less than -0.0136622925634184.
We can see that the difference in proportion of accounts that started a Publish trial is also statistically significant, with a p-value of 0.03.
users %>%
mutate(has_analyze_trial = (trial_product == "analyze")) %>%
group_by(experiment_group, has_analyze_trial) %>%
summarise(users = n_distinct(account_id)) %>%
ungroup() %>%
filter(has_analyze_trial) %>%
group_by(experiment_group) %>%
summarise(users_with_analyze_trials = users) %>%
kable() %>%
kable_styling()
## `summarise()` regrouping output by 'experiment_group' (override with `.groups` argument)
## `summarise()` ungrouping output (override with `.groups` argument)
experiment_group | users_with_analyze_trials |
---|---|
control | 466 |
variant_1 | 588 |
There were 449 users in the control group that started an Analyze trial, and 571 in the variation group. Just like above, we should also run a proportion test here.
res <- prop.test(x = c(449, 571), n = c(12080, 11656), alternative = "two.sided")
res
##
## 2-sample test for equality of proportions with continuity correction
##
## data: c(449, 571) out of c(12080, 11656)
## X-squared = 19.862, df = 1, p-value = 8.324e-06
## alternative hypothesis: two.sided
## 95 percent confidence interval:
## -0.017073586 -0.006563957
## sample estimates:
## prop 1 prop 2
## 0.03716887 0.04898765
explain(res)
This was a two-sample proportion test of the null hypothesis that the true population proportions are equal. Using a significance level of 0.05, we reject the null hypothesis, and conclude that two population proportions are not equal. The observed difference in proportions is 0.0118187716754467. The observed proportion for the first group is 0.0371688741721854 (449 events out of a total sample size of 12,080). For the second group, the observed proportion is 0.0489876458476321 (571, out of a total sample size of 11,656).
The confidence interval for the true difference in population proportions is (-0.0170736, -0.006564). This interval will contain the true difference in population proportions 95 times out of 100.
The p-value for this test is 8.32444710^{-6}. This, formally, is defined as the probability – if the null hypothesis is true – of observing a difference in sample proportions that is as or more extreme than the difference in sample proportions from this data set. In this case, this is the probability – if the true population proportions are equal – of observing a difference in sample proportions that is greater than 0.0118187716754467 or less than -0.0118187716754467.
We can see that the difference in proportion of accounts that started an Analyze trial is also statistically significant, with a p-value of 8.324e-06 (which a very small number, and much less than the generally accepted 0.05 threshold). This means we can say with confidence that the observed difference in the number of Analyze trials started between the two experiment groups was not the result of random variance.
Next, we will calculate how many users from each experiment group started a paid subscription from each product.
users %>%
mutate(has_publish_sub = (subscription_product == "publish")) %>%
group_by(experiment_group, has_publish_sub) %>%
summarise(users = n_distinct(account_id)) %>%
ungroup() %>%
filter(has_publish_sub) %>%
group_by(experiment_group) %>%
summarise(paying_publish_users = users) %>%
kable() %>%
kable_styling()
## `summarise()` regrouping output by 'experiment_group' (override with `.groups` argument)
## `summarise()` ungrouping output (override with `.groups` argument)
experiment_group | paying_publish_users |
---|---|
control | 640 |
variant_1 | 639 |
There were 618 users in the control group that started a paid Publish subscription, and 620 in the variation group. Just like above, we should also run a proportion test here, for both the account to paid subscription proportion, and also the trial to paid subscription proportion.
res <- prop.test(x = c(618, 620), n = c(12080, 11656), alternative = "two.sided")
res
##
## 2-sample test for equality of proportions with continuity correction
##
## data: c(618, 620) out of c(12080, 11656)
## X-squared = 0.45546, df = 1, p-value = 0.4998
## alternative hypothesis: two.sided
## 95 percent confidence interval:
## -0.007776708 0.003711610
## sample estimates:
## prop 1 prop 2
## 0.05115894 0.05319149
explain(res)
This was a two-sample proportion test of the null hypothesis that the true population proportions are equal. Using a significance level of 0.05, we do not reject the null hypothesis, and cannot conclude that two population proportions are different from one another. The observed difference in proportions is 0.00203254896435114. The observed proportion for the first group is 0.051158940397351 (618 events out of a total sample size of 12,080). For the second group, the observed proportion is 0.0531914893617021 (620, out of a total sample size of 11,656).
The confidence interval for the true difference in population proportions is (-0.0077767, 0.0037116). This interval will contain the true difference in population proportions 95 times out of 100.
The p-value for this test is 0.4997515. This, formally, is defined as the probability – if the null hypothesis is true – of observing a difference in sample proportions that is as or more extreme than the difference in sample proportions from this data set. In this case, this is the probability – if the true population proportions are equal – of observing a difference in sample proportions that is greater than 0.00203254896435114 or less than -0.00203254896435114.
There is no statistical difference between the two proportions of accounts that started a paid Publish subscription, as the p-value is 0.499.
res <- prop.test(x = c(618, 620), n = c(7372, 6954), alternative = "two.sided")
res
##
## 2-sample test for equality of proportions with continuity correction
##
## data: c(618, 620) out of c(7372, 6954)
## X-squared = 1.2195, df = 1, p-value = 0.2695
## alternative hypothesis: two.sided
## 95 percent confidence interval:
## -0.014679446 0.004026229
## sample estimates:
## prop 1 prop 2
## 0.08383071 0.08915732
explain(res)
This was a two-sample proportion test of the null hypothesis that the true population proportions are equal. Using a significance level of 0.05, we do not reject the null hypothesis, and cannot conclude that two population proportions are different from one another. The observed difference in proportions is 0.00532660873071643. The observed proportion for the first group is 0.0838307107976126 (618 events out of a total sample size of 7,372). For the second group, the observed proportion is 0.089157319528329 (620, out of a total sample size of 6,954).
The confidence interval for the true difference in population proportions is (-0.0146794, 0.0040262). This interval will contain the true difference in population proportions 95 times out of 100.
The p-value for this test is 0.2694686. This, formally, is defined as the probability – if the null hypothesis is true – of observing a difference in sample proportions that is as or more extreme than the difference in sample proportions from this data set. In this case, this is the probability – if the true population proportions are equal – of observing a difference in sample proportions that is greater than 0.00532660873071643 or less than -0.00532660873071643.
There is no statistical difference between the two proportions of accounts that started a paid Publish subscription, as the p-value is 0.269.
Next, let’s look into the number of Analyze paid subscriptions between the two experiment groups.
users %>%
mutate(has_analyze_sub = (subscription_product == "analyze")) %>%
group_by(experiment_group, has_analyze_sub) %>%
summarise(users = n_distinct(account_id)) %>%
ungroup() %>%
filter(has_analyze_sub) %>%
group_by(experiment_group) %>%
summarise(paying_publish_users = users) %>%
kable() %>%
kable_styling()
## `summarise()` regrouping output by 'experiment_group' (override with `.groups` argument)
## `summarise()` ungrouping output (override with `.groups` argument)
experiment_group | paying_publish_users |
---|---|
control | 47 |
variant_1 | 59 |
The number of paying users for Analyze was 44 in the control and 54 in the variation. Just like above, we should also run a couple of proportion tests here too.
explain(prop.test(x = c(44, 54), n = c(12080, 11656), alternative = "two.sided"))
This was a two-sample proportion test of the null hypothesis that the true population proportions are equal. Using a significance level of 0.05, we do not reject the null hypothesis, and cannot conclude that two population proportions are different from one another. The observed difference in proportions is 0.000990423031994436. The observed proportion for the first group is 0.00364238410596027 (44 events out of a total sample size of 12,080). For the second group, the observed proportion is 0.0046328071379547 (54, out of a total sample size of 11,656).
The confidence interval for the true difference in population proportions is (-0.0027099, 7.290466310^{-4}). This interval will contain the true difference in population proportions 95 times out of 100.
The p-value for this test is 0.2764203. This, formally, is defined as the probability – if the null hypothesis is true – of observing a difference in sample proportions that is as or more extreme than the difference in sample proportions from this data set. In this case, this is the probability – if the true population proportions are equal – of observing a difference in sample proportions that is greater than 0.000990423031994436 or less than -0.000990423031994436.
There is no statistical difference between the two proportions of users that started a paid Analyze subscription, as the p-value is 0.276.
res <- prop.test(x = c(44, 54), n = c(449, 571), alternative = "two.sided")
res
##
## 2-sample test for equality of proportions with continuity correction
##
## data: c(44, 54) out of c(449, 571)
## X-squared = 0.0059629, df = 1, p-value = 0.9384
## alternative hypothesis: two.sided
## 95 percent confidence interval:
## -0.03506551 0.04191475
## sample estimates:
## prop 1 prop 2
## 0.09799555 0.09457093
explain(res)
This was a two-sample proportion test of the null hypothesis that the true population proportions are equal. Using a significance level of 0.05, we do not reject the null hypothesis, and cannot conclude that two population proportions are different from one another. The observed difference in proportions is -0.00342461746086847. The observed proportion for the first group is 0.0979955456570156 (44 events out of a total sample size of 449). For the second group, the observed proportion is 0.0945709281961471 (54, out of a total sample size of 571).
The confidence interval for the true difference in population proportions is (-0.0350655, 0.0419147). This interval will contain the true difference in population proportions 95 times out of 100.
The p-value for this test is 0.9384488. This, formally, is defined as the probability – if the null hypothesis is true – of observing a difference in sample proportions that is as or more extreme than the difference in sample proportions from this data set. In this case, this is the probability – if the true population proportions are equal – of observing a difference in sample proportions that is greater than 0.00342461746086847 or less than -0.00342461746086847.
There is no statistical difference between the two proportions of Analyze trials that started a paid Analyze subscription, as the p-value is 0.938. This means that the difference in paid Analyze subscriptions is the result of more Analyze trials started in the variation group, not a difference in conversion rate to paid between the two groups.
Next, we will look at the number of accounts that started paid subscriptions for both Publish and Analyze in both experiment groups.
users %>%
filter(!is.na(subscription_product)) %>%
group_by(account_id, experiment_group) %>%
summarise(products = n_distinct(subscription_product)) %>%
ungroup() %>%
filter(products > 1) %>%
group_by(experiment_group, products) %>%
summarise(users = n_distinct(account_id)) %>%
kable() %>%
kable_styling()
## `summarise()` regrouping output by 'account_id' (override with `.groups` argument)
## `summarise()` regrouping output by 'experiment_group' (override with `.groups` argument)
experiment_group | products | users |
---|---|---|
control | 2 | 32 |
variant_1 | 2 | 44 |
res <- prop.test(x = c(30, 40), n = c(12080, 11656), alternative = "two.sided")
res
##
## 2-sample test for equality of proportions with continuity correction
##
## data: c(30, 40) out of c(12080, 11656)
## X-squared = 1.5059, df = 1, p-value = 0.2198
## alternative hypothesis: two.sided
## 95 percent confidence interval:
## -0.0024163450 0.0005198145
## sample estimates:
## prop 1 prop 2
## 0.002483444 0.003431709
explain(res)
This was a two-sample proportion test of the null hypothesis that the true population proportions are equal. Using a significance level of 0.05, we do not reject the null hypothesis, and cannot conclude that two population proportions are different from one another. The observed difference in proportions is 0.000948265282468285. The observed proportion for the first group is 0.00248344370860927 (30 events out of a total sample size of 12,080). For the second group, the observed proportion is 0.00343170899107756 (40, out of a total sample size of 11,656).
The confidence interval for the true difference in population proportions is (-0.0024163, 5.198144710^{-4}). This interval will contain the true difference in population proportions 95 times out of 100.
The p-value for this test is 0.2197602. This, formally, is defined as the probability – if the null hypothesis is true – of observing a difference in sample proportions that is as or more extreme than the difference in sample proportions from this data set. In this case, this is the probability – if the true population proportions are equal – of observing a difference in sample proportions that is greater than 0.000948265282468285 or less than -0.000948265282468285.
There is no statistical difference between the two proportions of accounts started in each experiment group that ended up starting both a paid Analyze subscription and a paid Publish subscription, as the p-value is 0.219. Put another way, there is about a 1 out of 5 chance that the observed difference is the result of natural variance.
To ensure we have a full picture of the variation group’s behavior with converting to paid for both products, let’s also look at how many trials of both products were started from the experiment landing page and what those trial’s conversion to paid was.
users %>%
mutate(has_multiproduct_trial = !is.na(trial_multi_product_bundle_name)) %>%
group_by(has_multiproduct_trial) %>%
summarise(users = n_distinct(account_id)) %>%
kable() %>%
kable_styling()
## `summarise()` ungrouping output (override with `.groups` argument)
has_multiproduct_trial | users |
---|---|
FALSE | 23776 |
TRUE | 115 |
users %>%
mutate(has_multiproduct_trial = !is.na(trial_multi_product_bundle_name)) %>%
filter(has_multiproduct_trial) %>%
group_by(account_id) %>%
summarise(products = n_distinct(subscription_product), paid_subscriptions = n_distinct(subscription_id)) %>%
ungroup() %>%
filter(products > 1) %>%
summarise(users = n_distinct(account_id)) %>%
kable() %>%
kable_styling()
## `summarise()` ungrouping output (override with `.groups` argument)
users |
---|
3 |
3 of 115 trials converted to both paid products, which is a 2.6% conversion rate. For comparison, the conversion rate to paid for just Publish trials for both groups was 5%, and the conversion rate to paid for just Analye trials for both groups was 10%.
Finally, let’s also look at the MRR value of all converted trials per experiment group, to see if there was any overall difference in value between the two groups.
users %>%
filter(!is.na(subscription_id)) %>%
mutate(mrr_value = if_else(subscritpion_billing_cycle == "year", (subscritpion_amount / 12), (subscritpion_amount/1))) %>%
group_by(experiment_group) %>%
summarise(paying_user_count = n_distinct(account_id), total_mrr_value = round(sum(mrr_value),2), mrr_value_per_account = round(total_mrr_value / paying_user_count,2)) %>%
kable() %>%
kable_styling()
## `summarise()` ungrouping output (override with `.groups` argument)
experiment_group | paying_user_count | total_mrr_value | mrr_value_per_account |
---|---|---|---|
control | 655 | 25232.25 | 38.52 |
variant_1 | 654 | 27187.50 | 41.57 |
Given both groups ended up with almost the same number of paying customers, the difference in mrr value per account between the two groups indicates that the stat sig difference in analyze trial starts lead to more overall revenue. This could imply that the realized benefit of this experiment was more exposure to Analyze, not a more convient way to start trials for both products.
Final Results
Given the above observations, the result of the experiment is that there is insufficient evidence to confirm the hypothesis. Though it appears the variation treatment lead to more Analyze trials, and thus a slightly higher mrr value per paying account, the landing page underperformed compared to trials started from other locations, and the number of new users that ended up on a paid subscription to both products had no statistical difference between the two groups.
We recommend persuing an iteration of this experiment, examing both ways to increase click throughs on the experiment CTAs to the solutions landing page, as well as increase the performance of the solutions landing page via changes to messaging and changes to which plans are paired for the solution offered.