20 min read

Publish & Analyze Bundle A/B Test (EID19)

In this analysis we’ll explore the results from the Publish and Analyze bundle A/B test, also known as experimnent EID19. The experiment was run as an A/B test via our A/Bert framework, and split visitors randomly 50/50 between the control and the variation groups. Visitors were enrolled in the experiment once they landed on a page that had a CTA promoting the experiment landing page (ie, a visitor did not have to land on the experiment landing page to be enrolled, as the variation treatment included menu items and ctas propmting users to go to the experiment landing page).

The experiment hypthosis was:

  • If we offer an easier, more convenient way for new-to-Buffer users to create a Buffer account and trial Publish & Analyze all at the same time (ie, create a Buffer account, signup for Analyze, signup for Publish, start an Analyze trial, and start a Publish trial all in one workflow), then we will see an increase in the number of users paying for both products.

Given this hypothesis, our success metric was:

  • Number of users paying for both products

Also, it is important to note, we are interested in comparing the overall impact between the two groups, not just the specific acquisition flow from within the experiment landing page.

TLDR

The result of the experiment is that there is insufficient evidence to confirm the hypothesis. Though it appears the variation treatment lead to more Analyze trials (which had a down stream effect of increasing the mrr value per paying account), the landing page underperformed compared to trials started from other locations, and the number of new users that ended up on a paid subscription to both products had no statistical difference between the two groups. Given there were only 3 users that started a trial from the experiment landing page and converted to paid subscriptions in both products, it appears the positive results from the variation group were related to increasing the exposure of Analyze during user acquisition.

We recommend persuing an additional iteration of this experiment, examing both ways to increase click throughs on the experiment CTAs to the solutions landing page, as well as increase the performance of the solutions landing page via changes to messaging and changes to which plans are paired for the solution offered.

Data Collection

To analyze the results of this experiment, we will use the following query to retrieve data about users enrolled in the experiment.

# connect to bigquery
con <- dbConnect(
  bigrquery::bigquery(),
  project = "buffer-data"
)
# define sql query to get experiment enrolled visitors
sql <- "
  with enrolled_users as (
    select
      anonymous_id
      , experiment_group
      , first_value(timestamp) over (
      partition by anonymous_id order by timestamp asc
      rows between unbounded preceding and unbounded following) as enrolled_at
    from segment_marketing.experiment_viewed
    where 
      first_viewed 
      and experiment_id = 'eid19_publish_analyze_bundle_ab'
  )
  select
    e.anonymous_id
    , e.experiment_group
    , e.enrolled_at
    , i.user_id as account_id
    , c.email
    , c.publish_user_id
    , c.analyze_user_id
    , a.timestamp as account_created_at
    , t.product as trial_product
    , t.timestamp as trial_started_at
    , t.subscription_id as trial_subscription_id
    , t.stripe_event_id as stripe_trial_event_id
    , t.plan_id as trial_plan_id
    , t.cycle as trial_billing_cycle
    , t.cta as trial_started_cta
    , t.multi_product_bundle_name as trial_multi_product_bundle_name
    , s.product as subscription_product
    , s.timestamp as subscription_started_at
    , s.subscription_id as subscription_id
    , s.stripe_event_id as stripe_subscription_event_id
    , s.plan_id as subscription_plan_id
    , s.cycle as subscritpion_billing_cycle
    , s.revenue as subscription_revenue
    , s.amount as subscritpion_amount
    , s.cta as subscription_started_cta
    , s.multi_product_bundle_name as subscription_multi_product_bundle_name
  from enrolled_users e
    left join segment_login_server.identifies i
      on e.anonymous_id = i.anonymous_id
    left join dbt_buffer.core_accounts c
      on i.user_id = c.id
    left join segment_login_server.account_created a
      on i.user_id = a.user_id
    left join segment_publish_server.trial_started t
      on i.user_id = t.user_id
      and t.timestamp > e.enrolled_at
    left join segment_publish_server.subscription_started s
      on i.user_id = s.user_id
      and s.timestamp > e.enrolled_at
  group by 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26
"
  
# query BQ
# users <- dbGetQuery(con, sql)

Exploratory Analysis

Let’s start by reviewing a few of the summary statistics from out data.

skim(users)
## Skim summary statistics
##  n obs: 284224 
##  n variables: 26 
## 
## ── Variable type:character ────────────────────────────────────
##                                variable missing complete      n min max
##                              account_id  259122    25102 284224  24  24
##                         analyze_user_id  281756     2468 284224  24  24
##                            anonymous_id       0   284224 284224  36  36
##                                   email  259217    25007 284224   8  50
##                        experiment_group       0   284224 284224   7   9
##                         publish_user_id  259310    24914 284224  24  24
##            stripe_subscription_event_id  282627     1597 284224  18  18
##                   stripe_trial_event_id  268416    15808 284224  18  18
##                         subscription_id  282627     1597 284224  18  18
##  subscription_multi_product_bundle_name  284212       12 284224  22  22
##                    subscription_plan_id  282627     1597 284224  13  44
##                    subscription_product  282627     1597 284224   7   7
##                subscription_started_cta  282777     1447 284224  30  70
##              subscritpion_billing_cycle  282627     1597 284224   4   5
##                     trial_billing_cycle  268416    15808 284224   4   5
##         trial_multi_product_bundle_name  284021      203 284224  22  22
##                           trial_plan_id  268416    15808 284224  13  44
##                           trial_product  268416    15808 284224   7   7
##                       trial_started_cta  268531    15693 284224  28  70
##                   trial_subscription_id  268416    15808 284224  18  18
##  empty n_unique
##      0    23734
##      0     1372
##      0   282249
##      0    23546
##      0        2
##      0    23550
##      0     1347
##      0    15662
##      0     1346
##      0        1
##      0       14
##      0        2
##      0       40
##      0        2
##      0        2
##      0        1
##      0       13
##      0        2
##      0       52
##      0    15662
## 
## ── Variable type:integer ──────────────────────────────────────
##              variable missing complete      n  mean     sd p0 p25 p50 p75
##  subscription_revenue  283195     1029 284224 35.64  34.13 12  15  15  50
##   subscritpion_amount  282627     1597 284224 83.09 143    15  15  35  99
##  p100     hist
##   399 ▇▂▁▁▁▁▁▁
##  1010 ▇▂▁▁▁▁▁▁
## 
## ── Variable type:POSIXct ──────────────────────────────────────
##                 variable missing complete      n        min        max
##       account_created_at  259122    25102 284224 2019-04-23 2019-12-04
##              enrolled_at       0   284224 284224 2019-10-14 2019-11-29
##  subscription_started_at  282627     1597 284224 2019-10-15 2019-12-04
##         trial_started_at  268416    15808 284224 2019-10-15 2019-12-04
##      median n_unique
##  2019-10-18    23734
##  2019-10-22   282213
##  2019-11-04     1348
##  2019-10-24    15664

Let’s start with a quick validation of the visitor count split between the two experiment groups.

users %>% 
  group_by(experiment_group) %>% 
  summarise(visitors = n_distinct(anonymous_id), accounts = n_distinct(account_id), trials = n_distinct(trial_subscription_id), subscriptions = n_distinct(subscription_id)) %>% 
  mutate(visitor_split_perct = visitors / sum(visitors)) %>%
  kable() %>% 
  kable_styling()
experiment_group visitors accounts trials subscriptions visitor_split_perct
control 141123 12080 7987 665 0.4999947
variant_1 141126 11656 7677 683 0.5000053

Great, there is a total of 282,249 unique visitors enrolled in the experiment, and the percentage split between the two experiment groups is within a few visitors of exactly 50/50. This confirms that our experiment framework correctly split enrollments for the experiment.

res <- prop.test(x = c(12080, 11656), n = c(141123, 141126), alternative = "two.sided")
res
## 
##  2-sample test for equality of proportions with continuity
##  correction
## 
## data:  c(12080, 11656) out of c(141123, 141126)
## X-squared = 8.2403, df = 1, p-value = 0.004097
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  0.0009514326 0.0050610215
## sample estimates:
##     prop 1     prop 2 
## 0.08559909 0.08259286
explain(res)

This was a two-sample proportion test of the null hypothesis that the true population proportions are equal. Using a significance level of 0.05, we reject the null hypothesis, and conclude that two population proportions are not equal. The observed difference in proportions is -0.00300622704010574. The observed proportion for the first group is 0.0855990873209895 (12,080 events out of a total sample size of 141,123). For the second group, the observed proportion is 0.0825928602808838 (11,656, out of a total sample size of 141,126).

The confidence interval for the true difference in population proportions is (9.514325910^{-4}, 0.005061). This interval will contain the true difference in population proportions 95 times out of 100.

The p-value for this test is 0.0040971. This, formally, is defined as the probability – if the null hypothesis is true – of observing a difference in sample proportions that is as or more extreme than the difference in sample proportions from this data set. In this case, this is the probability – if the true population proportions are equal – of observing a difference in sample proportions that is greater than 0.00300622704010574 or less than -0.00300622704010574.

We can see that the number of visitors that ended up creating a Buffer account (ie, signed up for their first Buffer product), was a few hundred higher in the control group. Using a quick proportion test we can see that this difference in proportion of enrolled visitors that created a Buffer account is statistically significant, with a p-value of 0.004 (which is less than the generally accepted 0.05 threshold).

Next, we will calculate how many users from each experiment group started a trial from each product.

users %>% 
  mutate(has_publish_trial = trial_product == "publish") %>% 
  group_by(experiment_group, has_publish_trial) %>% 
  summarise(users = n_distinct(account_id)) %>% 
  ungroup() %>% 
  filter(has_publish_trial) %>% 
  group_by(experiment_group) %>% 
  summarise(users_with_publish_trials = users) %>%
  kable() %>% 
  kable_styling()
experiment_group users_with_publish_trials
control 7372
variant_1 6954

There were 7372 users in the control group that started a Publish trial, and 6954 in the variation group. Just like above, we should also run a proportion test here.

res <- prop.test(x = c(7372, 6954), n = c(12080, 11656), alternative = "two.sided")
res
## 
##  2-sample test for equality of proportions with continuity
##  correction
## 
## data:  c(7372, 6954) out of c(12080, 11656)
## X-squared = 4.5707, df = 1, p-value = 0.03252
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  0.001130084 0.026194502
## sample estimates:
##    prop 1    prop 2 
## 0.6102649 0.5966026
explain(res)

This was a two-sample proportion test of the null hypothesis that the true population proportions are equal. Using a significance level of 0.05, we reject the null hypothesis, and conclude that two population proportions are not equal. The observed difference in proportions is -0.0136622925634184. The observed proportion for the first group is 0.610264900662252 (7,372 events out of a total sample size of 12,080). For the second group, the observed proportion is 0.596602608098833 (6,954, out of a total sample size of 11,656).

The confidence interval for the true difference in population proportions is (0.0011301, 0.0261945). This interval will contain the true difference in population proportions 95 times out of 100.

The p-value for this test is 0.0325235. This, formally, is defined as the probability – if the null hypothesis is true – of observing a difference in sample proportions that is as or more extreme than the difference in sample proportions from this data set. In this case, this is the probability – if the true population proportions are equal – of observing a difference in sample proportions that is greater than 0.0136622925634184 or less than -0.0136622925634184.

We can see that the difference in proportion of accounts that started a Publish trial is also statistically significant, with a p-value of 0.03.

users %>% 
  mutate(has_analyze_trial = (trial_product == "analyze")) %>% 
  group_by(experiment_group, has_analyze_trial) %>% 
  summarise(users = n_distinct(account_id)) %>% 
  ungroup() %>% 
  filter(has_analyze_trial) %>% 
  group_by(experiment_group) %>% 
  summarise(users_with_analyze_trials = users) %>%
  kable() %>% 
  kable_styling()
experiment_group users_with_analyze_trials
control 449
variant_1 571

There were 449 users in the control group that started an Analyze trial, and 571 in the variation group. Just like above, we should also run a proportion test here.

res <- prop.test(x = c(449, 571), n = c(12080, 11656), alternative = "two.sided")
res
## 
##  2-sample test for equality of proportions with continuity
##  correction
## 
## data:  c(449, 571) out of c(12080, 11656)
## X-squared = 19.862, df = 1, p-value = 8.324e-06
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.017073586 -0.006563957
## sample estimates:
##     prop 1     prop 2 
## 0.03716887 0.04898765
explain(res)

This was a two-sample proportion test of the null hypothesis that the true population proportions are equal. Using a significance level of 0.05, we reject the null hypothesis, and conclude that two population proportions are not equal. The observed difference in proportions is 0.0118187716754467. The observed proportion for the first group is 0.0371688741721854 (449 events out of a total sample size of 12,080). For the second group, the observed proportion is 0.0489876458476321 (571, out of a total sample size of 11,656).

The confidence interval for the true difference in population proportions is (-0.0170736, -0.006564). This interval will contain the true difference in population proportions 95 times out of 100.

The p-value for this test is 8.32444710^{-6}. This, formally, is defined as the probability – if the null hypothesis is true – of observing a difference in sample proportions that is as or more extreme than the difference in sample proportions from this data set. In this case, this is the probability – if the true population proportions are equal – of observing a difference in sample proportions that is greater than 0.0118187716754467 or less than -0.0118187716754467.

We can see that the difference in proportion of accounts that started an Analyze trial is also statistically significant, with a p-value of 8.324e-06 (which a very small number, and much less than the generally accepted 0.05 threshold). This means we can say with confidence that the observed difference in the number of Analyze trials started between the two experiment groups was not the result of random variance.

Next, we will calculate how many users from each experiment group started a paid subscription from each product.

users %>% 
  mutate(has_publish_sub = (subscription_product == "publish")) %>% 
  group_by(experiment_group, has_publish_sub) %>% 
  summarise(users = n_distinct(account_id)) %>% 
  ungroup() %>% 
  filter(has_publish_sub) %>% 
  group_by(experiment_group) %>% 
  summarise(paying_publish_users = users) %>%
  kable() %>% 
  kable_styling()
experiment_group paying_publish_users
control 618
variant_1 620

There were 618 users in the control group that started a paid Publish subscription, and 620 in the variation group. Just like above, we should also run a proportion test here, for both the account to paid subscription proportion, and also the trial to paid subscription proportion.

res <- prop.test(x = c(618, 620), n = c(12080, 11656), alternative = "two.sided")
res
## 
##  2-sample test for equality of proportions with continuity
##  correction
## 
## data:  c(618, 620) out of c(12080, 11656)
## X-squared = 0.45546, df = 1, p-value = 0.4998
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.007776708  0.003711610
## sample estimates:
##     prop 1     prop 2 
## 0.05115894 0.05319149
explain(res)

This was a two-sample proportion test of the null hypothesis that the true population proportions are equal. Using a significance level of 0.05, we do not reject the null hypothesis, and cannot conclude that two population proportions are different from one another. The observed difference in proportions is 0.00203254896435114. The observed proportion for the first group is 0.051158940397351 (618 events out of a total sample size of 12,080). For the second group, the observed proportion is 0.0531914893617021 (620, out of a total sample size of 11,656).

The confidence interval for the true difference in population proportions is (-0.0077767, 0.0037116). This interval will contain the true difference in population proportions 95 times out of 100.

The p-value for this test is 0.4997515. This, formally, is defined as the probability – if the null hypothesis is true – of observing a difference in sample proportions that is as or more extreme than the difference in sample proportions from this data set. In this case, this is the probability – if the true population proportions are equal – of observing a difference in sample proportions that is greater than 0.00203254896435114 or less than -0.00203254896435114.

There is no statistical difference between the two proportions of accounts that started a paid Publish subscription, as the p-value is 0.499.

res <- prop.test(x = c(618, 620), n = c(7372, 6954), alternative = "two.sided")
res
## 
##  2-sample test for equality of proportions with continuity
##  correction
## 
## data:  c(618, 620) out of c(7372, 6954)
## X-squared = 1.2195, df = 1, p-value = 0.2695
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.014679446  0.004026229
## sample estimates:
##     prop 1     prop 2 
## 0.08383071 0.08915732
explain(res)

This was a two-sample proportion test of the null hypothesis that the true population proportions are equal. Using a significance level of 0.05, we do not reject the null hypothesis, and cannot conclude that two population proportions are different from one another. The observed difference in proportions is 0.00532660873071643. The observed proportion for the first group is 0.0838307107976126 (618 events out of a total sample size of 7,372). For the second group, the observed proportion is 0.089157319528329 (620, out of a total sample size of 6,954).

The confidence interval for the true difference in population proportions is (-0.0146794, 0.0040262). This interval will contain the true difference in population proportions 95 times out of 100.

The p-value for this test is 0.2694686. This, formally, is defined as the probability – if the null hypothesis is true – of observing a difference in sample proportions that is as or more extreme than the difference in sample proportions from this data set. In this case, this is the probability – if the true population proportions are equal – of observing a difference in sample proportions that is greater than 0.00532660873071643 or less than -0.00532660873071643.

There is no statistical difference between the two proportions of accounts that started a paid Publish subscription, as the p-value is 0.269.

Next, let’s look into the number of Analyze paid subscriptions between the two experiment groups.

users %>% 
  mutate(has_analyze_sub = (subscription_product == "analyze")) %>% 
  group_by(experiment_group, has_analyze_sub) %>% 
  summarise(users = n_distinct(account_id)) %>% 
  ungroup() %>% 
  filter(has_analyze_sub) %>% 
  group_by(experiment_group) %>% 
  summarise(paying_publish_users = users) %>%
  kable() %>% 
  kable_styling()
experiment_group paying_publish_users
control 44
variant_1 54

The number of paying users for Analyze was 44 in the control and 54 in the variation. Just like above, we should also run a couple of proportion tests here too.

explain(prop.test(x = c(44, 54), n = c(12080, 11656), alternative = "two.sided"))

This was a two-sample proportion test of the null hypothesis that the true population proportions are equal. Using a significance level of 0.05, we do not reject the null hypothesis, and cannot conclude that two population proportions are different from one another. The observed difference in proportions is 0.000990423031994436. The observed proportion for the first group is 0.00364238410596027 (44 events out of a total sample size of 12,080). For the second group, the observed proportion is 0.0046328071379547 (54, out of a total sample size of 11,656).

The confidence interval for the true difference in population proportions is (-0.0027099, 7.290466310^{-4}). This interval will contain the true difference in population proportions 95 times out of 100.

The p-value for this test is 0.2764203. This, formally, is defined as the probability – if the null hypothesis is true – of observing a difference in sample proportions that is as or more extreme than the difference in sample proportions from this data set. In this case, this is the probability – if the true population proportions are equal – of observing a difference in sample proportions that is greater than 0.000990423031994436 or less than -0.000990423031994436.

There is no statistical difference between the two proportions of users that started a paid Analyze subscription, as the p-value is 0.276.

res <- prop.test(x = c(44, 54), n = c(449, 571), alternative = "two.sided")
res
## 
##  2-sample test for equality of proportions with continuity
##  correction
## 
## data:  c(44, 54) out of c(449, 571)
## X-squared = 0.0059629, df = 1, p-value = 0.9384
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.03506551  0.04191475
## sample estimates:
##     prop 1     prop 2 
## 0.09799555 0.09457093
explain(res)

This was a two-sample proportion test of the null hypothesis that the true population proportions are equal. Using a significance level of 0.05, we do not reject the null hypothesis, and cannot conclude that two population proportions are different from one another. The observed difference in proportions is -0.00342461746086847. The observed proportion for the first group is 0.0979955456570156 (44 events out of a total sample size of 449). For the second group, the observed proportion is 0.0945709281961471 (54, out of a total sample size of 571).

The confidence interval for the true difference in population proportions is (-0.0350655, 0.0419147). This interval will contain the true difference in population proportions 95 times out of 100.

The p-value for this test is 0.9384488. This, formally, is defined as the probability – if the null hypothesis is true – of observing a difference in sample proportions that is as or more extreme than the difference in sample proportions from this data set. In this case, this is the probability – if the true population proportions are equal – of observing a difference in sample proportions that is greater than 0.00342461746086847 or less than -0.00342461746086847.

There is no statistical difference between the two proportions of Analyze trials that started a paid Analyze subscription, as the p-value is 0.938. This means that the difference in paid Analyze subscriptions is the result of more Analyze trials started in the variation group, not a difference in conversion rate to paid between the two groups.

Next, we will look at the number of accounts that started paid subscriptions for both Publish and Analyze in both experiment groups.

users %>% 
  filter(!is.na(subscription_product)) %>% 
  group_by(account_id, experiment_group) %>% 
  summarise(products = n_distinct(subscription_product)) %>% 
  ungroup() %>% 
  filter(products > 1) %>% 
  group_by(experiment_group, products) %>% 
  summarise(users = n_distinct(account_id)) %>%
  kable() %>% 
  kable_styling()
experiment_group products users
control 2 30
variant_1 2 40
res <- prop.test(x = c(30, 40), n = c(12080, 11656), alternative = "two.sided")
res
## 
##  2-sample test for equality of proportions with continuity
##  correction
## 
## data:  c(30, 40) out of c(12080, 11656)
## X-squared = 1.5059, df = 1, p-value = 0.2198
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.0024163450  0.0005198145
## sample estimates:
##      prop 1      prop 2 
## 0.002483444 0.003431709
explain(res)

This was a two-sample proportion test of the null hypothesis that the true population proportions are equal. Using a significance level of 0.05, we do not reject the null hypothesis, and cannot conclude that two population proportions are different from one another. The observed difference in proportions is 0.000948265282468285. The observed proportion for the first group is 0.00248344370860927 (30 events out of a total sample size of 12,080). For the second group, the observed proportion is 0.00343170899107756 (40, out of a total sample size of 11,656).

The confidence interval for the true difference in population proportions is (-0.0024163, 5.198144710^{-4}). This interval will contain the true difference in population proportions 95 times out of 100.

The p-value for this test is 0.2197602. This, formally, is defined as the probability – if the null hypothesis is true – of observing a difference in sample proportions that is as or more extreme than the difference in sample proportions from this data set. In this case, this is the probability – if the true population proportions are equal – of observing a difference in sample proportions that is greater than 0.000948265282468285 or less than -0.000948265282468285.

There is no statistical difference between the two proportions of accounts started in each experiment group that ended up starting both a paid Analyze subscription and a paid Publish subscription, as the p-value is 0.219. Put another way, there is about a 1 out of 5 chance that the observed difference is the result of natural variance.

To ensure we have a full picture of the variation group’s behavior with converting to paid for both products, let’s also look at how many trials of both products were started from the experiment landing page and what those trial’s conversion to paid was.

users %>% 
  mutate(has_multiproduct_trial = !is.na(trial_multi_product_bundle_name)) %>% 
  group_by(has_multiproduct_trial) %>% 
  summarise(users = n_distinct(account_id)) %>%
  kable() %>% 
  kable_styling()
has_multiproduct_trial users
FALSE 23655
TRUE 115
users %>% 
  mutate(has_multiproduct_trial = !is.na(trial_multi_product_bundle_name)) %>%  
  filter(has_multiproduct_trial) %>% 
  group_by(account_id) %>% 
  summarise(products = n_distinct(subscription_product), paid_subscriptions = n_distinct(subscription_id)) %>% 
  ungroup() %>% 
  filter(products > 1) %>%
  summarise(users = n_distinct(account_id)) %>%
  kable() %>% 
  kable_styling()
users
3

3 of 115 trials converted to both paid products, which is a 2.6% conversion rate. For comparison, the conversion rate to paid for just Publish trials for both groups was 5%, and the conversion rate to paid for just Analye trials for both groups was 10%.

Finally, let’s also look at the MRR value of all converted trials per experiment group, to see if there was any overall difference in value between the two groups.

users %>% 
  filter(!is.na(subscription_id)) %>% 
  mutate(mrr_value = if_else(subscritpion_billing_cycle == "year", (subscritpion_amount / 12), (subscritpion_amount/1))) %>% 
  group_by(experiment_group) %>% 
  summarise(paying_user_count = n_distinct(account_id), total_mrr_value = round(sum(mrr_value),2), mrr_value_per_account = round(total_mrr_value / paying_user_count,2)) %>%
  kable() %>% 
  kable_styling()
experiment_group paying_user_count total_mrr_value mrr_value_per_account
control 632 23741.25 37.57
variant_1 634 26151.75 41.25

Given both groups ended up with almost the same number of paying customers, the difference in mrr value per account between the two groups indicates that the stat sig difference in analyze trial starts lead to more overall revenue. This could imply that the realized benefit of this experiment was more exposure to Analyze, not a more convient way to start trials for both products.

Final Results

Given the above observations, the result of the experiment is that there is insufficient evidence to confirm the hypothesis. Though it appears the variation treatment lead to more Analyze trials, and thus a slightly higher mrr value per paying account, the landing page underperformed compared to trials started from other locations, and the number of new users that ended up on a paid subscription to both products had no statistical difference between the two groups.

We recommend persuing an iteration of this experiment, examing both ways to increase click throughs on the experiment CTAs to the solutions landing page, as well as increase the performance of the solutions landing page via changes to messaging and changes to which plans are paired for the solution offered.