AB Experiment: Calculating Statistical Significance in Power BI

Soo Reed
Oct 28, 2022
7 min read

This is the last post in the series for CRO report – you can find all the previous blog posts here. Some of the metrics/functionality in this blog assume you have read and followed these previous posts:

Problem

Per experiment, I want to know if the difference in Conversion rate between variants are statistically significant.

Solution

Let’s implement two sample proportion Z-Test in the CRO report within Power BI.

TLDR;

Have a foundational CRO report ready before starting this part. If not sure, follow other blog posts in the series first.
Create CVR Z-value measure using DAX
(Optional) create a cumulative distribution calculation for easier interpretation
(Optional) consider creating “overall” CVR z value measure for non-control variants for a consumable visualization

Fake Dataset used in this blog series. You can download it to follow:

Step by Step:

1. Have a foundational CRO report ready

Have all the basic measures created, such as conversion rate
My example will have a control selection available, because in reality, not all the control variant is usually named the same, and sometimes CRO team members may want to reverse compare between control and experiment variant. If you do not know what I am talking about or how to do this, check out my blog: How to create interactive growth rate calculation.

2. Create CVR Z-Value measure

This is the meaty part of the blog. Let’s first talk about two sample proportion z-tests.

Why Two Sample Proportion Z-Test?

Two Sample Proportion Z-Test is used because we are comparing two different samples (control vs. experiment) for their proportion (specifically, conversion rate). For more than two-sample comparison simultaneously (at the same time), this is not ideal, because a two-sample proportion z-test only compares one sample as a control group to one other variant group (meaning, if there are multiple variants to one control, our calculation will calculate z-value for each of the variant compared to the control group, but not to each other). As long as you understand how to interpret the result, however, the two sample proportion z-test is valuable. This is also why it’s great to have a control group selection – if you decide to compare two experiment variables to each other, you can do so by choosing one of them to be a control group and observing the z-value.

Once you have a z-value, you can evaluate it using a chart or by calculating cumulative distribution. More on that later in this blog post as an optional 1. You can read more about Two Sample Proportion Z-Test in these helpful articles/blogs: https://online.stat.psu.edu/stat200/book/export/html/196 https://www.statisticshowto.com/probability-and-statistics/hypothesis-testing/z-test/ https://www.itl.nist.gov/div898/handbook/prc/section3/prc33.htm

How to calculate a CVR Z-Value using two sample proportion z-test:

Z value is calculated by ([exp CVR] - [control CVR])/[pooled std]. The most “difficult” part is to calculate the pooled standard deviation, since getting the experiment's CVR and control’s CVR is easy.
So how is the pooled standard deviation calculated?
1. Square root of ([pooled CVR] * (1 - [pooled CVR]) * (1/[exp Session] + 1/[control Session]))
Which is:

This image is from: https://online.stat.psu.edu/stat200/book/export/html/196

So how do we translate that in Power BI?

My DAX is based on the assumption that you already have the [Conversion Rate] metric (calculated by: conversion/sessions), and also have a control group selection. If you do not need control group selection in your report, replace “SELECTEDVALUE('Control Group'[Control])” into your variant name within my DAX code.

var experiment =[Conversion rate]

var control = CALCULATE([Conversion rate], Experiments[ExperimentVariant] = SELECTEDVALUE('Control Group'[Control]))

I am also creating this measure to use it with the experiment variable in a table. Here’s an image of what I am trying to create:

Therefore, when I calculate “experiment CVR” (meaning, currently selected variant’s CVR), it will be just a regular CVR calculation. It’s alright that the control group's own CVR will be calculated in the same way when the currently selected variant is the control group – because the calculation ends up comparing it to itself, it will be 0.0.

So here’s my DAX code:

***again, [Conversion rate] is defined outside of this DAX code and it is calculated by SUM(conversion)/SUM(sessions).

CVR_ZValue =

var experiment =[Conversion rate]

var control = CALCULATE([Conversion rate], Experiments[ExperimentVariant] = SELECTEDVALUE('Control Group'[Control]))

var exp_n = SUM(Experiments[Session])

var control_n = CALCULATE(SUM(Experiments[Session]), Experiments[ExperimentVariant] = SELECTEDVALUE('Control Group'[Control]))

var pooled_orders = SUM(Experiments[conversion]) + CALCULATE(SUM(Experiments[conversion]), Experiments[ExperimentVariant] = SELECTEDVALUE('Control Group'[Control]))

var pooled_sessions = SUM(Experiments[Session]) + CALCULATE(SUM(Experiments[Session]), Experiments[ExperimentVariant] = SELECTEDVALUE('Control Group'[Control]))

var pooled_CVR = pooled_orders/pooled_sessions

var zvalue = DIVIDE(

(experiment-control), SQRT(pooled_CVR*(1-pooled_CVR)*((1/exp_n)+(1/control_n))))

return IF(exp_n > 0 && control_n > 0, zvalue, BLANK())

Let’s dig in each line in detail. Feel free to skip this part if desired:

CVR_ZValue =

var experiment =[Conversion rate]

As mentioned earlier, here, the var experiment represents the currently selected variant based on a current row. It is calculated SUM(Conversion)/SUM(sessions)

var control = CALCULATE([Conversion rate], Experiments[ExperimentVariant] = SELECTEDVALUE('Control Group'[Control]))

Regardless of which variant is currently selected as a row, this calculates the control group’s Conversion rate. You do not need to worry about making sure it’s per (in other words, “partitioned by”) experiment ID because later, we will add experiment ID as a row in the table we will create, so it will be automatically considered.

var exp_n = SUM(Experiments[Session])

the var exp_n shows the denominator of the currently selected variant based on a current row (experiment #, the # of sessions)

var control_n = CALCULATE(SUM(Experiments[Session]), Experiments[ExperimentVariant] = SELECTEDVALUE('Control Group'[Control]))

the var control_n shows the denominator of the control group’s CVR (control #, the # of sessions)

var pooled_orders = SUM(Experiments[conversion]) + CALCULATE(SUM(Experiments[conversion]), Experiments[ExperimentVariant] = SELECTEDVALUE('Control Group'[Control]))

Here we are calculating the pooled order by summing the current selected variant’s conversion and the control group’s conversion

var pooled_sessions = SUM(Experiments[Session]) + CALCULATE(SUM(Experiments[Session]), Experiments[ExperimentVariant] = SELECTEDVALUE('Control Group'[Control]))

Here we are calculating the pooled sessions (denominator) by summing the current selected variant’s sessions and the control group’s sessions

var pooled_CVR = pooled_orders/pooled_sessions

Here we finally have a pooled CVR

var zvalue = DIVIDE(

(experiment-control), SQRT(pooled_CVR*(1-pooled_CVR)*((1/exp_n)+(1/control_n))))

This is calculating the z-value as indicated in the image from earlier. Here, we are dividing (experiment CVR - control CVR) by Square Root of (pooled CVR * (1-pooled CVR) * (1/experiment# + 1/control#)

return IF(exp_n > 0 && control_n > 0, zvalue, BLANK())

If for any reason the experiment variant or control variant is missing any sessions, then the z-value cannot be calculated as it will cause division by 0. To prevent the error, we are making sure the z-value is calculated only when the experiment variant and control variant both have sessions.

Now you have the CVR z-value metric!

Now you have the z-values to add in the table! So let’s create the table. Select “table” from Power BI visualization options, and put Experiment ID and Variant in rows, then add session, conversion, conversion rate, conversion rate growth rate, and the Z-value you created in the values. You can double check your DAX code by calculating z-value for one experiment and make sure the numbers come out the same.

Optional 1 – create a cumulative distribution calculation for easier interpretation

Meaning, Z-value is great, but it’s easier to interpret if it’s in % to show the confidence level. Cumulative distribution calculation can do that for you. It’s simple in Power BI using DAX! Here’s the DAX (assuming one-tailed test): Cumul Dist = NORM.S.DIST([CVR_ZValue], True)

Once you have it, you can drop it in the table to make the data more consumable. Usually, you will want the confidence level to be 95% or above:

Optional 2 – create an overall Z-value for not entirely accurate, but helpful visualization

What is this about?

This is the section for creating “overall” CVR z value measure for non-control variants for a consumable visualization. When an experiment only has two variants (control and experiment), then the overall z-value will accurately portrait the experiment z-value but when an experiment has more than two variants (control, variant 1, variant 2, or more), then this overall z-value metric will treat all the variants other than control as one, and calculate z-value, rendering z-value to be not as accurate.

So why would I want this when it may not be accurate all the time?

Having a table is great, but it’s hard to consume unless you are looking for a specific experiment. What if you want to see which experiment had the biggest z-value in the last 3 months? How would you go about finding that? For that, it is reasonable to start by looking at “overall z-value” in the last 3 months to see the top 3~5 experiments with the biggest overall z-values to find which variant in which experiment had the biggest z-value. Like this, overall z-value could help identify overall trends among many experiments quickly to filter out which ones to examine more closely.

We will try to create a visualization like below:

So how do we compute this overall z-value? Here’s the dax I created for it:

CVR_ZValue2 = var experiment = CALCULATE([conversion rate], Experiments[ExperimentVariant] <> SELECTEDVALUE('Control Group'[Control])) var control = CALCULATE([conversion rate], Experiments[ExperimentVariant] = SELECTEDVALUE('Control Group'[Control])) var exp_n = CALCULATE(SUM(Experiments[Session]), Experiments[ExperimentVariant] <> SELECTEDVALUE('Control Group'[Control])) var control_n = CALCULATE(SUM(Experiments[Session]), Experiments[ExperimentVariant] = SELECTEDVALUE('Control Group'[Control])) var pooled_CVR = SUM(Experiments[conversion])/SUM(Experiments[Session]) var zvalue = DIVIDE( (experiment-control), SQRT(pooled_CVR*(1-pooled_CVR)*((1/exp_n)+(1/control_n)))) return IF(exp_n > 0 && control_n > 0, zvalue, BLANK())

It is similar to the previous CVR Zvalue we created. There are two differences though:

Variable experiment: Instead of var experiment =[Conversion rate], now you have var experiment = CALCULATE([conversion rate], Experiments[ExperimentVariant] <> SELECTEDVALUE('Control Group'[Control])) This is because before, we wanted CVR by variant as we are adding this metric to the table by experiment ID and variant, but now we want to have an overall experiment variant(s)’s CVR, so we want to calculate overall CVR for non-control variant.
Variable pooled_CVR: Having to created Pooled_sessions and pooled_conversion and pooled_CVR all separately, now we can just use: var pooled_CVR = SUM(Experiments[conversion])/SUM(Experiments[Session]) This is because we will be creating a visualization per experiment, not per variant.

Now let’s create the visualization shown before.

Choose [scatter plot] in Power BI visualization option.
Put the new overall z-value (for me, I named it [CVR_ZValue2]) in its x-axis.
Put [Experiment ID] as a legend.

For easier interpretation, I want to create constant lines at 0 and 1.65: (1) 0 is to just let people know the “middle” value quickly, and (2) 1.65 is the 95% confidence level z-value boundary for one-tailed test (two-tailed test will be 1.96). I am marking 1.65 so people can easily see which experiment potentially has a statistically significant winner. To do so, go to analytics in the visualization tab and create two x-axis constant lines at 0 and 1.65. Here’s an example screenshot showing how the constant line at 0 is set:

Now you have a visualization showing the overall z-value per experiment (that relies on control selection) with two constant lines for easier interpretation! Any experiment that is beyond the 1.65 line (blue line in below) should definitely get someone's attention.

And here's the page we created today using the visualizations we worked on. I have a control slicer (necessary), experiment ID slicer and date slicer, along with an explanation on CVR Z value: