The greatest problem when it comes to comparing statistical differences between two groups, say group A and group B in an AB test, is confounding factors.

A/B tests or randomized controlled trials are the gold standard in causal inference. By randomly exposing units to a treatment we make sure that treated and untreated individuals are comparable, on average, and any outcome difference we observe can be attributed to the treatment effect alone.

One way of handling this issue is by controlling for possible confounding factors. For example, by comparing sub-groups with similar attributes such as age, sex, location etc.

One very useful tool for such situations is the create_table_one function from Uber's [causalml](<https://causalml.readthedocs.io/>) package. This function produces a covariate balance table containing the average value of the possible confounders across treatment/control groups:

from causalml.match import create_table_one

X = ['male', 'age', 'hours']
cv_table = create_table_one(df, 
					'dark_mode', X)
display(cv_table)

Screenshot 2022-07-06 at 14.04.42.png

The covariate balance table will give a nice summary of the comparison between the treatment and control groups.

def plot_distributions(df, X, d):
    df_long = df.copy()[X + [d]]
    df_long[X] =(df_long[X] - df_long[X].mean()) / df_long[X].std()
    df_long = pd.melt(df_long, id_vars=d, value_name='value')
    sns.violinplot(y="variable", x="value", hue=d, data=df_long, split=True).\\
        set(xlabel="", ylabel="", title="Normalized Variable Distribution");
    plot_distributions(df, X, "dark_mode")

Screenshot 2022-07-06 at 14.08.12.png

<aside> 👨🏾‍💻 Unless we control for the observed confounding variables, we will not be able to estimate the true statistical effect of the treatment.

</aside>

In graph lingo, the process of taking care of these confounding variables, is called blocking the backdoor path. This is done by conditioning the analysis on those intermediate confounding variables (eg. gender).

Conditioning on several confounding factors will allow you to increase the precision of the causality estimates, but does not impact the causal interpretation of the results.

Screenshot 2022-07-06 at 14.33.57.png

Screenshot 2022-07-06 at 14.34.03.png

Matching

Here the idea is to do the analysis by separating the data using confounding the variables eg. gender. But watch out for the arrival of the infamous Simpson's Paradox.

But then, what do you do when you have multiple confounding variables and you still want to match sub-groups? This can be dealt with using some sort of nearest neighbour algorithm that matches users in the treatment group with the most similar users in the control group. The causalml package lets you do this:

from causalml.match import NearestNeighborMatch

psm = NearestNeighborMatch(replace=True, ratio=1, random_state=1)
df_matched = psm.match(data=df, treatment_col="dark_mode", score_cols=X)

table1_matched = create_table_one(df_matched, "dark_mode", X)
table1_matched

Now take a look at the covariate balance table:

1_P9DpnTbzmqWZeReXygQH-A.png

plot_distributions(df_matched, X,
 "dark_mode")

Note that this algorithm might drop part of the sample if it is not able to find appropriate matches for it.

1_XxVzGxeBMF1IJ9OklU88BA.png