Single Stage#
Problem Setting#
Suppose we have a dataset containing observations from
Real Data#
1. Fetch_hillstrom
Fetch_hillstrom is a dataset from the scikit-uplift package with a single decision point. This study aims to assess the effectiveness of an e-mail campaign (
recency: # of months since the last purchase;
history: total dollars spent in the past year;
mens: binary, =1 if purchased Men’s merchandise in the past year;
womens: binary, =1 if purchased Women’s merchandise in the past year;
newbie: binary, =1 if the customer is a new customer in the past year;
zip_code_Surburban: binary, =1 if zip_code is classified as Suburban;
zip_code_Urban: binary, =1 if zip_code is classified as Urban;
channel_Phone: binary, =1 if the customer purchased from Phone in the past year;
channel_Web: binary, =1 if the customer purchased from Web in the past year.
Note, if zip_code_Surburban =0 and zip_code_Urban=0, then the zip_code is classified as Rural; if channel_Phone=0 and channel_Web=0, then the customer purchased from multichannel in the past year.
There are two different types of action space that are available for us to specify:
Binary Treatment: Considering a binary action space, each customer would either receive an e-mail campaign (
) or receive no e-mail ( ).Multi Treatments: Considering a multinomial action space, each customer would either receive no e-mail (
) or receive an e-mail campaign featuring Women’s merchandise ( ) or receive an e-mail campaign featuring Men’s merchandise ( )
After two weeks following the e-mail campaign, each customer’s total dollar spent (
The observed data are independent and identically distributed
where larger values of
How to get the data?
from causaldm._util_causaldm import *
if binary treatment, S,A,R = get_data(target_col = ‘spend’, binary_trt = True); otherwise, S,A,R = get_data(target_col = ‘spend’, binary_trt = Flase)
More details about the original dataset can be found in https://blog.minethatdata.com/2008/03/minethatdata-e-mail-analytics-and-data.html.