4. R learner#

The idea of classical R-learner came from Robinson 1988 [3] and was formalized by Nie and Wager in 2020 [2]. The main idea of R learner starts from the partially linear model setup, in which we assume that

(27)#\[\begin{equation} \begin{aligned} R&=A\tau(S)+g_0(S)+U,\\ A&=m_0(S)+V, \end{aligned} \end{equation}\]

where \(U\) and \(V\) satisfies \(\mathbb{E}[U|D,X]=0\), \(\mathbb{E}[V|X]=0\).

After several manipulations, it’s easy to get

(28)#\[\begin{equation} R-\mathbb{E}[R|S]=\tau(S)\cdot(A-\mathbb{E}[A|S])+\epsilon. \end{equation}\]

Define \(m_0(X)=\mathbb{E}[A|S]\) and \(l_0(X)=\mathbb{E}[R|S]\). A natural way to estimate \(\tau(X)\) is given below, which is also the main idea of R-learner:

Step 1: Regress \(R\) on \(S\) to obtain model \(\hat{\eta}(S)=\hat{\mathbb{E}}[R|S]\); and regress \(A\) on \(S\) to obtain model \(\hat{m}(S)=\hat{\mathbb{E}}[A|S]\).

Step 2: Regress outcome residual \(R-\hat{l}(S)\) on propensity score residual \(A-\hat{m}(S)\).

That is,

(29)#\[\begin{equation} \hat{\tau}(S)=\arg\min_{\tau}\left\{\mathbb{E}_n\left[\left(\{R_i-\hat{\eta}(S_i)\}-\{A_i-\hat{m}(S_i)\}\cdot\tau(S_i)\right)^2\right]\right\} \end{equation}\]

The easiest way to do so is to specify \(\hat{\tau}(S)\) to the linear function class. In this case, \(\tau(S)=S\beta\), and the problem becomes to estimate \(\beta\) by solving the following linear regression:

(30)#\[\begin{equation} \hat{\beta}=\arg\min_{\beta}\left\{\mathbb{E}_n\left[\left(\{R_i-\hat{\eta}(S_i)\}-\{A_i-\hat{m}(S_i)\} S_i\cdot \beta\right)^2\right]\right\}. \end{equation}\]
# import related packages
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt;
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import LogisticRegression 

from causaldm.learners.CEL.Single_Stage import _env_getdata_CEL
from causaldm.learners.CEL.Single_Stage.Rlearner import Rlearner
import warnings
warnings.filterwarnings('ignore')

MovieLens Data#

# Get the MovieLens data
MovieLens_CEL = _env_getdata_CEL.get_movielens_CEL()
MovieLens_CEL.pop(MovieLens_CEL.columns[0])
MovieLens_CEL = MovieLens_CEL[MovieLens_CEL.columns.drop(['Comedy','Action', 'Thriller'])]
MovieLens_CEL
user_id movie_id rating age Drama Sci-Fi gender_M occupation_academic/educator occupation_college/grad student occupation_executive/managerial occupation_other occupation_technician/engineer
0 48.0 1193.0 4.0 25.0 1.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0
1 48.0 919.0 4.0 25.0 1.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0
2 48.0 527.0 5.0 25.0 1.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0
3 48.0 1721.0 4.0 25.0 1.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0
4 48.0 150.0 4.0 25.0 1.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0
... ... ... ... ... ... ... ... ... ... ... ... ...
65637 5878.0 3300.0 2.0 25.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0
65638 5878.0 1391.0 1.0 25.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0
65639 5878.0 185.0 4.0 25.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0
65640 5878.0 2232.0 1.0 25.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0
65641 5878.0 426.0 3.0 25.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0

65642 rows × 12 columns

n = len(MovieLens_CEL)
userinfo_index = np.array([3,6,7,8,9,10,11])
MovieLens_CEL.columns[userinfo_index]
Index(['age', 'gender_M', 'occupation_academic/educator',
       'occupation_college/grad student', 'occupation_executive/managerial',
       'occupation_other', 'occupation_technician/engineer'],
      dtype='object')
# R-learner for HTE estimation
np.random.seed(1)
outcome = 'rating'
treatment = 'Drama'
controls = ['age', 'gender_M', 'occupation_academic/educator',
       'occupation_college/grad student', 'occupation_executive/managerial',
       'occupation_other', 'occupation_technician/engineer']
n_folds = 5
y_model = GradientBoostingRegressor(max_depth=2)
ps_model = LogisticRegression()
Rlearner_model = GradientBoostingRegressor(max_depth=2)

HTE_R_learner = Rlearner(MovieLens_CEL, outcome, treatment, controls, n_folds, y_model, ps_model, Rlearner_model)
HTE_R_learner = HTE_R_learner.to_numpy()
estimate with R-learner
fold 1,testing r2 y_learner: 0.019, ps_learner: 0.734
fold 2,testing r2 y_learner: 0.015, ps_learner: 0.739
fold 3,testing r2 y_learner: 0.017, ps_learner: 0.740
fold 4,testing r2 y_learner: 0.017, ps_learner: 0.736
fold 5,testing r2 y_learner: 0.018, ps_learner: 0.725
fold 1, training r2 R-learner: 0.028, testing r2 R-learner: 0.028
fold 2, training r2 R-learner: 0.031, testing r2 R-learner: 0.020
fold 3, training r2 R-learner: 0.029, testing r2 R-learner: 0.029
fold 4, training r2 R-learner: 0.030, testing r2 R-learner: 0.024
fold 5, training r2 R-learner: 0.030, testing r2 R-learner: 0.024

Let’s focus on the estimated HTEs for three randomly chosen users:

print("R-learner:  ",HTE_R_learner[np.array([0,1000,5000])])
R-learner:   [0.05127254 0.08881288 0.10304225]
ATE_R_learner = np.sum(HTE_R_learner)/n
print("Choosing Drama instead of Sci-Fi is expected to improve the rating of all users by",round(ATE_R_learner,4), "out of 5 points.")
Choosing Drama instead of Sci-Fi is expected to improve the rating of all users by 0.0755 out of 5 points.

Conclusion: Choosing Drama instead of Sci-Fi is expected to improve the rating of all users by 0.0755 out of 5 points.

References#

  1. Xinkun Nie and Stefan Wager. Quasi-oracle estimation of heterogeneous treatment effects. Biometrika, 108(2):299–319, 2021.

  2. Peter M Robinson. Root-n-consistent semiparametric regression. Econometrica: Journal of the Econometric Society, pages 931–954, 1988.