{ "cells": [ { "cell_type": "markdown", "id": "db3ddf94-3c3e-43e2-93f6-56abeca09776", "metadata": {}, "source": [ "# MovieLens\n", "Movie Lens is a website that helps users find the movies they like and where they will rate the recommended movies. [MovieLens 1M dataset](https://grouplens.org/datasets/movielens/1m/) is a dataset including the observations collected in an online movie recommendation experiment and is widely used to generate data for online bandit simulation studies. The goal of the simulation studies below is to learn the reward distribution of different movie genres and hence to recommend the optimal movie genres to the users to optimize the cumulative user satisfaction. In other words, every time a user visits the website, the agent will recommend a movie genre ($A_t$) to the user, and then the user will give a rating ($R_t$) to the genre recommended. We assume that users' satisfaction is fully reflected through the ratings. Therefore, the ultimate goal of the bandit algorithms is to optimize the cumulative ratings received by finding and recommending the optimal movie genre that will receive the highest rating. In this tutorial, we mainly focus on the following 4 Genres, including \n", "\n", "- **Comedy**: $a=0$,\n", "- **Drama**: $a=1$\n", "- **Action**: $a=2$,\n", "- **Thriller**: $a=3$,\n", "- **Sci-Fi**: $a=4$.\n", "\n", "Therefore, $K=5$. For each user, feature information, including age, gender and occupation, are available:\n", "\n", "- **age**: numerical, from 18 to 56,\n", "- **gender**: binary, =1 if male,\n", "- **college/grad student**: binary, =1 if a college/grad student,\n", "- **executive/managerial**: binary, =1 if a executive/managerial,\n", "- **technician/engineer**: binary, =1 if a technician/engineer,\n", "- **other**: binary, =1 if having other occupations other than the rest of the four occupations,\n", "- **academic/educator**: if an academic/educator, then all the previous occupation-related variables = 0 (baseline).\n", "\n", "The realized reward $R_t$ is a numerical variable, taking the value of $\\{1,2,3,4,5\\}$,with 1 being the least satisfied and 5 being the most satisfied. In the following, we first perform causal effect learning on the logged data and output the estimated reward for each movie genre. Then, we conduct online learning to efficiently explore the optimal policy utilizing both the estimation results and new information collected through real-time online iteraction." ] }, { "cell_type": "markdown", "id": "931b6180-ebc9-4a45-bcb3-2a993146355b", "metadata": {}, "source": [ "## Causal Effect Learning" ] }, { "cell_type": "code", "execution_count": 8, "id": "2bfb5e7f", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "from sklearn.linear_model import LinearRegression\n", "from causaldm.learners.CPL4.CMAB import _env_realCMAB as _env\n", "from causaldm.learners.CPL4.CMAB import LinTS\n", "import warnings\n", "warnings.filterwarnings('ignore')\n", "env = _env.Single_Contextual_Env(seed = 0, Binary = False)\n", "logged_data, arms = env.get_logged_dat()" ] }, { "cell_type": "code", "execution_count": 9, "id": "2d916f6d", "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "dict_keys(['Comedy', 'Drama', 'Action', 'Thriller', 'Sci-Fi'])" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "logged_data.keys()" ] }, { "cell_type": "code", "execution_count": 10, "id": "fad9aa26", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
user_idmovie_idratingageComedyDramaActionThrillerSci-Figender_Moccupation_academic/educatoroccupation_college/grad studentoccupation_executive/managerialoccupation_otheroccupation_technician/engineer
685706482042.02.025.01.00.00.00.00.01.00.01.00.00.00.0
499683482424.04.025.01.00.00.00.00.01.00.01.00.00.00.0
69486053330.03.025.01.00.00.00.00.01.00.00.00.01.00.0
2076911693671.04.025.01.00.00.00.00.01.00.00.01.00.00.0
4258391693869.03.025.01.00.00.00.00.01.00.00.01.00.00.0
................................................
2136384140.0153.02.025.01.00.00.00.00.01.00.00.00.01.00.0
2422714140.0435.01.025.01.00.00.00.00.01.00.00.00.01.00.0
7173224411910.05.018.01.00.00.00.00.01.00.01.00.00.00.0
3459274411.045.02.018.01.00.00.00.00.01.00.01.00.00.00.0
87011358781226.05.025.01.00.00.00.00.00.00.00.00.01.00.0
\n", "

406 rows × 15 columns

\n", "
" ], "text/plain": [ " user_id movie_id rating age Comedy Drama Action Thriller \\\n", "685706 48 2042.0 2.0 25.0 1.0 0.0 0.0 0.0 \n", "499683 48 2424.0 4.0 25.0 1.0 0.0 0.0 0.0 \n", "694860 53 330.0 3.0 25.0 1.0 0.0 0.0 0.0 \n", "207691 169 3671.0 4.0 25.0 1.0 0.0 0.0 0.0 \n", "425839 169 3869.0 3.0 25.0 1.0 0.0 0.0 0.0 \n", "... ... ... ... ... ... ... ... ... \n", "213638 4140.0 153.0 2.0 25.0 1.0 0.0 0.0 0.0 \n", "242271 4140.0 435.0 1.0 25.0 1.0 0.0 0.0 0.0 \n", "717322 4411 910.0 5.0 18.0 1.0 0.0 0.0 0.0 \n", "345927 4411.0 45.0 2.0 18.0 1.0 0.0 0.0 0.0 \n", "870113 5878 1226.0 5.0 25.0 1.0 0.0 0.0 0.0 \n", "\n", " Sci-Fi gender_M occupation_academic/educator \\\n", "685706 0.0 1.0 0.0 \n", "499683 0.0 1.0 0.0 \n", "694860 0.0 1.0 0.0 \n", "207691 0.0 1.0 0.0 \n", "425839 0.0 1.0 0.0 \n", "... ... ... ... \n", "213638 0.0 1.0 0.0 \n", "242271 0.0 1.0 0.0 \n", "717322 0.0 1.0 0.0 \n", "345927 0.0 1.0 0.0 \n", "870113 0.0 0.0 0.0 \n", "\n", " occupation_college/grad student occupation_executive/managerial \\\n", "685706 1.0 0.0 \n", "499683 1.0 0.0 \n", "694860 0.0 0.0 \n", "207691 0.0 1.0 \n", "425839 0.0 1.0 \n", "... ... ... \n", "213638 0.0 0.0 \n", "242271 0.0 0.0 \n", "717322 1.0 0.0 \n", "345927 1.0 0.0 \n", "870113 0.0 0.0 \n", "\n", " occupation_other occupation_technician/engineer \n", "685706 0.0 0.0 \n", "499683 0.0 0.0 \n", "694860 1.0 0.0 \n", "207691 0.0 0.0 \n", "425839 0.0 0.0 \n", "... ... ... \n", "213638 1.0 0.0 \n", "242271 1.0 0.0 \n", "717322 0.0 0.0 \n", "345927 0.0 0.0 \n", "870113 1.0 0.0 \n", "\n", "[406 rows x 15 columns]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "logged_data['Comedy']" ] }, { "cell_type": "code", "execution_count": 11, "id": "a0169329", "metadata": {}, "outputs": [], "source": [ "userinfo_index = np.array([3,9,11,12,13,14])\n", "movie_generes = ['Comedy', 'Drama', 'Action', 'Thriller', 'Sci-Fi']" ] }, { "cell_type": "code", "execution_count": 12, "id": "5588f06b", "metadata": {}, "outputs": [], "source": [ "# convert the sampled dataset of interest to dataframe format\n", "data_CEL_sample = logged_data['Comedy']\n", "for movie_genere in movie_generes[1:5]:\n", " data_CEL_sample = pd.concat([data_CEL_sample, logged_data[movie_genere]])" ] }, { "cell_type": "code", "execution_count": 13, "id": "c015ebce", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1286" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(data_CEL_sample) # the total sample size we selected: n=1286" ] }, { "cell_type": "markdown", "id": "53f234b9", "metadata": { "id": "Tcwr9EQB_btu" }, "source": [ "### nonlinear model fitting" ] }, { "cell_type": "code", "execution_count": 14, "id": "f44c395d", "metadata": { "id": "m_357p5PHrSi" }, "outputs": [], "source": [ "models_CEL = {}\n", " \n", "# initialize the models we'll fit in Causal Effect Learning\n", "for i in movie_generes:\n", " models_CEL[i] = None \n", "from lightgbm import LGBMRegressor\n", "for movie_genere in movie_generes: \n", " models_CEL[movie_genere] = LGBMRegressor(max_depth=3)\n", " models_CEL[movie_genere].fit(data_CEL_sample.iloc[np.where(data_CEL_sample[movie_genere]==1)[0],userinfo_index],data_CEL_sample.iloc[np.where(data_CEL_sample[movie_genere]==1)[0],2] )\n" ] }, { "cell_type": "code", "execution_count": 15, "id": "f8aa7005", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 8, "status": "ok", "timestamp": 1676663330569, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "1LWOEFXNNiOw", "outputId": "bee7be34-4cab-4d87-fd24-5535c5760628" }, "outputs": [], "source": [ "# record thev estimated expected reward for each movie genere, under each possible combination of state variable\n", "age_range = np.linspace(min(data_CEL_sample['age']),max(data_CEL_sample['age']),int(max(data_CEL_sample['age'])-min(data_CEL_sample['age'])+1)).astype(int)" ] }, { "cell_type": "code", "execution_count": 16, "id": "c2ca9d72", "metadata": { "executionInfo": { "elapsed": 602, "status": "ok", "timestamp": 1676663332414, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "ZjV4BSJ717cm" }, "outputs": [], "source": [ "import itertools\n", "\n", "gender = np.array([0,1])\n", "occupation_college = np.array([0,1])\n", "occupation_executive = np.array([0,1])\n", "occupation_other = np.array([0,1])\n", "occupation_technician = np.array([0,1])\n", "\n", "# result contains all possible combinations.\n", "combinations = pd.DataFrame(itertools.product(age_range,gender,occupation_college,\n", " occupation_executive,occupation_other,occupation_technician))\n", "combinations.columns =['age','gender','occupation_college', 'occupation_executive','occupation_other','occupation_technician']" ] }, { "cell_type": "code", "execution_count": 17, "id": "79622c3d", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([3.1224, 3.0439, 3.7664, ..., 3.5822, 3.6663, 3.6364])" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "models_CEL['Comedy'].predict(combinations)\n", "#models_CEL['Comedy'].predict(data_CEL_sample.iloc[np.where(data_CEL_sample['Comedy']==1)[0],userinfo_index])" ] }, { "cell_type": "code", "execution_count": 11, "id": "c50dbb1f", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 484, "status": "ok", "timestamp": 1676663532795, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "sToJaW2I7JDR", "outputId": "dae5165b-6023-43b5-c557-19c831b649db" }, "outputs": [], "source": [ "values = np.zeros((5,1312))\n", "i=0\n", "for movie_genere in movie_generes:\n", " values[i,:] = models_CEL[movie_genere].predict(combinations)\n", " i=i+1\n", " #print(values)" ] }, { "cell_type": "code", "execution_count": 12, "id": "aa105642", "metadata": { "executionInfo": { "elapsed": 327, "status": "ok", "timestamp": 1676663610385, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "ftSFf4P62_A8" }, "outputs": [], "source": [ "result_CEL_nonlinear = combinations.copy()\n", "i=0\n", "for movie_genere in movie_generes:\n", " #values = models_CEL[movie_genere].predict(combinations)\n", " result_CEL_nonlinear.insert(len(result_CEL_nonlinear.columns), movie_genere, values[i,:])\n", " i=i+1" ] }, { "cell_type": "code", "execution_count": 13, "id": "22fb7206", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 488 }, "executionInfo": { "elapsed": 312, "status": "ok", "timestamp": 1676663615813, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "3b44q4xD8Gvh", "outputId": "a2846ad8-a2ef-4677-ff8f-1b0f3b87f386" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agegenderoccupation_collegeoccupation_executiveoccupation_otheroccupation_technicianComedyDramaActionThrillerSci-Fi
016000003.1223793.5764713.0664483.5833823.133766
116000013.0438623.2059393.2327273.5833823.133766
216000103.7664413.9102813.3366233.7176033.160268
316000113.6879243.3314633.3452333.7176033.160268
416001003.3773403.6498883.0390563.9236353.133766
....................................
130756110113.2975533.0901103.0242213.6584423.151436
130856111003.6121663.6959113.6084583.7408303.151436
130956111013.5822103.1657073.5528893.7408303.151436
131056111103.6663113.2833113.1291953.7408303.151436
131156111113.6363553.1036473.1159873.7408303.151436
\n", "

1312 rows × 11 columns

\n", "
" ], "text/plain": [ " age gender occupation_college occupation_executive occupation_other \\\n", "0 16 0 0 0 0 \n", "1 16 0 0 0 0 \n", "2 16 0 0 0 1 \n", "3 16 0 0 0 1 \n", "4 16 0 0 1 0 \n", "... ... ... ... ... ... \n", "1307 56 1 1 0 1 \n", "1308 56 1 1 1 0 \n", "1309 56 1 1 1 0 \n", "1310 56 1 1 1 1 \n", "1311 56 1 1 1 1 \n", "\n", " occupation_technician Comedy Drama Action Thriller Sci-Fi \n", "0 0 3.122379 3.576471 3.066448 3.583382 3.133766 \n", "1 1 3.043862 3.205939 3.232727 3.583382 3.133766 \n", "2 0 3.766441 3.910281 3.336623 3.717603 3.160268 \n", "3 1 3.687924 3.331463 3.345233 3.717603 3.160268 \n", "4 0 3.377340 3.649888 3.039056 3.923635 3.133766 \n", "... ... ... ... ... ... ... \n", "1307 1 3.297553 3.090110 3.024221 3.658442 3.151436 \n", "1308 0 3.612166 3.695911 3.608458 3.740830 3.151436 \n", "1309 1 3.582210 3.165707 3.552889 3.740830 3.151436 \n", "1310 0 3.666311 3.283311 3.129195 3.740830 3.151436 \n", "1311 1 3.636355 3.103647 3.115987 3.740830 3.151436 \n", "\n", "[1312 rows x 11 columns]" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result_CEL_nonlinear" ] }, { "cell_type": "code", "execution_count": 14, "id": "3b49ef75", "metadata": { "executionInfo": { "elapsed": 297, "status": "ok", "timestamp": 1676663762150, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "ZPelPjli1oTF" }, "outputs": [], "source": [ "# save the result to\n", "result_CEL_nonlinear.to_csv('result_CEL_nonlinear.csv')" ] }, { "cell_type": "code", "execution_count": 15, "id": "62a7c999", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 488 }, "executionInfo": { "elapsed": 18, "status": "ok", "timestamp": 1676663918591, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "alx6VRzn8v-W", "outputId": "6bfbeeec-1a6c-4e68-bad9-39c0853b34a7" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agegenderoccupation_collegeoccupation_executiveoccupation_otheroccupation_technicianComedyDramaActionThrillerSci-Fi
016000003.1223793.5764713.0664483.5833823.133766
116000013.0438623.2059393.2327273.5833823.133766
216000103.7664413.9102813.3366233.7176033.160268
316000113.6879243.3314633.3452333.7176033.160268
416001003.3773403.6498883.0390563.9236353.133766
....................................
130756110113.2975533.0901103.0242213.6584423.151436
130856111003.6121663.6959113.6084583.7408303.151436
130956111013.5822103.1657073.5528893.7408303.151436
131056111103.6663113.2833113.1291953.7408303.151436
131156111113.6363553.1036473.1159873.7408303.151436
\n", "

1312 rows × 11 columns

\n", "
" ], "text/plain": [ " age gender occupation_college occupation_executive occupation_other \\\n", "0 16 0 0 0 0 \n", "1 16 0 0 0 0 \n", "2 16 0 0 0 1 \n", "3 16 0 0 0 1 \n", "4 16 0 0 1 0 \n", "... ... ... ... ... ... \n", "1307 56 1 1 0 1 \n", "1308 56 1 1 1 0 \n", "1309 56 1 1 1 0 \n", "1310 56 1 1 1 1 \n", "1311 56 1 1 1 1 \n", "\n", " occupation_technician Comedy Drama Action Thriller Sci-Fi \n", "0 0 3.122379 3.576471 3.066448 3.583382 3.133766 \n", "1 1 3.043862 3.205939 3.232727 3.583382 3.133766 \n", "2 0 3.766441 3.910281 3.336623 3.717603 3.160268 \n", "3 1 3.687924 3.331463 3.345233 3.717603 3.160268 \n", "4 0 3.377340 3.649888 3.039056 3.923635 3.133766 \n", "... ... ... ... ... ... ... \n", "1307 1 3.297553 3.090110 3.024221 3.658442 3.151436 \n", "1308 0 3.612166 3.695911 3.608458 3.740830 3.151436 \n", "1309 1 3.582210 3.165707 3.552889 3.740830 3.151436 \n", "1310 0 3.666311 3.283311 3.129195 3.740830 3.151436 \n", "1311 1 3.636355 3.103647 3.115987 3.740830 3.151436 \n", "\n", "[1312 rows x 11 columns]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# read the result file\n", "result_CEL_nonlinear = pd.read_csv('result_CEL_nonlinear.csv')\n", "result_CEL_nonlinear = result_CEL_nonlinear.drop(result_CEL_nonlinear.columns[0], axis=1)\n", "result_CEL_nonlinear" ] }, { "cell_type": "markdown", "id": "9a8a68cb", "metadata": { "id": "nhhsw-DOwxzx" }, "source": [ "#### Analysis" ] }, { "cell_type": "code", "execution_count": 16, "id": "7e8d8c90", "metadata": { "executionInfo": { "elapsed": 343, "status": "ok", "timestamp": 1676664442590, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "Dk9-vMRIQCyg" }, "outputs": [], "source": [ "# calculate the expected reward of Comedy for female\n", "TE_female=result_CEL_nonlinear.iloc[np.where(result_CEL_nonlinear['gender']==0)[0],6:11]/(41*(2**4))\n", "TE_female=pd.DataFrame(TE_female.sum(axis=0))\n", "TE_female.columns =['Expected Rating']" ] }, { "cell_type": "code", "execution_count": 17, "id": "06d13401", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 206 }, "executionInfo": { "elapsed": 17, "status": "ok", "timestamp": 1676664443598, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "KfCTrgBDSyWv", "outputId": "896d8116-a377-484f-e9eb-7abca4cad453" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Expected Rating
Comedy3.500268
Drama3.309777
Action3.562432
Thriller3.605472
Sci-Fi2.960134
\n", "
" ], "text/plain": [ " Expected Rating\n", "Comedy 3.500268\n", "Drama 3.309777\n", "Action 3.562432\n", "Thriller 3.605472\n", "Sci-Fi 2.960134" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "TE_female" ] }, { "cell_type": "code", "execution_count": 18, "id": "428ea969", "metadata": { "executionInfo": { "elapsed": 310, "status": "ok", "timestamp": 1676664467325, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "xNU7ibG4_UNF" }, "outputs": [], "source": [ "# calculate the expected reward of Comedy for female\n", "TE_male=result_CEL_nonlinear.iloc[np.where(result_CEL_nonlinear['gender']==1)[0],6:11]/(41*(2**4))\n", "TE_male=pd.DataFrame(TE_male.sum(axis=0))\n", "TE_male.columns =['Expected Rating']" ] }, { "cell_type": "code", "execution_count": 19, "id": "7eae7c50", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 206 }, "executionInfo": { "elapsed": 316, "status": "ok", "timestamp": 1676664470347, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "4YBh4ljh_UNG", "outputId": "f54358c5-fe05-44fc-8693-f6a33ccf204e" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Expected Rating
Comedy3.365749
Drama3.321332
Action3.256846
Thriller3.447365
Sci-Fi2.960134
\n", "
" ], "text/plain": [ " Expected Rating\n", "Comedy 3.365749\n", "Drama 3.321332\n", "Action 3.256846\n", "Thriller 3.447365\n", "Sci-Fi 2.960134" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "TE_male" ] }, { "cell_type": "markdown", "id": "169be1ee", "metadata": { "id": "-OwDCcYP_XiE" }, "source": [ "**Conclusion**: Among these five selected movie generes, `Comedy` is the most popular one that received the highest expected rating. On the contrary, the estimated rating for `Schi-Fi` is the lowest, both for males and females. In addition, the expected ratings of women are generally slightly higher than men, except for `Drama` movies where men's rating is expected to be 0.02/5 points higher than women." ] }, { "cell_type": "markdown", "id": "ff203c46", "metadata": { "id": "S6_rREHz_i8r" }, "source": [ "### linear model fitting" ] }, { "cell_type": "code", "execution_count": 20, "id": "8a0b53ec", "metadata": { "executionInfo": { "elapsed": 301, "status": "ok", "timestamp": 1676664659108, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "RTBk0DSMADT0" }, "outputs": [], "source": [ "models_CEL_linear = {}\n", " \n", "# initialize the models we'll fit in Causal Effect Learning\n", "for i in movie_generes:\n", " models_CEL_linear[i] = None \n", " \n", "from sklearn.linear_model import LinearRegression\n", "for movie_genere in movie_generes: \n", " models_CEL_linear[movie_genere] = LinearRegression()\n", " models_CEL_linear[movie_genere].fit(data_CEL_sample.iloc[np.where(data_CEL_sample[movie_genere]==1)[0],userinfo_index],data_CEL_sample.iloc[np.where(data_CEL_sample[movie_genere]==1)[0],2] )\n" ] }, { "cell_type": "code", "execution_count": 21, "id": "c32620a4", "metadata": { "executionInfo": { "elapsed": 300, "status": "ok", "timestamp": 1676664695417, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "gvpnxYTE_nSN" }, "outputs": [], "source": [ "import itertools\n", "\n", "gender = np.array([0,1])\n", "occupation_college = np.array([0,1])\n", "occupation_executive = np.array([0,1])\n", "occupation_other = np.array([0,1])\n", "occupation_technician = np.array([0,1])\n", "\n", "# result contains all possible combinations.\n", "combinations = pd.DataFrame(itertools.product(age_range,gender,occupation_college,\n", " occupation_executive,occupation_other,occupation_technician))\n", "combinations.columns =['age','gender','occupation_college', 'occupation_executive','occupation_other','occupation_technician']\n" ] }, { "cell_type": "code", "execution_count": 22, "id": "5d38fd08", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 428, "status": "ok", "timestamp": 1676664715886, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "ZY9IE6Wt_nSR", "outputId": "b5d43244-a65c-4e69-8369-c91a26f42e7f" }, "outputs": [], "source": [ "values = np.zeros((5,1312))\n", "i=0\n", "for movie_genere in movie_generes:\n", " values[i,:] = models_CEL_linear[movie_genere].predict(combinations)\n", " i=i+1\n", " #print(values)" ] }, { "cell_type": "code", "execution_count": 23, "id": "8195a7b7", "metadata": { "executionInfo": { "elapsed": 463, "status": "ok", "timestamp": 1676664728008, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "PS7DNCGp_nSR" }, "outputs": [], "source": [ "result_CEL_linear = combinations\n", "i=0\n", "for movie_genere in movie_generes:\n", " #values = models_CEL[movie_genere].predict(combinations)\n", " result_CEL_linear.insert(len(result_CEL_linear.columns), movie_genere, values[i,:])\n", " i=i+1" ] }, { "cell_type": "code", "execution_count": 24, "id": "05449c57", "metadata": { "executionInfo": { "elapsed": 290, "status": "ok", "timestamp": 1676664747841, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "da-uqtdb_nSS" }, "outputs": [], "source": [ "# the result is saved to\n", "result_CEL_linear.to_csv('result_CEL_linear.csv')" ] }, { "cell_type": "code", "execution_count": 25, "id": "32713df8", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 488 }, "executionInfo": { "elapsed": 297, "status": "ok", "timestamp": 1676664757858, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "-geZLYVv_nSS", "outputId": "e17fd598-6f3e-451d-ab1d-f6a8fda18572" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agegenderoccupation_collegeoccupation_executiveoccupation_otheroccupation_technicianComedyDramaActionThrillerSci-Fi
016000003.3231693.4536503.6921673.4828833.357668
116000013.3257823.0989453.6281773.1457053.886814
216000103.5782373.6662343.4642503.3927273.691633
316000113.5808503.3115293.4002613.0555494.220779
416001003.5300903.5667013.4559293.7413103.570002
....................................
130756110113.4949233.2596522.8993593.0069802.469963
130856111003.4441643.5148242.9550273.6927421.819186
130956111013.4467763.1601202.8910383.3555642.348332
131056111103.6992313.7274082.7271113.6025852.153151
131156111113.7018443.3727042.6631223.2654072.682297
\n", "

1312 rows × 11 columns

\n", "
" ], "text/plain": [ " age gender occupation_college occupation_executive occupation_other \\\n", "0 16 0 0 0 0 \n", "1 16 0 0 0 0 \n", "2 16 0 0 0 1 \n", "3 16 0 0 0 1 \n", "4 16 0 0 1 0 \n", "... ... ... ... ... ... \n", "1307 56 1 1 0 1 \n", "1308 56 1 1 1 0 \n", "1309 56 1 1 1 0 \n", "1310 56 1 1 1 1 \n", "1311 56 1 1 1 1 \n", "\n", " occupation_technician Comedy Drama Action Thriller Sci-Fi \n", "0 0 3.323169 3.453650 3.692167 3.482883 3.357668 \n", "1 1 3.325782 3.098945 3.628177 3.145705 3.886814 \n", "2 0 3.578237 3.666234 3.464250 3.392727 3.691633 \n", "3 1 3.580850 3.311529 3.400261 3.055549 4.220779 \n", "4 0 3.530090 3.566701 3.455929 3.741310 3.570002 \n", "... ... ... ... ... ... ... \n", "1307 1 3.494923 3.259652 2.899359 3.006980 2.469963 \n", "1308 0 3.444164 3.514824 2.955027 3.692742 1.819186 \n", "1309 1 3.446776 3.160120 2.891038 3.355564 2.348332 \n", "1310 0 3.699231 3.727408 2.727111 3.602585 2.153151 \n", "1311 1 3.701844 3.372704 2.663122 3.265407 2.682297 \n", "\n", "[1312 rows x 11 columns]" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# read the result file\n", "result_CEL_linear = pd.read_csv('result_CEL_linear.csv')\n", "result_CEL_linear = result_CEL_linear.drop(result_CEL_linear.columns[0], axis=1)\n", "result_CEL_linear" ] }, { "cell_type": "markdown", "id": "8f1058a4", "metadata": { "id": "gBUDJ7pF_nST" }, "source": [ "#### Analysis" ] }, { "cell_type": "code", "execution_count": 26, "id": "44dffb75", "metadata": { "executionInfo": { "elapsed": 367, "status": "ok", "timestamp": 1676664791192, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "vaCTRmcg_nSU" }, "outputs": [], "source": [ "# calculate the expected reward of Comedy for female\n", "TE_female_linear=result_CEL_linear.iloc[np.where(result_CEL_linear['gender']==0)[0],6:11]/(41*(2**4))\n", "TE_female_linear=pd.DataFrame(TE_female_linear.sum(axis=0))\n", "TE_female_linear.columns =['Expected Rating']" ] }, { "cell_type": "code", "execution_count": 27, "id": "6f1bc168", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 206 }, "executionInfo": { "elapsed": 292, "status": "ok", "timestamp": 1676664794563, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "GeRnCtDm_nSU", "outputId": "192e750a-4732-4342-ea20-ac0d39c4dea2" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Expected Rating
Comedy3.579924
Drama3.402675
Action3.282099
Thriller3.511989
Sci-Fi3.082199
\n", "
" ], "text/plain": [ " Expected Rating\n", "Comedy 3.579924\n", "Drama 3.402675\n", "Action 3.282099\n", "Thriller 3.511989\n", "Sci-Fi 3.082199" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "TE_female_linear" ] }, { "cell_type": "code", "execution_count": 28, "id": "46297093", "metadata": { "executionInfo": { "elapsed": 305, "status": "ok", "timestamp": 1676664826089, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "WXe5ahtO_nSV" }, "outputs": [], "source": [ "# calculate the expected reward of Comedy for female\n", "TE_male_linear=result_CEL_linear.iloc[np.where(result_CEL_linear['gender']==1)[0],6:11]/(41*(2**4))\n", "TE_male_linear=pd.DataFrame(TE_male_linear.sum(axis=0))\n", "TE_male_linear.columns =['Expected Rating']" ] }, { "cell_type": "code", "execution_count": 29, "id": "8e59b318", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 206 }, "executionInfo": { "elapsed": 13, "status": "ok", "timestamp": 1676664826612, "user": { "displayName": "Yang Xu", "userId": "12270366590264264299" }, "user_tz": 300 }, "id": "d3cpJ3hG_nSV", "outputId": "30c5dd64-f5e2-4a70-e2ed-17bb8a987c68" }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Expected Rating
Comedy3.445089
Drama3.423679
Action3.073189
Thriller3.236301
Sci-Fi2.957766
\n", "
" ], "text/plain": [ " Expected Rating\n", "Comedy 3.445089\n", "Drama 3.423679\n", "Action 3.073189\n", "Thriller 3.236301\n", "Sci-Fi 2.957766" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "TE_male_linear" ] }, { "cell_type": "markdown", "id": "c0a91313", "metadata": {}, "source": [ "**Conclusion**: The expected ratings obtained under linear model is generally consistent with the result under nonlinear model. " ] }, { "cell_type": "markdown", "id": "9bce2c09-db8c-464a-bdd3-c3ea3f97c166", "metadata": {}, "source": [ "## Online Learning\n", "\n", "In this section, we aim to implement the contextual TS to learn the optimal policy online. Specifically, we assume that, for each arm $i$, \n", "$$R_t(i)\\sim \\mathcal{N}(\\boldsymbol{s}_i^T \\boldsymbol{\\gamma},\\sigma^2).$$" ] }, { "cell_type": "code", "execution_count": 30, "id": "220cf366-d18b-4d17-9a8a-6f934d8e07dc", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "from sklearn.linear_model import LinearRegression\n", "from causaldm.learners.CPL4.CMAB import _env_realCMAB as _env\n", "from causaldm.learners.CPL4.CMAB import LinTS\n", "env = _env.Single_Contextual_Env(seed = 0, Binary = False)\n", "K = env.K\n", "p = env.p\n", "logged_data, arms = env.get_logged_dat()\n", "CEL_results = pd.read_csv('result_CEL_nonlinear.csv').iloc[:,1:] " ] }, { "cell_type": "markdown", "id": "db11b7b7-79e8-40aa-9fe3-f902f3e8e27e", "metadata": {}, "source": [ "### Estimated $\\sigma$ and $\\boldsymbol{\\gamma}$\n", "\n", "Here, we estimated the $\\sigma$ and $\\boldsymbol{\\gamma}$ based on the logged data and the estimated results obtained from the causal effect learning (CEL) step." ] }, { "cell_type": "code", "execution_count": 31, "id": "2d03c533-4b65-46c8-8611-e6392a83a7e4", "metadata": {}, "outputs": [], "source": [ "mean_error = []\n", "for genere in arms:\n", " genere_dat = logged_data[genere][['age','gender_M',\n", " 'occupation_college/grad student', 'occupation_executive/managerial',\n", " 'occupation_other', 'occupation_technician/engineer','rating']] \n", " model = LinearRegression().fit(CEL_results.iloc[:,:-5], CEL_results[genere])\n", " genere_error = genere_dat.rating.to_numpy() - model.predict(np.array(genere_dat.iloc[:,:-1]))\n", " mean_error += genere_error.tolist()\n", "sigma = np.std(mean_error,ddof=1)\n", "\n", "gamma = []\n", "for genere in arms:\n", " model = LinearRegression().fit(CEL_results.iloc[:,:-5], CEL_results[genere])\n", " gamma+=[model.intercept_] + list(model.coef_)" ] }, { "cell_type": "markdown", "id": "a46a0843-5f72-40ff-87e8-0c6d9d807a06", "metadata": {}, "source": [ "### Run Informative TS\n", "\n", "Here, we run an informative TS with informative prior information, including the estimated $\\sigma$ and $\\gamma$. Specifically, we use $\\mathcal{N}(\\hat{\\boldsymbol{\\gamma}},.05I)$ as the prior distribution of $\\gamma$. In total, we ran 50 replicates, each with 5000 total steps, to get the expected performance of online learning." ] }, { "cell_type": "code", "execution_count": 32, "id": "ee43e7f0", "metadata": {}, "outputs": [], "source": [ "T = 20000\n", "S = 50\n", "sigma1 = .3\n", "cum_reward_informative = []\n", "for seed in range(S):\n", " env = _env.Single_Contextual_Env(seed = seed, Binary = False)\n", " prior_theta_u = np.array(gamma)\n", " prior_theta_cov = sigma1*np.identity(p)\n", " informative_TS= LinTS.LinTS_Gaussian(sigma = sigma, prior_theta_u = prior_theta_u, \n", " prior_theta_cov = prior_theta_cov, \n", " K = K, p = p,seed = seed)\n", " cum_reward_informative_t = []\n", " rec_action_informative_t = []\n", " for t in range(T):\n", " X, feature_info= env.get_Phi(t)\n", " A = informative_TS.take_action(X)\n", " R = env.get_reward(t,A)\n", " informative_TS.receive_reward(t,A,R,X)\n", " cum_reward_informative_t.append(R)\n", " rec_action_informative_t.append(A)\n", " cum_reward_informative_t = np.cumsum(cum_reward_informative_t)/(np.array(range(T))+1)\n", " cum_reward_informative.append(cum_reward_informative_t)" ] }, { "cell_type": "markdown", "id": "6f3f447b-53bd-43d6-8e68-9835ff0e275a", "metadata": {}, "source": [ "### Run Uninformative TS\n", "\n", "To further show the advantages of integrating the information from a CEL step, we run an uninformative TS with uninformative prior information. Specifically, we use $\\mathcal{N}(\\boldsymbol{0},1000I)$ as the prior distribution of $\\gamma$. In total, we ran 50 replicates, each with 5000 total steps, to get the expected performance of online learning." ] }, { "cell_type": "code", "execution_count": 33, "id": "394623b1", "metadata": {}, "outputs": [], "source": [ "T = 20000\n", "S = 50\n", "cum_reward_uninformative = []\n", "for seed in range(S):\n", " env = _env.Single_Contextual_Env(seed = seed, Binary = False)\n", " K = env.K\n", " p = env.p\n", " prior_theta_u = np.zeros(p)\n", " prior_theta_cov = 1000*np.identity(p)\n", " uninformative_TS= LinTS.LinTS_Gaussian(sigma = sigma, prior_theta_u = prior_theta_u, \n", " prior_theta_cov = prior_theta_cov, \n", " K = K, p = p,seed = seed)\n", " cum_reward_uninformative_t = []\n", " rec_action_uninformative_t = []\n", " for t in range(T):\n", " X, feature_info = env.get_Phi(t)\n", " A = uninformative_TS.take_action(X)\n", " R = env.get_reward(t,A)\n", " uninformative_TS.receive_reward(t,A,R,X)\n", " cum_reward_uninformative_t.append(R)\n", " rec_action_uninformative_t.append(A)\n", " cum_reward_uninformative_t = np.cumsum(cum_reward_uninformative_t)/(np.array(range(T))+1)\n", " cum_reward_uninformative.append(cum_reward_uninformative_t)" ] }, { "cell_type": "markdown", "id": "34b1ac77-c6db-4b31-a44c-cedeb8ddeba2", "metadata": {}, "source": [ "### Run Personalized Greedy\n", "\n", "We also run a greedy algorithm using the results of the CEL-HTE step as a natural baseline. In particular, for each round, we estimated the expected reward for each arm solely based on the estimated results from the CEL step and then pulled the arm with the highest expected reward." ] }, { "cell_type": "code", "execution_count": 34, "id": "8a3ac8d3", "metadata": {}, "outputs": [], "source": [ "T = 20000\n", "S = 50\n", "cum_reward_greedy = []\n", "for seed in range(S):\n", " env = _env.Single_Contextual_Env(seed = seed, Binary = False)\n", " cum_reward_greedy_t = []\n", " rec_action_greedy_t = []\n", " for t in range(T):\n", " X, feature_info= env.get_Phi(t)\n", " A = np.argmax(X.dot(np.array(gamma)))\n", " R = env.get_reward(t,A)\n", " cum_reward_greedy_t.append(R)\n", " rec_action_greedy_t.append(A)\n", " cum_reward_greedy_t = np.cumsum(cum_reward_greedy_t)/(np.array(range(T))+1)\n", " cum_reward_greedy.append(cum_reward_greedy_t)" ] }, { "cell_type": "markdown", "id": "0958c189", "metadata": {}, "source": [ "### Run Naive Greedy\n", "\n", "We also run a greedy algorithm on the CEL-ATE results, which serve as another natural baseline. For each round, we recommend thrillers to the user, as it has the highest expected average reward for both male and female users." ] }, { "cell_type": "code", "execution_count": 35, "id": "8d624fce", "metadata": {}, "outputs": [], "source": [ "T = 20000\n", "S = 50\n", "cum_reward_nvgreedy = []\n", "for seed in range(S):\n", " env = _env.Single_Contextual_Env(seed = seed, Binary = False)\n", " cum_reward_nvgreedy_t = []\n", " rec_action_nvgreedy_t = []\n", " for t in range(T):\n", " A = 3\n", " R = env.get_reward(t,A)\n", " cum_reward_nvgreedy_t.append(R)\n", " rec_action_nvgreedy_t.append(A)\n", " cum_reward_nvgreedy_t = np.cumsum(cum_reward_nvgreedy_t)/(np.array(range(T))+1)\n", " cum_reward_nvgreedy.append(cum_reward_nvgreedy_t)" ] }, { "cell_type": "markdown", "id": "5c377ca8-53db-4018-b3bd-ee008e4c9db4", "metadata": {}, "source": [ "### Results\n", "\n", "On the one hand, while the greedy algorithm outperforms the TS algorithms in the early stages due to extra exploration, both TS algorithms continue to learn new information from the environment and eventually outperform the greedy algorithm. On the other hand, when the results of the uninformative TS are compared to the results of the informative TS, it is clear that the TS algorithm with an informative prior outperforms the uninformative TS, especially in the early stages, due to the use of prior information obtained from the CEL step. Based on the result of the last replicate of the informative TS, the mean of the final estimation of the posterior distribution of $\\boldsymbol{\\gamma}$ is summarized as follows (it can be retrieved by `informative_TS.u`):\n", "\n", "| | intercept | age | gender | college/grad student | executive/managerial | other | technician/engineer |\n", "|----------|:---------:|-------|--------|----------------------|----------------------|-------|---------------------|\n", "| Comedy | 3.426 | 0.001 | -0.045 | -0.191 | -0.062 | -0.223 | -0.149 |\n", "| Drama | 3.884 | -0.010 | -0.029 | -0.161 | -0.011 | 0.039 | -0.117 |\n", "| Action | 2.429 | 0.023 | -0.287 | 0.151 | 0.391 | -0.391 | -0.182 |\n", "| Thriller | 3.674 | -0.008 | -0.040 | -0.267 | -0.028 | -0.000 | -0.061 |\n", "| Sci-Fi | 3.114 | 0.004 | -0.041 | 0.088 | -.004 | -0.063 | -0.578 |\n", "\n", "To use the estimated results greedily, we can calculate the mean reward of each movie genre using the estimated $\\boldsymbol{\\gamma}$ and the incoming user's information, and then recommend the genre with the highest estimated mean reward." ] }, { "cell_type": "code", "execution_count": 37, "id": "9386bd02", "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "import seaborn as sns\n", "import matplotlib.pyplot as plt\n", "result = pd.DataFrame({'rep':np.concatenate([np.ones(T)*rep for rep in range(S)]*4),\n", " 't':np.concatenate([range(T)]*4*S),\n", " 'Reward':np.concatenate(cum_reward_informative+cum_reward_uninformative+cum_reward_greedy+cum_reward_nvgreedy),\n", " 'Algo':['info_LinTS']*T*S+['uninfo_LinTS']*T*S+['personalized greedy']*T*S+['naive greedy']*T*S})\n", "sns.lineplot(data=result[result.t>0], x='t', y=\"Reward\", hue=\"Algo\", ci = 95,\n", " n_boot = 20, linewidth = 1.0, markers = False)\n", "plt.legend(bbox_to_anchor=(1.02, 1), loc='upper left', borderaxespad=0)\n", "plt.ylabel('Average Reward')\n", "plt.ylim(3.35, 3.55)\n", "plt.savefig('my_figure.png', bbox_inches='tight')" ] }, { "cell_type": "code", "execution_count": null, "id": "f648587e", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" } }, "nbformat": 4, "nbformat_minor": 5 }