{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "ca14b891",
   "metadata": {},
   "source": [
    "# Heterogeneous Effects\n",
    "\n",
    "**Author**\n",
    "\n",
    "> - [Paul Schrimpf *UBC*](https://economics.ubc.ca/faculty-and-staff/paul-schrimpf/)  \n",
    "\n",
    "\n",
    "\n",
    "**Prerequisites**\n",
    "\n",
    "- [Regression](https://datascience.quantecon.org/../tools/regression.html)  \n",
    "- [Machine Learning in Economics](https://datascience.quantecon.org/ml_in_economics.html)  \n",
    "\n",
    "\n",
    "**Outcomes**\n",
    "\n",
    "- Understand potential outcomes and treatment effects  \n",
    "- Apply generic machine learning inference to data from a randomized experiment  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "875326a3",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "# Uncomment following line to install on colab\n",
    "#! pip install fiona geopandas xgboost gensim folium pyLDAvis descartes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "34a9105a",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "import patsy\n",
    "from sklearn import linear_model, ensemble, base, neural_network\n",
    "import statsmodels.formula.api as smf\n",
    "import statsmodels.api as sm\n",
    "from sklearn.utils._testing import ignore_warnings\n",
    "from sklearn.exceptions import ConvergenceWarning\n",
    "\n",
    "%matplotlib inline\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f9c2896d",
   "metadata": {},
   "source": [
    "In this notebook, we will learn how to apply machine learning methods to analyze\n",
    "results of a randomized experiment. We typically begin analyzing\n",
    "experimental results by calculating the difference in mean\n",
    "outcomes between the treated and control groups. This difference estimates well\n",
    "the average treatment effect. We can obtain more\n",
    "nuanced results by recognizing that the effect of most experiments\n",
    "might be heterogeneous. That is, different people could be affected by\n",
    "the experiment differently. We will use machine learning methods to\n",
    "explore this heterogeneity in treatment effects."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "36a9d26d",
   "metadata": {},
   "source": [
    "## Background and Data\n",
    "\n",
    "We are going to use data from a randomized  experiment in Indonesia\n",
    "called Program Keluarga Harapan (PKH). PKH was a conditional cash\n",
    "transfer program designed to improve child health. Eligible pregnant\n",
    "women would receive a cash transfer if they attended at least 4\n",
    "pre-natal and 2 post-natal visits, received iron supplements, and had\n",
    "their baby delivered by a doctor or midwife. The cash transfers were\n",
    "given quarterly and were about 60-220 dollars or 15-20 percent of\n",
    "quarterly consumption. PKH eligibility was randomly assigned at the\n",
    "kecamatan (district) level. All pregnant women living in a treated\n",
    "kecamatan could choose to participate in the experiment. For more\n",
    "information see [[hetACE+11](#id21)] or [[hetTri16](#id22)].\n",
    "\n",
    "We are using the data provided with [[hetTri16](#id22)]."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dc03b468",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "url = \"https://datascience.quantecon.org/assets/data/Triyana_2016_price_women_clean.csv\"\n",
    "df = pd.read_csv(url)\n",
    "df.describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e39194e0",
   "metadata": {},
   "source": [
    "## Potential Outcomes and Treatment Effects\n",
    "\n",
    "Since program eligibility was randomly assigned (and what\n",
    "policymakers could choose to change), we will focus on estimating\n",
    "the effect of eligibility. We will let\n",
    "$ d_i $ be a 1 if person $ i $ was eligible and be 0 if not.\n",
    "Let $ y_i $ be an outcome of interest. Below, we\n",
    "will look at midwife usage and birth weight as outcomes.\n",
    "\n",
    "It is\n",
    "useful to think about potential outcomes of the treatment. The potential treated\n",
    "outcome is $ y_i(1) $. For subjects who actually were treated,\n",
    "$ y_i(1) = y_i $ is the observed outcome. For untreated subjects,\n",
    "$ y_i(1) $ is what mother i ‘s baby’s birth weight would have\n",
    "been if she had been eligible for the program. Similarly, we can\n",
    "define the potential untreated outcome $ y_i(0) $ .\n",
    "\n",
    "The individual treatment effect for subject i is $ y_i(1) - y_i(0) $.\n",
    "\n",
    "Individual treatment effects are impossible to know since we always\n",
    "only observe $ y_i(1) $ or $ y_i(0) $, but never both.\n",
    "\n",
    "When treatment is randomly assigned, we can estimate average treatment\n",
    "effects because\n",
    "\n",
    "$$\n",
    "\\begin{align*}\n",
    "E[y_i(1) - y_i(0) ] = & E[y_i(1)] - E[y_i(0)] \\\\\n",
    "& \\text{random assignment } \\\\\n",
    "= & E[y_i(1) | d_i = 1] - E[y_i(0) | d_i = 0] \\\\\n",
    "= & E[y_i | d_i = 1] - E[y_i | d_i = 0 ]\n",
    "\\end{align*}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "44ded1f0",
   "metadata": {},
   "source": [
    "### Average Treatment Effects\n",
    "\n",
    "Let’s estimate the average treatment effect."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1c2b3284",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "# some data prep for later\n",
    "formula = \"\"\"\n",
    "bw ~ pkh_kec_ever +\n",
    "  C(edu)*C(agecat) + log_xp_percap + hh_land + hh_home + C(dist) +\n",
    "  hh_phone + hh_rf_tile + hh_rf_shingle + hh_rf_fiber +\n",
    "  hh_wall_plaster + hh_wall_brick + hh_wall_wood + hh_wall_fiber +\n",
    "  hh_fl_tile + hh_fl_plaster + hh_fl_wood + hh_fl_dirt +\n",
    "  hh_water_pam + hh_water_mechwell + hh_water_well + hh_water_spring + hh_water_river +\n",
    "  hh_waterhome +\n",
    "  hh_toilet_own + hh_toilet_pub + hh_toilet_none +\n",
    "  hh_waste_tank + hh_waste_hole + hh_waste_river + hh_waste_field +\n",
    "  hh_kitchen +\n",
    "  hh_cook_wood + hh_cook_kerosene + hh_cook_gas +\n",
    "  tv + fridge + motorbike + car + goat + cow + horse\n",
    "\"\"\"\n",
    "bw, X = patsy.dmatrices(formula, df, return_type=\"dataframe\")\n",
    "# some categories are empty after dropping rows will Null, drop now\n",
    "X = X.loc[:, X.sum() > 0]\n",
    "bw = bw.iloc[:, 0]\n",
    "treatment_variable = \"pkh_kec_ever\"\n",
    "treatment = X[\"pkh_kec_ever\"]\n",
    "Xl = X.drop([\"Intercept\", \"pkh_kec_ever\", \"C(dist)[T.313175]\"], axis=1)\n",
    "#scale = bw.std()\n",
    "#center = bw.mean()\n",
    "loc_id = df.loc[X.index, \"Location_ID\"].astype(\"category\")\n",
    "\n",
    "import re\n",
    "# remove [ ] from names for compatibility with xgboost\n",
    "Xl = Xl.rename(columns=lambda x: re.sub('\\[|\\]','_',x))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6e516637",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "# Estimate average treatment effects\n",
    "from statsmodels.iolib.summary2 import summary_col\n",
    "tmp = pd.DataFrame(dict(birthweight=bw,treatment=treatment,assisted_delivery=df.loc[X.index, \"good_assisted_delivery\"]))\n",
    "usage = smf.ols(\"assisted_delivery ~ treatment\", data=tmp).fit(cov_type=\"cluster\", cov_kwds={'groups':loc_id})\n",
    "health= smf.ols(\"bw ~ treatment\", data=tmp).fit(cov_type=\"cluster\", cov_kwds={'groups':loc_id})\n",
    "summary_col([usage, health])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "814da018",
   "metadata": {},
   "source": [
    "The program did increase the percent of births assisted by a medical\n",
    "professional, but on average, did not affect birth weight."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "652fda62",
   "metadata": {},
   "source": [
    "### Conditional Average Treatment Effects\n",
    "\n",
    "Although we can never estimate individual treatment effects, the\n",
    "logic that lets us estimate unconditional average treatment effects\n",
    "also suggests that we can estimate conditional average treatment effects.\n",
    "\n",
    "$$\n",
    "\\begin{align*}\n",
    "E[y_i(1) - y_i(0) |X_i=x] = & E[y_i(1)|X_i = x] - E[y_i(0)|X_i=x] \\\\\n",
    "& \\text{random assignment } \\\\\n",
    "= & E[y_i(1) | d_i = 1, X_i=x] - E[y_i(0) | d_i = 0, X_i=x] \\\\\n",
    "= & E[y_i | d_i = 1, X_i = x] - E[y_i | d_i = 0, X_i=x ]\n",
    "\\end{align*}\n",
    "$$\n",
    "\n",
    "Conditional average treatment effects tell us whether there are\n",
    "identifiable (by X) groups of people for with varying treatment effects vary.\n",
    "\n",
    "Since conditional average treatment effects involve conditional\n",
    "expectations, machine learning methods might be useful.\n",
    "\n",
    "However, if we want to be able to perform statistical inference, we\n",
    "must use machine learning methods carefully. We will detail one\n",
    "approach below. [[hetAI16](#id20)] and [[hetWA18](#id19)] are\n",
    "alternative approaches."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2b545418",
   "metadata": {},
   "source": [
    "## Generic Machine Learning Inference\n",
    "\n",
    "In this section, we will describe the “generic machine learning\n",
    "inference” method of [[hetCDDFV18](#id18)] to explore heterogeneity in\n",
    "conditional average treatment effects.\n",
    "\n",
    "This approach allows any\n",
    "machine learning method to be used to estimate $ E[y_i(1) -\n",
    "y_i(0) |X_i=x] $.\n",
    "\n",
    "Inference for functions estimated by machine learning methods is\n",
    "typically either impossible or requires very restrictive assumptions.\n",
    "[[hetCDDFV18](#id18)] gets around this problem by focusing on inference for\n",
    "certain summary statistics of the machine learning prediction for\n",
    "$ E[y_i(1) - y_i(0) |X_i=x] $ rather than\n",
    "$ E[y_i(1) - y_i(0) |X_i=x] $ itself."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b69c20eb",
   "metadata": {},
   "source": [
    "### Best Linear Projection of CATE\n",
    "\n",
    "Let $ s_0(x) = E[y_i(1) - y_i(0) |X_i=x] $ denote the true\n",
    "conditional average treatment effect. Let $ S(x) $ be an estimate\n",
    "or noisy proxy for $ s_0(x) $. One way to summarize how well\n",
    "$ S(x) $ approximates $ s_0(x) $ is to look at the best linear\n",
    "projection of $ s_0(x) $ on $ S(x) $.\n",
    "\n",
    "$$\n",
    "\\DeclareMathOperator*{\\argmin}{arg\\,min}\n",
    "\\beta_0, \\beta_1 = \\argmin_{b_0,b_1} E[(s_0(x) -\n",
    "b_0 - b_1 (S(x)-E[S(x)]))^2]\n",
    "$$\n",
    "\n",
    "Showing that $ \\beta_0 = E[y_i(1) - y_i(0)] $\n",
    "is the unconditional average treatment effect is not difficult. More interestingly,\n",
    "$ \\beta_1 $ is related to how well $ S(x) $ approximates\n",
    "$ s_0(x) $. If $ S(x) = s_0(x) $, then $ \\beta_1=1 $. If\n",
    "$ S(x) $ is completely uncorrelated with $ s_0(x) $, then\n",
    "$ \\beta_1 = 0 $.\n",
    "\n",
    "The best linear projection of the conditional average treatment\n",
    "effect tells us something about how well $ S(x) $ approximates\n",
    "$ s_0(x) $, but does not directly quantify how much the conditional\n",
    "average treatment effect varies with $ x $. We could try looking\n",
    "at $ S(x) $ directly, but if $ x $ is high dimensional, reporting or visualizing\n",
    "$ S(x) $ will be difficult. Moreover, most\n",
    "machine learning methods have no satisfactory method to determine inferences\n",
    "on $ S(x) $. This is very problematic if we want to use\n",
    "$ S(x) $ to shape future policy decisions. For example, we might\n",
    "want to use $ S(x) $ to target the treatment to people with\n",
    "different $ x $. If we do this, we need to know whether the\n",
    "estimated differences across $ x $ in $ S(x) $ are precise or\n",
    "caused by noise."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e016fa01",
   "metadata": {},
   "source": [
    "### Grouped Average Treatment Effects\n",
    "\n",
    "To deal with both these issues, [[hetCDDFV18](#id18)] focuses on\n",
    "grouped average treatment effects (GATE) with groups defined by\n",
    "$ S(x) $. Partition the data into a fixed, finite number of groups\n",
    "based on $ S(x) $ . Let\n",
    "$ G_{k}(x) = 1\\{\\ell_{k-1} \\leq S(x) \\leq \\ell_k \\} $ where\n",
    "$ \\ell_k $ could be a constant chosen by the researcher or evenly\n",
    "spaced quantiles of $ S(x) $. The $ k $ th grouped average\n",
    "treatment effect is then $ \\gamma_k = E[y(1) - y(0) | G_k(x)] $.\n",
    "If the true $ s_0(x) $ is not constant, and $ S(x) $\n",
    "approximates $ s_0(x) $ well, then the grouped average treatment\n",
    "effects will increase with $ k $. If the conditional average treatment effect\n",
    "has no heterogeneity (i.e. $ s_0(x) $ is constant) and/or\n",
    "$ S(x) $ is a poor approximation to $ s_0(x) $,\n",
    "then the grouped average treatment effect will tend to\n",
    "be constant with $ k $ and may even be non-monotonic due to\n",
    "estimation error."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fc33b6b0",
   "metadata": {},
   "source": [
    "### Estimation\n",
    "\n",
    "We can estimate both the best linear projection of the conditional average treatment\n",
    "effect and the grouped treatment effects by using\n",
    "particular regressions. Let $ B(x) $ be an estimate of the outcome\n",
    "conditional on no treatment, i.e. $ B(x) = \\widehat{E[y(0)|x]} $\n",
    ". Then the estimates of $ \\beta $ from the regression\n",
    "\n",
    "$$\n",
    "y_i = \\alpha_0 + \\alpha_1 B(x_i) + \\beta_0 (d_i-P(d=1)) + \\beta_1\n",
    "(d_i-P(d=1))(S(x_i) - E[S(x_i)]) + \\epsilon_i\n",
    "$$\n",
    "\n",
    "are consistent estimates of the best linear projection of the\n",
    "conditional average treatment effect if $ B(x_i) $ and\n",
    "$ S(x_i) $ are uncorrelated with $ y_i $ . We can ensure that\n",
    "$ B(x_i) $ and $ S(x_i) $ are uncorrelated with $ y_i $ by\n",
    "using the familiar idea of sample-splitting and cross-validation. The\n",
    "usual regression standard errors will also be valid.\n",
    "\n",
    "Similarly, we can estimate grouped average treatment effects from the\n",
    "following regression.\n",
    "\n",
    "$$\n",
    "y_i = \\alpha_0 + \\alpha_1 B(x_i) + \\sum_k \\gamma_k (d_i-P(d=1)) 1(G_k(x_i)) +\n",
    "u_i\n",
    "$$\n",
    "\n",
    "The resulting estimates of $ \\gamma_k $ will be consistent and\n",
    "asymptotically normal with the usual regression standard errors."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d2218925",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "# for clustering standard errors\n",
    "def get_treatment_se(fit, cluster_id, rows=None):\n",
    "    if cluster_id is not None:\n",
    "        if rows is None:\n",
    "            rows = [True] * len(cluster_id)\n",
    "        vcov = sm.stats.sandwich_covariance.cov_cluster(fit, cluster_id.loc[rows])\n",
    "        return np.sqrt(np.diag(vcov))\n",
    "\n",
    "    return fit.HC0_se"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "eb219f8e",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "def generic_ml_model(x, y, treatment, model, n_split=10, n_group=5, cluster_id=None):\n",
    "    nobs = x.shape[0]\n",
    "\n",
    "    blp = np.zeros((n_split, 2))\n",
    "    blp_se = blp.copy()\n",
    "    gate = np.zeros((n_split, n_group))\n",
    "    gate_se = gate.copy()\n",
    "\n",
    "    baseline = np.zeros((nobs, n_split))\n",
    "    cate = baseline.copy()\n",
    "    lamb = np.zeros((n_split, 2))\n",
    "\n",
    "    for i in range(n_split):\n",
    "        main = np.random.rand(nobs) > 0.5\n",
    "        rows1 = ~main & (treatment == 1)\n",
    "        rows0 = ~main & (treatment == 0)\n",
    "\n",
    "        mod1 = base.clone(model).fit(x.loc[rows1, :], (y.loc[rows1]))\n",
    "        mod0 = base.clone(model).fit(x.loc[rows0, :], (y.loc[rows0]))\n",
    "\n",
    "        B = mod0.predict(x)\n",
    "        S = mod1.predict(x) - B\n",
    "        baseline[:, i] = B\n",
    "        cate[:, i] = S\n",
    "        ES = S.mean()\n",
    "\n",
    "        ## BLP\n",
    "        # assume P(treat|x) = P(treat) = mean(treat)\n",
    "        p = treatment.mean()\n",
    "        reg_df = pd.DataFrame(dict(\n",
    "            y=y, B=B, treatment=treatment, S=S, main=main, excess_S=S-ES\n",
    "        ))\n",
    "        reg = smf.ols(\"y ~ B + I(treatment-p) + I((treatment-p)*(S-ES))\", data=reg_df.loc[main, :])\n",
    "        reg_fit = reg.fit()\n",
    "        blp[i, :] = reg_fit.params.iloc[2:4]\n",
    "        blp_se[i, :] = get_treatment_se(reg_fit, cluster_id, main)[2:]\n",
    "\n",
    "        lamb[i, 0] = reg_fit.params.iloc[-1]**2 * S.var()\n",
    "\n",
    "        ## GATEs\n",
    "        cutoffs = np.quantile(S, np.linspace(0,1, n_group + 1))\n",
    "        cutoffs[-1] += 1\n",
    "        for k in range(n_group):\n",
    "            reg_df[f\"G{k}\"] = (cutoffs[k] <= S) & (S < cutoffs[k+1])\n",
    "\n",
    "        g_form = \"y ~ B + \" + \" + \".join([f\"I((treatment-p)*G{k})\" for k in range(n_group)])\n",
    "        g_reg = smf.ols(g_form, data=reg_df.loc[main, :])\n",
    "        g_fit = g_reg.fit()\n",
    "        gate[i, :] = g_fit.params.values[2:] #g_fit.params.filter(regex=\"G\").values\n",
    "        gate_se[i, :] = get_treatment_se(g_fit, cluster_id, main)[2:]\n",
    "\n",
    "        lamb[i, 1] = (gate[i,:]**2).sum()/n_group\n",
    "\n",
    "    out = dict(\n",
    "        gate=gate, gate_se=gate_se,\n",
    "        blp=blp, blp_se=blp_se,\n",
    "        Lambda=lamb, baseline=baseline, cate=cate,\n",
    "        name=type(model).__name__\n",
    "    )\n",
    "    return out\n",
    "\n",
    "\n",
    "def generic_ml_summary(generic_ml_output):\n",
    "    out = {\n",
    "        x: np.nanmedian(generic_ml_output[x], axis=0)\n",
    "        for x in [\"blp\", \"blp_se\", \"gate\", \"gate_se\", \"Lambda\"]\n",
    "    }\n",
    "    out[\"name\"] = generic_ml_output[\"name\"]\n",
    "    return out"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "706cdc80",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "kw = dict(x=Xl, treatment=treatment, n_split=11, n_group=5, cluster_id=loc_id)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cccac2e3",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "@ignore_warnings(category=ConvergenceWarning)\n",
    "def evaluate_models(models, y, **other_kw):\n",
    "    all_kw = kw.copy()\n",
    "    all_kw[\"y\"] = y\n",
    "    all_kw.update(other_kw)\n",
    "    return list(map(lambda x: generic_ml_model(model=x, **all_kw), models))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3193562c",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "def generate_report(results):\n",
    "    summaries = list(map(generic_ml_summary, results))\n",
    "    df_plot = pd.DataFrame({\n",
    "        mod[\"name\"]: np.median(mod[\"cate\"], axis=1)\n",
    "        for mod in results\n",
    "    })\n",
    "\n",
    "    print(\"Correlation in median CATE:\")\n",
    "    display(df_plot.corr())\n",
    "    sns.pairplot(df_plot, diag_kind=\"kde\", kind=\"reg\")\n",
    "\n",
    "    print(\"\\n\\nBest linear projection of CATE\")\n",
    "    df_cate = pd.concat({\n",
    "        s[\"name\"]: pd.DataFrame(dict(blp=s[\"blp\"], se=s[\"blp_se\"]))\n",
    "        for s in summaries\n",
    "    }).T.stack()\n",
    "    display(df_cate)\n",
    "\n",
    "    print(\"\\n\\nGroup average treatment effects:\")\n",
    "    df_groups = pd.concat({\n",
    "        s[\"name\"]: pd.DataFrame(dict(gate=s[\"gate\"], se=s[\"gate_se\"]))\n",
    "        for s in summaries\n",
    "    }).T.stack()\n",
    "    display(df_groups)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "91491001",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "import xgboost as xgb\n",
    "models = [\n",
    "    linear_model.LassoCV(cv=10, n_alphas=25, max_iter=500, tol=1e-4, n_jobs=1),\n",
    "    ensemble.RandomForestRegressor(n_estimators=200, min_samples_leaf=20),\n",
    "    xgb.XGBRegressor(n_estimators=200, max_depth=3, reg_lambda=2.0, reg_alpha=0.0, objective=\"reg:squarederror\"),\n",
    "    neural_network.MLPRegressor(hidden_layer_sizes=(20, 10), max_iter=500, activation=\"logistic\",\n",
    "                                solver=\"adam\", tol=1e-3, early_stopping=True, alpha=0.0001)\n",
    "]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a35f846c",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "results = evaluate_models(models, y=bw)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "86f86236",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "generate_report(results)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f723aac6",
   "metadata": {},
   "source": [
    "From the second table above, we see that regardless of the machine\n",
    "learning method, the estimated intercept (the first row of the table)\n",
    "is near 0 and statistically insignificant. Given our results for the unconditional\n",
    "ATE above, we should expect this. The estimate of the\n",
    "slopes are also either near 0, very imprecise, or both. This means\n",
    "that either the conditional average treatment effect is near 0 or that all\n",
    "four machine learning methods are very poor proxies for the true\n",
    "conditional average treatment effect."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "95cc3ed7",
   "metadata": {},
   "source": [
    "### Assisted Delivery\n",
    "\n",
    "Let’s see what we get when we look at assisted delivery."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "637993fa",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "ad = df.loc[X.index, \"good_assisted_delivery\"]#\"midwife_birth\"]\n",
    "results_ad = evaluate_models(models, y=ad)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9a508cdc",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "generate_report(results_ad)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bcdfd8c6",
   "metadata": {},
   "source": [
    "Now, the results are more encouraging. For all four machine learning\n",
    "methods, the slope estimate is positive and statistically\n",
    "significant. From this, we can conclude that the true conditional\n",
    "average treatment effect must vary with at least some covariates, and\n",
    "the machine learning proxies are at least somewhat correlated with the\n",
    "true conditional average treatment effect."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5fda0a45",
   "metadata": {},
   "source": [
    "### Covariate Means by Group\n",
    "\n",
    "Once we’ve detected heterogeneity in the grouped average treatment effects\n",
    "of using medical professionals for assisted delivery, it’s interesting to see\n",
    "how effects vary across groups. This could help\n",
    "us understand why the treatment effect varies or how to\n",
    "develop simple rules for targeting future treatments."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "17f358f8",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "df2 = df.loc[X.index, :]\n",
    "df2[\"edu99\"] = df2.edu == 99\n",
    "df2[\"educ\"] = df2[\"edu\"]\n",
    "df2.loc[df2[\"edu99\"], \"educ\"] = np.nan\n",
    "\n",
    "variables = [\n",
    "    \"log_xp_percap\",\"agecat\",\"educ\",\"tv\",\"goat\",\n",
    "    \"cow\",\"motorbike\",\"hh_cook_wood\",\"pkh_ever\"\n",
    "]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "51b0cd7a",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "def cov_mean_by_group(y, res, cluster_id):\n",
    "    n_group = res[\"gate\"].shape[1]\n",
    "    gate = res[\"gate\"].copy()\n",
    "    gate_se = gate.copy()\n",
    "    dat = y.to_frame()\n",
    "\n",
    "    for i in range(res[\"cate\"].shape[1]):\n",
    "        S = res[\"cate\"][:, i]\n",
    "        cutoffs = np.quantile(S, np.linspace(0, 1, n_group+1))\n",
    "        cutoffs[-1] += 1\n",
    "        for k in range(n_group):\n",
    "            dat[f\"G{k}\"] = ((cutoffs[k] <= S) & (S < cutoffs[k+1])) * 1.0\n",
    "\n",
    "        g_form = \"y ~ -1 + \" + \" + \".join([f\"G{k}\" for k in range(n_group)])\n",
    "        g_reg = smf.ols(g_form, data=dat.astype(float))\n",
    "        g_fit = g_reg.fit()\n",
    "        gate[i, :] = g_fit.params.filter(regex=\"G\").values\n",
    "        rows = ~y.isna()\n",
    "        gate_se[i, :] = get_treatment_se(g_fit, cluster_id, rows)\n",
    "\n",
    "    out = pd.DataFrame(dict(\n",
    "        mean=np.nanmedian(gate, axis=0),\n",
    "        se=np.nanmedian(gate_se, axis=0),\n",
    "        group=list(range(n_group))\n",
    "    ))\n",
    "\n",
    "    return out"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "16a942a4",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "def compute_group_means_for_results(results):\n",
    "    to_cat = []\n",
    "    for res in results:\n",
    "        for v in variables:\n",
    "            to_cat.append(\n",
    "                cov_mean_by_group(df2[v], res, loc_id)\n",
    "                .assign(method=res[\"name\"], variable=v)\n",
    "            )\n",
    "\n",
    "    group_means = pd.concat(to_cat, ignore_index=True)\n",
    "    group_means[\"plus2sd\"] = group_means.eval(\"mean + 1.96*se\")\n",
    "    group_means[\"minus2sd\"] = group_means.eval(\"mean - 1.96*se\")\n",
    "    return group_means"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "33883d39",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "group_means_ad = compute_group_means_for_results(results_ad)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "37757dab",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "g = sns.FacetGrid(group_means_ad, col=\"variable\", col_wrap=3, hue=\"method\", sharey=False)\n",
    "g.map(plt.plot, \"group\", \"mean\")\n",
    "g.map(plt.plot, \"group\", \"plus2sd\", ls=\"--\")\n",
    "g.map(plt.plot, \"group\", \"minus2sd\", ls=\"--\")\n",
    "g.add_legend();"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c892e743",
   "metadata": {},
   "source": [
    "From this, we see that the group predicted to be most affected by treatment\n",
    "are less educated, less likely to own a TV or\n",
    "motorbike, and more likely to participate in the program.\n",
    "\n",
    "If we wanted to maximize the program impact with a limited budget, targeting the program towards\n",
    "less educated and less wealthy mothers could be a good idea. The existing financial incentive\n",
    "already does this to some extent. As one might expect, a fixed-size\n",
    "cash incentive has a bigger behavioral impact on less wealthy\n",
    "individuals. If we want to further target these individuals, we\n",
    "could alter eligibility rules and/or increase the cash transfer\n",
    "for those with lower wealth."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8aef0f4a",
   "metadata": {},
   "source": [
    "### Caution\n",
    "\n",
    "When exploring treatment heterogeneity like above, we need to\n",
    "interpret our results carefully. In particular, looking at grouped\n",
    "treatment effects and covariate means conditional on group leads to\n",
    "many hypothesis tests (although we never stated null hypotheses or\n",
    "reported p-values, the inevitable eye-balling of differences in the\n",
    "above graphs compared to their confidence intervals has the same\n",
    "issues as formal hypothesis tests).  When we perform many hypothesis\n",
    "tests, we will likely stumble upon some statistically\n",
    "significant differences by chance. Therefore, writing about a single large difference found in the\n",
    "above analysis as though it is our main\n",
    "finding would be misleading (and perhaps unethical). The correct thing to do is to present all\n",
    "results that we have looked at.  See [this excellent news article](https://slate.com/technology/2013/07/statistics-and-psychology-multiple-comparisons-give-spurious-results.html)\n",
    "by statistician Andrew Gelman for more information."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "112e9308",
   "metadata": {},
   "source": [
    "## Causal Trees and Forests\n",
    "\n",
    "[[hetAI16](#id20)] develop the idea of “causal trees.” The purpose and\n",
    "method are qualitatively similar to the grouped average treatment\n",
    "effects. The main difference is that the groups in [[hetAI16](#id20)]\n",
    "are determined by a low-depth regression tree instead of by quantiles\n",
    "of a noisy estimate of the conditional average treatment effect. As\n",
    "above, sample-splitting is used to facilitate inference.\n",
    "\n",
    "Causal trees share many downsides of regression trees. In\n",
    "particular, the branches of the tree and subsequent results can be\n",
    "sensitive to small changes in the data.  [[hetWA18](#id19)] develop a\n",
    "causal forest estimator to address this concern. This causal forest\n",
    "estimates $ E[y_i(1) - y_i(0) |X_i=x] $ directly. Unlike most\n",
    "machine learning estimators, [[hetWA18](#id19)] prove that causal\n",
    "forests are consistent and pointwise asymptotically normal, albeit\n",
    "with a slower than $ \\sqrt{n} $ rate of convergence. In practice,\n",
    "this means that either the sample size must be very large (and/or $ x $\n",
    "relatively low dimension) to get precise estimates."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "778cb9dd",
   "metadata": {},
   "source": [
    "## References\n",
    "\n",
    "<a id='id21'></a>\n",
    "\\[hetACE+11\\] Vivi Alatas, Nur Cahyadi, Elisabeth Ekasari, Sarah Harmoun, Budi Hidayat, Edgar Janz, Jon Jellema, H Tuhiman, and M Wai-Poi. Program keluarga harapan : impact evaluation of indonesia's pilot household conditional cash transfer program. Technical Report, World Bank, 2011. URL: [http://documents.worldbank.org/curated/en/589171468266179965/Program-Keluarga-Harapan-impact-evaluation-of-Indonesias-Pilot-Household-Conditional-Cash-Transfer-Program](http://documents.worldbank.org/curated/en/589171468266179965/Program-Keluarga-Harapan-impact-evaluation-of-Indonesias-Pilot-Household-Conditional-Cash-Transfer-Program).\n",
    "\n",
    "<a id='id20'></a>\n",
    "\\[hetAI16\\] Susan Athey and Guido Imbens. Recursive partitioning for heterogeneous causal effects. *Proceedings of the National Academy of Sciences*, 113(27):7353–7360, 2016. URL: [http://www.pnas.org/content/113/27/7353](http://www.pnas.org/content/113/27/7353), [arXiv:http://www.pnas.org/content/113/27/7353.full.pdf](https://arxiv.org/abs/http://www.pnas.org/content/113/27/7353.full.pdf), [doi:10.1073/pnas.1510489113](https://doi.org/10.1073/pnas.1510489113).\n",
    "\n",
    "<a id='id18'></a>\n",
    "\\[hetCDDFV18\\] Victor Chernozhukov, Mert Demirer, Esther Duflo, and Iván Fernández-Val. Generic machine learning inference on heterogenous treatment effects in randomized experimentsxo. Working Paper 24678, National Bureau of Economic Research, June 2018. URL: [http://www.nber.org/papers/w24678](http://www.nber.org/papers/w24678), [doi:10.3386/w24678](https://doi.org/10.3386/w24678).\n",
    "\n",
    "<a id='id22'></a>\n",
    "\\[hetTri16\\] Margaret Triyana. Do health care providers respond to demand-side incentives? evidence from indonesia. *American Economic Journal: Economic Policy*, 8(4):255–88, November 2016. URL: [http://www.aeaweb.org/articles?id=10.1257/pol.20140048](http://www.aeaweb.org/articles?id=10.1257/pol.20140048), [doi:10.1257/pol.20140048](https://doi.org/10.1257/pol.20140048).\n",
    "\n",
    "<a id='id19'></a>\n",
    "\\[hetWA18\\] Stefan Wager and Susan Athey. Estimation and inference of heterogeneous treatment effects using random forests. *Journal of the American Statistical Association*, 0(0):1–15, 2018. URL: [https://doi.org/10.1080/01621459.2017.1319839](https://doi.org/10.1080/01621459.2017.1319839), [arXiv:https://doi.org/10.1080/01621459.2017.1319839](https://arxiv.org/abs/https://doi.org/10.1080/01621459.2017.1319839), [doi:10.1080/01621459.2017.1319839](https://doi.org/10.1080/01621459.2017.1319839)."
   ]
  }
 ],
 "metadata": {
  "date": 1738727696.6522565,
  "filename": "heterogeneity.md",
  "kernelspec": {
   "display_name": "Python",
   "language": "python3",
   "name": "python3"
  },
  "title": "Heterogeneous Effects"
 },
 "nbformat": 4,
 "nbformat_minor": 5
}