StochasticTree
diff --git a/‎demo/notebooks/multi_chain.ipynb‎
Lines changed: 245 additions & 0 deletions b/‎demo/notebooks/multi_chain.ipynb‎
Lines changed: 245 additions & 0 deletions
diff --git a/‎src/R_random_effects.cpp‎
Lines changed: 3 additions & 3 deletions b/‎src/R_random_effects.cpp‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎src/py_stochtree.cpp‎
Lines changed: 18 additions & 2 deletions b/‎src/py_stochtree.cpp‎
Lines changed: 18 additions & 2 deletions
@@ -0,0 +1,245 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "fbb58475",
+   "metadata": {},
+   "source": [
+    "# Running Multiple Chains (Sequentially or in Parallel) in StochTree\n",
+    "\n",
+    "Mixing of an MCMC sampler is a perennial concern for complex Bayesian models. BART and BCF are no exception. On common way to address such concerns is to run multiple independent \"chains\" of an MCMC sampler, so that if each chain gets stuck in a different region of the posterior, their combined samples attain better coverage of the full posterior.\n",
+    "\n",
+    "This idea works with the classic \"root-initialized\" MCMC sampler of Chipman et al (2010), but a key insight of He and Hahn (2023) and Krantsevich et al (2023) is that the GFR algorithm may be used to warm-start initialize multiple chains of the BART / BCF MCMC sampler.\n",
+    "\n",
+    "Operationally, the above two approaches have the same implementation (setting `num_gfr > 0` if warm-start initialization is desired), so this vignette will demonstrate how to run a multi-chain sampler sequentially.\n",
+    "\n",
+    "To begin, load `stochtree` and other relevant libraries"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1310a192",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "import matplotlib.pyplot as plt\n",
+    "import arviz as az\n",
+    "from sklearn.model_selection import train_test_split\n",
+    "from stochtree import BARTModel, BCFModel"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7ea3a091",
+   "metadata": {},
+   "source": [
+    "# Demo 1: Supervised Learning"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c85b6eac",
+   "metadata": {},
+   "source": [
+    "## Data Simulation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ea171e49",
+   "metadata": {},
+   "source": [
+    "Simulate a simple partitioned linear model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0e80f3b7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Generate the data\n",
+    "random_seed = 1111\n",
+    "rng = np.random.default_rng(random_seed)\n",
+    "n = 500\n",
+    "p_x = 10\n",
+    "p_w = 1\n",
+    "snr = 3\n",
+    "X = rng.uniform(size=(n, p_x))\n",
+    "leaf_basis = rng.uniform(size=(n, p_w))\n",
+    "f_XW = (((0 <= X[:, 0]) & (0.25 > X[:, 0])) *\n",
+    "         (-7.5 * leaf_basis[:, 0]) +\n",
+    "         ((0.25 <= X[:, 0]) & (0.5 > X[:, 0])) * (-2.5 * leaf_basis[:, 0]) +\n",
+    "         ((0.5 <= X[:, 0]) & (0.75 > X[:, 0])) * (2.5 * leaf_basis[:, 0]) +\n",
+    "         ((0.75 <= X[:, 0]) & (1 > X[:, 0])) * (7.5 * leaf_basis[:, 0]))\n",
+    "noise_sd = np.std(f_XW) / snr\n",
+    "y = f_XW + rng.normal(0, noise_sd, size=n)\n",
+    "\n",
+    "# Split data into test and train sets\n",
+    "test_set_pct = 0.2\n",
+    "train_inds, test_inds = train_test_split(np.arange(n), test_size=test_set_pct, random_state=random_seed)\n",
+    "n_train = len(train_inds)\n",
+    "n_test = len(test_inds)\n",
+    "X_train = X[train_inds]\n",
+    "X_test = X[test_inds]\n",
+    "leaf_basis_train = leaf_basis[train_inds]\n",
+    "leaf_basis_test = leaf_basis[test_inds]\n",
+    "y_train = y[train_inds]\n",
+    "y_test = y[test_inds]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dfb36dbe",
+   "metadata": {},
+   "source": [
+    "## Sampling Multiple Chains Sequentially\n",
+    "\n",
+    "The simplest way to sample multiple chains of a stochtree model is to do so \"sequentially,\" that is, after chain 1 is sampled, chain 2 is sampled from a different starting state, and similarly for each of the requested chains. This is supported internally in both the `bart()` and `bcf()` functions, with the `num_chains` parameter in the `general_params` list.\n",
+    "\n",
+    "Define some high-level parameters, including number of chains to run and number of samples per chain. Here we run 4 independent chains with 5000 MCMC iterations, each of which is initialized by a different \"grow-from-root\" sample (the last 4 of 5 GFR samples) and  burned in for 2000 iterations after warm-start."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e3e978d6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "num_chains = 4\n",
+    "num_gfr = 5\n",
+    "num_burnin = 2000\n",
+    "num_mcmc = 5000"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "89e67fe1",
+   "metadata": {},
+   "source": [
+    "Run the sampler"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c1af3f2d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "bart_model = BARTModel()\n",
+    "bart_model.sample(\n",
+    "  X_train = X_train,\n",
+    "  leaf_basis_train = leaf_basis_train,\n",
+    "  y_train = y_train,\n",
+    "  num_gfr = num_gfr,\n",
+    "  num_burnin = num_burnin,\n",
+    "  num_mcmc = num_mcmc,\n",
+    "  general_params = {'num_chains' : num_chains}\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5bbcc00a",
+   "metadata": {},
+   "source": [
+    "Now we have a `BARTModel` object with `num_chains * num_mcmc` samples stored internally. These samples are arranged sequentially, with the first `num_mcmc` samples corresponding to chain 1, the next `num_mcmc` samples to chain 2, etc...\n",
+    "\n",
+    "Since each chain is a set of samples of the same model, we can analyze the samples collectively, for example, by looking at out-of-sample predictions."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0cfd1a28",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "y_hat_test = bart_model.predict(\n",
+    "  covariates = X_test,\n",
+    "  basis = leaf_basis_test, \n",
+    "  type = \"mean\", \n",
+    "  terms = \"y_hat\"\n",
+    ")\n",
+    "plt.scatter(y_hat_test, y_test)\n",
+    "plt.xlabel(\"Estimated conditional mean\")\n",
+    "plt.ylabel(\"Actual outcome\")\n",
+    "plt.axline((0, 0), slope=1, color=\"black\", linestyle=(0, (3, 3)))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "89015a31",
+   "metadata": {},
+   "source": [
+    "Now, suppose we want to analyze each of the chains separately to assess mixing / convergence.\n",
+    "\n",
+    "We can use our knowledge of the internal arrangement of the chain samples to construct a an `mcmc.list` in the `coda` package, from which we can perform various diagnostics."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "96cee0e4",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "sigma2_samples = bart_model.global_var_samples\n",
+    "sigma2_samples_by_chain = {\"sigma2\": np.reshape(sigma2_samples, (num_chains, num_mcmc))}\n",
+    "az.plot_trace(sigma2_samples_by_chain)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "08137cda",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "az.ess(sigma2_samples_by_chain)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "552ba09c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "az.rhat(sigma2_samples_by_chain)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2a0c65f0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "az.plot_autocorr(sigma2_samples_by_chain)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5a0659ff",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "az.plot_violin(sigma2_samples_by_chain)"
+   ]
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "name": "python"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
@@ -321,7 +321,7 @@ cpp11::list rfx_label_mapper_to_list_cpp(cpp11::external_pointer<StochTree::Labe
 void reset_rfx_model_cpp(cpp11::external_pointer<StochTree::MultivariateRegressionRandomEffectsModel> rfx_model, 
                          cpp11::external_pointer<StochTree::RandomEffectsContainer> rfx_container, 
                          int sample_num) {
-    // Reet the RFX tracker
+    // Reset the RFX model from a previous sample
     rfx_model->ResetFromSample(*rfx_container, sample_num);
 }
 
@@ -330,7 +330,7 @@ void reset_rfx_tracker_cpp(cpp11::external_pointer<StochTree::RandomEffectsTrack
                            cpp11::external_pointer<StochTree::RandomEffectsDataset> dataset, 
                            cpp11::external_pointer<StochTree::ColumnVector> residual, 
                            cpp11::external_pointer<StochTree::MultivariateRegressionRandomEffectsModel> rfx_model) {
-    // Reset the RFX tracker
+    // Reset the RFX tracker from a previous sample
     tracker->ResetFromSample(*rfx_model, *dataset, *residual);
 }
 
@@ -339,6 +339,6 @@ void root_reset_rfx_tracker_cpp(cpp11::external_pointer<StochTree::RandomEffects
                                 cpp11::external_pointer<StochTree::RandomEffectsDataset> dataset, 
                                 cpp11::external_pointer<StochTree::ColumnVector> residual, 
                                 cpp11::external_pointer<StochTree::MultivariateRegressionRandomEffectsModel> rfx_model) {
-    // Reset the RFX tracker
+    // Reset the RFX tracker from root
     tracker->RootReset(*rfx_model, *dataset, *residual);
 }
@@ -1526,6 +1526,8 @@ class RandomEffectsTrackerCpp {
   StochTree::RandomEffectsTracker* GetTracker() {
     return rfx_tracker_.get();
   }
+  void Reset(RandomEffectsModelCpp& rfx_model, RandomEffectsDatasetCpp& rfx_dataset, ResidualCpp& residual);
+  void RootReset(RandomEffectsModelCpp& rfx_model, RandomEffectsDatasetCpp& rfx_dataset, ResidualCpp& residual);
 
  private:
   std::unique_ptr<StochTree::RandomEffectsTracker> rfx_tracker_;
@@ -1630,6 +1632,9 @@ class RandomEffectsModelCpp {
   void SetVariancePriorScale(double scale) {
     rfx_model_->SetVariancePriorScale(scale);
   }
+  void Reset(RandomEffectsContainerCpp& rfx_container, int sample_num) {
+    rfx_model_->ResetFromSample(*rfx_container.GetRandomEffectsContainer(), sample_num);
+  }
 
  private:
   std::unique_ptr<StochTree::MultivariateRegressionRandomEffectsModel> rfx_model_;
@@ -2144,6 +2149,14 @@ void RandomEffectsModelCpp::SampleRandomEffects(RandomEffectsDatasetCpp& rfx_dat
   if (keep_sample) rfx_container.AddSample(*this);
 }
 
+void RandomEffectsTrackerCpp::Reset(RandomEffectsModelCpp& rfx_model, RandomEffectsDatasetCpp& rfx_dataset, ResidualCpp& residual) {
+  rfx_tracker_->ResetFromSample(*rfx_model.GetModel(), *rfx_dataset.GetDataset(), *residual.GetData());
+}
+
+void RandomEffectsTrackerCpp::RootReset(RandomEffectsModelCpp& rfx_model, RandomEffectsDatasetCpp& rfx_dataset, ResidualCpp& residual) {
+  rfx_tracker_->RootReset(*rfx_model.GetModel(), *rfx_dataset.GetDataset(), *residual.GetData());
+}
+
 PYBIND11_MODULE(stochtree_cpp, m) {
   m.def("cppComputeForestContainerLeafIndices", &cppComputeForestContainerLeafIndices, "Compute leaf indices of the forests in a forest container");
   m.def("cppComputeForestMaxLeafIndex", &cppComputeForestMaxLeafIndex, "Compute max leaf index of a forest in a forest container");
@@ -2369,7 +2382,9 @@ PYBIND11_MODULE(stochtree_cpp, m) {
   py::class_<RandomEffectsTrackerCpp>(m, "RandomEffectsTrackerCpp")
     .def(py::init<py::array_t<int>>())
     .def("GetUniqueGroupIds", &RandomEffectsTrackerCpp::GetUniqueGroupIds)
-    .def("GetTracker", &RandomEffectsTrackerCpp::GetTracker);
+    .def("GetTracker", &RandomEffectsTrackerCpp::GetTracker)
+    .def("Reset", &RandomEffectsTrackerCpp::Reset)
+    .def("RootReset", &RandomEffectsTrackerCpp::RootReset);
 
   py::class_<RandomEffectsLabelMapperCpp>(m, "RandomEffectsLabelMapperCpp")
     .def(py::init<>())
@@ -2391,7 +2406,8 @@ PYBIND11_MODULE(stochtree_cpp, m) {
     .def("SetWorkingParameterCovariance", &RandomEffectsModelCpp::SetWorkingParameterCovariance)
     .def("SetGroupParameterCovariance", &RandomEffectsModelCpp::SetGroupParameterCovariance)
     .def("SetVariancePriorShape", &RandomEffectsModelCpp::SetVariancePriorShape)
-    .def("SetVariancePriorScale", &RandomEffectsModelCpp::SetVariancePriorScale);
+    .def("SetVariancePriorScale", &RandomEffectsModelCpp::SetVariancePriorScale)
+    .def("Reset", &RandomEffectsModelCpp::Reset);
 
   py::class_<GlobalVarianceModelCpp>(m, "GlobalVarianceModelCpp")
     .def(py::init<>())