202412112122
Status: #idea
Tags: Decision Trees, Bayes Theorem, Markov Chain Monte Carlo
State: #nascient

Bayesian Additive Regression Trees (BART)

Similar to both Random Forests and Boosting.
Each tree is constructed randomly as in random forest, and is built sequentially through an approach where each successive tree tries to capture information about the previous trees like in Boosting

Main novelty: The how trees are generated

We fit K trees.

  1. For each tree we start with a root node who's decision is just the mean and compute the residuals of that tree.
  2. We compute the residual
  3. We apply some perturbation to the tree randomly at each step (either change tree structure by adding/removing a branch, or changing the prediction in the terminal nodes)
  4. We consider the residual obtained when considering the output of all the trees until now.
  5. Repeat for B iterations
  6. Enjoy!?

That's really it, the result is then obtained by taking the average of the predictions of each tree beyond a burn-in phase L<B , and across all trees K.

Pros