Adaptive Boosting From Scratch
In order to better understand the process involved in adaptive boosting I made a simple boosting model myself recently. Which can be found here. Now I’d like to go through what is boosting and what it can do for us.
Adaptive Boosting uses many decision trees with a depth of one to split the data:
These shallow decision tree are called stumps because of their limited depth. A single stump splits the data into two halves and in classification this means that it attempts to categorise data points based on this single split. Any data set that can’t be split by a single true or false question has values that are wrongfully categorised by this.
The wrongfully labelled data points are then given more weight. This extra weight encourages a new decision tree stump to favor them and then a new split will be created.
This process will continue with many decision tree stumps. Each of these stumps is a weak classifier that can be combined with the others to produce a strong model overall. This can be done by voting classification. Where each classifier is assigned a weight that corresponds to its accuracy.
This process works under a fairly simple principle that with each split of the data, the wrongfully categorised data becomes more and more heavily weighted. Thereby encouraging the model to produce specific rules to handle the most difficult data points.
This was a fairly simple model to make from scratch though I differed from the ordinary implementation quite significantly. I had all the weak classifiers produce their predictions and then had a ridge classifier train on the resulting prediction table and the original data that the boosted model was originally trained on. This allowed for the model to gain greater accuracy and was interesting to me as it implies that the process produces extra beneficial information that can be added to tabular data for use by other means.