I am currently studying Andrew NG’s ‘Machine Learning Specialization’ on Coursera and the most recent concept I have learned about is Decision Tree Modeling. I thought this would actually be a whole lot more complicated than it is but I believe this is something everyone could definitely benefit from.
After all, a decision tree is almost how our brains make decisions in real life. Just think about it, if you had the option to go to a cafe and study or study at home. You would have to weigh up a few different features and decide which option is the best for you. Some of these features could include, how much money you have, time of day, energy level, distance to the cafe as well as the type of content you are trying to study. After you weigh all these options up you either end up deciding to stay at home or go to the cafe. There is a third option which involves running the model again and again because you can’t seem to make a decision.
Anyway, enough waffle. Here is how I have understood a ‘Decision Tree’. (Thanks again to Andrew NG, I really recommend his course on Coursera)
This next image was taken directly from the course and gives us a great overview of the data we have to work with.
You can see that the Y or labels can either be 0 or 1 which means this is a binary classification task. You can also see that in this example each of the features has only two options. For the sake of understanding the model first, this relatively simple model is used. The more choices you have, the more complicated your tree is going to get.
Here you can see that we have a new test example that has pointy ears, a round face as well as the presence of whiskers. Using this example the algorithm will ‘predict’ that the new test example is a cat. This is only one specific tree that we could train, there are multiple other possibilities that could be created.
Here is what a decision tree COULD look like. You will notice that the root node is the starting point and is also a decision node. The second two features on the second layer are decision nodes and the 3rd layer are referred to as leaf nodes since they make predictions.
There could be a lot more layers in between and this could get confusing but if you just remember this:
The starting point is the root node. (Can also be a decision node)
A node where a decision is being made is a decision node.
A node where a prediction is being made is called a leaf node.
It should be a lot easier.
Here are a few other possible decision trees.
And the job of the decision tree algorithm is to pick a decision tree from all the possibilities that have the highest accuracy for its prediction. There is a lot more math involved in this which we can cover at a later point.