Definition
- A decision tree is a flow-chart-like tree structure
- Internal node denotes a test on an attribute (feature)
- Branch represents an outcome of the test
- All records in a branch have the same value for the tested attribute
- Leaf node represents class label or class label distribution
Decision Tree |
Advantages - Common type of classifier
- Predictive accuracy: Captures complex patterns.
- Speed: Faster than many others to build. Very quick to apply model.
- Robustness: Can handle noise and missing values.
- Scalability: Some implementations are scalable.
- Interpretability: Very readable. To classify an instance, walk your way down the tree following the rules.
Decision Tree Classification task
Example
The provided image is an excellent example of a Decision Tree (DT) model, a fundamental concept in machine learning. On the left is the training data table, which contains several features (age, income, student, credit_rating) and a target variable (buys_computer). The goal is to predict whether a person will buy a computer based on their characteristics.
On the right, the decision tree visually represents the classification model learned from the data. The tree starts with a root node (age?) and uses a series of internal nodes (questions) and branches (answers) to classify an instance. By following a path down the tree based on the values of the features for a new data point, a prediction is made at a leaf node (the colored squares at the end of the branches).
For example, to predict if a person buys a computer, the tree first checks their age. If their age is <=30
, it then checks if they are a student. If they are a student, the tree predicts "yes" (they will buy a computer). This process of recursively partitioning the data based on the most informative features is the core mechanism of decision tree algorithms, such as the ID3 method by Quinlan mentioned in the image.
Geometric interpretation
The diagram provides a geometric interpretation of how a decision tree classifies data by recursively partitioning the feature space. The scatter plot on the left shows data points represented in a 2D space defined by two features: income and age. Each point is a customer, with green circles and purple plus signs representing two different classes. The decision tree on the right corresponds to the splits in this geometric space. The first split, based on income, creates a vertical decision boundary at 50K. This splits the data into two regions: those with income less than 50K and those with income greater than or equal to 50K. The tree's next split is based on age, creating a horizontal decision boundary at age 45, but only for the group with income greater than or equal to 50K. This process continues, creating a series of axis-parallel decision boundaries that divide the feature space into rectangular regions. Each region corresponds to a leaf node in the decision tree and is assigned a class label, which is used to make a prediction for any new point (like the red dot with a '?') that falls within that region.
Next up -
Comments
Post a Comment