Decision
Tree is a tool which suggests or tells
about the decision. It represents the decision and its consequences in a tree
like graphical form. It is basically a binary tree displaying an algorithm
called ID3 which was developed by J.Quilan.
Here, the search appears in the form of a branching tree with just only two
possible outcomes.
This tool is widely being used in operations research, to develop strategy, to achieve goals, as it requires a systematic process to arrive at an analysis. It formalizes the brainstorming process in the form of a document. It is majorly being used in mechanical type of learning in order to interpret the data.
1. What is Decision Tree?
Decision
Tree is a regression model represented
in the form of tree. The algorithm divides
the data into various subsets and simultaneously develops related decision tree
incrementally in a step by step manner. The tree has two nodes, one is the decision
node, which has further branches and the other is leaf nodes, which does not
branch out but depicts the final, desired output or result. The last among the
nodes is referred to as the end node while the initial decision node is
referred as root node. As it is the root, it is considered to be the best
predictor.
2. Example of Decision Tree
Suppose
there is sample of 30 High School students. They are classified on three
attributes. For Example: Gender, height (5 to 6 ft.) and Class (IX or
X). There are 15 students, among these 30 students who like
playing cricket. How can we develop a model describing the students who like
playing cricket?
In the above problem, there is a need to segregate cricket playing students based on highly significant input variable among all three.
3. Types of Decision Trees
Based on the targeted values, the
decision trees are of the following two types:
- Categorical Variable Decision
Tree: The data which has
categorical values as target are dealt with here. For example, deciding on
whether to say Yes or No in a task or a game.
- Continuous Variable Decision
Tree: The data which has continuous
values as target are considered here. For example, having age as a target
value.
4. Why Decision Trees are Regarded Superior to other Algorithm?
Decision Tree is an instrument used to analyse multiple variables. It allows predicting, explaining, classifying and describing the various possibilities of an outcome or event to occur. It goes beyond simple one to one cause-effect relationship. The superiority of the models goes to the ease and strength with the variable type of data and its levels of measurement. They find strong relationship between input and target values.
5. Advantages and Disadvantages of Decision Tree
Decision Tree is used to depict visually all the decisions with their relevant factors and consequences, which aims to ease your analysis. It has some advantages and disadvantages which are as under:
Advantages Disadvantages Easy to understand and code It is a high variance classifier Easily handled, skewed variables
as they do not assume on the basis of variable distribution Over fitting Unlike other algorithm , it
explains non-linearity in intuitive way It allows for forward and
backward calculation along the decision path The tree helps in collapsing a set
of categorical values into the range of the selected target
6. Application It
has been widely used as integrity checking mechanism to validate the data
provided by the providers. There are many soft wares that provide the Decision
tree for the data. R and Python users have many soft wares with packages that
allow you to develop a tree in order to arrive at a fair decision.