Tutorial on Decision Tree Classifier: What is Decision Tree

What is Decision Tree?

Decision tree is a hierarchical tree structure that used to classify classes based on a series of questions (or rules) about the attributes of the class. The attributes of the classes can be any type of variables from binary, nominal, ordinal, and quantitative values, while the classes must be qualitative type (categorical or binary, or ordinal). In short, given a data of attributes together with its classes, a decision tree produces a sequence of rules (or series of questions) that can be used to recognize the class.

Click here to purchase the complete E-book of this tutorial

Example

Let us start with an example. Throughout this tutorial, we will use the following 10 training data. The training data is supposed to be a part of a transportation study regarding mode choice to select Bus, Car or Train among commuters along a major route in a city, gathered through a questionnaire study. The data have 4 attributes which I selected for the shake of clarity. Attribute gender is binary type, car ownership is quantitative integer (thus behave like nominal). Travel cost/km is quantitative of ratio type but in here I put into ordinal type (because quantitative data need to be split into qualitative data) and income level is also an ordinal type.

Attributes

Classes

Gender

Car ownership

Travel Cost ($)/km

Income Level

Transportation mode

Male
0
Cheap

Low

Bus

Male

1

Cheap

Medium

Bus

Female

1

Cheap

Medium

Train

Female

0

Cheap

Low

Bus

Male

1

Cheap

Medium

Bus

Male

0

Standard

Medium

Train

Female

1

Standard

Medium

Train

Female

1

Expensive

High

Car

Male

2

Expensive

Medium

Car

Female

2

Expensive

High

Car

Based on above training data, we can induce a decision tree as the following:

Notice that attribute "income level" is not included in the decision tree because based on the given data attribute "travel cost per km" would produce better classification than "income level". We will see later how the decision is generated. In the next section , I will discuss how to use a decision tree to predict unseen record.

Click here to purchase the complete E-book of this tutorial

< Previous | Next | Content >

This tutorial is copyrighted .

Preferable reference for this tutorial is

Teknomo, Kardi. (2009) Tutorial on Decision Tree. http://people.revoledu.com/kardi/tutorial/DecisionTree/

		Attributes		Classes
Gender	Car ownership	Travel Cost ($)/km	Income Level	Transportation mode
Male	0	Cheap	Low	Bus
Male	1	Cheap	Medium	Bus
Female	1	Cheap	Medium	Train
Female	0	Cheap	Low	Bus
Male	1	Cheap	Medium	Bus
Male	0	Standard	Medium	Train
Female	1	Standard	Medium	Train
Female	1	Expensive	High	Car
Male	2	Expensive	Medium	Car
Female	2	Expensive	High	Car