Explore Courses
Liverpool Business SchoolLiverpool Business SchoolMBA by Liverpool Business School
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA (Master of Business Administration)
  • 15 Months
Popular
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Business Administration (MBA)
  • 12 Months
New
Birla Institute of Management Technology Birla Institute of Management Technology Post Graduate Diploma in Management (BIMTECH)
  • 24 Months
Liverpool John Moores UniversityLiverpool John Moores UniversityMS in Data Science
  • 18 Months
Popular
IIIT BangaloreIIIT BangalorePost Graduate Programme in Data Science & AI (Executive)
  • 12 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
upGradupGradData Science Bootcamp with AI
  • 6 Months
New
University of MarylandIIIT BangalorePost Graduate Certificate in Data Science & AI (Executive)
  • 8-8.5 Months
upGradupGradData Science Bootcamp with AI
  • 6 months
Popular
upGrad KnowledgeHutupGrad KnowledgeHutData Engineer Bootcamp
  • Self-Paced
upGradupGradCertificate Course in Business Analytics & Consulting in association with PwC India
  • 06 Months
OP Jindal Global UniversityOP Jindal Global UniversityMaster of Design in User Experience Design
  • 12 Months
Popular
WoolfWoolfMaster of Science in Computer Science
  • 18 Months
New
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Rushford, GenevaRushford Business SchoolDBA Doctorate in Technology (Computer Science)
  • 36 Months
IIIT BangaloreIIIT BangaloreCloud Computing and DevOps Program (Executive)
  • 8 Months
New
upGrad KnowledgeHutupGrad KnowledgeHutAWS Solutions Architect Certification
  • 32 Hours
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Popular
upGradupGradUI/UX Bootcamp
  • 3 Months
upGradupGradCloud Computing Bootcamp
  • 7.5 Months
Golden Gate University Golden Gate University Doctor of Business Administration in Digital Leadership
  • 36 Months
New
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Golden Gate University Golden Gate University Doctor of Business Administration (DBA)
  • 36 Months
Bestseller
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDoctorate of Business Administration (DBA)
  • 36 Months
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (DBA)
  • 36 Months
KnowledgeHut upGradKnowledgeHut upGradSAFe® 6.0 Certified ScrumMaster (SSM) Training
  • Self-Paced
KnowledgeHut upGradKnowledgeHut upGradPMP® certification
  • Self-Paced
IIM KozhikodeIIM KozhikodeProfessional Certification in HR Management and Analytics
  • 6 Months
Bestseller
Duke CEDuke CEPost Graduate Certificate in Product Management
  • 4-8 Months
Bestseller
upGrad KnowledgeHutupGrad KnowledgeHutLeading SAFe® 6.0 Certification
  • 16 Hours
Popular
upGrad KnowledgeHutupGrad KnowledgeHutCertified ScrumMaster®(CSM) Training
  • 16 Hours
Bestseller
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 4 Months
upGrad KnowledgeHutupGrad KnowledgeHutSAFe® 6.0 POPM Certification
  • 16 Hours
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Science in Artificial Intelligence and Data Science
  • 12 Months
Bestseller
Liverpool John Moores University Liverpool John Moores University MS in Machine Learning & AI
  • 18 Months
Popular
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
IIIT BangaloreIIIT BangaloreExecutive Post Graduate Programme in Machine Learning & AI
  • 13 Months
Bestseller
IIITBIIITBExecutive Program in Generative AI for Leaders
  • 4 Months
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
IIIT BangaloreIIIT BangalorePost Graduate Certificate in Machine Learning & Deep Learning (Executive)
  • 8 Months
Bestseller
Jindal Global UniversityJindal Global UniversityMaster of Design in User Experience
  • 12 Months
New
Liverpool Business SchoolLiverpool Business SchoolMBA with Marketing Concentration
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA with Marketing Concentration
  • 15 Months
Popular
MICAMICAAdvanced Certificate in Digital Marketing and Communication
  • 6 Months
Bestseller
MICAMICAAdvanced Certificate in Brand Communication Management
  • 5 Months
Popular
upGradupGradDigital Marketing Accelerator Program
  • 05 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Corporate & Financial Law
  • 12 Months
Bestseller
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in AI and Emerging Technologies (Blended Learning Program)
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Intellectual Property & Technology Law
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Dispute Resolution
  • 12 Months
upGradupGradContract Law Certificate Program
  • Self paced
New
ESGCI, ParisESGCI, ParisDoctorate of Business Administration (DBA) from ESGCI, Paris
  • 36 Months
Golden Gate University Golden Gate University Doctor of Business Administration From Golden Gate University, San Francisco
  • 36 Months
Rushford Business SchoolRushford Business SchoolDoctor of Business Administration from Rushford Business School, Switzerland)
  • 36 Months
Edgewood CollegeEdgewood CollegeDoctorate of Business Administration from Edgewood College
  • 24 Months
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with Concentration in Generative AI
  • 36 Months
Golden Gate University Golden Gate University DBA in Digital Leadership from Golden Gate University, San Francisco
  • 36 Months
Liverpool Business SchoolLiverpool Business SchoolMBA by Liverpool Business School
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA (Master of Business Administration)
  • 15 Months
Popular
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Business Administration (MBA)
  • 12 Months
New
Deakin Business School and Institute of Management Technology, GhaziabadDeakin Business School and IMT, GhaziabadMBA (Master of Business Administration)
  • 12 Months
Liverpool John Moores UniversityLiverpool John Moores UniversityMS in Data Science
  • 18 Months
Bestseller
O.P.Jindal Global UniversityO.P.Jindal Global UniversityMaster of Science in Artificial Intelligence and Data Science
  • 12 Months
Bestseller
IIIT BangaloreIIIT BangalorePost Graduate Programme in Data Science (Executive)
  • 12 Months
Bestseller
O.P.Jindal Global UniversityO.P.Jindal Global UniversityO.P.Jindal Global University
  • 12 Months
WoolfWoolfMaster of Science in Computer Science
  • 18 Months
New
Liverpool John Moores University Liverpool John Moores University MS in Machine Learning & AI
  • 18 Months
Popular
Golden Gate UniversityGolden Gate UniversityDBA in Emerging Technologies with concentration in Generative AI
  • 3 Years
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (AI/ML)
  • 36 Months
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDBA Specialisation in AI & ML
  • 36 Months
Golden Gate University Golden Gate University Doctor of Business Administration (DBA)
  • 36 Months
Bestseller
Ecole Supérieure de Gestion et Commerce International ParisEcole Supérieure de Gestion et Commerce International ParisDoctorate of Business Administration (DBA)
  • 36 Months
Rushford, GenevaRushford Business SchoolDoctorate of Business Administration (DBA)
  • 36 Months
Liverpool Business SchoolLiverpool Business SchoolMBA with Marketing Concentration
  • 18 Months
Bestseller
Golden Gate UniversityGolden Gate UniversityMBA with Marketing Concentration
  • 15 Months
Popular
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Corporate & Financial Law
  • 12 Months
Bestseller
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Intellectual Property & Technology Law
  • 12 Months
Jindal Global Law SchoolJindal Global Law SchoolLL.M. in Dispute Resolution
  • 12 Months
IIITBIIITBExecutive Program in Generative AI for Leaders
  • 4 Months
New
IIIT BangaloreIIIT BangaloreExecutive Post Graduate Programme in Machine Learning & AI
  • 13 Months
Bestseller
upGradupGradData Science Bootcamp with AI
  • 6 Months
New
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
KnowledgeHut upGradKnowledgeHut upGradSAFe® 6.0 Certified ScrumMaster (SSM) Training
  • Self-Paced
upGrad KnowledgeHutupGrad KnowledgeHutCertified ScrumMaster®(CSM) Training
  • 16 Hours
upGrad KnowledgeHutupGrad KnowledgeHutLeading SAFe® 6.0 Certification
  • 16 Hours
KnowledgeHut upGradKnowledgeHut upGradPMP® certification
  • Self-Paced
upGrad KnowledgeHutupGrad KnowledgeHutAWS Solutions Architect Certification
  • 32 Hours
upGrad KnowledgeHutupGrad KnowledgeHutAzure Administrator Certification (AZ-104)
  • 24 Hours
KnowledgeHut upGradKnowledgeHut upGradAWS Cloud Practioner Essentials Certification
  • 1 Week
KnowledgeHut upGradKnowledgeHut upGradAzure Data Engineering Training (DP-203)
  • 1 Week
MICAMICAAdvanced Certificate in Digital Marketing and Communication
  • 6 Months
Bestseller
MICAMICAAdvanced Certificate in Brand Communication Management
  • 5 Months
Popular
IIM KozhikodeIIM KozhikodeProfessional Certification in HR Management and Analytics
  • 6 Months
Bestseller
Duke CEDuke CEPost Graduate Certificate in Product Management
  • 4-8 Months
Bestseller
Loyola Institute of Business Administration (LIBA)Loyola Institute of Business Administration (LIBA)Executive PG Programme in Human Resource Management
  • 11 Months
Popular
Goa Institute of ManagementGoa Institute of ManagementExecutive PG Program in Healthcare Management
  • 11 Months
IMT GhaziabadIMT GhaziabadAdvanced General Management Program
  • 11 Months
Golden Gate UniversityGolden Gate UniversityProfessional Certificate in Global Business Management
  • 6-8 Months
upGradupGradContract Law Certificate Program
  • Self paced
New
IU, GermanyIU, GermanyMaster of Business Administration (90 ECTS)
  • 18 Months
Bestseller
IU, GermanyIU, GermanyMaster in International Management (120 ECTS)
  • 24 Months
Popular
IU, GermanyIU, GermanyB.Sc. Computer Science (180 ECTS)
  • 36 Months
Clark UniversityClark UniversityMaster of Business Administration
  • 23 Months
New
Golden Gate UniversityGolden Gate UniversityMaster of Business Administration
  • 20 Months
Clark University, USClark University, USMS in Project Management
  • 20 Months
New
Edgewood CollegeEdgewood CollegeMaster of Business Administration
  • 23 Months
The American Business SchoolThe American Business SchoolMBA with specialization
  • 23 Months
New
Aivancity ParisAivancity ParisMSc Artificial Intelligence Engineering
  • 24 Months
Aivancity ParisAivancity ParisMSc Data Engineering
  • 24 Months
The American Business SchoolThe American Business SchoolMBA with specialization
  • 23 Months
New
Aivancity ParisAivancity ParisMSc Artificial Intelligence Engineering
  • 24 Months
Aivancity ParisAivancity ParisMSc Data Engineering
  • 24 Months
upGradupGradData Science Bootcamp with AI
  • 6 Months
Popular
upGrad KnowledgeHutupGrad KnowledgeHutData Engineer Bootcamp
  • Self-Paced
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Bestseller
KnowledgeHut upGradKnowledgeHut upGradBackend Development Bootcamp
  • Self-Paced
upGradupGradUI/UX Bootcamp
  • 3 Months
upGradupGradCloud Computing Bootcamp
  • 7.5 Months
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 5 Months
upGrad KnowledgeHutupGrad KnowledgeHutSAFe® 6.0 POPM Certification
  • 16 Hours
upGradupGradDigital Marketing Accelerator Program
  • 05 Months
upGradupGradAdvanced Certificate Program in GenerativeAI
  • 4 Months
New
upGradupGradData Science Bootcamp with AI
  • 6 Months
Popular
upGradupGradFull Stack Software Development Bootcamp
  • 6 Months
Bestseller
upGradupGradUI/UX Bootcamp
  • 3 Months
PwCupGrad CampusCertification Program in Financial Modelling & Analysis in association with PwC India
  • 4 Months
upGradupGradCertificate Course in Business Analytics & Consulting in association with PwC India
  • 06 Months
upGradupGradDigital Marketing Accelerator Program
  • 05 Months

Decision Tree Example: Function & Implementation [Step-by-step]

Updated on 24 November, 2022

7.77K+ views
9 min read

Introduction

Decision Trees are one of the most powerful and popular algorithms for both regression and classification tasks. They are a flowchart like structure and fall under the category of supervised algorithms. The ability of the decision trees to be visualized like a flowchart enables them to easily mimic the thinking level of humans and this is the reason why these decision trees are easily understood and interpreted. 

What is a Decision Tree?

Decision Trees are a type of tree-structured classifiers. They have three types of nodes which are,

  • Root Nodes
  • Internal Nodes
  • Leaf Nodes

Image Source

The Root nodes are the primary nodes that represent the entire sample which is further split into several other nodes. The Internal nodes represent the test on an attribute while the branches represent the decision of the test. Finally, the leaf nodes denote the class of the label, which is the decision taken after the compilation of all attributes. Learn more about decision tree learning.

Enrol for the Machine Learning Course from the World’s top Universities. Earn Masters, Executive PGP, or Advanced Certificate Programs to fast-track your career.

How do Decision Trees work?

The decision trees are used in classification by sorting them down the entire tree structure from the root node to the leaf node. This approach used by the decision tree is called as the Top-Down approach. Once a particular data point is fed into the decision tree, it is made to pass through each and every node of the tree by answering Yes/No questions till it reaches the particular designated leaf node.

Each node in the decision tree represents a test case for an attribute and each descent (branch) to a new node corresponds to one of the possible answers to that test case. In this way, with multiple iterations, the decision tree predicts a value for the regression task or classifies the object in a classification task. 

Decision Tree Implementation

Now that we have the basics of a decision tree, let us go through on of its execution in Python programming.

Problem Analysis 

In the following example we are going to use the famous “Iris Flower” Dataset. Originally published in 1936 at UCI Machine Learning Repository, (Link: https://archive.ics.uci.edu/ml/datasets/Iris), this small dataset is widely used for testing out machine learning algorithms and visualizations.

In this, there are a total of 150 rows and 5 columns of which 4 columns are the attributes or features and the last column is the type of Iris flower species. Iris is a genus of flowering plants in botany. The four attributes in cm are, 

  • Sepal Length
  • Sepal Width
  • Petal Length 
  • Petal Width

These four features are used to define and classify the type of Iris flower depending upon the size and shape. The 5th or the last column consists of the Iris flower class, which are Iris Setosa, Iris Versicolor and Iris Virginica.

For our problem, we have to build a Machine Learning model utilizing Decision Tree Algorithm to learn the features and classify them based on the Iris flower class.

Let us go through its implementation in python, step by step:

Step 1: Importing the libraries

The first step in building any machine learning model in Python will be to import the necessary libraries such as Numpy, Pandas and Matplotlib. The tree module is imported from the sklearn library to visualise the Decision Tree model at the end.

Step 2: Importing the dataset

Once we have imported the Iris dataset, we store the .csv file into a Pandas DataFrame from which we can easily access the columns and rows of the table. The first four columns of the dataframe are the independent variables or the features which are to be understood by the decision tree classifier and are stored into the variable X.

The dependant variable which is the Iris flower class consisting of 3 species is stored into the variable y. The dataset is visualized by printing the first 5 rows.

Also Read: Decision Tree Classification

Step 3: Splitting the dataset into the Training set and Test set

In the following step, after reading the dataset, we have to split the entire dataset into the training set, using which the classifier model will be trained upon and the test set, on which the trained model will be implemented. The results obtained on the test set will be compared to check for accuracy of the trained model.

Here, we have used a test size of 0.25, which denotes that 25% of the entire dataset will be randomly split as the test set and the remaining 75% will consist of the training set to be used in training the model. Hence, out of 150 datapoints, 38 random datapoints are retained as the test set and the remaining 112 samples are used in the training set.

Step 4: Training the Decision Tree Classification model on the Training Set

Once the model has been split and is ready for training purpose, the DecisionTreeClassifier module is imported from the sklearn library and the training variables (X_train and y_train) are fitted on the classifier to build the model. During this training process, the classifier undergoes several optimization methods such as the Gradient Descent and Backpropagation and finally builds the Decision Tree Classifier model.

Step 5: Predicting the Test Set Results

As we have our model ready, shouldn’t we check its accuracy on the test set? This step involves the testing of the model built using decision tree algorithm on the test set that was split earlier. These results are stored in a variable, “y_pred”.

Step 6: Comparing the Real Values with Predicted Values

This is another simple step, where we will build another simple dataframe which will consist of two columns, the real values of the test set on one side and the predicted values on the other side. This step enables us to compare the results obtained by the model built.

Step 7: Confusion Matrix and Accuracy

Now that we have both the real and predicted values of the test sets, let us build a simple classification matrix and calculate the accuracy of our model built using simple library functions within sklearn. The accuracy score is calculated by inputting both the real and predicted values of the test set. The model built using the above steps gives us an accuracy of 92.1% which is denoted as 0.92105 in the step below. 

The confusion matrix is a table that is used to show the correct and incorrect predictions on a classification problem. For simple usage, the values across the diagonal represent the correct predictions and the other values outside of the diagonal are incorrect predictions.

Must Read: Decision Tree Interview Questions & Answers

On calculating the number from 38 test set datapoints we get 35 correct predictions and 3 incorrect predictions, which are reflected as 92% accurate. The accuracy can be improved by optimizing the hyperparameters which can be given as arguments to the classifier before training the model.

Step 8: Visualizing the Decision Tree Classifier

Finally, in the last step we shall visualize the Decision Tree built. On noticing the root node, it is seen that the number of “samples” are 112, which are in sync with the training set samples split before. The GINI index is calculated during each step of the decision tree algorithm and the 3 classes are split as shown in the “value” parameter in the decision tree. 

Conclusion

Hence, in this way, we have understood the concept of Decision Tree algorithm and have built a simple Classifier to solve a classification problem using this algorithm. 

If you’re interested to learn more about decision trees, machine learning, check out IIIT-B & upGrad’s PG Diploma in Machine Learning & AI which is designed for working professionals and offers 450+ hours of rigorous training, 30+ case studies & assignments, IIIT-B Alumni status, 5+ practical hands-on capstone projects & job assistance with top firms.

Frequently Asked Questions (FAQs)

1. What are the cons of using decision trees?

While decision trees help in the classification or sorting of data, their use sometimes creates a few problems too. Often, decision trees lead to the overfitting of data, which further makes the final result highly inaccurate. In case of large datasets, the use of a single decision tree is not recommended because it causes complexity. Also, decision trees are highly unstable, which means that if you cause a small change in the given dataset, the structure of the decision tree changes greatly.

2. How does a random forest algorithm work?

A random forest is essentially a collection of diverse decision trees, just like a forest is made up of many trees. The random forest algorithm's outcomes are actually dependent on the decision trees' predictions. The random forest technique also minimizes the likelihood of data over-fitting. To get the required outcome, random forest classification employs an ensemble approach. The training data is used to train various decision trees. When nodes are separated, this dataset contains observations and attributes that will be picked at random.

3. How is a decision table different from a decision tree?

A decision table may be produced from a decision tree, but not the other way around. A decision tree is made up of nodes and branches, whereas a decision table is made up of rows and columns. In decision tables, more than one or condition can be inserted. In decision trees, this is not the case. Decision tables are only useful when only a few properties are presented; decision trees, on the other hand, can be used effectively with a large number of properties and sophisticated logic.