Decision Trees

Ahsan Ijaz

Loan risk assessment

Intelligent loan application

Decision tree

Step 1: Start with empty tree

Step 2: Split on a feature

Decide class on majority vote

Which term to split on??

Data example

Term vs credit?

Quality metric: Classification

  • Error measures fraction of mistakes

\[ \color{blue}{\textit{Error} = \frac{\textit{#incorrect predictions}}{\textit{# examples}}} \]

  • Best possible value: 0.0
  • Worst possible value: 1.0

Calculation of error

Splitting on Credit

Splitting on term

Greedy decision tree algorithm

  • Step 1: Start with an empty tree
  • Step 2: Select a feature to split data
  • For each split of the tree:
    • Step 3: If nothing more to, make predictions
    • Step 4: Otherwise, go to Step 2 and continue on this split

Level 1 of learned tree

Level 2 of learned tree

Complete tree

Stopping condition 1: All data agree on y

Stopping condition 2: All features have been used