How to calculate the error rate for a decision tree with R?

5.6K    Asked by GayatriJaiteley in Data Science , Asked on Nov 4, 2019
Answered by Gayatri Jaiteley

To calculate the error rate for a decision tree in R, assuming the mean computing error rate on the sample used to fit the model, we can use printcp().

> library(rpart)

> fit <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)

> printcp(fit)

Classification tree:

rpart(formula = Kyphosis ~ Age + Number + Start, data = kyphosis)

Variables actually used in tree construction:

[1] Age Start

Root node error: 17/81 = 0.20988

n= 81

        CP nsplit rel error xerror xstd

1 0.176471 0 1.00000 1.00000 0.21559

2 0.019608 1 0.82353 0.82353 0.20018

3 0.010000 4 0.76471 0.82353 0.20018

The Root node error is used to compute two measures of predictive performance, when considering values displayed in the rel error column and xerror column.

It is also seen that it is more or less in agreement with classification accuracy from tree

> library(tree)

> summary(tree(Kyphosis ~ Age + Number + Start, data=kyphosis))

Classification tree:

tree(formula = Kyphosis ~ Age + Number + Start, data = kyphosis)

Number of terminal nodes: 10

Residual mean deviance: 0.5809 = 41.24 / 71

Misclassification error rate: 0.1235 = 10 / 81

Here Misclassification error rate is computed from the training sample.


Your Answer

Interviews

Parent Categories