A user having a dataframe containing factor but after subsetting the data frame the column loses the class as factor. How to fix that?

444    Asked by sahin khan in Data Science , Asked on Dec 13, 2019
Answered by sahin khan

df <- data.frame(letters=letters[1:5], numbers=seq(1:5))

levels(df$letters)

## [1] "a" "b" "c" "d" "e"

subdf <- subset(df, numbers <= 3)

## letters numbers

## 1 a 1

## 2 b 2

## 3 c 3

# all levels are still there!

levels(subdf$letters)

## [1] "a" "b" "c" "d" "e"

The issue can be solved by changing the column containing categorical variable to a factor class after subsetting to a new dataframe such as

> subdf$letters

[1] a b c

Levels: a b c d e

subdf$letters <- factor(subdf$letters)

> subdf$letters

[1] a b c

Levels: a b c



Your Answer

Interviews

Parent Categories