Explain gather() and spread() in R along with an example.
Gather() function is used to collapse multiple columns into key-pair values. The data frame above is considered wide since the time variable (represented as quarters) is structured such that each quarter represents a variable.
Let us create fake data to clean with tidyr functions.
comp <- c(1,1,1,2,2,2,3,3,3)
yr <- c(1998,1999,2000,1998,1999,2000,1998,1999,2000)
q1 <- runif(9, min=0, max=100)
q2 <- runif(9, min=0, max=100)
q3 <- runif(9, min=0, max=100)
q4 <- runif(9, min=0, max=100)
df <- data.frame(comp=comp,year=yr,Qtr1 = q1,Qtr2 = q2,Qtr3 = q3,Qtr4 = q4)
Df
Now let us use gather data using pipe operator
# Using Pipe Operator
head(df %>% gather(Quarter,Revenue,Qtr1:Qtr4))
By using just the function, we can do something like this
# With just the function
head(gather(df,Quarter,Revenue,Qtr1:Qtr4))
Now we will use spread function to see what it does
Let us create a different data
stocks <- data.frame(
time = as.Date('2009-01-01') + 0:9,
X = rnorm(10, 0, 1),
Y = rnorm(10, 0, 2),
Z = rnorm(10, 0, 4)
)
stocks
stocksm %>% spread(stock, price)
stocksm %>% spread(time, price)