In this example you will learn how to use reshape and ggplot together to investigate sensory data.
These data were collected by an experimenter interested in the effects of fryer oil on the taste of french fries. Two replicates of three different fryer oils were tasted weekly for 10 weeks. The flavours tasted for were potatoey (good), buttery, grassy, rancid and painty (bad)
You can find out more about the data with ?french_fries
First, we can try using the missing values plot. This provides a univariate summary of the missing values
> ggmissing(french_fries)![]()
In this case, it's not very useful. We can see that the proportion of missing values is very small but not much more. To do better, we are going to have to look at more than one variable at a time using the reshape package. The first step is to melt the data, to make it easy to cast into new shapes:
> ffm <- melt(french_fries, id = 1:4, preserve.na = FALSE)
We can then use cast, along with the fluctuation plot, to investigate the pattern of missing values over time and subjects. There should be a total of 30 observations (2 reps x 5 treatments x 3 treatments) for each time/subject combination, so we will subtract of 30 to see the number of missings.
> ggfluctuation(cast(ffm, subject + time ~ ., function() 30 - length(x)))![]()
There are two types of missing values: when the subject didn't turn up at that time (grey boxes) or missed some measurements (dark grey boxes)
Finally, we are going to look at the reliability of the scores. We do this by casting the data into a form where the replicates form the columns, and then plot a scatterplot for each measured variable. We can supplement the scatterplot with a 45 degree line to show optimal agreement, and with contours of a 2d density estimate to help with overplotting.
> p <- ggplot(cast(ffm, ... ~ rep), aes = list(x = X1, y = X2),
formula = . ~ variable)
> p <- ggpoint(p, size = 0.8)
> ggabline(p, range = c(0, 15))
> gg2density(p)