Thursday, August 11, 2011

ggplot2: Determining the order in which lines are drawn

In a time series, I want to plot the values of an interesting cluster versus the background. However, if I'm not careful, ggplot will draw the items in an order determined by their name, so background items will obscure the interesting cluster:

Correct: Interesting lines in front of backgroundWrong: Background lines obscure interesting lines

One way to solve this is to combine the label and name columns into one column that is used to group the individual lines. In this toy example, the line belonging to group 1 should overlay the other two lines:

2 comments:

Bob Muenchen said...

Thanks for that handy example! Another way to do that is to use the levels argument when you make l a factor:

df <- data.frame( n=c("a","a","b","b","c","c"),
x = rep(c(1,2), 3),
y = rep(c(1), 6),
l = c(1,1,0,0,0,0))
df
df$l <- factor(df$l, levels=c(1,0))
p <- ggplot(df, aes(x,y)) + geom_line(aes(group=n, color=l))
print(p)

Or you could use the relevel or reoder functions after l is created.

Cheers,
Bob Muenchen

Michael Kuhn said...

Hi Bob, you're right, in this simple example adding making l a factor indeed helps. In my actual use case, curiously, it doesn't. But there I also specify alpha (where I have to use a continuous scale), so I'm not sure what ggplot does internally to decide the plotting order.

cheers, Michael