geom_density returns plot without considering real values

You are calculating the density of your x-axis, which in your case is Variable 2, the same thing (1,2,...,7) for every Variable 1, so it gives the same density.

So i think that you want your x-axis to be value, and you actually don’t need Variable 2 as it’s a mere index.

ggplot(df, aes(x=value, y=Variable1)) +
  geom_density_ridges(aes(fill=Variable1)) 

enter image description here

EDIT 1:

The geom you want actually is geom_line, or geom_smooth (for prettier graphs), or maybe geom_area for filling the area under the curves.

Now, one way of doing it would be putting all the curves on the same y scale:

ggplot(df, aes(x=Variable2, y=value, color=Variable1)) +
  geom_smooth(fill=NA)

enter image description here

But this doesn’t give the separation that you wanted. To do that, the way i know is making a plot for each Variable1, and arranging them together (but maybe there’s an option with this package ggridges, but i never used it). To do that we build a “base” graph:

g = ggplot(df, aes(x=Variable2, y=value)) +
  geom_smooth(fill=NA) +
  theme(axis.text.x  = element_blank(),
        axis.title.x = element_blank())

Where we removed the x-axis to add only once in the grid. Then, we apply that base for each variable, one at a time, with a for loop:

for(i in unique(df$Variable1)){
  df2 = df[df$Variable1==i,]
  assign(i,
         g %+% df2 + ylab(i) +
               ylim(min(df2$value),max(df2$value)))}

This creates one graph for each Variable1, named as the variable itself. Now we add the x-axis in the last plot and arrange them together:

N = N + theme(axis.text.x  = element_text(),
              axis.title.x = element_text())

gridExtra::grid.arrange(E,L,N, nrow=3)

Output:

enter image description here

EDIT 2:

To use colors, first we don’t pass the geom to g:

g = ggplot(df, aes(x=Variable2, y=value)) +
  theme(axis.text.x  = element_blank(),
        axis.title.x = element_blank())

Then we create a vector of colors that we’ll use in the loop:

color = c("red", "green", "blue")
names(color) = unique(df$Variable1)

Then we pass the color argument inside the geom that we omitted earlier.

But first, let me talk about the available geoms: We could use a smooth geom area, which will give something like this:

enter image description here

Which is good but has a lot of useless area under the graphs. To change that, we can use geom_ribbon, where we can use the argument aes(ymin=min(value)-0.1, ymax=value) and ylim(min(df2$value)-0.1, max(df2$value)) to stop the graph at the minimal value (minus 0.1). The problem is that the smoothing function of ggplot doesn’t work well with geom_ribbon, so we only have the option of a “rough” graph:

enter image description here

Code for the smoot area:

for(i in unique(df$Variable1)){
  df2 = df[df$Variable1==i,]
  assign(i,
         g %+% df2 + ylab(i) +
         stat_smooth(geom="area", fill=color[i]))}

Code for the rough ribbon:

for(i in unique(df$Variable1)){
  df2 = df[df$Variable1==i,]
  assign(i,
         g %+% df2 + ylab(i) + ylim(min(df2$value)-0.1,max(df2$value)) +
         geom_ribbon(aes(ymax=value, ymin=min(value)-0.1), fill=color[i]))}

I searched for a way to work aroud that smotthing problem but foud nothing, i’ll create a question in the site and if i find a solution i’ll show it here!

EDIT 3:

After asking in here, i found that using after_stat inside the aes argument of stat_smooth(geom="ribbon", aes(...)) solves it (more info read the link).

for(i in unique(df$Variable1)){
  df2 = df[df$Variable1==i,]
  assign(i,
         g %+% df2 + ylab(i) + 
           stat_smooth(geom="ribbon", fill=color[i],
                       aes(ymax=after_stat(value), ymin=after_stat(min(value))-0.1)))}

enter image description here

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top