Categories
ggplot2 r r-faq

Order Bars in ggplot2 bar graph

360

I am trying to make a bar graph where the largest bar would be nearest to the y axis and the shortest bar would be furthest. So this is kind of like the Table I have

    Name   Position
1   James  Goalkeeper
2   Frank  Goalkeeper
3   Jean   Defense
4   Steve  Defense
5   John   Defense
6   Tim    Striker

So I am trying to build a bar graph that would show the number of players according to position

p <- ggplot(theTable, aes(x = Position)) + geom_bar(binwidth = 1)

but the graph shows the goalkeeper bar first then the defense, and finally the striker one. I would want the graph to be ordered so that the defense bar is closest to the y axis, the goalkeeper one, and finally the striker one.
Thanks

5

  • 17

    can’t ggplot reorder them for you without having to mess around with the table (or dataframe)?

    Mar 23, 2014 at 6:42

  • 2

    @MattO’Brien I find it incredible that this is not done in a single, simple command

    Dec 27, 2019 at 17:57

  • @Zimano Too bad that’s what you’re getting from my comment. My observation was towards the creators of ggplot2, not the OP

    Jan 24, 2020 at 14:10

  • 3

    @Euler_Salter Thank you for clarifying, my sincere apologies for jumping on you like that. I have deleted my original remark.

    – Zimano

    Jan 24, 2020 at 14:14

  • ggplot2 currently ignores binwidth = 1 with a warning. To control the width of the bars (and have no gaps between bars), you might want to use width = 1 instead.

    – stragu

    Oct 27, 2020 at 6:11


246

The key with ordering is to set the levels of the factor in the order you want. An ordered factor is not required; the extra information in an ordered factor isn’t necessary and if these data are being used in any statistical model, the wrong parametrisation might result — polynomial contrasts aren’t right for nominal data such as this.

## set the levels in order we want
theTable <- within(theTable, 
                   Position <- factor(Position, 
                                      levels=names(sort(table(Position), 
                                                        decreasing=TRUE))))
## plot
ggplot(theTable,aes(x=Position))+geom_bar(binwidth=1)

barplot figure

In the most general sense, we simply need to set the factor levels to be in the desired order. If left unspecified, the levels of a factor will be sorted alphabetically. You can also specify the level order within the call to factor as above, and other ways are possible as well.

theTable$Position <- factor(theTable$Position, levels = c(...))

10

  • 1

    @Gavin: 2 simplifications: since you already are using within, there’s no need to use theTable$Position, and you could just do sort(-table(...)) for decreasing order.

    Mar 6, 2011 at 15:16

  • 2

    @Prasad the former was a leftover from testing so thanks for pointing that out. As far the latter, I prefer explicitly asking for the reversed sort than the - you use as it is far easier to get the intention from decreasing = TRUE than noticing the - in all the rest of the code.

    Mar 6, 2011 at 15:22

  • 2

    @GavinSimpson; I think the part about levels(theTable$Position) <- c(...) leads to undesired behaviour where the actual entries of the data frame gets reordered, and not just the levels of the factor. See this question. Maybe you should modify or remove those lines?

    – Anton

    Feb 18, 2019 at 11:56


  • 2

    Strongly agree with Anton. I just saw this question and went poking around on where they got the bad advice to use levels<-. I’m going to edit that part out, at least tentatively.

    Feb 18, 2019 at 23:03

  • 2

    @Anton Thanks for the suggestion (and to Gregor for the edit); I would never do this via levels<-() today. This is something from from 8 years back and I can’t recall if things were different back then or whether I was just plain wrong, but regardless, it is wrong and should be erased! Thanks!

    Feb 19, 2019 at 4:09

246

The key with ordering is to set the levels of the factor in the order you want. An ordered factor is not required; the extra information in an ordered factor isn’t necessary and if these data are being used in any statistical model, the wrong parametrisation might result — polynomial contrasts aren’t right for nominal data such as this.

## set the levels in order we want
theTable <- within(theTable, 
                   Position <- factor(Position, 
                                      levels=names(sort(table(Position), 
                                                        decreasing=TRUE))))
## plot
ggplot(theTable,aes(x=Position))+geom_bar(binwidth=1)

barplot figure

In the most general sense, we simply need to set the factor levels to be in the desired order. If left unspecified, the levels of a factor will be sorted alphabetically. You can also specify the level order within the call to factor as above, and other ways are possible as well.

theTable$Position <- factor(theTable$Position, levels = c(...))

10

  • 1

    @Gavin: 2 simplifications: since you already are using within, there’s no need to use theTable$Position, and you could just do sort(-table(...)) for decreasing order.

    Mar 6, 2011 at 15:16

  • 2

    @Prasad the former was a leftover from testing so thanks for pointing that out. As far the latter, I prefer explicitly asking for the reversed sort than the - you use as it is far easier to get the intention from decreasing = TRUE than noticing the - in all the rest of the code.

    Mar 6, 2011 at 15:22

  • 2

    @GavinSimpson; I think the part about levels(theTable$Position) <- c(...) leads to undesired behaviour where the actual entries of the data frame gets reordered, and not just the levels of the factor. See this question. Maybe you should modify or remove those lines?

    – Anton

    Feb 18, 2019 at 11:56


  • 2

    Strongly agree with Anton. I just saw this question and went poking around on where they got the bad advice to use levels<-. I’m going to edit that part out, at least tentatively.

    Feb 18, 2019 at 23:03

  • 2

    @Anton Thanks for the suggestion (and to Gregor for the edit); I would never do this via levels<-() today. This is something from from 8 years back and I can’t recall if things were different back then or whether I was just plain wrong, but regardless, it is wrong and should be erased! Thanks!

    Feb 19, 2019 at 4:09

182

Using scale_x_discrete (limits = ...) to specify the order of bars.

positions <- c("Goalkeeper", "Defense", "Striker")
p <- ggplot(theTable, aes(x = Position)) + scale_x_discrete(limits = positions)

6

  • 13

    Your solution is the most suitable to my situation, as I want to program to plot with x being an arbitrary column expressed by a variable in a data.frame. The other suggestions would be harder to express the arrangement of the order of x by an expression involving the variable. Thanks! If there is interest, I can share my solution using your suggestion. Just one more issue, adding scale_x_discrete(limits = …), I found that there is blank space as wide as the bar-chart, on the right of the chart. How can I get rid of the blank space? As it does not serve any purpose.

    – Yu Shen

    Apr 28, 2015 at 1:04


  • 1

    This seems necessary for ordering histogram bars

    – geotheory

    Aug 4, 2015 at 9:50

  • 10

    QIBIN: Wow…the other answers here work, but your answer by far seems not just the most concise and elegant, but the most obvious when thinking from within ggplot’s framework. Thank you.

    – dancow

    Sep 10, 2015 at 13:53

  • When I tried this solution, on my data it, didn’t graph NAs. Is there a way to use this solution and have it graph NAs?

    May 25, 2017 at 18:13


  • This is an elegant and simple solution – thank you!!

    Nov 6, 2018 at 17:00