{highcharts} doesn’t like empty factors (and the .by argument is quite nice, actually)

Author
Published

October 24, 2024

Heard of {plotly} for interactive charts? I massively prefer {highcharts} as it’s more accessible, mobile friendly and simply more beautiful. Both of these packages are htmlwidgets and there are so many packages for making interactive charts, maps and more.

But. We’re here to discuss how {highcharter} doesn’t like empty factor levels - with my own YouTubesque thumbnail.

Let’s demonstrate how {ggplot2} handles ordering of factor levels that are empty:

library(tidyverse)

gss_party_by_marital <- gss_cat %>% 
  summarise(n_in_subcategory = n(), .by = c(partyid, marital)) %>% 
  mutate(n_in_category = sum(n_in_subcategory), .by = partyid) %>% 
  mutate(partyid = fct_reorder(partyid, n_in_category))

gss_party_by_marital %>% 
  ggplot(aes(x = n_in_subcategory,
             y = partyid,
             fill = marital)) +
  geom_col() +
  scale_fill_viridis_d(option = "A", direction = 1) +
  guides(fill = guide_legend(reverse = TRUE))

That’s a nicely ordered stacked bar chart! We’ve got the bars going from big to small, even though the final two categories don’t contain all of the marital states.

gss_party_by_marital %>% 
  filter(partyid %in% c("No answer", "Don't know")) %>% 
  knitr::kable()
partyid marital n_in_subcategory n_in_category
No answer Married 60 154
No answer Divorced 29 154
No answer Never married 35 154
No answer Widowed 16 154
No answer Separated 9 154
No answer No answer 5 154
Don’t know Married 1 1

Let’s throw this into {highcharter} and we see just a complete mess.

library(highcharter)

gss_party_by_marital %>% 
  hchart(
    type = "bar",
    hcaes(
      y = n_in_subcategory,
      x = partyid,
      group = marital
    )
  ) %>% 
  hc_plotOptions(series = list(stacking = "normal"))

Using complete() and talking about that .by argument

You might have missed it in my code above, but I used groups without group_by() or ungroup()! Back in late 2022 there was a suggestion that the grouping functions gain a .by argument which was added as an experimental feature.

I initially I really didn’t like it. But, look how neat this is. I’ve highlighted the summarise() and mutate() lines that make use of the .by argument, it saves having to call group_by() and ungroup()… and just to be clear complete() is filling in empty factor levels with an explicit value of 0.

gss_completed_data <- gss_cat %>% 
  summarise(n_in_subcategory = n(), .by = c(partyid, marital)) %>% 
  complete(partyid,
           marital,
           fill = list(n_in_subcategory = 0)) %>% 
  mutate(n_in_category = sum(n_in_subcategory), .by = partyid) %>% 
  mutate(partyid = fct_reorder(partyid, n_in_category)) %>% 
  arrange(desc(n_in_category))

gss_completed_data %>% 
  filter(partyid %in% c("No answer", "Don't know")) %>% 
  knitr::kable()
partyid marital n_in_subcategory n_in_category
No answer No answer 5 154
No answer Never married 35 154
No answer Separated 9 154
No answer Divorced 29 154
No answer Widowed 16 154
No answer Married 60 154
Don’t know No answer 0 1
Don’t know Never married 0 1
Don’t know Separated 0 1
Don’t know Divorced 0 1
Don’t know Widowed 0 1
Don’t know Married 1 1

Now those empty factor levels are filled we can throw it back into {highcharter}. Make sure to interact with the chart, and click on the legend items to see just how beautiful {highcharter} is.

gss_completed_data %>% 
  hchart(
    type = "bar",
    hcaes(
      y = n_in_subcategory,
      x = partyid,
      group = marital
    )
  ) %>% 
  hc_plotOptions(series = list(stacking = "normal"))

A necessary aside on licensing

{highcharter} is a wrapper for the incredible https://highcharts.com/ JavaScript framework. It is not free to use {highcharter} in commercial projects, but there are free license options for personal projects and educational usage.

Reuse

Citation

BibTeX citation:
@online{hadley2024,
  author = {Hadley, Charlie},
  title = {\{Highcharts\} Doesn’t Like Empty Factors (and the .by
    Argument Is Quite Nice, Actually)},
  date = {2024-10-24},
  url = {https://visibledata.co.uk/posts/2024-10-24_Highchart-doesnt-like-empty-factors},
  langid = {en}
}
For attribution, please cite this work as:
Hadley, Charlie. 2024. “{Highcharts} Doesn’t Like Empty Factors (and the .by Argument Is Quite Nice, Actually).” October 24, 2024. https://visibledata.co.uk/posts/2024-10-24_Highchart-doesnt-like-empty-factors.