library(tidyverse)
<- gss_cat %>%
gss_party_by_marital summarise(n_in_subcategory = n(), .by = c(partyid, marital)) %>%
mutate(n_in_category = sum(n_in_subcategory), .by = partyid) %>%
mutate(partyid = fct_reorder(partyid, n_in_category))
%>%
gss_party_by_marital ggplot(aes(x = n_in_subcategory,
y = partyid,
fill = marital)) +
geom_col() +
scale_fill_viridis_d(option = "A", direction = 1) +
guides(fill = guide_legend(reverse = TRUE))
Heard of {plotly}
for interactive charts? I massively prefer {highcharts}
as it’s more accessible, mobile friendly and simply more beautiful. Both of these packages are htmlwidgets and there are so many packages for making interactive charts, maps and more.
But. We’re here to discuss how {highcharter}
doesn’t like empty factor levels - with my own YouTubesque thumbnail.
Let’s demonstrate how {ggplot2}
handles ordering of factor levels that are empty:
That’s a nicely ordered stacked bar chart! We’ve got the bars going from big to small, even though the final two categories don’t contain all of the marital states.
%>%
gss_party_by_marital filter(partyid %in% c("No answer", "Don't know")) %>%
::kable() knitr
partyid | marital | n_in_subcategory | n_in_category |
---|---|---|---|
No answer | Married | 60 | 154 |
No answer | Divorced | 29 | 154 |
No answer | Never married | 35 | 154 |
No answer | Widowed | 16 | 154 |
No answer | Separated | 9 | 154 |
No answer | No answer | 5 | 154 |
Don’t know | Married | 1 | 1 |
Let’s throw this into {highcharter}
and we see just a complete mess.
library(highcharter)
%>%
gss_party_by_marital hchart(
type = "bar",
hcaes(
y = n_in_subcategory,
x = partyid,
group = marital
)%>%
) hc_plotOptions(series = list(stacking = "normal"))
Using complete() and talking about that .by argument
You might have missed it in my code above, but I used groups without group_by()
or ungroup()
! Back in late 2022 there was a suggestion that the grouping functions gain a .by
argument which was added as an experimental feature.
I initially I really didn’t like it. But, look how neat this is. I’ve highlighted the summarise()
and mutate()
lines that make use of the .by
argument, it saves having to call group_by()
and ungroup()
… and just to be clear complete()
is filling in empty factor levels with an explicit value of 0.
<- gss_cat %>%
gss_completed_data summarise(n_in_subcategory = n(), .by = c(partyid, marital)) %>%
complete(partyid,
marital,fill = list(n_in_subcategory = 0)) %>%
mutate(n_in_category = sum(n_in_subcategory), .by = partyid) %>%
mutate(partyid = fct_reorder(partyid, n_in_category)) %>%
arrange(desc(n_in_category))
%>%
gss_completed_data filter(partyid %in% c("No answer", "Don't know")) %>%
::kable() knitr
partyid | marital | n_in_subcategory | n_in_category |
---|---|---|---|
No answer | No answer | 5 | 154 |
No answer | Never married | 35 | 154 |
No answer | Separated | 9 | 154 |
No answer | Divorced | 29 | 154 |
No answer | Widowed | 16 | 154 |
No answer | Married | 60 | 154 |
Don’t know | No answer | 0 | 1 |
Don’t know | Never married | 0 | 1 |
Don’t know | Separated | 0 | 1 |
Don’t know | Divorced | 0 | 1 |
Don’t know | Widowed | 0 | 1 |
Don’t know | Married | 1 | 1 |
Now those empty factor levels are filled we can throw it back into {highcharter}
. Make sure to interact with the chart, and click on the legend items to see just how beautiful {highcharter}
is.
%>%
gss_completed_data hchart(
type = "bar",
hcaes(
y = n_in_subcategory,
x = partyid,
group = marital
)%>%
) hc_plotOptions(series = list(stacking = "normal"))
A necessary aside on licensing
{highcharter}
is a wrapper for the incredible https://highcharts.com/ JavaScript framework. It is not free to use {highcharter}
in commercial projects, but there are free license options for personal projects and educational usage.
Reuse
Citation
@online{hadley2024,
author = {Hadley, Charlie},
title = {\{Highcharts\} Doesn’t Like Empty Factors (and the .by
Argument Is Quite Nice, Actually)},
date = {2024-10-24},
url = {https://visibledata.co.uk/posts/2024-10-24_Highchart-doesnt-like-empty-factors},
langid = {en}
}