How do you make a histogram with equally sized dots or squares for each observation, and colour them by another variable

Wait 5 sec.

[This article was first published on pacha.dev/blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. Because of delays with my scholarship payment, if this post is useful to you I kindly ask a minimal donation on Buy Me a Coffee that shall be used to continue my Open Source efforts. If you need an R package or Shiny dashboard for your team, you can email me or inquiry on Fiverr. The full explanation is here: A Personal Message from an Open Source ContributorYou can send me questions for the blog using this form.I got this question from a reader: How do you make a histogram with equally sized dots or squares for each observation, and colour them by another variable?I shall use the Palmer’s Penguins dataset to answer this, which contains observation about the species and body mass for a sample of penguins:library(palmerpenguins)library(dplyr)glimpse(penguins)Rows: 344Columns: 8$ species Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adel…$ island Torgersen, Torgersen, Torgersen, Torgersen, Torgerse…$ bill_length_mm 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34.1, …$ bill_depth_mm 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18.1, …$ flipper_length_mm 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, 186…$ body_mass_g 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 3475, …$ sex male, female, female, NA, female, male, female, male…$ year 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007…To use squares, one possibility is to create a discrete body mass variable by intervals and count by species and interval:library(tidyr)library(ggplot2)library(tintin)# Create quantile-based bins (wider bins)n_bins % mutate(body_mass_d = cut(body_mass_g, breaks = 4, dig.lab = 6)) %>% group_by(species, body_mass_d) %>% count()d# A tibble: 9 × 3# Groups: species, body_mass_d [9] species body_mass_d n 1 Adelie (2696.4,3600] 712 Adelie (3600,4500] 733 Adelie (4500,5400] 74 Chinstrap (2696.4,3600] 265 Chinstrap (3600,4500] 406 Chinstrap (4500,5400] 27 Gentoo (3600,4500] 178 Gentoo (4500,5400] 729 Gentoo (5400,6303.6] 34Now I can create a Tetris-style column plot where each square represents 5 penguins:square_size 0) %>% uncount(full_squares) %>% group_by(species, body_mass_d) %>% mutate(square_id = row_number() - 1, y = square_id + 0.5, height = 1, square_type = "full") %>% ungroup()# Create partial squarespartial_squares_df % filter(remainder > 0) %>% mutate(square_id = full_squares, y = full_squares + partial_height/2, height = partial_height, square_type = "partial")# Combine bothd_squares