stats19 v4.0.0: 45 Years of UK Road Crash Data, Unified

Wait 5 sec.

[This article was first published on R | Robin Lovelace, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.The stats19 R package has been updated to version 4.0.0. The main change is a unified column schema that lets you work with 45 years of UK road crash data (1979 to 2024) without running into mismatched column names.Unified schemaOlder data files have columns like carriageway_hazards_historic while newer ones use carriageway_hazards. v4.0.0 detects these variants, merges them into the modern names, and drops the redundant columns.library(stats19)crashes = get_stats19(year = 1979:2024, type = "crashes")Parsing fixesread_stats19() now builds a custom parser from the CSV header, which removes the warnings about unmatched columns that appeared in previous versions. We also fixed a bug where 2024 latitude and longitude values were truncated to integers.Missing valuesCodes like -1, “Code deprecated”, and “Data missing or out of range” are now standardised to NA during formatting, so is.na() works consistently.PerformanceThe package now uses readr Edition 2 by default, which supports multi-threaded parsing. Loading large files is noticeably faster.New functionsmatch_tag() joins government TAG cost estimates (RAS4001) to collision dataclean_make(), clean_model(), and clean_make_model() standardise the 2,400+ raw strings in the vehicle datasetMulti-year downloadsYear ranges now download bulk historic files once and filter efficiently. The 1979 file is also handled correctly (it used to be returned as a catch-all for any older year).Feedback wantedWe plan to submit to CRAN soon. Please install, test, and report any issues:pak::pak("ropensci/stats19")Issues: github.com/ropensci/stats19/issuesAcknowledgementsContributions from David Ranzolin and Adam Sparks (rOpenSci review), Malcolm Morgan, Layik Hama, and Blaise Kelly. Funding from the RAC Foundation.LinksGitHubDocumentationChangelogTo leave a comment for the author, please follow the link and comment on their blog: R | Robin Lovelace.R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.Continue reading: stats19 v4.0.0: 45 Years of UK Road Crash Data, Unified