[This article was first published on free range statistics - R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.How long do wars last, on average? If a war such as that currently under way in Iran has lasted 74 days so far, how long do we expect it to last in total? For all sorts of reasons, inquiring minds are interested. Luckily there are some very well curated datasets out there, including the Correlates of War, that make it easy to answer these questions.A caveat to all this applies that I am not a military historian, just an interested amateur. I’m very open to having mistakes of interpretation or method pointed out to me.Distribution of wars’ durationsThe Correlates of War data lets us see, for example, that this is the distribution (on a logarithmic scale) of durations of wars post-Napoleon:You can see I’ve compared this to a log-normal distribution and found that it doesn’t have quite as fat tails as that. But that’s ok, I’m not too worried about the precise shape, because later on I’ll be using pretty straightforward empirical methods.This data is only for inter-state wars, which are in contrast to intra-state (eg civil wars) and extra-state (eg with external non-state actors). As I’m interested in a reference population to compare the current USA-Israel-Iran war to, it’s the inter-state population I want.The median length of a war is 139 days and the mean is 408 days.The four day war in the dataset is the so-called “Football War” of 1969 between Honduras and El Salvador. The 3,734 day war was the much better-known “Vietnam War Phase II”, involving USA, Australia, Vietnam, Cambodia and others.Here’s the code to import the data from the Correlates of War project and draw that first density plot:library(tidyverse)library(lubridate)library(janitor)library(glue)library(ggrepel)library(scales)# https://correlatesofwar.org/data-sets/cow-war/#----- import interstate war data----------------------interstate clean_names() |> mutate(start_date = as.Date(sprintf("%04d-%02d-%02d", start_year1, start_month1, start_day1)), end_date = as.Date(sprintf("%04d-%02d-%02d", end_year1, end_month1, end_day1)))interstate_wars group_by(war_num, war_name) |> summarise(earliest_start= min(start_date), latest_end = max(end_date), bat_death = sum(bat_death)) |> mutate(duration = as.numeric(latest_end - earliest_start), start_year = year(earliest_start)) |> ungroup()# what years covered? 1823 to 2003 at time of writingrange(interstate_wars$start_year)#==========================plots================= simple_caption mutate(cumulative_freq = 1:n() / n()) # smoothed model of the cumulative distribution, including estimates of where# the Iran war is on it:model