IntroductionAneuploidy can occur during human meiosis due to nondisjunction (NDJ) errors, but it is challenging to study because most cannot give rise to viable embryos, with the exception of trisomies 13, 18 and especially 21—the most common aneuploid condition at birth. Therefore, Down syndrome (DS) offers a unique opportunity and is an ideal model for understanding the origin of these events. Individuals with DS are at increased risk of intellectual disability, congenital heart disease, and are predisposed to hematologic malignancies1. Trisomy 21 arises mostly from NDJ or mis-segregation of chromosome 21 (chr 21) during meiosis2,3, or relatively uncommonly, from postzygotic mitosis4. In more than 90% of cases, the extra copy of chr 21 originates from the maternal gamete5. This is because female gametogenesis is more prone than male gametogenesis to erroneous chromosomal segregation due to decades of arrest at prophase I6,7. Inaccurate chromosomal segregation during gametogenesis can be grossly classified into meiosis I (MI) or meiosis II (MII) errors using genotyping data8,9. MI errors are inferred when two homologous parental chromosomes are identified in the gamete and MII errors are inferred when two sister chromatids are identified. These observations can be the result of several mechanisms of mis-segregation, such as NDJ of the homologous chromosomes, precocious separation of sister chromatids, or reverse segregation. The stage at which the NDJ takes place is associated with the number and the position of the chromosomal crossovers. In maternal NDJ of chromosome 21, absence of crossover or crossover distally located from the centromere is associated with MI errors. In contrast, MII errors are associated with pericentromeric crossovers10.The type/stage of NDJ error of trisomy 21 is usually inferred from proband-parent trios by comparing the heterozygosity patterns of genotypes at the pericentromeric region between the child with DS and the parents10,11,12,13. However, when the parents’ genetic data are unavailable, there is currently no generalizable method to infer the type of error, aside from recent work by refs. 14,15 analyzing NDJs based on low coverage whole genome sequencing (WGS) data in preimplantation genetic testing.In this work, we develop a method, Mis-segregation Error Identification through Hidden Markov Models (MeiHMM), to infer the type of NDJ and locate the chromosomal crossovers using proband genotype alone, based on the frequencies of single nucleotide polymorphisms (SNPs) or haplotypes in the general population. Applying this to WGS data of 152 DS cases demonstrates high accuracy compared to trio analysis. We further investigate the association between the type of NDJ error and disease characteristics in a cohort of DS-associated acute lymphoblastic leukemia (ALL). MeiHMM does not infer the parental origin of trisomy 21, but it is applicable to NDJ errors of both maternal and paternal origins. For simplicity in the description, the following sections assume a maternal origin for trisomy 21, which is true in ~90% of DS cases5, though the framework is the same for NDJ of paternal origin as well.ResultsMeiHMM implements variant- and haplotype-based analyses to identify NDJ timing and cross over locationsThe types of chr 21 NDJ errors can be distinguished by the number of unique haplotypes across the q arm of this chromosome as follows (Fig. 1A): (1) Errors of MI origin are characterized by having three different haplotypes in the centromeric region, i.e., one allele with the paternal haplotype, two remaining alleles with two different maternal haplotypes due to the unsuccessful separation of the homologous chromosomes in the oocyte; (2) By contrast, errors of MII result when the duplicated sister chromatids fail to separate; such failure is inferred when only two haplotypes are represented in the centromeric region, i.e., one allele from the paternal haplotype and two identical alleles from the maternal haplotype; (3) Mitotic errors are inferred when there is complete duplication of maternal chr 21. (Alternatively, this could be inferred as a MII error with no recombination, a noted limitation of this gross classification10) Applying a Hidden Markov model, MeiHMM (1) segments chr 21 into blocks of two- or three haplotypes which informs the stage of NDJ, and (2) identifies the position of chromosome crossover (recombination during meiosis) on the basis of the boundaries of these haplotype blocks.Fig. 1: Overview of MeiHMM.A The types of chr 21 nondisjunction (NDJ) and their patterns of haplotypes along chr 21. Here, trisomy 21 is assumed to be of maternal origin. B Summary of the workflow to categorize chr 21 heterozygous SNPs. C Type 1 informative SNPs occur predominantly in two-haplotype blocks and type 2 informative SNPs occur predominantly in three-haplotype blocks. D Diagram of the hidden states and the observations of MeiHMM. E Distribution of the allele frequencies (AF) of two-copy alleles of chr 21 SNPs in simulated SNPs by sampling two (and duplicate one allele to make three alleles) or three alleles from gnomAD and AFLA references. F Distribution of the O/E ratio of hypothetical haplotypes. In panels E and F, yellow and blue colors indicate simulated values by assuming there are two or three haplotypes, respectively. G The results of MeiHMM of an example case with Meiosis I error. Red dashed lines indicate thresholds used to define type 1 (allele frequency