Optical genome mapping enables accurate testing of large repeat expansions
Dataset belonging to the study entitled: "Optical genome mapping enables accurate testing of large repeat expansions". This dataset contains for each of the 85 samples the local guided assembly as generated within the study. These .zip file can be imported into the Bionano Access software. Please refer to the readme.txt for more information. The study has the following abstract: Short tandem repeats (STRs) are common variations in human genomes that frequently expand or contract, causing genetic disorders, mainly when expanded. Traditional diagnostic methods for identifying these expansions, such as repeat-primed PCR and Southern blotting, are often labor-intensive, locus-specific, and are unable to precisely determine long repeat expansions. Sequencing-based methods, though capable of genome-wide detection, are limited by inaccuracy (short-read technologies) and high associated costs (long-read technologies). This study evaluated optical genome mapping (OGM) as an efficient, accurate approach for measuring STR lengths and assessing somatic stability in 85 samples with known pathogenic repeat expansions in DMPK, CNBP, and RFC1, causing myotonic dystrophy types 1 and 2 and cerebellar ataxia, neuropathy and vestibular areflexia syndrome (CANVAS), respectively. Three workflows—manual de novo assembly, local guided assembly (local-GA), and a molecule distance script—were applied, of which the latter two were developed as part of this study to assess the repeat sizes and somatic repeat stability. OGM successfully identified 84/85 (98.8%) of the pathogenic expansions, distinguishing between wild-type and expanded alleles or between two expanded alleles in recessive cases, with greater accuracy than standard of care (SOC) for long repeats and no apparent upper size limit. Notably, OGM detected somatic instability in a subset of DMPK, CNBP, and RFC1 samples. These findings suggest OGM could advance diagnostic accuracy for large repeat expansions, providing a more comprehensive genome-wide assay for repeat expansion disorders by measuring exact repeat lengths and somatic instability across multiple loci simultaneously.