Structural Variant Analysis with Linked-Reads
Why are structural variants (SV) important?
Structural variants, including copy number variations (CNV), have been associated with diseases and disorders, such as cancer, developmental delay/intellectual disability, and congenital anomalies1,2. Despite their clinical relevance, SVs remain one of the most difficult classes of variants to discern from genomic data. For the known SVs, specialized tests have been developed3, but there is evidence that data4 can go undetected due to the limitations of current sequencing technology. Identifying clinically relevant SVs is key to furthering our understanding of diseases, disorders and basic human genomics.
What are the limitations of current short read sequencing methods for SV analysis?
There are a variety of SV types, including deletions, insertions, duplications, inversions and translocations that vary in size from 50bp to entire chromosomes. With the wide variety of types and sizes, SV detection can be tricky for the following reasons:
- SVs tend to be clustered in duplicated and repetitive regions of the genome that are typically not accessible by short read sequencing
- The haploid representation of the human genome makes it necessary to average the signal between the two haplotypes, which ends up diluting the variant signal and making it difficult to separate true variation from background noise
- The primary SV-calling methods of read depth (RD), split read (SR), read pair (RP) and re-assembly are frequently used in combination to overcome the drawbacks and variation blind spots of each method but still only result in an overall sensitivity and Positive Predictive Value of 30-84% and 27-85% respectively on the genome5
- Copy-neutral variation remains difficult to detect because read depth alone is not informative
How can 10x Linked-Reads be used for more complete structural variant analysis?
The new application note “Structural Variant Analysis with Linked-Reads” discusses how Chromium™ Genome and Exome Linked-Read data in combination with Long Ranger™ Software provide important SV information that is otherwise unclear using standard short read sequencing.
- Linked-Reads provide important haplotype information that can be used for more complete structural variant analysis
- Haplotype phasing can improve the confidence of SV-calls by removing the “noise” of un-phased data
- Linked-Reads enable the mapping of reads to repetitive regions of the genome, where structural variant breakpoints often cluster
- Long Ranger™ utilizes the Linked-Read data to reliably detect structural variants in genome and exome data