Dense and accurate whole-chromosome haplotyping of individual genomes
10x Genomics© Linked-Reads provide long range information from short read sequencers, enabling haplotyping, de novo assembly and complex structural variant detection applications. In a recent publication in Nature Communications, Porubsky, et al. combined strand-specific single-cell sequencing (Strand-seq) with long-read (PacBio) or Linked-Read (10x Genomics) sequencing to introduce an integrative method for phasing individual human genomes. The long-range phased data from Strand-seq combined with the phased segments from long-reads or Linked-Reads allowed the researchers to determine haplotypes that are contiguous and span whole chromosomes. The authors also noted that the combination of Strand-seq and 10x Linked-Reads resulted in the most cost-effective approach to phase an individual genome at high accuracy.
The diploid nature of the human genome is neglected in many analyses done today, where a genome is perceived as a set of unphased variants with respect to a reference genome. This lack of haplotype-level analyses can be explained by a lack of methods that can produce dense and accurate chromosome-length haplotypes at reasonable costs. Here we introduce an integrative phasing strategy that combines global, but sparse haplotypes obtained from strand-specific single-cell sequencing (Strand-seq) with dense, yet local, haplotype information available through long-read or linked-read sequencing. We provide comprehensive guidance on the required sequencing depths and reliably assign more than 95% of alleles (NA12878) to their parental haplotypes using as few as 10 Strand-seq libraries in combination with 10-fold coverage PacBio data or, alternatively, 10X Genomics linked-read sequencing data. We conclude that the combination of Strand-seq with different technologies represents an attractive solution to chart the genetic variation of diploid genomes.
Read the full article in Nature Communications.