10x Supernova Assembler
Supernova is a software package for de novo assembly from Chromium Linked-Reads that are made from a single whole-genome library from an individual DNA source. A key feature of Supernova is that it creates diploid assemblies, thus separately representing maternal and paternal chromosomes over very long distances. Almost all other methods instead merge homologous chromosomes into single incorrect 'consensus' sequences. Supernova is the only practical method for creating diploid assemblies of large genomes.
The Supernova software package includes two processing pipelines:
- supernova mkfastq wraps Illumina's bcl2fastq to correctly demultiplex Chromium-prepared sequencing samples and to convert barcode and read data to FASTQ files.
- supernova run takes FASTQ files containing barcoded reads from supernova mkfastq and builds a graph-based assembly. The approach is to first build an assembly using read kmers (K = 48), then resolve this assembly using read pairs (to K = 200), then use barcodes to effectively resolve this assembly to K ≈ 100,000. The final step pulls apart homologous chromosomes into phase blocks, which are typically multi-megabase for human genomes.
and for post-processing:
- supernova mkoutput takes Supernova's graph-based assemblies and produces several styles of FASTA suitable for downstream processing and analysis.
Re: 10x de novo Assembly Tools
I assume supernova runs quality trimming of the seqeucne data and adapter trimming as part of its processing pipeline.
Is this assumption correct? ( and thus any time trying to improve assemblies by playing around with this wasted?)