Reply
Highlighted

Supernova results on 1.2GB genome

Posted By: jgarbe, on Feb 2, 2017 at 9:05 AM

Here are the results from my first 10X de novo assembly:

I used Supernova 1.1.3 to assemble a fish genome estimated to be 1.2GB in size. A total of 161 million 150bp paired-end read pairs (323 million reads total) were generated from a sample prepared using Chromium V1 chemistry. Supernova successfully completed after running for 18 days, with a peak memory usage of 184GB (out of 256GB available), and using two cores for almost the entire run (peaking at 28 cores during the first day). 

 

--------------------------------------------------------------------------------
SUMMARY
--------------------------------------------------------------------------------
INPUT
- 323.74 M = READS = number of reads; ideal 800-1200 for human
- 139.00 b = MEAN READ LEN = mean read length after trimming; ideal 140
- 35.95 x = EFFECTIVE COV = effective read coverage; ideal ~42 for nominal 56x cov
- 84.31 % = READ TWO Q30 = fraction of Q30 bases in read 2; ideal 75-85
- 0.31 kb = MEDIAN INSERT = median insert size; ideal 0.35-0.40
- 88.08 % = PROPER PAIRS = fraction of proper read pairs; ideal >=75
- 53.62 kb = MOLECULE LEN = weighted mean molecule size; ideal 50-100
- 3.27 kb = HETDIST = mean distance between heterozygous SNPs
- 7.34 % = UNBAR = fraction of reads that are not barcoded
- 316.00 = BARCODE N50 = N50 reads per barcode
- 8.36 % = DUPS = fraction of reads that are duplicates
- 49.09 % = PHASED = nonduplicate and phased reads; ideal 45-50
--------------------------------------------------------------------------------
OUTPUT
- 2.21 K = LONG SCAFFOLDS = number of scaffolds >= 10 kb
- 3.62 kb = EDGE N50 = N50 edge size
- 18.53 kb = CONTIG N50 = N50 contig size
- 0.19 Mb = PHASEBLOCK N50 = N50 phase block size
- 1.36 Mb = SCAFFOLD N50 = N50 scaffold size
- 1.04 Mb = SCAFFOLD N60 = N60 scaffold size
- 0.65 Gb = ASSEMBLY SIZE = assembly size (only scaffolds >= 10 kb)
--------------------------------------------------------------------------------

 

It would be great to see what kind of results others are getting with the 10X de novo platform.

  • Non-human

8 Replies

Re: Supernova results on 1.2GB genome

Posted By: shauna-10x, on Feb 3, 2017 at 11:24 AM

Thanks for sharing your assembly metrics with the community!  It would be great to see what other members are getting for different species and genome sizes.

Re: Supernova results on 1.2GB genome

Posted By: Lutz, on Feb 6, 2017 at 11:16 AM

Hi jgarbe,

 

Thanks for the data. We will soon run a fish genome of similar size.  For comaprison purposes which sequencer did  you use and could you provide us a bioanlyzer trace of the library?

The genome coverage is a bit on the low side  for your data so far.

 

Compared to our plant data for similar genome sizes the contig sizes are smaller for the fish but the scaffold size are significantly longer.

Re: Supernova results on 1.2GB genome

Posted By: jaffe, on Feb 8, 2017 at 8:01 PM

Hi,

 

Here are a few observations:

 

1. Internally, when we assemble human genomes, we observe run times of ~2 days.  Many other genomes behave the same way, however we have two examples of more repetitive genomes that run much longer.  We identified one problem that causes this, which will be fixed in a future release.  There are other problems that are not yet solved, that we hope to address in the relatively near future.

 

2. Some customers have experienced these same problems.  A few also have problems that are associated with limitations of their computational infrastructure.  These are not things that we can fix but we plan to add better diagnostics to identify such conditions. 

 

3. As noted, your coverage is on the low side.

 

4. Your molecule length is on the low side.  Very long DNA can be gotten from human blood.  I don't know enough about the biology to comment on whether one should be able to get the same length DNA from fish blood.  But it seems plausible.  Generally, longer molecules will produce better assemblies.

 

5. Even if your coverage and molecule length are 'perfect' I can't guarantee that you'll get a great assembly.  But these are steps that might help.  We have some algorithmic changes in progress that we think might help too, but it would be pure speculation at this point to propose that they will solve your problem.


Best regards,

 

David Jaffe

Re: Supernova results on 1.2GB genome

Posted By: jaffe, on Feb 9, 2017 at 4:47 AM

And now a few questions that I have!

 

1. What was your loading mass?  For a genome of that size, we recommend half the regular loading mass (so 0.625 ng).

 

2. Does the existing assembly meet your needs?  

 

3. What would you plan to do next with the assembly?

 

Thank you.

 

David