Running Cellranger pipeline with a published dataset.

Posted By: nsag, on Apr 14, 2018 at 4:31 PM


Hi all,


From the "Specifying Input FASTQ Files for 10x Pipelines" page:


"The cellranger pipeline requires FASTQ files as input, which will typically come from running cellranger mkfastq, a 10x-aware convenience wrapper for bcl2fastq. However, it is possible to use FASTQ files from other sources, such as Illumina's bcl2fastq, a published dataset, or ourbamtofastq."


I'm trying to run the Cellranger pipeline on fastqs obtained by downloading SRA sequences published on NCBI and converting them to fastq files via the sratoolkit. However, Cellranger does not recognize the fastq files because they are not formatted in separate index and read fastq.gz files (I1, R1, R2, etc), and are instead singular .fastq files.


  1. Is there support for running published datasets not sequenced by 10x?
  2. What is the recommended workflow for anlyzing published sequences in SRA format?
  3. Is there a way to properly format a singular fastq input file into three separate I1, R1, R2 files for use with cellranger count?  

Thank you so much, and any response is greatly appreciated.