Spades - commandline

This tutorial follows on from “PacBio assembly with commandline tools”.

Short-read assembly: a comparison

So far, we have assembled the long PacBio reads into one contig (the chromosome) and found an additional plasmid in the Illumina short reads.

If we only had Illumina reads, we could also assemble these using the tool Spades.

You can try this here, or try it later on your own data.

Get data

We will use the same Illumina data as we used above:

  • illumina_R1.fastq.gz: the Illumina forward reads
  • illumina_R2.fastq.gz: the Illumina reverse reads

This is from Sample 25747.

Assemble

Run Spades:

spades.py -1 illumina_R1.fastq.gz -2 illumina_R2.fastq.gz --careful --cov-cutoff auto -o spades_assembly_all_illumina
  • -1 is input file of forward reads
  • -2 is input file of reverse reads
  • --careful minimizes mismatches and short indels
  • --cov-cutoff auto computes the coverage threshold (rather than the default setting, “off”)
  • -o is the output directory

Results

Move into the output directory and look at the contigs:

infoseq contigs.fasta

Next

Run “Prokka” to annotate the contigs.