Introduction to Microbiome Analysis

dada2 and phyloseq are two complementary R packages for the analysis of microbial community data developed in Susan Holmesโ€™ research group at Stanford. Today we will go through the tutorials associated with these packages.

Notes

Installing phyloseq

phyloseq is installed through Bioconductor using the BiocManager

BiocManager::install("phyloseq")
BiocManager::install("dada2")
BiocManager::install("DECIPHER")

For the tutorials you will need to download and decompress (unzip) the files. I put this in my data folder. The results should be a directory MiSeq_SOP with the following files.

path <- "data/MiSeq_SOP" # CHANGE ME to the directory containing the fastq files after unzipping.
list.files(path)

You will need to download the silva training set. The current file is silva_nr99_v138_train_set.fa.gz and put it in your data folder

taxa <- assignTaxonomy(seqtab.nochim, "data/silva_nr99_v138_train_set.fa.gz", multithread=TRUE)

To use DECIPHER get the most recent .RData file from DECIPHER downloads then modify the tutorial code with your path and the new file names.

dna <- DNAStringSet(getSequences(seqtab.nochim)) # Create a DNAStringSet from the ASVs
load("data/SILVA_SSU_r138_2019.RData") # CHANGE TO THE PATH OF YOUR TRAINING SET

Exercises

Create your own Rmd files for the dada2 tutorials including the bonus phyloseq section.