dada2 and phyloseq are two complementary R packages for the analysis of microbial community data developed in Susan Holmesโ research group at Stanford. Today we will go through the tutorials associated with these packages.
phyloseqphyloseq is installed through Bioconductor using the BiocManager
BiocManager::install("phyloseq")
BiocManager::install("dada2")
BiocManager::install("DECIPHER")For the tutorials you will need to download and decompress (unzip) the files. I put this in my data folder. The results should be a directory MiSeq_SOP with the following files.
path <- "data/MiSeq_SOP" # CHANGE ME to the directory containing the fastq files after unzipping.
list.files(path)You will need to download the silva training set. The current file is silva_nr99_v138_train_set.fa.gz and put it in your data folder
taxa <- assignTaxonomy(seqtab.nochim, "data/silva_nr99_v138_train_set.fa.gz", multithread=TRUE)To use DECIPHER get the most recent .RData file from DECIPHER downloads then modify the tutorial code with your path and the new file names.
dna <- DNAStringSet(getSequences(seqtab.nochim)) # Create a DNAStringSet from the ASVs
load("data/SILVA_SSU_r138_2019.RData") # CHANGE TO THE PATH OF YOUR TRAINING SETCreate your own Rmd files for the dada2 tutorials including the bonus phyloseq section.