R code
library(tidyverse)
library(nycflights13)
library(tidyverse)
library(nycflights13)
In Lab 5 : Data Transformation and Visualization with COVID-19 reporting data we used the pivot_longer
function, but did not talk about it in depth. Often spreadsheets are designed for data entry, but this is not the optimum format for data analysis and graphing. Today we will walk through some of the key aspects of Data Tidying. The data sets used in this chapter are loaded with the tidyverse
package.
A common part of an analysis workflow is combining data from multiples sources. To do this a common element is needed to link the data tables. This is the key
that is often described as an important element of relational databases. Often in genome analysis the key is a GenBank or other database ID. Chapter 18 Joins in R for Data Science discusses the key types primary
, compound
, foreign
and surrogate
. Then goes over how to use keys in mutating
and filtering
joins.
As an additional reference the Tidyverse Cookbook has many practical solutions to problems including different ways to join tables.
Today you will go through Chapter 5 Data Tidying in R for Data Science. As we did last previously, by putting the examples in our own Quarto Markdown file. You do not need to do the exercises in this chapter.
Go through Sections 18.1 to 18.4 Chapter 18 Joins in R for Data Science putting the examples and exercises in your Quarto file.
After you Render
the qmd file to an html file, export the file to your computer and upload it to Canvas.
Earlier we used solutions found in R for Data Science (2e) - Solutions to Exercises, but there are no solution yet for Ch 19 on table joins. They can be found Solutions Manual: R for Data Science (2e)