For this lab you will create a
.zip file called
lab10.zip which contains the following:
lab10.Rmd- An RMarkdown file.
lab10.html- The results of knitting the RMarkdown file.
lab10.Rproj- An RStudio project file.
Submit your lab (the
.zip file) to the corresponding assignment on Canvas. You have unlimited attempts before the deadline. Your final submission before the deadline will be graded.
Grading of this lab will largely be based on the ability of the grader to access and run your code. That is, the grader should be able to unzip your
lab10.zip file, open
lab10.Rproj, then finally open and knit
lab10.Rmd without any modification or errors. If they are able to do so, and the resulting
lab10.html contains the graphics described below, you will receive at least nine of the ten possible points for the lab.
The following video describes how to create all of the files described above. It will also walk through each of the exercises and describe and least one valid solution.
lab10.Rmd you should first create an RStudio Project named
lab10. (The video above will demonstrate this.) This will also create a folder named
lab10.Rmd and place it inside this folder.
Add the following code to your
.Rmd file which will load the
tidyverse. Throughout this lab you may need functions from
Additionally, add the following code to your
.Rmd file which will load the data needed for this lab:
mlb_pitches_2021 = as_tibble(readRDS(url("https://stat385.org/data/mlb_pitches_2021.rds")))
This data originates from Baseball Savant. In particular this data comes from the Statcast that MLB collects. Several data transformations have been done to the originally accessed data. Ultimately this data contains information on the pitch type, velocity, and spin rate of every MLB pitch thrown in 2021.
The following video explains the various “pitch types” used in baseball:
The following table explains the abbreviations used by Statcast:
|Pitch Type||Pitch Name|
Create a bar plot that shows the frequency of each pitch type in 2021. Order the bars according to frequency.
mlb_pitches_2021 %>% filter(pitch_type != "") %>% ggplot(aes(x = fct_infreq(pitch_type), fill = pitch_type)) + geom_bar(show.legend = FALSE) + labs(title = "Frequency of MLB Pitch Types", subtitle = "2021 Season", caption = "Data Source: Baseball Savant") + xlab("Pitch Type") + ylab("Count") + theme_bw()
Can you guess the type of pitch just by watching it?
To get a sense of how this is more easily done by looking at velocity and spin rates, create a plot of spin rate versus velocity for Carlos Rodon. Use color and shapes to indicate the pitch types.
mlb_pitches_2021 %>% filter(pitch_type != "") %>% filter(name == "Carlos Rodon") %>% na.omit() %>% ggplot(aes( x = release_speed, y = release_spin_rate, color = pitch_type, shape = pitch_type )) + geom_point() + labs(title = "Spin Rate versus Velocity", subtitle = "Carlos Rodon, 2021", caption = "Data Source: Baseball Savant", color = "Pitch Type", shape = "Pitch Type") + xlab("Velocity") + ylab("Spin Rate") + scale_color_brewer(palette = "Set1") + theme_bw()