See Lab 01 for instructions on cloning a repo and starting a new project in RStudio.
Once you have the new project, run the code below (filling in your github username and email address) to configure git.
library(usethis)
use_git_config(user.name= "your github username", user.email="your email")
library(tidyverse)
library(broom)
library(knitr)
In this AE, we will look at the price of textbooks and how it varies based on the number of pages. The data contains the price and number of pages for a random sample of 30 college textbooks from the Cal Poly-San Luis Obispo bookstore in Fall 2006.
textbooks <- read_csv("data/textbooks.csv")
We will use the following variables: - Pages
: Number of pages in the textbook - Price
: Price of the textbook in US dollars
textbook_model <- lm(Price ~ Pages, data = textbooks)
tidy(textbook_model) %>%
kable(digits = 3)
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | -3.422 | 10.464 | -0.327 | 0.746 |
Pages | 0.147 | 0.019 | 7.653 | 0.000 |
\[\hat{Price} = -3.422 + 0.147 \times Pages\]
We can calculate the ANOVA table in R using the following code:
anova(textbook_model) %>%
kable(digits = 3)
Df | Sum Sq | Mean Sq | F value | Pr(>F) | |
---|---|---|---|---|---|
Pages | 1 | 51877.03 | 51877.030 | 58.573 | 0 |
Residuals | 28 | 24799.19 | 885.685 | NA | NA |
Use the ANOVA table 1. Calculate the total sum of squares (\(SS_{total}\)).
Calculate the total degrees of freedom.
What is \(\hat{\sigma}_\epsilon\), the regression standard error?
Calculate \(R^2\). Interpret this value.
Note: You can get model summaries in R using the glance
function. Use the code below to get \(\hat{\sigma}_\epsilon\) and \(R^2\). Check your responses exercises 3 and 4.
#Remove eval = F from the code chunk header
glance(textbook_model)$sigma
glance(textbook_model)$r.squared
State the null and alternative hypotheses we can test using the ANOVA table.
What is the test statistic? How is it calculated?
What distribution was used to calculate the p-value?
State the conclusion from the test in the context of the data.
Knit your Rmd file to view the updated output. Commit your changes with an informative commit message, and push the updated files to GitHub.
The data used in this exercise is from Stat2: Building Models for a World of Data.