See Lab 01 for instructions on cloning a repo and starting a new project in RStudio.
Once you have the new project, run the code below (filling in your github username and email address) to configure git.
library(usethis)
use_git_config(user.name= "your github username", user.email="your email")
library(tidyverse)
library(broom)
porsche <- read_csv("data/PorschePrice.csv")
In this AE, we will analyze the relationship between mileage and price for 30 Porsches for sale. More specifically, we want to use the mileage to understand variation in the price. The data set includes the following variables:
Price
: Asking price for the car (in $1,000’s)Age
: Age of the car (in years)Mileage
: Previous miles driven (in 1,000’s)Let’s start by getting a quick view of the data.
glimpse(porsche)
## Rows: 30
## Columns: 3
## $ Price <dbl> 69.4, 56.9, 49.9, 47.4, 42.9, 36.9, 83.0, 72.9, 69.9, 67.9, 6…
## $ Age <dbl> 3, 3, 2, 4, 4, 6, 0, 0, 2, 0, 2, 2, 4, 3, 10, 11, 4, 4, 10, 3…
## $ Mileage <dbl> 21.50, 43.00, 19.90, 36.00, 44.00, 49.80, 1.30, 0.67, 13.40, …
price_model <- lm(Price ~ Mileage, data = porsche)
tidy(price_model)
We would like to test the following hypotheses:
\[H_0: \beta_1 = 0 \text{ vs. } H_a: \beta_1 \neq 0\]
State the null and alternative hypotheses in words.
What is the test statistic? What does it mean?
What is the p-value? What distribution was used to calculate the p-value? What does the p-value mean?
State your conclusion in the context of the data.
We would like to calculate and interpret a 95% confidence interval for \(\beta_1\).
Recall, the confidence interval is
\[\hat{\beta}_1 \pm t^* SE(\hat{\beta}_1)\]
df <- nrow(porsche) - 2
critical_val <- qt(0.975,df)
critical_val
## [1] 2.048407
## calculate 95% confidence interval
Knit your Rmd file to view the updated output. Commit your changes with an informative commit message, and push the updated files to GitHub.
The data used in this exercise is from Stat2: Building Models for a World of Data.