AE 04: Price vs. Mileage for Porsches

Announcements

Lab 01 due today at 11:59p
Email (see syllabus)
Introduce George, the R support TA
HW 01 released after class
Principles of data analysis (see vignettes)

Questions?

Review hypothesis test + confidence interval

Clone a repo + start a new project

See Lab 01 for instructions on cloning a repo and starting a new project in RStudio.

Once you have the new project, run the code below (filling in your github username and email address) to configure git.

library(usethis)
use_git_config(user.name= "your github username", user.email="your email")

Price vs. Mileage

library(tidyverse)
library(broom)

porsche <- read_csv("data/PorschePrice.csv")

In this AE, we will analyze the relationship between mileage and price for 30 Porsches for sale. More specifically, we want to use the mileage to understand variation in the price. The data set includes the following variables:

Price: Asking price for the car (in $1,000’s)
Age: Age of the car (in years)
Mileage: Previous miles driven (in 1,000’s)

Let’s start by getting a quick view of the data.

glimpse(porsche)

## Rows: 30
## Columns: 3
## $ Price   <dbl> 69.4, 56.9, 49.9, 47.4, 42.9, 36.9, 83.0, 72.9, 69.9, 67.9, 6…
## $ Age     <dbl> 3, 3, 2, 4, 4, 6, 0, 0, 2, 0, 2, 2, 4, 3, 10, 11, 4, 4, 10, 3…
## $ Mileage <dbl> 21.50, 43.00, 19.90, 36.00, 44.00, 49.80, 1.30, 0.67, 13.40, …

Linear model

price_model <- lm(Price ~ Mileage, data = porsche)
tidy(price_model)

Hypothesis test for $\beta_1$

We would like to test the following hypotheses:

\[H_0: \beta_1 = 0 \text{ vs. } H_a: \beta_1 \neq 0\]

State the null and alternative hypotheses in words.
What is the test statistic? What does it mean?
What is the p-value? What distribution was used to calculate the p-value? What does the p-value mean?
State your conclusion in the context of the data.

Confidence interval for $\beta_1$

We would like to calculate and interpret a 95% confidence interval for $\beta_1$.

Recall, the confidence interval is

\[\hat{\beta}_1 \pm t^* SE(\hat{\beta}_1)\]

Use the code below to calculate the critical value, $t^*$.

df <- nrow(porsche) - 2
critical_val <- qt(0.975,df)
critical_val

## [1] 2.048407

Calculate the 95% confidence interval.

## calculate 95% confidence interval

Interpret the interval in the context of the data.

Knit your Rmd file to view the updated output. Commit your changes with an informative commit message, and push the updated files to GitHub.

The data used in this exercise is from Stat2: Building Models for a World of Data.

AE 04: Price vs. Mileage for Porsches

Inference

Your Name

2020-08-26