Getting started

Clone the hw-02- repo and start a new project in RStudio. For more detailed instructions about getting started, see the Lab 01 instructions

Type the following lines of code in the console in RStudio filling in your Github username and the email address associated with your Github account.

library(usethis)
use_git_config(user.name = "your github username", 
               user.email="your email")

Tips

Here are some tips as you complete HW 02:

  • Periodically knit your document and commit changes (the more often the better, for example, once per each new task).
  • Be sure to include all relevant code and resulting output.
  • Push all your changes back to your GitHub repo. The Git pane in RStudio should be empty after you push. You can also check your assignment repo on github.com to make sure it has updated.
  • Once you have completed your work, you will submit your assignment in Gradescope. You are welcome to resubmit your work until the assignment deadline. We will grade the most recent version of the .pdf file in Gradescope.

Packages

We will use the following packages in this assignment:

library(tidyverse)
library(broom)
library(knitr)
#add other packages as needed

Questions

Part 1: Conceptual questions

We will go back to the data set from HW 01 that contains the nutrition information for food items at Starbucks.

Starbucks will often display the number of calories in store displays but will not show any other nutrition information. Therefore, we want to use the number of calories to understand variability in the amount of carbohydrates (in grams) in Starbucks food items.

Use the code below to load the dataset. It is originally from the openintro R package.

starbucks <- read_csv("data/starbucks.csv")

We fit a linear model to describe the relationship between the amount of carbohydrates (carb) and the number of calories (calories) in Starbucks food items. Use the code below to fit and display the model.

carb_model <- lm(carb ~ calories, data = starbucks)
tidy(carb_model) %>%
  kable(digits = 3)
term estimate std.error statistic p.value
(Intercept) 8.944 4.746 1.884 0.063
calories 0.106 0.013 7.923 0.000
  1. According to the Starbucks menu, pumpkin bread has 410 calories.

    a. How many grams of carbohydrates do you predict to be in pumpkin bread? Show how you calculated this answer.

    b. Show code and output to obtain a 90% confidence interval for the mean grams of carbohydrates in Starbucks food items with 410 calories.

    c.  Show code and output to obtain a 90% prediction interval for the grams of carbohydrates in food items with 410 calories.

    d.  You decide to purchase a piece of pumpkin bread, and you want to predict the amount of carbs in this piece. Should you use the interval from part b or part c? Briefly explain your choice.

  2. Let’s take a look at the model conditions. First, create a scatterplot of the residuals vs. predicted values. Show the code used to create the plot. Be sure your plot has informative axis labels and title. Hint: You can use the augment function to calculate the residuals and predicted values for each observation.

  3. Make a histogram and Normal quantile plot of the residuals. Show the code used to create each plot. Be sure your plots have informative axis labels and titles.

  4. Are the model conditions satisfied? Briefly explain your reasoning for each condition.

    • Linearity
    • Constant variance
    • Normality
    • Independence
  5. What percent of the variation in carbohydrates is explained by the calories in Starbucks food items? Show how you calculated this value from the ANOVA table.

Part 2: Wrapping up SLR

Write your responses to the following questions:

  • What is one question you still have about simple linear regression or modeling in general?

  • What is one thing you learned about simple linear regression?

You must write a substantive response to receive credit, i.e. responses equivalent to “I don’t have any questions” are not considered substantive and will not receive credit.

Submission

Knit, commit, and push your final changes to GitHub, then submit your finalized PDF on Gradescope. See the Lab 01 instructions for more details on submitting your work on Gradescope.

Grading

Total 50
Part 1: Conceptual questions 34
Part 2: Wrapping up SLR 6
Document submitted as PDF with clear question headers 3
Name and date updated in YAML 2
Narrative written in complete sentences 3
At least 3 informative commit messages 2