Hang out with the TAs from STA 210! This is a casual conversation and a fun opportunity to meet the members of the STA 210 teaching team. The only rule is these can’t turn into office hours!
Tea with a TA counts as a statistics experience.
Cody Coombs, Thu, Nov 5, 4p - 5p
If you’re eligible, VOTE! Find out more information: https://vote.duke.edu/
Electronic Undergraduate Statistics Research Conference (eUSR) Nov 6, 11:30a - 4:40p
Q 4.2 thrown out (3 free points, +1 bonus if correct)
After fitting a logistic regression, you compute the raw residual, \(y_i - \hat{\pi}_i\), for each observation. 20% of the raw residuals are positive, and 80% are negative. Because there are far more raw residuals below zero than above zero, this logistic regression does not fit the data well.
Thu, Nov 5 - Sun, Nov 8, not timed.
Basic premise: You will be given a case study scenario and a few question prompts. You will apply what you’ve learned throughout the semester to the given scenario.
You will submit your answer in narrative form.
More details will be emailed Thursday morning.
Click here for slides.
Today’s data comes from an experiment by the Educational Testing Service to test the effectiveness of the children’s program Sesame Street. Sesame Street is an educational program designed to teach young children basic educational skills such as counting and the alphabet
As part of the experiment, children were assigned to one of two groups: those who were encouraged to watch the program and those who were not.
The show is only effective if children watch it, so we want to understand what effect the encouragement had on the frequency children watched the program.
Response:
viewcat
Predictors:
age
: child’s age in monthsprenumb
: score on numbers pretest (0 to 54)prelet
: score on letters pretest (0 to 58)viewenc
: 1: encouraged to watch, 0: not encouragedsite:
viewenc
, prenumbCent
, and site
to predict how frequently a child viewed Sesame Street (viewcat
).y.level | term | estimate | std.error | statistic | p.value |
---|---|---|---|---|---|
2 | (Intercept) | -0.204 | 0.484 | -0.421 | 0.674 |
2 | site2 | -0.069 | 0.774 | -0.088 | 0.929 |
2 | site3 | -1.069 | 0.640 | -1.670 | 0.095 |
2 | site4 | -1.902 | 0.640 | -2.971 | 0.003 |
2 | site5 | -1.773 | 0.830 | -2.136 | 0.033 |
2 | prenumbCent | 0.023 | 0.024 | 0.967 | 0.334 |
2 | viewenc1 | 2.652 | 0.493 | 5.378 | 0.000 |
3 | (Intercept) | 0.050 | 0.467 | 0.108 | 0.914 |
3 | site2 | 0.222 | 0.739 | 0.300 | 0.764 |
3 | site3 | -0.880 | 0.629 | -1.399 | 0.162 |
3 | site4 | -2.465 | 0.681 | -3.621 | 0.000 |
3 | site5 | -3.674 | 1.235 | -2.974 | 0.003 |
3 | prenumbCent | 0.051 | 0.024 | 2.184 | 0.029 |
3 | viewenc1 | 2.467 | 0.494 | 4.997 | 0.000 |
4 | (Intercept) | -0.273 | 0.499 | -0.547 | 0.584 |
4 | site2 | 0.919 | 0.741 | 1.241 | 0.215 |
4 | site3 | -0.645 | 0.663 | -0.973 | 0.330 |
4 | site4 | -2.417 | 0.753 | -3.211 | 0.001 |
4 | site5 | -1.644 | 0.869 | -1.893 | 0.058 |
4 | prenumbCent | 0.067 | 0.023 | 2.831 | 0.005 |
4 | viewenc1 | 2.291 | 0.501 | 4.575 | 0.000 |
Let’s get the predicted view category using the augment function .
Make a table to view the actual vs. predicted view categories. How well did the model perform?
The data come from the 2015 Family Income and Expenditure Survey conducted by the Philippine Statistics Authority.
The variables in the data are
age
: the age of the head of householdtotal
: the number of people in the household other than the headlocation
: where the house is located (Central Luzon, Davao Region, Ilocos Region, Metro Manila, or Visayas)numLT5
: the number in the household under 5 years of ageroof
: the type of roof in the household (either Predominantly Light/Salvaged Material, or Predominantly Strong Material, where stronger material can sometimes be used as a proxy for greater wealth)We fit the following model:
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 1.436 | 0.017 | 82.339 | 0 |
ageCent | -0.004 | 0.001 | -3.584 | 0 |
I(ageCent^2) | -0.001 | 0.000 | -10.938 | 0 |
Interpret the coefficient of ageCent^2
in the context of the data.
Conduct a test to assess whether location
is a useful predictor of the number of people in the household after accounting for age of the head of the household.
The dataset for Part 2 is from Chapter 4 of Beyond Multiple Linear Regression.