+ - 0:00:00
Notes for current slide
Notes for next slide

Simple Linear Regression

Partioning variability

Prof. Maria Tackett

1

Topics

3

Topics

  • Use analysis of variance to partition variability in the response variable
3

Topics

  • Use analysis of variance to partition variability in the response variable

  • Define and calculate R2

3

Topics

  • Use analysis of variance to partition variability in the response variable

  • Define and calculate R2

  • Use ANOVA to test the hypothesis

H0:β1=0 vs Ha:β10

3

Topics

  • Use analysis of variance to partition variability in the response variable

  • Define and calculate R2

  • Use ANOVA to test the hypothesis

H0:β1=0 vs Ha:β10

3

Cats data

The data set contains the heart weight (Hwt) and body weight (Bwt) for 144 domestic cats.

4

Distribution of response

Mean Std. Dev. IQR
10.631 2.435 3.175
5

The model

^Hwt=0.357+4.034×Bwt

6

How much of the variation in cats' heart weights can be explained by knowing their body weights?

7

ANOVA

We will use Analysis of Variance (ANOVA) to partition the variation in the response variable Y.


8

Response variable, Y

9

Total variation

SSTotal=ni=1(yiˉy)2=(n1)s2y

10

Explained variation (Model)

SSModel=ni=1(ˆyiˉy)2

11

Unexplained variation (Residuals)

SSError=ni=1(yiˆyi)2

12

ni=1(yiˉy)2=ni=1(ˆyiˉy)2+ni=1(yiˆyi)2

13

ni=1(yiˉy)2=ni=1(ˆyiˉy)2+ni=1(yiˆyi)2

14

ni=1(yiˉy)2=ni=1(ˆyiˉy)2+ni=1(yiˆyi)2

15

ni=1(yiˉy)2=ni=1(ˆyiˉy)2+ni=1(yiˆyi)2

16

R2

The coefficient of determination, R2, is the proportion of variation in the response, Y, that is explained by the regression model


17

R2

The coefficient of determination, R2, is the proportion of variation in the response, Y, that is explained by the regression model


R2=SSModelSSTotal=1SSErrorSSTotal

17

R2 for our model

SSModel=548.092

SSError=299.533

SSTotal=847.625

18

R2 for our model

SSModel=548.092

SSError=299.533

SSTotal=847.625

R2=548.092847.625=0.647

18

R2 for our model

SSModel=548.092

SSError=299.533

SSTotal=847.625

R2=548.092847.625=0.647


About 64.7% of the variation in the heart weight of cats can be explained by variation in body weight.

18

ANOVA table

Source Df Sum Sq Mean Sq F Stat Pr(> F)
Model 1 548.092 548.092 259.835 0
Residuals 142 299.533 2.109
Total 143 847.625
19

ANOVA table

Source Df Sum Sq Mean Sq F Stat Pr(> F)
Model 1 548.092 548.092 259.835 0
Residuals 142 299.533 2.109
Total 143 847.625
20

ANOVA table

Source Df Sum Sq Mean Sq F Stat Pr(> F)
Model 1 548.092 548.092 259.835 0
Residuals 142 299.533 2.109
Total 143 847.625

Sum of squares

SSTotal=847.625=548.092+299.533

SSModel=548.092

SSError=299.533

20

ANOVA Test

Source Df Sum Sq Mean Sq F Stat Pr(> F)
Model 1 548.092 548.092 259.835 0
Residuals 142 299.533 2.109
Total 143 847.625
21

ANOVA Test

Source Df Sum Sq Mean Sq F Stat Pr(> F)
Model 1 548.092 548.092 259.835 0
Residuals 142 299.533 2.109
Total 143 847.625


H0:β1=0Ha:β10

21

ANOVA Test

Source Df Sum Sq Mean Sq F Stat Pr(> F)
Model 1 548.092 548.092 259.835 0
Residuals 142 299.533 2.109
Total 143 847.625
22

ANOVA Test

Source Df Sum Sq Mean Sq F Stat Pr(> F)
Model 1 548.092 548.092 259.835 0
Residuals 142 299.533 2.109
Total 143 847.625

Degrees of freedom

dfTotal=1441=143

dfModel=1

dfError=1431=142

22

ANOVA Test

Source Df Sum Sq Mean Sq F Stat Pr(> F)
Model 1 548.092 548.092 259.835 0
Residuals 142 299.533 2.109
Total 143 847.625
23

ANOVA Test

Source Df Sum Sq Mean Sq F Stat Pr(> F)
Model 1 548.092 548.092 259.835 0
Residuals 142 299.533 2.109
Total 143 847.625

Mean squares

MSModel=548.0921=548.092

MSError=299.533142=2.109

23

ANOVA Test

Source Df Sum Sq Mean Sq F Stat Pr(> F)
Model 1 548.092 548.092 259.835 0
Residuals 142 299.533 2.109
Total 143 847.625
24

ANOVA Test

Source Df Sum Sq Mean Sq F Stat Pr(> F)
Model 1 548.092 548.092 259.835 0
Residuals 142 299.533 2.109
Total 143 847.625

F test statistic: ratio of explained to unexplained variability

F=MSModelMSError=548.0922.109=259.835

24

F distribution

25

ANOVA test

Source Df Sum Sq Mean Sq F Stat Pr(> F)
Model 1 548.092 548.092 259.835 0
Residuals 142 299.533 2.109
Total 143 847.625
26

ANOVA test

Source Df Sum Sq Mean Sq F Stat Pr(> F)
Model 1 548.092 548.092 259.835 0
Residuals 142 299.533 2.109
Total 143 847.625

P-value: Probability of observing a test statistic at least as extreme as F Stat given the population slope β1 is 0

26

ANOVA test

Source Df Sum Sq Mean Sq F Stat Pr(> F)
Model 1 548.092 548.092 259.835 0
Residuals 142 299.533 2.109
Total 143 847.625

P-value: Probability of observing a test statistic at least as extreme as F Stat given the population slope β1 is 0

The p-value is calculated using an F distribution with 1 and n2 degrees of freedom

26

Calculating p-value

27

ANOVA

Source Df Sum Sq Mean Sq F Stat Pr(> F)
Model 1 548.092 548.092 259.835 0
Residuals 142 299.533 2.109
Total 143 847.625

The p-value is very small (0), so we reject H0.

28

ANOVA

Source Df Sum Sq Mean Sq F Stat Pr(> F)
Model 1 548.092 548.092 259.835 0
Residuals 142 299.533 2.109
Total 143 847.625

The p-value is very small (0), so we reject H0.

The data provide strong evidence that population slope, β1, is different from 0.

28

ANOVA

Source Df Sum Sq Mean Sq F Stat Pr(> F)
Model 1 548.092 548.092 259.835 0
Residuals 142 299.533 2.109
Total 143 847.625

The p-value is very small (0), so we reject H0.

The data provide strong evidence that population slope, β1, is different from 0.

The data provide sufficient evidence that there is a linear relationship between a cat's heart weight and body weight.

28

Recap

29

Recap

  • Used analysis of variance to partition variability in the response variable
29

Recap

  • Used analysis of variance to partition variability in the response variable

  • Defined and calculated R2

29

Recap

  • Used analysis of variance to partition variability in the response variable

  • Defined and calculated R2

  • Used ANOVA to test the hypothesis H0:β1=0 vs Ha:β10

29
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow