Interpretation of parameters in multiple regression, collinearity, and influence.

November 29, 2010

Today we focused on some issues that arise when there is more than one explanatory variable in a regression model (i.e., multiple regression). This included the interpretation of the partial slopes/effects and the importance in some cases of controlling for the effects of other variables, and collinearity — what it is and why it can be a problem. I also introduced the concept of influence and leverage — that the influence of an observation depends on its leverage and residual.

 


Simultaneous tests — continued, special cases, and interactions in regression.

November 19, 2010

We continued looking at simultaneous tests and special cases thereof, and considered an example where we had both a quantitative explanatory variable, and a categorical explanatory variable represented by indicator variables. We also considered how to model an interaction between the quantitative and categorical explanatory variables.


Indicator variables and simultaneous tests.

November 17, 2010

Today we focused on the use of indicator variables so that we can make the same kinds of inferences with regression that we do with the analysis of variance. I also introduced the idea of the simultaneous test in regression where we test the null hypothesis that a set of beta parameters are equal to zero.


Polynomial regression, and an introduction to indicator variables.

November 15, 2010

Today I expanded on the idea of using more than one explanatory variable. This is useful of course if there is more than one explanatory variable, but it is also useful for extending regression to handle nonlinear relationships through polynomials, and to handle categorical explanatory variables through the use of indicator/dummy variables.

Homework: 11.4, 11.7, 11.15, 11.22, 11.23, 11.25, 11.26, 11.33, 11.34, 12.5, 12.10.


Inference in simple linear regression — continued.

November 10, 2010

I continued the introduction of the simple linear regression model. I considered the confidence interval for the mean of the response variable, and the prediction interval for the value of the response variable, both for a given value of the explanatory variable. We also started considering multiple linear regression with an arbitrary number of explanatory variables.


Inference with simple linear regression.

November 8, 2010

Today we focused on the simple linear regression model, its assumptions, and inferences. For inferences we looked at the estimation of the intercept and slope parameters via least squares, and the estimation of the variance parameter. We also considered the confidence interval and test statistic for the slope parameter. I also explained the ANOVA table in the context of regression. Finally I demonstrated the use of PROC REG and PROC GLM to facilitate these inferences.

Note: Solutions for the most recently set of homework/practice problems are available on Blackboard.


Compound symmetry, sphericity, and an introduction to regression.

November 5, 2010

Designs with repeated measures assume a condition known as sphericity. A stronger condition that implies sphericity is compound symmetry. The Greenhouse-Geisser and Huynh-Feldt indices can be used to measure the degree to which the sphericity assumption is violated. They can also be used to adjust p-values the compensate for violations of sphericity. In SAS this occurs automatically when one uses the repeated statement to analyze “short form” repeated measures data.

This concludes our discussion of design and the analysis of variance. We’ll now turn to regression for the rest of the course, first as a separate topic, but eventually as a general statistical framework that includes the models used in ANOVA.

Read: Chapter 11 (skip 11.5 and 11.6), Chapter 12 (skip 12.8, 12.9).


Split-plot and repeated measures designs.

November 3, 2010

We finished covering split-plot designs and looked briefly at repeated measures designs. A repeated measure design is similar in structure and analysis to a randomized block design or a split-plot design, but we don’t usually have randomization to the levels of the factor corresponding to the repeated measurements since it usually involves something that can’t be randomized like time or space.


Nested and split-plot designs.

November 1, 2010

Today we looked at the analysis of designs with nested factors. We also starting looking at split-plot designs which involve nesting and random effects.

Read: Chapter 18 (skip 18.5).

 


Follow

Get every new post delivered to your Inbox.