> ################################ > ### Example: Students and parents smoking > ### Lesson 3 & 6: LOGLIN() & GLM() functions > ### Related: See smoke.sas in Lesson 6 > ################################# > > ### Fitting logistic regression for a 2x2 table ##### > ### Here is one way to read the data from the table and use glm() > ### NOTE: If the data come from a datafile, or are in a different format > ### we would need a slightly different code bellow > ### but you need to have the data for "success" in one column and "failure" in another > ### Notice now we need to use family=binomial (link=logit) > ### while with log-linear models we used family=poisson(link=log) > > ### define the explanatory variable with two levels: 1=one or more parents smoke, 0=no parents smoke > parentsmoke=as.factor(c(1,0)) > ###NOTE: if we do parentsmoke=c(1,0) R will treat this as a numeric and not categorical variable > > ### need to create a response vector such that has counts for both "success" and "failure" > response<-cbind(yes=c(816,188),no=c(3203,1168)) > response yes no [1,] 816 3203 [2,] 188 1168 > > ### fit the logistic regression model > smoke.logistic<-glm(response~parentsmoke, family=binomial(link=logit)) > > ### OUTPUT > smoke.logistic Call: glm(formula = response ~ parentsmoke, family = binomial(link = logit)) Coefficients: (Intercept) parentsmoke1 -1.8266 0.4592 Degrees of Freedom: 1 Total (i.e. Null); 0 Residual Null Deviance: 29.12 Residual Deviance: 2.554e-13 AIC: 19.24 > summary(smoke.logistic) Call: glm(formula = response ~ parentsmoke, family = binomial(link = logit)) Deviance Residuals: [1] 0 0 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -1.82661 0.07858 -23.244 < 2e-16 *** parentsmoke1 0.45918 0.08782 5.228 1.71e-07 *** --- Signif. codes: 0 Ô***Õ 0.001 Ô**Õ 0.01 Ô*Õ 0.05 Ô.Õ 0.1 Ô Õ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 2.9121e+01 on 1 degrees of freedom Residual deviance: 2.5535e-13 on 0 degrees of freedom AIC: 19.242 Number of Fisher Scoring iterations: 2 > anova(smoke.logistic) Analysis of Deviance Table Model: binomial, link: logit Response: response Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev NULL 1 29.121 parentsmoke 1 29.121 0 2.554e-13 > > ###NOW, treat parents smoking as a factor > > > > ### Fitting logistic regression for a 2x3 table ##### > ### Here is one way to read the data from the table and use glm() > ### Notice now we need to use family=binomial (link=logit) > ### while with log-linear models we used family=poisson(link=log) > > parentsmoke=as.factor(c(2,1,0)) > response<-cbind(c(400,416,188),c(1380,1823,1168)) > response [,1] [,2] [1,] 400 1380 [2,] 416 1823 [3,] 188 1168 > smoke.logistic<-glm(response~parentsmoke, family=binomial(link=logit)) > > ### OUTPUT > smoke.logistic Call: glm(formula = response ~ parentsmoke, family = binomial(link = logit)) Coefficients: (Intercept) parentsmoke1 parentsmoke2 -1.8266 0.3491 0.5882 Degrees of Freedom: 2 Total (i.e. Null); 0 Residual Null Deviance: 38.37 Residual Deviance: -5.498e-13 AIC: 28.16 > summary(smoke.logistic) Call: glm(formula = response ~ parentsmoke, family = binomial(link = logit)) Deviance Residuals: [1] 0 0 0 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -1.82661 0.07858 -23.244 < 2e-16 *** parentsmoke1 0.34905 0.09554 3.654 0.000259 *** parentsmoke2 0.58823 0.09695 6.067 1.30e-09 *** --- Signif. codes: 0 Ô***Õ 0.001 Ô**Õ 0.01 Ô*Õ 0.05 Ô.Õ 0.1 Ô Õ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 3.8366e+01 on 2 degrees of freedom Residual deviance: -5.4978e-13 on 0 degrees of freedom AIC: 28.165 Number of Fisher Scoring iterations: 2 > anova(smoke.logistic) Analysis of Deviance Table Model: binomial, link: logit Response: response Terms added sequentially (first to last) Df Deviance Resid. Df Resid. Dev NULL 2 38.366 parentsmoke 2 38.366 0 -5.498e-13