R Help: Poisson & Nonlinear Regression

Hospital recovery (exponential regression)

  • Load the recovery data.
  • Create log(prog) variable.
  • Obtain starting values for nonlinear model parameters from fitting a simple linear regression model of log(prog) vs days.
  • Fit nonlinear regression model to data using these starting values.
  • Create a scatterplot of prog vs days and add a fitted line based on the nonlinear regression model.

recovery <- read.table("~/path-to-data/recovery.txt", header=T)
attach(recovery)
logprog <- log(prog)
summary(lm(logprog ~ days))
#              Estimate Std. Error t value Pr(>|t|)    
# (Intercept)  4.037159   0.084103   48.00 5.08e-16 ***
# days        -0.037974   0.002284  -16.62 3.86e-10 ***
exp(4.037159) # 56.66513
model.1 <- nls(prog ~ theta1 * exp(theta2 * days),
start=list(theta1=56.7, theta2=-0.038))
summary(model.1)
#         Estimate Std. Error t value Pr(>|t|)    
# theta1 58.606532   1.472159   39.81 5.70e-15 ***
# theta2 -0.039586   0.001711  -23.13 6.01e-12 ***
# ---
# Residual standard error: 1.951 on 13 degrees of freedom
plot(x=days, y=prog,
panel.last = lines(sort(days), fitted(model.1)[order(days)]))
detach(recovery)

U.S. Census population (population growth nonlinear regression)

  • Load the census data.
  • Obtain starting values for nonlinear model parameters from observing features of a scatterplot of population vs year.
  • Fit nonlinear regression model to data using these starting values.
  • Create a scatterplot of population vs year and add a fitted line based on the nonlinear regression model.
  • Create a residual plot.

census <- read.table("~/path-to-data/us_census.txt", header=T)
attach(census)
plot(x=year, y=population)
log(350/3.929-1) # 4.478259
log(350/5.308-1) - log(350/3.929-1) # -0.3048229
model.1 <- nls(population ~ beta1 / (1 + exp(beta2 + beta3 * (year - 1790) / 10)),
start=list(beta1=350, beta2=4.5, beta3=-0.3))
summary(model.1)
#        Estimate Std. Error t value Pr(>|t|)    
# beta1 389.16551   30.81196   12.63  2.2e-10 ***
# beta2   3.99035    0.07032   56.74  < 2e-16 ***
# beta3  -0.22662    0.01086  -20.87  4.6e-14 ***
# ---
# Residual standard error: 4.45 on 18 degrees of freedom
plot(x=year, y=population,
panel.last = lines(sort(year), fitted(model.1)[order(year)]))
plot(x=year, y=residuals(model.1),
panel.last = abline(h=0, lty=2))
detach(census)

Poisson example (Poisson regression)

  • Load the poisson data.
  • Create a scatterplot of the data.
  • Fit a Poisson regression model of y vs x.
  • Calculate 95% confidence intervals for the regression parameters based on asymptotic normality and based on profiling the least-squares estimation surface.
  • Create a scatterplot of y vs x and add a fitted line based on the Poisson regression model.
  • Conduct a likelihood ratio (or deviance) test for x.
  • Calculate the sum of squared deviance residuals and the sum of squared Pearson residuals and calculate p-values based on chi-squared goodness-of-fit tests.
  • Calculate pseudo \(R^2\) for Poisson regression.
  • Create residual plots using Pearson and deviance residuals.
  • Calculate hat values (leverages) and studentized residuals.

poisson <- read.table("~/path-to-data/poisson_simulated.txt", header=T)
attach(poisson)
plot(x=x, y=y)
model.1 <- glm(y ~ x, family="poisson")
summary(model.1)
#             Estimate Std. Error z value Pr(>|z|)    
# (Intercept)  0.30787    0.28943   1.064    0.287    
# x            0.07636    0.01730   4.413 1.02e-05 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# (Dispersion parameter for poisson family taken to be 1)
# 
# Null deviance: 48.310  on 29  degrees of freedom
# Residual deviance: 27.842  on 28  degrees of freedom
# AIC: 124.5
confint.default(model.1) # based on asymptotic normality
confint(model.1) # based on profiling the least-squares estimation surface
plot(x=x, y=y,
panel.last = lines(sort(x), fitted(model.1)[order(x)]))
anova(model.1, test="Chisq")
#      Df Deviance Resid. Df Resid. Dev  Pr(>Chi)    
# NULL                    29     48.310              
# x     1   20.468        28     27.842 6.065e-06 ***
sum(residuals(model.1, type="deviance")^2) # 27.84209
model.1$deviance # 27.84209
pchisq(model.1$deviance, 28, lower.tail=F) # p-value = 0.4728389
sum(residuals(model.1, type="pearson")^2) # 26.09324
pchisq(sum(residuals(model.1, type="pearson")^2), 28, lower.tail=F) # p-value = 0.5679192
1-model.1$deviance/model.1$null.deviance # Pseudo R-squared = 0.423676
plot(fitted(model.1), residuals(model.1, type="pearson"))
plot(fitted(model.1), residuals(model.1, type="deviance"))
summary(influence.measures(model.1))
#    dfb.1_ dfb.x dffit cov.r   cook.d hat    
# 10 -0.22   0.30  0.37  1.25_*  0.08   0.18  
# 21  0.37  -0.48 -0.57  1.30_*  0.15   0.23_*
residuals(model.1)[8] # 1.974329
rstudent(model.1)[8] # 2.028255
detach(poisson)