Analysis of my data

about the data

goal

Notice the above three lines are different levels of headings.

eb <- read.delim("~/Dropbox/R_Class/EssentialR/Data/electric bill.txt")

summary(eb)
##      month             year           kwh              days       est   
##  Min.   : 1.000   Min.   :2003   Min.   : 294.0   Min.   :21.00   a:63  
##  1st Qu.: 4.000   1st Qu.:2005   1st Qu.: 657.0   1st Qu.:29.00   e:38  
##  Median : 7.000   Median :2007   Median : 849.0   Median :30.00         
##  Mean   : 6.673   Mean   :2007   Mean   : 876.6   Mean   :30.36         
##  3rd Qu.:10.000   3rd Qu.:2009   3rd Qu.:1076.0   3rd Qu.:32.00         
##  Max.   :12.000   Max.   :2011   Max.   :1992.0   Max.   :35.00         
##       cost             avgT           dT.yr              kWhd.1     
##  Min.   :-73.09   Min.   :13.00   Min.   :-13.0000   Min.   :10.50  
##  1st Qu.: 55.03   1st Qu.:39.00   1st Qu.: -3.0000   1st Qu.:21.97  
##  Median : 68.03   Median :53.00   Median :  1.0000   Median :27.83  
##  Mean   : 67.84   Mean   :52.35   Mean   :  0.2475   Mean   :28.74  
##  3rd Qu.: 83.94   3rd Qu.:69.00   3rd Qu.:  3.0000   3rd Qu.:34.18  
##  Max.   :174.70   Max.   :78.00   Max.   : 11.0000   Max.   :66.40

This data looks OK.

Regression model of usage as a function of temp….

m1<-lm(kwh ~ avgT, data = eb)
summary(m1)$coeff
##                Estimate Std. Error   t value     Pr(>|t|)
## (Intercept) 1378.076386  85.362566 16.143802 1.741231e-29
## avgT          -9.579103   1.554991 -6.160229 1.568526e-08

The fit is OK - the R2 is 0.2698.
The above value is dynamically linked to the data, so if I change the data or the model, the R2 value will change also.