6 Components of linear models

Linear models are the bread and butter of ecological modeling. If you can fit a linear model, you can satisfactorily answer probably over 90% of the types of questions ecologists are interested in. They also include the basic building structures for almost all types of models in ecology, and the skills you would need to learn other ones, so this is why we will spend two weeks covering them.

6.1 Algebraic linear functions

There are a few important things to remember from your first algebra class that will be important for linear models.

Remember that we can represent a straight line in slope-intercept form as $y = mx + b$, where $m$ is the slope of the line and $b$ is the intercept. The intercept, $b$ is where the line hits the y-axis, in other words, it is the value of $y$ when $x=0$. For every one unit increase in $x$, we except an $m$ unit increase in $y$.

We represent the relationship between $x$ and $y$ in a Cartesian plane, where the $x$ axis is horizontal and the $y$ axis is vertical (and we are only operating in two dimensions).

In a function, no single $x$ value as input can have two different $y$ values as output. In other words, a function can never double back on itself / overlap vertically. If you know $x$, you know $y$ - there is no possibility of a different $y$ associated with a single $y$. The same $y$ value can be the output of multiple different $x$ input values, though.

Non-functional cow, input of x = 1 leads to y = 1.5 and y = -4

Normal linear functions are continuous with no break points / gaps. There are piecewise functions over which there are ranges of $x$ values that there can be no $y$ values, but these are not the norm. For this class, we are only going to use normal linear functions that are not piece-wise. If you’re interested in piece-wise type questions, you might also be interested in hurdle models.

Note on terminology. Some people refer to variable(s) on the x-axis as the independent variable and the variable on the y-axis as the dependent variable because y depends on x. This is confusing, however, and also implies that there is a strongly causal relationship between x and y. I, and many others, prefer the terms predictor for the x-axis variable(s), and response for the y-axis variable because these terms are much clearer for communicating your model structure.

6.2 Assumptions of linear models

All models have assumptions. These are conditions that should be met by your model in order for you to trust the inference you make based on it. If these assumptions are not met, take your model outputs with a grain of salt (or don’t believe them, depending on how badly you don’t meet the assumptions!).

It makes biological sense. This is the single most important assumption of your model. Your predictors must make sense in how they relate to your response variable, your underlying deterministic component is sensible, and you have a biological hypothesis or explanation for why you are using this model structure.
The effects are additive. In our simple model, the response variable is the sum of the intercept ($b$) and the slope ($m$) times the value of $x$, not the product of the two, or $m$ as a function of $x$, etc. The effects are additive because they are separate and the sum of the effects results in the deterministic estimate.
Effects are linear. This does not mean that everything needs to be a straight line! But the function does need to be linear.

# Let's explore some linear functions

x <- seq(-10, 10, by=0.1)
plot(x ~ x, type="l")

Warning in plot.formula(x ~ x, type = "l"): the formula 'x ~ x' is treated as
'x ~ 1'

plot(x^(1/2) ~ x, type="l")

plot(x^2 ~x, type="l")

plot(x^3 ~ x, type="l")

plot(x^4 ~ x, type="l")

plot(x^5 ~ x, type="l")

plot(x/(1-x) ~ x, type="l")

plot(x/(x^2) ~ x, type="l")

plot(sin(x) ~ x, type="l")

Errors (aka residuals, in this case) are independent. There are no shared patterns in the residuals, which might indicate that there is an issue with the observation process, or that there are other shared covariates that might explain patterns in the residuals.
Errors have equal variance (“homoscedasticity”).
Errors are normally distributed. The residuals of the model are not skewed, non-normal, all positive, etc.

6.3 Deterministic and stochastic processes

Linear models consist of a deterministic component and a stochastic component. For a normally distributed random variable $y$ with one predictor $x_1$, the (a) stochastic and (b) deterministic components can be represented as: \[\text{(a)} \quad y \sim Norm(\mu, \sigma) \\ \text{(b)} \quad \mu = \beta_0 + \beta_1*x_1\] In this example, you can think of $\beta_0$ as the same as $b$ in $y = mx + b$ and the coefficient $\beta_1$ as the slope $m$. But, the general linear model is a lot more flexible than a linear function represented in slope-intercept form (so we will switch to the more accurate representation for the rest of the semester).

The deterministic component will always result in the same answer every time (i.e., it is determined). This is akin to an algebraic linear function, i.e. the y-value always lies on the line. In ecology, the deterministic component represents our biological hypothesis, i.e. it is a statement about the way we think the world works. If the world worked the way we think it does, every time we would end up observing the same outcome. Note that the deterministic component uses $=$ because the answer is solvable.

Because there is always some amount of error or variation around that deterministic part, we also have a stochastic component. The stochastic component can represent a number of things depending on what you are modeling. It could be measurement or observation error, e.g. because we imperfectly observe the world, miscount individuals, our measurement tools are a bit faulty, etc. It could also represent true biological variability around a population mean (e.g. think about intraspecific trait variability). Or, it could represent unmeasured or unmodeled variables (i.e., the variable we are modeling is predicted by two covariates, but we only included one, so we are working with an incomplete picture). And, of course, it could be a combination of these different factors. Note that the stochastic component uses $\sim$ because the output is distributed as that probability distribution, but we know there will always be variation in our sample.

6.4 Ordinary least squares

In an algebraic linear function represented in slope-intercept form, all values of $y$ lie on the line, with no deviation from it. In the real world, data are messy, so how do we find the line of best fit through that data? Typically, we use ordinary least squares (OLS) regression. You can think of it as if we are attempting to find the combination of the slope ($m$) and intercept ($b$) that leads to a line that is as close as possible to all data points. In other words, we want to have the smallest residuals (the difference between each point and the predicted value on the line) as possible. In reality, we base this fit on the sum of the squared residuals. Why squared? Let’s simulate some data.

# First, let's create a function that generates observations as a linear function 
# of x, i.e. in an algebraic sense
linear_function <- function(m, x, b){
  y <- m*x + b
  return(y)
}

x.vals <- 1:10
slope <- 0.2
intercept <- 1.2

y.det <- linear_function(m = slope, x=x.vals, b=intercept)
plot(y.det ~ x.vals)
lines(y.det ~ x.vals)

# Now let's add error
set.seed(27)
y.obs <- rnorm(n = length(y.det), mean = y.det, sd=1)

plot(y.obs ~ x.vals, type="p")
lines(y.det ~ x.vals)
arrows(x0 = x.vals, x1=x.vals,
       y0=y.det, y1=y.obs, 
       code=3, 
       angle=90, 
       length=0, 
       col="red")

# calculate the residuals
res <- y.obs - y.det
sum(res^2)

[1] 14.89237

# what if we tried a different line?
alt.line <- linear_function(m=0.05, x=x.vals, b=2)
plot(y.obs ~ x.vals)
lines(alt.line ~ x.vals)
arrows(x0 = x.vals, x1=x.vals,
       y0=alt.line, y1=y.obs, 
       code=3, 
       angle=90, 
       length=0, 
       col="red")

res2 <- y.obs - alt.line
sum(res2^2)

[1] 18.02767

Because we simulated based on the deterministic relationship but added in error, neither of these attempts is actually the line of best fit for our ‘observed’ data. Let’s build a for-loop that will let us try a bunch of different potential slopes.

potential_m <- seq(-2, 2, by=0.1)

ssr <- c()

for(i in 1:length(potential_m)){
  line_i <- linear_function(m = potential_m[i], x = x.vals, b=0)
  res_i <- y.obs - line_i
  ssr[i] <- sum(res_i^2)
}

plot(ssr ~ potential_m, type="l")

which.min(ssr)

[1] 25

potential_m[which.min(ssr)]

[1] 0.4

est_line <- linear_function(m = 0.4, x=x.vals, b=0)

plot(y.obs ~ x.vals)
lines(est_line ~ x.vals)

# so far, we fixed the intercept at 0 and tried slopes based on that
# what if we want to estimate both the slope and intercept parameter?

potential_b <- seq(-3, 3, by=0.1)

ssr2 <- matrix(nrow=length(potential_m), ncol=length(potential_b))

for(i in 1:length(potential_m)){
  
  for(j in 1:length(potential_b)){
    
      line_i <- linear_function(m = potential_m[i], x = x.vals, b=potential_b[j])
      res_i <- y.obs - line_i
      ssr2[i, j] <- sum(res_i^2)
    
  }
}

# visualize the likelihood surface
heatmap(ssr2, Rowv = NA, Colv = NA)

min(ssr2)

[1] 13.63746

which.min(ssr2)

[1] 1909

which(ssr2==min(ssr2), arr.ind = T)

     row col
[1,]  23  47

potential_m[23]

[1] 0.2

potential_b[47]

[1] 1.6

# Are these what we actually set earlier? No, because we included a bit of stochasticity in our data generating process

Realistically, we rarely (if ever?) actually brute force our way to a combination of parameters that minimizes the sum of the squared residuals. Instead, we use built-in functions to do that for us. We will use the function lm to fit a linear model to our simulated data which has a slope and intercept that minimize the sum of the squared residuals. The argument that the function needs is a formula, which we represent as our response variable over our predictor variable(s) using the tilde (i.e. y ~ x).

mod1 <- lm(y.obs ~ x.vals)
summary(mod1)


Call:
lm(formula = y.obs ~ x.vals)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.7480 -0.8622 -0.2009  0.8796  1.7493 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)   1.3137     0.8860   1.483    0.176
x.vals        0.2442     0.1428   1.710    0.126

Residual standard error: 1.297 on 8 degrees of freedom
Multiple R-squared:  0.2677,    Adjusted R-squared:  0.1762 
F-statistic: 2.925 on 1 and 8 DF,  p-value: 0.1256

# From our "observed" data, the best line has slope=0.2442 and intercept=1.3137
# Let's double check this with our function and residuals
ols.line <- linear_function(m=0.2442, x=x.vals, b=1.3137)
plot(y.obs ~ x.vals)
lines(ols.line ~ x.vals)
arrows(x0 = x.vals, x1=x.vals,
       y0=ols.line, y1=y.obs, 
       code=3, 
       angle=90, 
       length=0, 
       col="red")

res.ols <- y.obs - ols.line
sum(res.ols^2)

[1] 13.45756

sum(res.ols) # Not actually zero because of rounding, but, very very close

[1] 0.0006278831

# Why doesn't this match what we got in our brute force approach?
# What would happen if we increased the number of parameters we checked with brute force?

7 Model fitting

7.1 Recap of OLS and residuals

Remember from the previous lecture that a simple way to fit a linear model to data is to use ordinary least squares (OLS) regression and minimize the sum of the squared residuals. We did this a brute force approach by trying combinations of parameter values that resulted in a linear model that minimized the sum of the squared residuals for the parameter values that we tried. Note that this last point is very important: we did not try all possible combinations of parameters, so even though we may have found a locally optimal solution, we could have missed the actual optimal solution.

For normally distributed errors, OLS is essentially a special case of maximum likelihood estimation (MLE). Maximum likelihood attempts to find the parameter combination that makes our observed data most probable. But, the likelihood function (not in the R function sense, but in the math function sense) can get stuck at local optima. Think of it sort of like a blindfolded person walking around a field trying to find the lowest point. If they keep walking downhill, eventually they will find themselves at the bottom. If any step they take in any direction is uphill, then they should stop where they are. This approach works fine if there is only one low point in the field, but not if there are multiple depressions. What if they’re just stuck in a bison wallow?

7.2 American bison example

For today’s lecture and next week, we will be using a dataset of bison ages and weights collected at the Konza Prairie LTER site. Let’s start by watching a video of bison to remind ourselves just how massive they are. When watching the video, pay careful attention specifically to the size (and approximate weight) of bison of different ages and different sexes, and anything that might affect an individual’s weight.

dat <- read.csv("https://tinyurl.com/bigbison")
head(dat)

  DataCode RecType RecYear RecMonth RecDay AnimalCode AnimalSex AnimalWeight
1    CBH01       2    1994       11      8        813         F          890
2    CBH01       2    1994       11      8        834         F         1074
3    CBH01       2    1994       11      8      B-301         F         1060
4    CBH01       2    1994       11      8      B-402         F          989
5    CBH01       2    1994       11      8      B-403         F         1062
6    CBH01       2    1994       11      8      B-502         F          978
  AnimalYOB
1      1981
2      1983
3      1983
4      1984
5      1984
6      1985

# calculate age
dat$age <- dat$RecYear - dat$AnimalYOB + 1

# sort with oldest bison at the top
dat <- dat[order(dat$age, decreasing = T),]

# remove duplicates of the same individual
# because we sorted by age first, we are keeping the oldest record per individual
dat <- dat[!duplicated(dat$AnimalCode),]

dat$weight <- as.numeric(dat$AnimalWeight)

Warning: NAs introduced by coercion

dat <- dat[!is.na(dat$AnimalWeight),]
dat$sex <- factor(dat$AnimalSex)

plot(dat$weight ~ dat$age, 
     col=dat$sex)

Let’s say we are interested in fitting a model of the effect of age and sex on bison weight. We assume that our response variable (weight) is normally distributed. The deterministic and stochastic components of our model can be written as:

\[y \sim Norm(\mu, sigma)\] \[\mu = \beta_0 + \beta_{a}*age + \beta_{s}*sex\]

When we perform OLS to find a combination of parameters that makes our observed data the most likely, we are attempting to find the combination of our three coefficients $\beta_0$, $\beta_a$ and $\beta_s$. When fitting the model, our coefficient estimates will now be the best combination of values for all three coefficients (i.e. the intercept, $\beta_0$, the effect of age, $\beta_{a}$, and the effect of sex $\beta_{s}$) that minimizes the sum of the squared residuals.

7.3 The Design Matrix

To understand why we get a combination of three coefficients that collectively minimize the sum of the squared residuals, rather than estimating the effect of each predictor separately, it is important to know just a little bit of matrix algebra. The three coefficient estimates (or ‘betas’ for short) are a vector, which has only one dimension and all elements are of the same type (typically, numeric).

A matrix, while still having elements of the same type, has two dimensions: the number of rows, and the number of columns. In ecological modeling, people typically either describe a matrix as being $n \times p$, where $n$ is the number of rows or number of observations and $p$ is the number of predictors, or as being $i \times j$ where $i$ is the number of observations and $j$ is the number of predictors. I tend to use $i \times j$ because it matches how I code indices (i.e. x[i]) and because very rarely are biological concepts denoted as $i$ or $j$ but $n$ often is used to describe sample sizes and $p$ is often used for probability, so I find the code is much more readable with $i$ and $j$.

7.3.1 Adding matrices

Let’s say we have the following 2x3 matrices $A$ and $B$. To add the two matrices together, we add the matching elements from each. We can only add the matrices together if they have the same dimensionality, i.e. we could not add a 2x3 matrix to a 2x5 matrix because there will not be matching elements.

\[A = \begin{bmatrix} 2 & 4 & 3 \\ 1&6 & 8 \end{bmatrix} \\ \\ B= \begin{bmatrix} 1 & 2&3\\ 5&8&1 \end{bmatrix} \\ \\ A+B = \begin{bmatrix} (2+1) & (4+2) & (3+3)\\ (1+5) & (6+8) & (8+1) \end{bmatrix} \\ A+B = \begin{bmatrix} 3 & 6 & 6 \\ 6 & 14 & 9 \end{bmatrix}\]

We can also do matrix calculations in R.

# create matrix A
A <- matrix(data=c(2, 4, 3, 1, 6, 8), 
            nrow=2, ncol=3,
            byrow=T # super important! default is to fill by column
            )

B <- matrix(data=c(1, 2, 3, 5, 8, 1),
            nrow=2, ncol=3, 
            byrow=T)

A+B

     [,1] [,2] [,3]
[1,]    3    6    6
[2,]    6   14    9

# what if we mad matrices with different dimensions?

C <- matrix(data=c(1, 0, 1, 1, 0, 1),
            nrow=3, ncol=2, # note we now have a 3x2 matrix!
            byrow=T
            )
print(C)

     [,1] [,2]
[1,]    1    0
[2,]    1    1
[3,]    0    1

#A+C # this will result in an error because they have different dimensions

7.3.2 Multiplying vectors and matrices

When multiplying matrices and vectors in R, we use the %*% operator, rather than *. When multiplying a vector by a matrix, the vector needs to have the same length as the number of columns in the matrix. One thing that is perhaps counterintuitive here is that vectors are oriented like columns so the length is vertical. But, when viewed in R, they will look horizontal.

$$ = \[\begin{bmatrix} 0 \\ 1 \\ 0.5 \\ \end{bmatrix}\] \ A = \[\begin{bmatrix} 2 & 4 & 3 \\ 1&6 & 8 \end{bmatrix}\]

A = \[\begin{bmatrix} (0\times2) + (1\times4) + (0.5\times3)\\ (0\times1) + (1\times6) + (0.5\times8) \end{bmatrix}\] = \[\begin{bmatrix} 0 + 4 + 1.5\\ 0 + 6 + 4 \end{bmatrix}\] = \[\begin{bmatrix} 5.5 \\ 10 \end{bmatrix}\]

betas <- matrix(c(0, 1, 0.5), ncol=1)
A%*%betas

     [,1]
[1,]  5.5
[2,] 10.0

This is where our Design Matrix comes in, where the number of columns needs to be equal to our number of betas, and the number of rows needs to be equal to the number of observations. Don’t forget to add a column of all 1s if you have an intercept! i.e. so that the intercept $\beta_0$ gets multiplied by $1$ and added in each time.

\[ \begin{bmatrix} \beta_0 \\ \beta_1 \\ \beta_2 \\ \vdots \\ \beta_j \end{bmatrix} * \begin{bmatrix} 1 & x_{1,1} & x_{1,2} & \dots & x_{1,j} \\ 1 & x_{2,1} & x_{2,2} & \dots & x_{2,j} \\ \vdots & \vdots & & \vdots \\ 1 & x_{i,1} & x_{i,2} & \dots & x_{i,j} \\\end{bmatrix} = \begin{bmatrix} \mu_1 \\ \mu_2 \\ \vdots \\ \mu_i \end{bmatrix}\]

\[ \mu_i = \beta_0*1 + \beta_1*x_{i,1} + \beta_2*x_{i,2} \dots + \beta_j*x_{i,j}\]

7.4 Fitting a simple linear model

Instead of going through the whole painful process of OLS every time we want to fit a model, R has a built in function called lm which will run a linear model for us.

mod1 <- lm(weight ~ age + sex, data=dat)

Before we get to model fit, we will go over interpreting coefficient estimates for categorical variables. In our dataset, sex is coded as M or F. Behind the scenes, R has automatically dummy-coded these for you. Dummy-coding is when you represent a categorical variable as 0 or 1, so there is a new temporary ‘column’ (not really, but thinking of it like that will be helpful) in our dataset called ‘M’ and it takes the value of 1 when the sex column is M, and 0 when it does not. There is also a new temporary ‘column’ called F that scales the value of 1 when sex is F, and 0 when it is not.

When dealing with categorical variables, there always has to be a comparison level that other levels are compared to because it is incorporated into the intercept. The default in R is the first value alphabetically. Let’s look at the summary of the model to help with understanding this.

summary(mod1)


Call:
lm(formula = weight ~ age + sex, data = dat)

Residuals:
    Min      1Q  Median      3Q     Max 
-835.03 -141.93  -19.71  125.43  961.16 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  404.216      9.585   42.17   <2e-16 ***
age           58.855      1.328   44.32   <2e-16 ***
sexM         213.787     10.183   21.00   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 235.2 on 2282 degrees of freedom
  (43 observations deleted due to missingness)
Multiple R-squared:  0.475, Adjusted R-squared:  0.4745 
F-statistic:  1032 on 2 and 2282 DF,  p-value: < 2.2e-16

We have a coefficient estimate for the intercept, for age, and for sex = M. There is no coefficient estimate for sex = F. That’s because the intercept now represents the predicted weight for a female bison at age 0. The coefficient estimate for sex = Mrepresents the difference in the intercept for a male bison still at age 0, i.e. it shifts the entire line up the y-axis by that amount.

As a side note that does not apply to this model but will come up in the future: if you have multiple categorical variables, the intercept is the comparison level for all of them. i.e. if we also had bison color morphs that were aqua and chartreuse, the intercept would be the predicted weight for an aqua-colored female bison at age 0 (because aqua alphabetically comes first). You can change what the comparison level is by converting categorical variables into factors with factor and using relevel to specify the order.

Let’s plot the predictions from the model to help with understanding the different coefficient estimates by visualizing the model (we will come back to model fit later!). Now, one obnoxious thing about plotting models fit with dummy variables, is that to predict new values, we need to be able to pass in a 0 or 1.

dat$sex <- as.numeric(dat$AnimalSex=="M")
mod1 <- lm(weight ~ age + sex, data=dat)

des.mat <- cbind(1, dat$age, dat$sex)
betas <- coefficients(mod1)

est.vals <- des.mat%*%betas

age.range <- seq(0, 22, 0.1)

des.mat.f <- cbind(1, age.range, 0)
des.mat.m <- cbind(1, age.range, 1)

f.ests <- des.mat.f%*%betas
m.ests <- des.mat.m%*%betas

plot(weight ~ age, data = dat, col=factor(dat$sex))
lines(f.ests ~ age.range, col="black")
lines(m.ests ~ age.range, col="red")

Okay, so I think we can all agree something seems very off about our model. We don’t even need diagnostic plots to know this is not a good representation of our data. In reality, animals do not continue to grow at at the same rate with age (though many fish do have indeterminate growth), and weight will eventually plateau.

Even fish with indeterminate growth do not grow linearly with age

Remember that a linear model does not have to be a straight line! Any linear function of x will do. In this case, we want something that will reach a horizontal asymptote. $x^{1/2}$ should do the trick here. If you’re not sure what linear form of x to use, I recommend setting up a range of x-values from -4 to 4, increasing in increments of 0.1 and then plotting y as different functions of x.

x <- seq(-4, 4, by=0.01)

y <- x^{1/2}
plot(y ~ x)

y <- sin(x)
plot(y ~ x)

y <- -x^{2}
plot(y ~ x)

To model weight as a function of the square root of age and sex, we will first create a new variable consisting of the square root of age, and then refit the model with this as a covariate.

dat$age.sq <- dat$age^{1/2}

mod4 <- lm(weight ~ age.sq + sex, data=dat)


summary(mod4)


Call:
lm(formula = weight ~ age.sq + sex, data = dat)

Residuals:
    Min      1Q  Median      3Q     Max 
-714.49 -116.32  -12.64  107.33  874.01 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   25.148     12.934   1.944    0.052 .  
age.sq       331.173      5.638  58.741   <2e-16 ***
sex          214.140      8.699  24.618   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 202.4 on 2282 degrees of freedom
  (43 observations deleted due to missingness)
Multiple R-squared:  0.6111,    Adjusted R-squared:  0.6108 
F-statistic:  1793 on 2 and 2282 DF,  p-value: < 2.2e-16

betas2 <- coefficients(mod4)

squage.range <- age.range^{1/2}

des.mat.f <- cbind(1, squage.range, 0)
des.mat.m <- cbind(1, squage.range, 1)

new.pred.f <- des.mat.f%*%betas2
new.pred.m <- des.mat.m%*%betas2

par(pty="s", las=1)
plot(weight ~ age, data = dat, col=factor(dat$sex))
lines(new.pred.f ~ age.range, col="black")
lines(new.pred.m ~ age.range, col="red")

This is closer, but we still aren’t capturing that there is a rapid increase at the beginning followed by a tapering off and that the slopes are really different for male and female bison. We can model this with an interaction between our categorical predictor (sex) and our continuous predictor (the square root of age).

mod.int <- lm(weight ~ age.sq + sex + age.sq*sex, data=dat)
summary(mod.int)


Call:
lm(formula = weight ~ age.sq + sex + age.sq * sex, data = dat)

Residuals:
    Min      1Q  Median      3Q     Max 
-677.19  -81.18    2.93   83.82  508.67 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  228.973      9.268   24.71   <2e-16 ***
age.sq       231.369      4.124   56.10   <2e-16 ***
sex         -690.224     17.437  -39.59   <2e-16 ***
age.sq:sex   513.445      9.353   54.89   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 132.9 on 2281 degrees of freedom
  (43 observations deleted due to missingness)
Multiple R-squared:  0.8325,    Adjusted R-squared:  0.8322 
F-statistic:  3778 on 3 and 2281 DF,  p-value: < 2.2e-16

betas3 <- coefficients(mod.int)

des.mat.f <- cbind(1, squage.range, 0, 0*squage.range)
des.mat.m <- cbind(1, squage.range, 1, 1*squage.range)

new.pred.f <- des.mat.f%*%betas3
new.pred.m <- des.mat.m%*%betas3

par(pty="s", las=1)
plot(weight ~ age, data = dat, col=factor(dat$sex))
lines(new.pred.f ~ age.range, col="black")
lines(new.pred.m ~ age.range, col="red")

8 Evaluating model fit

Just because we can fit a model, doesn’t necessarily mean it is a good model. How do we know if a model is a ‘good’ model? We could approach this in a whole lot of different ways.

8.1 Assumptions of linear models

First, does our model meet the assumptions of a linear model? If it doesn’t, it might not be a good model. But, how much are we willing to compromise on those assumptions to have any model at all? Remember that all models are going to be wrong and are imperfect representations of the world with a lot of simplifications, but the question is really whether or not we have cut too many corners. Earlier this semester, we played rectilinear pictionary where we were allowed to draw straight lines. What if instead we had played pointilism pictionary? How many guesses would it have taken people to figure out their animals if you were only allowed to draw one point per turn? Is that an assumption we would have been willing to accept was okay? What if, actually, one team had been allowed to draw curved lines while everyone else had to draw straight lines? Is that a rule we would have been okay with breaking? These sorts of questions, as ridiculous as they may seem, are the sort of things you want to consider when evaluating model fit. There is no right or wrong answer (okay, sometimes there are really, really bad models that would count as wrong answers…) but most of the time if you’re fitting a model, it will be somewhere in a grey zone of goodness.

8.1.1 The model makes biological sense

Remember that this is the single most important thing to keep in mind when fitting a model. If your model does not make biological sense, then you should not be fitting it, or quite frankly evaluating fit. It should still be in your mental checklist though, because it is very, very important to remind ourselves of this criterion! It can be tempting to consider a few different models, compare them by other metrics of ‘goodness’ and end up selecting a model that by and large is utter nonsense when you consider the biology or ecology of a system.

We used the sum of the squared residuals earlier when thinking about how to fit the best line through the data in an OLS sense. Let’s use that as a simplistic measure of ‘goodness’ for the two different models below. Remember that when using this as our metric, we want a smaller sum because that means the predicted and observed values are closer together. Let’s fit the model from above, with the square root of age and an interaction between sex and age, and a second model that also includes the day during the month we recorded them and an interaction term with sex. Is there any reason you can think of why, on later days in a 30-day calendar month, there should be some reason that individuals weigh more, and that this effect should be more pronounced (or less pronounced) for male than female bison? Do they perhaps get fed more at the beginning of the month because there was a new food shipment, and then later in the month males are more pushy so they continue to eat the dwindling food supply while females don’t get as much? Do they have monthly check-ins with a vet and are trying to watch their weight towards the end of the month? Is the grass mowed at the start of each calendar month so there is less food available and it is tied to a 12-month calendar? And for some reason this affects one sex more than the other? Most of these explanations seem pretty far-fetched, so even though the SSR is smaller for the model that includes this odd interaction term, it doesn’t really make biological sense, which means it is probably not as good of a model as our much more interpretable one. Sure, there could be better models than our original one that are also biologically interpretable, but the one I’ve proposed likely isn’t one.

calc_ssr <- function(mod){
  sum(residuals(mod)^2)
}


ssr1 <- calc_ssr(lm(weight ~ age.sq*sex, data=dat))
ssr2 <- calc_ssr(lm(weight ~ age.sq*sex + RecDay*sex, data=dat))

ssr1 - ssr2

[1] 1056229

ssr1 > ssr2 # so model two is better?

[1] TRUE

8.1.2 The effects are additive

Most of the time, the way you have conceptualized your model will mean you are meeting this assumption. A lot of the times in ecology when we have to deal with non-additive effects, we will have already done this through some sort of data transformation (e.g., things like exponentiation or logarithms) and still fit things in an additive manner. But, sometimes you really just need a model that just fits the data with a somewhat uninterpretable effect (i.e., nothing like a slope that we are used to seeing). For example, maybe you’ve got an effect that you think only kicks in at some point. We talked about drowning baby birds very early in the semester, and the potential for them to exhibit some either adaptive or plastic responses to sea level rise by having chicks climb to the top of nests to survive high tide. If, as has been hypothesized, they cannot actually climb until their bones are fully ossified, this effect may not kick in until say day six or seven. What we are dealing with in that case would be a step function or a hurdle model, where the predictor variable has to exceed some threshold before it has an effect. We could force this into a linear context, but it isn’t immediately clear how ‘additive’ it is (…but you can force most models to conform if you want to, hence however you conceptualize it is typically linear). Similarly, things like splines or local regression will fit different models to different portions of the data, so aren’t really additive in the broad sense but are within smaller areas, so we can kind of force most things to be additive. If you’re interested in this general question of modeling things without additive, linear effects, you should probably venture into GAM (Generalized Additive Model) world, which is something we won’t talk about in this course. Think of them sort of like just chucking all your data in a hat and seeing what comes out. As perhaps a more relatable example, ask any ‘AI’ model how it knows a picture of a cat is a cat, and it would probably tell you something like “it’s, y’know, kinda cat-like” and wave its hands a bit. A lot of ML models are sorta in this realm, and then people will fit post-hoc GAMs to be able to explain how the model eventually got to its conclusion. To be clear: GAMs are still additive, but they are kinda in the realm of “…yeah, but does my model really have to meet all those assumptions?”

8.1.3 Effects are linear

Same general comment as above: if you have conceptualized your model as having linear, additive effects, there is a pretty good chance you’re good to go with modeling your data as if it has linear, additive effects.

8.1.4 Errors are independent

When we talk about errors in this context, we are referring to the residuals of the model (not things like process or observational error). Remember that our ‘errors’ (or residuals) are the differences between the predicted and observed values. Let’s think about some reasons these errors might not be independent. Typically, in ecology this arises from either some sort of hierarchical grouping structure or an unmeasured covariate that would explain why we have ‘groups’ showing up in our errors. For example, what if for some reason we fit our bison model without including sex as a covariate?

errors <- residuals(lm(weight ~ age, data = dat))
hist(errors, 100)

plot(weight ~ age, data=dat)
abline(lm(weight ~ age, data=dat))

# looks like those big errors are maybe due to some grouping...

We aren’t going to talk about non-independent errors right now, because we will spend a lot of time on these later in the semester when we get to mixed effects models and hierarchical models. For now, remember that you should think about potential covariate that might explain structure in your errors!

8.1.5 Errors have equal variance (homoscedasticity)

What we mean by homoscedasticity is that the spread in the errors is the same across all different values or levels of the predicted value. So if the errors were homoscedastic, they should basically just be a big blob with no trend line across the line of best fit. What would be especially bad here is a trend line, fan shape, or really anything you could draw a discernible shape around. Fish-shaped = bad. Pufferfish-shaped = probably good. If we see variance in the errors showing a pattern over predicted values, we are probably missing a covariate that might explain that trend, or there is something really bad about our overall fit. For example, if we have a fairly flat relationship between predicted values and errors at small values, but a large error at larger values, that might mean our model is just pretty bad at capturing the range of values in our data. If there is a strong trend, maybe we’ve got a missing covariate (or several!). What would indicate that this is probably not a big problem is just having a nice, fluff cloud of residuals with no clear patterns.

plot(fitted(mod1) ~ residuals(mod1))

8.1.6 Errors are normally distributed

While a straight-up linear model is not often done in ecology (our data typically don’t meet the assumptions - we will get to this soon in the course!), I would guess that this is most likely either the most, or second most, common issue with models. It is either this, or the assumption about independent errors, that are violated most often and lead us to other types of models. So, let’s take a look at our errors. We will go back to the most sensible model we have fit so far for this check. This plot doesn’t look too bad. We’ve got a lot of data, and we do expect our weight response variable to be normally distributed, which often (but not always!) means the errors are also often normally distributed.

hist(residuals(mod.int))

qqnorm(residuals(mod.int)) # residual quantiles 
qqline(residuals(mod.int), col="red") # theoretical quantiles for normal

8.2 Diagnostic plots in R

By default, R has some diagnostic plots for visually inspecting the goodness of our models to our data, mostly based on residuals and assumptions of linear models. To view them, you can call plot() on your model.

par(mfrow=c(2,2), las=1) # 2 rows, 2 column layout
plot(mod.int)

First is the residuals vs fitted plot. The residuals vs fitted plot helps us check if our errors vary with magnitude, i.e. do we have larger errors for larger predicted values? If we see a trend here, that means our model may only be good over some ranges of values. If this relationship is relatively flat, that means we should not be concerned about the model estimates being biased at certain range of our observed data.

Next is the Normal Q-Q plot. Remember that one of the assumptions of linear models is that the error are normally distributed. If the errors are normally distributed, they should fall along the line matching up with theoretical quantiles. This is just a visual check, so it is not really diagnostic, but can give us a good gut check on whether or nor the errors are normally distributed. There seems to be a fairly long tail in the lower end that deviates from the assumption, which we might have anticipated from our histogram of the response variable.

The third default diagnostic plot is Scale-Location, which helps us understand if our errors have equal variance (another assumption of linear models: homoscedasticity). Again, a relatively flat line here passes the gut check that model assumptions are met.

The final visual diagnostic check is residuals versus leverage. If data points have large leverage, you can interpret this as meaning that the estimates would change quite a bit if they were removed from the dataset before fitting the model. If particularly large, or particularly small, data points have high leverage, they would stand out in this plot and could indicate issues with the model fit or data.

8.3 What about the data?

Besides meeting the most basic assumptions of a linear model, we may also want to know if our model actually describes our data well. We can do this via model evaluation.

An aside ramble: a lot of textbooks, papers, etc. will discuss model validation. As a quibble, there is really no such thing as model validation. Validation implies that we can test and validate a model, meaning we can confirm or prove it. The word ‘prove’ should set your hackles up, because never actually ‘prove’ anything in science. We just fail to disprove stuff. I’m personally still not sure the moon isn’t made of green cheese (unlikely, but… I haven’t disproven it; can’t prove it isn’t either). So, when you hear anyone talk about model ‘validaton’ you should pause for a second to think about what they actually mean!

We will start with $R^2$, which is a measure of how much your model matches the observed data. It is a measure of how much of the total variance in the response is described by the predictor variable(s). It is calculated as the sum of the squared residuals (sounds familiar!) divided by the total sum of squares. In equation form:

\[ SSR = \sum{(y_i - \hat{y_i})^2} \\ TSS = \sum(y_i - \bar{y_i})^2)\] Let’s calculate $R^2$ for two of the models we fit earlier to see how much of the variance in the response they describe.

ssr <- function(mod){
 sum(residuals(mod)^2) 
}

tss <- function(yi){
  sum((yi - mean(yi))^2)
}

calc_R2 <- function(mod, yi){
  1 - ssr(mod)/tss(yi)
}

calc_R2(mod1, mod1$model$weight)

[1] 0.4749924

summary(mod1)


Call:
lm(formula = weight ~ age + sex, data = dat)

Residuals:
    Min      1Q  Median      3Q     Max 
-835.03 -141.93  -19.71  125.43  961.16 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  404.216      9.585   42.17   <2e-16 ***
age           58.855      1.328   44.32   <2e-16 ***
sex          213.787     10.183   21.00   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 235.2 on 2282 degrees of freedom
  (43 observations deleted due to missingness)
Multiple R-squared:  0.475, Adjusted R-squared:  0.4745 
F-statistic:  1032 on 2 and 2282 DF,  p-value: < 2.2e-16

8.4 Overparameterizing

The more parameters we add to our model, the more we will be able to describe the data. Even if we just add random noise with a parameter value, we will still be able to describe our response variable better because the model has more to work with. If you take it to the extreme end, let’s imagine we have just as many covariates (or parameters) in our model as we do observations of our response variable. We could perfectly predict every data point because we would have the flexibility to use different parameter combinations for every data point and end up essentially drawing a line to connect all the dots. That’s why we aim to simplify our models so that we can actually make inference on them and explain our results, rather than just adding more and more parameters until we are essentially just building an airplane instead of a model of an airplane.

When we overparameterize a model, we also often overfit it, but those two concepts are not synonymous. A model that is overfit means that it is overly reliant on the data itself and really describes those data really well but is not generalizable. If you moved it to a new context, it would not perform well.

8.5 Within- and out-of-sample prediction

We can use within-sample (WS) and out-of-sample (OOS) prediction as one way of evaluating our model. An overfit model will perform really, really well on within sample data because it is fit to those data, but will perform poorly when confronted with new observations that it was not built with. This brings us to the idea of cross-validation (but remember, we don’t ever really ‘validate’ a model!) in which we set aside some parts of the data to build the model, and then test how well it performs on data that weren’t used to fit it. Let’s start simple and divide our data into training and evaluation datasets.

train <- as.logical(rbinom(n = nrow(dat), size=1, prob=0.2))

modA <- lm(weight ~ age.sq * sex, data=dat[train,])

dat.eval <- dat[!train, c('weight', 'age.sq', 'sex')]
desA <- as.matrix(cbind(1, dat.eval[,2:3], dat.eval[,2]*dat.eval[,3]))

y.eval <- dat.eval[,1]
y.pred <- desA%*%coefficients(modA)
plot(y.eval ~ y.pred)

res.train <- sum(residuals(modA)^2)
res.eval <- sum((y.eval - y.pred)^2, na.rm=T)

We could then look at a bunch of model evaluation metrics to see how well our model performs on the within versus out of sample data, and maybe consider alternative specifications that make it more generalizable. We can also extend this concept and rather than just doing one training and one evaluation dataset, we could split our data up into three, or five, or 10, or 100 different folds and repeat this process to get a sense of how the model performs on out of sample data. This is called k-fold cross-validation where $k$ is the number of folds. We aren’t going to go into this in any level of detail whatsoever in this course, but it is worth looking into if this is the sort of thing you may need for the types of models you use in your research.

9 Prediction and inference

9.1 Interpreting coefficients

The coefficient estimates that we get out of our model describe the relationships between the variables that we have included in our model. They are the answer to the question implied by our models when we fit them to test our hypotheses. If we think back to earlier in the semester and asking good questions with interesting questions, we might want to know something like “How does the weight of bison change across their lifespan, and how does it vary by sex?” rather than something like “Are male bison bigger than females?”. Our model answers the first question, and the coefficients are the answer.

Before we unpack these results for our best model (so far) with the interaction term, let’s go back to one of our earlier models that wasn’t very good, where we modeled weight as a function of age and sex separately, with no interaction term.

mod1


Call:
lm(formula = weight ~ age + sex, data = dat)

Coefficients:
(Intercept)          age          sex  
     404.22        58.86       213.79

# the first coefficient is our intercept. this tells us the expected weight for
# a 0 year old, female bison

coefficients(mod1)[1]

(Intercept) 
   404.2158

# the second parameter for age tells us how much more, or less, a bison weighs
# for each additional year older it is
coefficients(mod1)[2]

     age 
58.85509

# the third coefficient for sex is the difference we need to add or subtract
# if a bison is male (because we coded female = 0, male = 1)
coefficients(mod1)[3]

     sex 
213.7865

mod.int


Call:
lm(formula = weight ~ age.sq + sex + age.sq * sex, data = dat)

Coefficients:
(Intercept)       age.sq          sex   age.sq:sex  
      229.0        231.4       -690.2        513.4

# the first coefficient is our intercept. this tells us the expected weight for
# a 0 year old, female bison

# the second parameter for age.sq tells us how much more, or less, a bison weighs
# for each additional increase of that covariate
# if our covariate was just age, it would be for each additional year older it is
# because it is the square root of age, it is the increased weight for
# each additional square root of years a bison ages
# this is why it tapers off over time

sqrt(1)*coefficients(mod.int)[2]

  age.sq 
231.3685

sqrt(2)*coefficients(mod.int)[2]

  age.sq 
327.2045

sqrt(3)*coefficients(mod.int)[2]

 age.sq 
400.742

sqrt(4)*coefficients(mod.int)[2]

  age.sq 
462.7371

# the third coefficient for sex is the difference we need to add or subtract
# if a bison is male

# our final coefficient in this model is the interaction of the square root of 
# age and sex. it will behave similarly to the second coefficient, but only be 
# applied if a bison is male. this means we have an additional effect
# of the square root of age that only applies for male bison, allowing them
# to grow really, really fast early on and then taper off

9.2 Predicting

Instead of creating a design matrix and multiplying it by our vector of coefficient estimates, there is a built-in function in R to make predictions from a model called predict. If we run predict on our model without any other arguments, it will return the expected value for every observation in our dataset that we used to fit the model. We can also pass new data to the predict function to predict across a range of responses, which is typically what we are actually interested in doing because we want to make inference about the general relationship, not specific individuals. We can also use it to extrapolate beyond the range of values used to fit the model (but shouldn’t!).

# expected value for each observation in our dataset
predict(mod.int)

     2421      2422      2423      3932      3933      3934      2424      3935 
1314.1873 1289.2365 1289.2365 1289.2365 1289.2365 1289.2365 1263.6842 1263.6842 
     2425      3135      3137      6120      6121      6482      7573      1415 
1237.4848 1237.4848 1237.4848 1237.4848 1237.4848 1237.4848 1237.4848 1210.5862 
     2426      2427      2428      2429      3140      3532      3533      3936 
1210.5862 1210.5862 1210.5862 1210.5862 1210.5862 1210.5862 1210.5862 1210.5862 
     5066      5732      6483      6833      7957      7958      1418      2790 
1210.5862 1210.5862 1210.5862 1210.5862 1210.5862 1210.5862 1182.9296 1182.9296 
     3142      3534      5067      5733      6484      7204      7205      7206 
1182.9296 1182.9296 1182.9296 1182.9296 1182.9296 1182.9296 1182.9296 1182.9296 
     7207      7208      7575      7959      7960      2057      2793      3146 
1182.9296 1182.9296 1182.9296 1182.9296 1182.9296 1154.4468 1154.4468 1154.4468 
     3147      3148      3149      3937      4720      5068      5379      5381 
1154.4468 1154.4468 1154.4468 1154.4468 1154.4468 1154.4468 1154.4468 1154.4468 
     5735      5736      6124      6125      6127      6486      6835      6837 
1154.4468 1154.4468 1154.4468 1154.4468 1154.4468 1154.4468 1154.4468 1154.4468 
     7212      7213      7579      7961      7962      7963       149       901 
1154.4468 1154.4468 1154.4468 1154.4468 1154.4468 1154.4468 1125.0592 1125.0592 
     1147      1421      1425      3150      3938      3939      4339      4723 
1125.0592 1125.0592 1125.0592 1125.0592 1125.0592 1125.0592 1125.0592 1125.0592 
     5383      5385      5386      5387      5388      6130      6131      6490 
1125.0592 1125.0592 1125.0592 1125.0592 1125.0592 1125.0592 1125.0592 1125.0592 
     6493      6843      7215      7582      7583      7584      7964      7965 
1125.0592 1125.0592 1125.0592 1125.0592 1125.0592 1125.0592 1125.0592 1125.0592 
     7966      7967      7968      7969       286       625      1727      1728 
1125.0592 1125.0592 1125.0592 1125.0592 1094.6745 1094.6745 1094.6745 1094.6745 
     2065      5080      5391      5393      6499      6500      6504      6852 
1094.6745 1094.6745 1094.6745 1094.6745 1094.6745 1094.6745 1094.6745 1094.6745 
     7223      7589      7593      7970      7971      7972      7973      1434 
1094.6745 1094.6745 1094.6745 1094.6745 1094.6745 1094.6745 1094.6745 1063.1838 
     1436      1437      2069      3153      3542      3944      4342      4736 
1063.1838 1063.1838 1063.1838 1063.1838 1063.1838 1063.1838 1063.1838 1063.1838 
     5086      5394      5397      5399      5400      5403      5754      7230 
1063.1838 1063.1838 1063.1838 1063.1838 1063.1838 1063.1838 1063.1838 1063.1838 
     7974      7975      7976      7977      7978      7979      7980       487 
1063.1838 1063.1838 1063.1838 1063.1838 1063.1838 1063.1838 1063.1838 1030.4568 
     1167      1168      3948      4350      4351      4741      5759      6513 
1030.4568 1030.4568 1030.4568 1030.4568 1030.4568 1030.4568 1030.4568 1030.4568 
     6865      7237      7599      7981      7982      7983      7984      7985 
1030.4568 1030.4568 1030.4568 1030.4568 1030.4568 1030.4568 1030.4568 1030.4568 
     7986      7987      7988      7989      7990       490       492       918 
1030.4568 1030.4568 1030.4568 1030.4568 1030.4568  996.3353  996.3353  996.3353 
      929      1447      2075      3163      3164      3546      4364      4367 
 996.3353 2009.0155  996.3353  996.3353  996.3353  996.3353  996.3353  996.3353 
     5098      5409      5411      5767      5771      6163      6528      7615 
 996.3353  996.3353  996.3353  996.3353  996.3353  996.3353  996.3353  996.3353 
     7991      7992      7993      7994      7995      7996      7997      7998 
 996.3353  996.3353  996.3353  996.3353  996.3353  996.3353  996.3353  996.3353 
      159       298       299       498      1454      3166      3174      4370 
 960.6242 1894.0556 1894.0556  960.6242 1894.0556  960.6242 1894.0556  960.6242 
     5426      5429      5779      6530      6534      7625      7999      8000 
 960.6242  960.6242  960.6242  960.6242  960.6242  960.6242  960.6242  960.6242 
     8001      8002      8003      8004      8005      8006      8007      8008 
 960.6242  960.6242  960.6242  960.6242  960.6242  960.6242  960.6242  960.6242 
     8009      8010      8011       162       306       307       653       654 
 960.6242  960.6242  960.6242  923.0783  923.0783 1773.1890 1773.1890  923.0783 
      655       656       661      1456      2458      2460      3181      3187 
1773.1890  923.0783  923.0783 1773.1890 1773.1890 1773.1890 1773.1890  923.0783 
     3568      3978      3981      4380      4381      5432      5433      5440 
 923.0783  923.0783  923.0783  923.0783  923.0783  923.0783  923.0783 1773.1890 
     5442      6178      6542      6883      6884      6893      7263      7637 
 923.0783  923.0783 1773.1890  923.0783 1773.1890 1773.1890  923.0783 1773.1890 
     8012      8013      8014      8015      8016      8017      8018      8019 
 923.0783  923.0783  923.0783  923.0783  923.0783  923.0783  923.0783  923.0783 
     8020      8021      8022      8023       171       315       519       663 
 923.0783  923.0783  923.0783  923.0783 1645.3992  883.3817  883.3817  883.3817 
      944       945       946      1751      2092      2466      2467      2469 
 883.3817  883.3817  883.3817  883.3817  883.3817 1645.3992 1645.3992 1645.3992 
     2475      2476      2828      2833      3189      3193      3198      3199 
1645.3992  883.3817  883.3817 1645.3992 1645.3992  883.3817 1645.3992 1645.3992 
     3575      3583      3585      3586      3587      3591      3987      3991 
1645.3992 1645.3992 1645.3992 1645.3992  883.3817 1645.3992 1645.3992 1645.3992 
     3992      3995      3998      3999      4388      4390      4772      4778 
1645.3992 1645.3992 1645.3992 1645.3992  883.3817  883.3817 1645.3992 1645.3992 
     4780      4781      4782      4788      4790      5128      5132      5134 
 883.3817 1645.3992 1645.3992  883.3817 1645.3992 1645.3992 1645.3992 1645.3992 
     5136      5444      5445      5446      5451      5452      5453      5456 
1645.3992  883.3817  883.3817 1645.3992 1645.3992 1645.3992 1645.3992 1645.3992 
     5458      5799      5803      5804      5806      5807      5808      6184 
 883.3817 1645.3992 1645.3992 1645.3992 1645.3992 1645.3992 1645.3992 1645.3992 
     6186      6191      6192      6194      6555      6556      6558      6897 
1645.3992 1645.3992 1645.3992 1645.3992 1645.3992 1645.3992 1645.3992 1645.3992 
     6898      6902      6903      6905      6906      7272      7274      7277 
1645.3992 1645.3992 1645.3992 1645.3992  883.3817 1645.3992  883.3817 1645.3992 
     7282      7284      7642      7648      7650      7651      7655      8024 
1645.3992 1645.3992 1645.3992  883.3817 1645.3992 1645.3992 1645.3992  883.3817 
     8025      8026      8027      8028      8029      8030      8031      8032 
 883.3817  883.3817 1645.3992  883.3817  883.3817 1645.3992  883.3817 1645.3992 
     8033      8034      8035      8036      8037       179       181       183 
1645.3992  883.3817 1645.3992  883.3817  883.3817 1509.3398 1509.3398 1509.3398 
      328       329       672       948       951       952      1464      1469 
1509.3398  841.1163 1509.3398  841.1163  841.1163  841.1163 1509.3398 1509.3398 
     1754      1759      2483      2843      2848      2854      2855      2856 
 841.1163  841.1163  841.1163  841.1163 1509.3398 1509.3398 1509.3398 1509.3398 
     2857      2858      3207      3215      3216      3597      4003      4007 
1509.3398 1509.3398  841.1163 1509.3398  841.1163 1509.3398 1509.3398 1509.3398 
     4011      4400      4402      4405      4421      4794      5467      5468 
1509.3398 1509.3398 1509.3398 1509.3398  841.1163  841.1163 1509.3398 1509.3398 
     5470      5471      5473      5475      5477      5479      5813      5820 
1509.3398 1509.3398 1509.3398 1509.3398  841.1163 1509.3398  841.1163  841.1163 
     6211      6214      6569      6574      6920      6932      7291      7662 
1509.3398 1509.3398  841.1163 1509.3398 1509.3398  841.1163  841.1163  841.1163 
     8038      8039      8040      8041      8042      8043      8044      8045 
 841.1163  841.1163  841.1163  841.1163  841.1163  841.1163  841.1163  841.1163 
     8046      8047      8048      8049      8050      8051      8052      8053 
 841.1163 1509.3398 1509.3398  841.1163  841.1163 1509.3398 1509.3398  841.1163 
     8054      8055        20       193       194       195       197       198 
 841.1163 1509.3398  795.7076 1363.1615 1363.1615 1363.1615 1363.1615 1363.1615 
      199       339       678       679       683       958       959      1470 
1363.1615 1363.1615 1363.1615  795.7076  795.7076  795.7076  795.7076  795.7076 
     1471      1771      2109      2118      3223      3230      4436      4438 
1363.1615  795.7076  795.7076  795.7076  795.7076  795.7076 1363.1615  795.7076 
     4824      5154      5158      5481      5482      5489      5492      5829 
 795.7076 1363.1615  795.7076 1363.1615  795.7076  795.7076  795.7076  795.7076 
     5835      6218      6232      6233      6237      6947      7306      7317 
1363.1615 1363.1615 1363.1615  795.7076 1363.1615 1363.1615  795.7076  795.7076 
     7319      7684      7687      8056      8057      8058      8059      8060 
 795.7076  795.7076  795.7076  795.7076 1363.1615  795.7076 1363.1615  795.7076 
     8061      8062      8063      8064      8065      8066      8067      8068 
 795.7076  795.7076  795.7076  795.7076  795.7076 1363.1615 1363.1615  795.7076 
     8069      8070      8071       206       207       208       209       347 
 795.7076 1363.1615  795.7076 1204.2021  746.3285  746.3285  746.3285 1204.2021 
      348       349       687       690       692       693       695       960 
1204.2021 1204.2021 1204.2021 1204.2021  746.3285  746.3285  746.3285  746.3285 
      963       964       969      1202      1782      1788      2137      3247 
 746.3285  746.3285  746.3285 1204.2021  746.3285  746.3285  746.3285 1204.2021 
     3253      3254      3625      4442      4452      4458      4463      4825 
1204.2021  746.3285  746.3285  746.3285  746.3285 1204.2021  746.3285  746.3285 
     4843      4848      5182      5184      5503      5507      5856      5867 
 746.3285 1204.2021  746.3285  746.3285  746.3285  746.3285  746.3285  746.3285 
     6239      6245      6255      6260      6601      6610      6616      6966 
1204.2021  746.3285  746.3285  746.3285 1204.2021  746.3285 1204.2021  746.3285 
     7695      7699      7709      7710      8072      8073      8074      8075 
1204.2021  746.3285 1204.2021  746.3285 1204.2021  746.3285  746.3285  746.3285 
     8076      8077      8078      8079      8080      8081      8082      8083 
 746.3285  746.3285 1204.2021  746.3285 1204.2021  746.3285  746.3285  746.3285 
     8084      8085      8086      8087      8088      8089      8090      8091 
1204.2021  746.3285 1204.2021  746.3285  746.3285  746.3285  746.3285  746.3285 
     8092      8093        34        35        36        37        38        39 
 746.3285 1204.2021 1028.3755 1028.3755 1028.3755 1028.3755 1028.3755 1028.3755 
       40        41        42        43        44       216       219       220 
1028.3755 1028.3755 1028.3755 1028.3755 1028.3755 1028.3755 1028.3755 1028.3755 
      698       699       702       703       704       705       707       711 
 691.7098  691.7098 1028.3755 1028.3755  691.7098  691.7098  691.7098  691.7098 
      712       713       718       973      1500      1796      1799      1810 
 691.7098  691.7098  691.7098  691.7098  691.7098  691.7098  691.7098  691.7098 
     1817      2156      2161      3277      3647      3659      3664      4465 
 691.7098  691.7098  691.7098  691.7098  691.7098  691.7098  691.7098  691.7098 
     4483      4485      4489      4856      5517      5526      5531      6273 
1028.3755  691.7098  691.7098  691.7098  691.7098 1028.3755  691.7098  691.7098 
     6274      6283      6626      8094      8095      8096      8097      8098 
1028.3755 1028.3755 1028.3755  691.7098  691.7098 1028.3755 1028.3755 1028.3755 
     8099      8100      8101      8102      8103      8104      8105      8106 
 691.7098  691.7098  691.7098  691.7098  691.7098  691.7098  691.7098  691.7098 
     8107      8108      8109      8110      8111      8112      8113      8114 
 691.7098  691.7098  691.7098 1028.3755 1028.3755  691.7098  691.7098 1028.3755 
       48        54        55        56        57        58        61        62 
 828.8033  828.8033  828.8033  629.7148  629.7148  629.7148  629.7148  629.7148 
       63        64        65        66        67        68        69        70 
 629.7148  828.8033  828.8033  629.7148  629.7148  828.8033  629.7148  629.7148 
       71       222       223       224       225       226       361       362 
 828.8033  828.8033  629.7148  828.8033  828.8033  828.8033  828.8033  828.8033 
      363       364       366       367       368       369       370       371 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033 
      372       373       374       379       381       382       383       384 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033 
      386       387       388       545       721       722       723       725 
 828.8033  828.8033  629.7148  629.7148  828.8033  828.8033  828.8033  828.8033 
      726       728       729       730       731       732       734       737 
 828.8033  828.8033  629.7148  828.8033  828.8033  828.8033  629.7148  629.7148 
      738       739       740       741       742       743       744       745 
 629.7148  828.8033  828.8033  629.7148  828.8033  828.8033  828.8033  828.8033 
      747       748       750       751       752       756       757       760 
 828.8033  629.7148  828.8033  828.8033  828.8033  828.8033  828.8033  629.7148 
      761       762       763       765       986       997       998      1227 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  629.7148 
     1228      1229      1230      1231      1232      1233      1234      1236 
 629.7148  629.7148  629.7148  828.8033  828.8033  629.7148  828.8033  629.7148 
     1237      1238      1239      1240      1241      1243      1244      1246 
 629.7148  629.7148  828.8033  629.7148  828.8033  828.8033  629.7148  828.8033 
     1247      1249      1250      1253      1255      1256      1257      1258 
 828.8033  828.8033  629.7148  828.8033  828.8033  828.8033  629.7148  828.8033 
     1259      1260      1261      1262      1263      1266      1267      1268 
 828.8033  828.8033  629.7148  828.8033  828.8033  828.8033  629.7148  828.8033 
     1271      1519      1520      1521      1523      1524      1525      1526 
 629.7148  828.8033  828.8033  629.7148  828.8033  828.8033  828.8033  828.8033 
     1528      1529      1531      1532      1533      1534      1537      1538 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033 
     1540      1541      1544      1546      1547      1548      1549      1823 
 629.7148  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  629.7148 
     1824      1825      1827      1828      1829      1830      1831      1832 
 828.8033  828.8033  629.7148  828.8033  828.8033  828.8033  828.8033  828.8033 
     1833      1834      1836      1838      1841      1842      1843      1844 
 629.7148  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033 
     1847      1848      1849      1851      1854      1857      1859      1860 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  629.7148  828.8033 
     1861      1862      1863      1868      2171      2173      2175      2176 
 828.8033  828.8033  828.8033  629.7148  629.7148  828.8033  828.8033  629.7148 
     2177      2178      2183      2184      2185      2187      2189      2190 
 828.8033  828.8033  828.8033  828.8033  629.7148  828.8033  828.8033  828.8033 
     2191      2194      2195      2196      2197      2198      2199      2200 
 828.8033  629.7148  828.8033  828.8033  629.7148  629.7148  828.8033  828.8033 
     2201      2202      2203      2205      2208      2209      2210      2553 
 828.8033  629.7148  828.8033  828.8033  629.7148  629.7148  828.8033  828.8033 
     2554      2555      2556      2557      2558      2559      2563      2564 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  629.7148 
     2565      2566      2567      2568      2569      2572      2573      2574 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033 
     2576      2581      2583      2584      2585      2586      2587      2588 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  629.7148 
     2589      2590      2593      2594      2595      2599      2600      2601 
 828.8033  629.7148  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033 
     2604      2605      2607      2608      2609      2610      2919      2920 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033 
     2921      2922      2926      2929      2931      2932      2935      2936 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033 
     2940      2942      2943      2947      2948      2949      2950      2951 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033 
     2952      2953      2954      2955      2957      2961      2962      2964 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  629.7148  828.8033 
     2965      2966      2967      2968      2969      2971      2972      2973 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033 
     2974      2975      2976      2978      2979      2980      2982      2984 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033 
     3284      3288      3289      3290      3294      3295      3296      3297 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033 
     3299      3303      3309      3310      3311      3314      3316      3317 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  629.7148  828.8033 
     3318      3319      3321      3322      3324      3668      3669      3670 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033 
     3671      3672      3674      3675      3677      3678      3683      3686 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033 
     3687      3688      3690      3691      3694      3695      3699      3700 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033 
     3701      3704      4082      4083      4084      4087      4088      4089 
 828.8033  828.8033  629.7148  828.8033  828.8033  629.7148  828.8033  828.8033 
     4090      4092      4093      4094      4095      4096      4099      4100 
 629.7148  828.8033  828.8033  828.8033  629.7148  828.8033  828.8033  828.8033 
     4101      4102      4105      4106      4109      4113      4115      4120 
 629.7148  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  629.7148 
     4121      4122      4124      4125      4128      4130      4132      4133 
 828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  828.8033  629.7148 
     4137      4138      4140      4141      4493      4494      4495      4496 
 828.8033  828.8033  828.8033  828.8033  629.7148  828.8033  629.7148  629.7148 
     4497      4498      4499      4500      4501      4503      4504      4505 
 828.8033  828.8033  629.7148  629.7148  828.8033  629.7148  629.7148  629.7148 
     4507      4508      4509      4510      4511      4512      4513      4514 
 828.8033  828.8033  629.7148  828.8033  828.8033  629.7148  629.7148  828.8033 
     4515      4519      4520      4522      4523      4524      4525      4529 
 629.7148  828.8033  828.8033  629.7148  828.8033  629.7148  629.7148  828.8033 
     4530      4531      4532      4533      4534      4535      4537      4538 
 629.7148  828.8033  828.8033  629.7148  629.7148  828.8033  629.7148  828.8033 
     4543      4545      4546      4548      4550      4551      4552      4553 
 629.7148  828.8033  828.8033  828.8033  828.8033  629.7148  629.7148  828.8033 
     4554      4556      4557      4558      4871      4873      4874      4875 
 629.7148  828.8033  828.8033  828.8033  629.7148  828.8033  629.7148  629.7148 
     4877      4878      4879      4880      4881      4887      4888      4890 
 828.8033  828.8033  828.8033  828.8033  629.7148  629.7148  629.7148  828.8033 
     4891      4892      4894      4897      4898      4899      4901      4902 
 828.8033  828.8033  629.7148  629.7148  629.7148  828.8033  828.8033  828.8033 
     4906      4908      4909      4911      4912      5209      5213      5214 
 629.7148  629.7148  828.8033  629.7148  828.8033  629.7148  629.7148  828.8033 
     5215      5218      5219      5220      5221      5222      5223      5229 
 629.7148  828.8033  629.7148  629.7148  828.8033  828.8033  629.7148  629.7148 
     5231      5235      5236      5238      5239      5240      5242      5243 
 828.8033  828.8033  828.8033  629.7148  828.8033  828.8033  828.8033  828.8033 
     5244      5540      5541      5542      5543      5545      5546      5547 
 828.8033  828.8033  828.8033  629.7148  828.8033  828.8033  629.7148  629.7148 
     5548      5552      5554      5555      5556      5557      5559      5561 
 828.8033  828.8033  828.8033  629.7148  629.7148  629.7148  828.8033  828.8033 
     5562      5563      5564      5571      5574      5575      5576      5577 
 828.8033  828.8033  828.8033  629.7148  828.8033  629.7148  629.7148  828.8033 
     5579      5585      5586      5896      5901      5902      5903      5904 
 828.8033  629.7148  828.8033  629.7148  828.8033  828.8033  828.8033  629.7148 
     5905      5906      5907      5909      5912      5914      5917      5918 
 828.8033  828.8033  828.8033  629.7148  629.7148  629.7148  828.8033  629.7148 
     5923      5924      5925      5929      5931      5932      5938      5939 
 828.8033  828.8033  629.7148  828.8033  828.8033  828.8033  828.8033  629.7148 
     6291      6292      6295      6297      6298      6299      6300      6302 
 828.8033  828.8033  828.8033  828.8033  629.7148  828.8033  629.7148  629.7148 
     6303      6306      6307      6308      6309      6310      6311      6312 
 629.7148  828.8033  828.8033  828.8033  629.7148  828.8033  629.7148  828.8033 
     6313      6314      6316      6317      6318      6320      6321      6324 
 828.8033  629.7148  828.8033  828.8033  629.7148  629.7148  828.8033  629.7148 
     6325      6327      6328      6330      6332      6333      6334      6336 
 629.7148  629.7148  828.8033  828.8033  629.7148  629.7148  828.8033  629.7148 
     6644      6645      6648      6654      6657      6659      6665      6666 
 629.7148  629.7148  629.7148  629.7148  629.7148  629.7148  629.7148  629.7148 
     6667      6670      6671      6994      6996      6997      6999      7003 
 629.7148  629.7148  629.7148  629.7148  629.7148  629.7148  629.7148  828.8033 
     7006      7013      7019      7362      7363      7364      7368      7373 
 629.7148  629.7148  629.7148  629.7148  629.7148  629.7148  629.7148  629.7148 
     7376      7377      7385      7391      7392      7734      7737      7742 
 629.7148  629.7148  629.7148  629.7148  629.7148  629.7148  629.7148  629.7148 
     7748      7749      7751      7753      7756      7758      8115      8116 
 629.7148  629.7148  629.7148  629.7148  629.7148  629.7148  828.8033  629.7148 
     8117      8118      8119      8120      8121      8122      8123      8124 
 629.7148  828.8033  629.7148  629.7148  629.7148  629.7148  629.7148  828.8033 
     8125      8126      8127      8128      8129      8130      8131      8132 
 828.8033  629.7148  629.7148  629.7148  629.7148  629.7148  629.7148  629.7148 
     8133      8134      8135      8136      8137      8138      8139      8140 
 629.7148  828.8033  629.7148  629.7148  629.7148  629.7148  629.7148  629.7148 
     8141        73        74        75        76        77        79        80 
 828.8033  556.1772  556.1772  592.0738  592.0738  556.1772  592.0738  556.1772 
       81        82        84        85        86        87        88        89 
 592.0738  592.0738  556.1772  556.1772  592.0738  592.0738  556.1772  592.0738 
       90        91        92        94        96        97        98        99 
 592.0738  556.1772  556.1772  556.1772  592.0738  592.0738  556.1772  556.1772 
      100       101       102       247       390       391       392       393 
 556.1772  556.1772  556.1772  556.1772  592.0738  592.0738  592.0738  592.0738 
      395       397       398       399       402       403       405       407 
 592.0738  592.0738  592.0738  592.0738  592.0738  592.0738  556.1772  592.0738 
      410       414       416       420       424       428       429       561 
 556.1772  592.0738  592.0738  592.0738  592.0738  556.1772  556.1772  592.0738 
      766       768       769       771       772       774       775       777 
 556.1772  592.0738  556.1772  592.0738  556.1772  556.1772  592.0738  556.1772 
      779       780       781       782       783       784       785       786 
 592.0738  556.1772  556.1772  556.1772  556.1772  592.0738  592.0738  556.1772 
      787       788       789       791       793       796       797       799 
 556.1772  592.0738  556.1772  556.1772  592.0738  592.0738  556.1772  556.1772 
      801       802       803       804       805       806       807       808 
 592.0738  556.1772  556.1772  592.0738  556.1772  556.1772  556.1772  556.1772 
      810       811       812       813       814       815      1005      1009 
 592.0738  556.1772  556.1772  556.1772  556.1772  592.0738  592.0738  592.0738 
     1023      1038      1052      1055      1273      1274      1279      1280 
 592.0738  592.0738  592.0738  592.0738  592.0738  556.1772  592.0738  592.0738 
     1283      1285      1290      1296      1301      1302      1307      1309 
 556.1772  592.0738  556.1772  592.0738  556.1772  592.0738  592.0738  592.0738 
     1310      1315      1318      1319      1562      1564      1566      1569 
 592.0738  556.1772  592.0738  556.1772  556.1772  592.0738  592.0738  556.1772 
     1575      1576      1582      1588      1589      1594      1599      1605 
 556.1772  556.1772  556.1772  556.1772  592.0738  592.0738  592.0738  556.1772 
     1609      1612      1871      1875      1876      1877      1881      1882 
 592.0738  592.0738  556.1772  592.0738  592.0738  556.1772  592.0738  556.1772 
     1883      1889      1890      1891      1892      1901      1902      1905 
 592.0738  556.1772  556.1772  592.0738  556.1772  556.1772  556.1772  592.0738 
     1906      1909      1910      1912      1913      1927      1929      2214 
 556.1772  556.1772  592.0738  556.1772  592.0738  556.1772  556.1772  556.1772 
     2218      2220      2221      2222      2228      2232      2234      2236 
 556.1772  556.1772  556.1772  556.1772  556.1772  556.1772  556.1772  592.0738 
     2239      2245      2247      2250      2273      2280      2283      2284 
 556.1772  556.1772  556.1772  556.1772  556.1772  556.1772  556.1772  556.1772 
     2285      2287      2290      2291      2292      2616      2617      2618 
 556.1772  556.1772  556.1772  556.1772  556.1772  556.1772  556.1772  556.1772 
     2621      2622      2631      2637      2638      2640      2647      2651 
 556.1772  556.1772  556.1772  556.1772  556.1772  556.1772  556.1772  556.1772 
     2652      2653      2657      2661      2663      2666      2668      2671 
 556.1772  556.1772  556.1772  556.1772  556.1772  556.1772  556.1772  556.1772 
     2672      2673      2695      2696      2699      2989      2999      3013 
 556.1772  556.1772  556.1772  556.1772  556.1772  556.1772  556.1772  556.1772 
     3020      3025      3027      3030      3035      3036      3327      3328 
 556.1772  556.1772  556.1772  556.1772  556.1772  556.1772  556.1772  592.0738 
     3329      3333      3334      3335      3337      3338      3342      3343 
 592.0738  592.0738  592.0738  556.1772  556.1772  592.0738  556.1772  556.1772 
     3344      3351      3358      3359      3360      3361      3363      3365 
 556.1772  592.0738  556.1772  592.0738  556.1772  592.0738  592.0738  592.0738 
     3369      3370      3373      3374      3375      3376      3377      3378 
 592.0738  592.0738  556.1772  592.0738  556.1772  592.0738  592.0738  556.1772 
     3379      3380      3382      3394      3395      3398      3402      3706 
 556.1772  556.1772  556.1772  556.1772  556.1772  556.1772  592.0738  592.0738 
     3711      3713      3715      3716      3723      3724      3728      3729 
 556.1772  556.1772  556.1772  556.1772  592.0738  556.1772  592.0738  592.0738 
     3732      3734      3735      3736      3737      3743      3744      3746 
 556.1772  556.1772  556.1772  592.0738  592.0738  592.0738  556.1772  592.0738 
     3748      3750      3754      3758      3760      3761      3762      3764 
 556.1772  556.1772  592.0738  556.1772  592.0738  556.1772  592.0738  592.0738 
     3765      3768      3769      3770      3771      3775      3776      3778 
 592.0738  592.0738  592.0738  592.0738  556.1772  592.0738  592.0738  592.0738 
     3779      3781      3783      3784      3792      3794      3795      3796 
 592.0738  556.1772  556.1772  556.1772  592.0738  556.1772  556.1772  592.0738 
     3799      3800      3802      3804      3809      3810      3812      4142 
 556.1772  556.1772  556.1772  556.1772  592.0738  592.0738  592.0738  556.1772 
     4147      4153      4155      4165      4172      4173      4174      4178 
 592.0738  556.1772  556.1772  556.1772  556.1772  556.1772  592.0738  592.0738 
     4179      4181      4184      4185      4188      4189      4191      4194 
 556.1772  592.0738  592.0738  592.0738  556.1772  556.1772  556.1772  556.1772 
     4195      4197      4201      4203      4206      4208      4210      4211 
 556.1772  592.0738  592.0738  592.0738  592.0738  592.0738  556.1772  592.0738 
     4216      4218      4220      4223      4225      4226      4227      4229 
 556.1772  556.1772  556.1772  592.0738  556.1772  592.0738  556.1772  592.0738 
     4234      4235      4236      4239      4240      4244      4248      4251 
 556.1772  592.0738  592.0738  556.1772  556.1772  556.1772  592.0738  556.1772 
     4559      4561      4563      4564      4567      4568      4570      4571 
 556.1772  592.0738  556.1772  592.0738  592.0738  556.1772  556.1772  556.1772 
     4574      4575      4580      4581      4582      4583      4588      4589 
 592.0738  556.1772  592.0738  592.0738  592.0738  592.0738  592.0738  556.1772 
     4590      4594      4595      4597      4599      4609      4611      4612 
 556.1772  556.1772  556.1772  556.1772  556.1772  556.1772  556.1772  556.1772 
     4616      4619      4622      4623      4624      4627      4628      4629 
 592.0738  592.0738  556.1772  592.0738  592.0738  556.1772  592.0738  592.0738 
     4634      4635      4918      4919      4920      4923      4925      4935 
 556.1772  592.0738  592.0738  592.0738  592.0738  592.0738  556.1772  592.0738 
     4936      4937      4940      4941      4944      4945      4948      4954 
 592.0738  556.1772  592.0738  556.1772  592.0738  556.1772  592.0738  592.0738 
     4955      4956      4958      4959      4965      4967      4968      4969 
 556.1772  556.1772  592.0738  592.0738  592.0738  556.1772  556.1772  556.1772 
     4971      4975      4976      4977      4978      4981      4982      4985 
 592.0738  592.0738  592.0738  556.1772  592.0738  592.0738  592.0738  556.1772 
     5249      5251      5252      5254      5255      5258      5259      5262 
 556.1772  556.1772  592.0738  556.1772  556.1772  592.0738  592.0738  592.0738 
     5268      5271      5273      5274      5283      5285      5287      5292 
 592.0738  556.1772  556.1772  592.0738  556.1772  556.1772  556.1772  556.1772 
     5293      5299      5302      5303      5304      5306      5307      5312 
 556.1772  556.1772  556.1772  556.1772  556.1772  592.0738  592.0738  592.0738 
     5589      5590      5591      5592      5593      5596      5599      5600 
 556.1772  556.1772  592.0738  592.0738  592.0738  592.0738  556.1772  592.0738 
     5602      5604      5605      5606      5608      5610      5612      5617 
 556.1772  592.0738  556.1772  556.1772  556.1772  592.0738  556.1772  556.1772 
     5618      5620      5621      5622      5623      5626      5627      5629 
 556.1772  592.0738  592.0738  592.0738  556.1772  556.1772  556.1772  556.1772 
     5630      5631      5633      5634      5635      5636      5638      5640 
 556.1772  592.0738  592.0738  592.0738  556.1772  592.0738  592.0738  556.1772 
     5641      5642      5643      5644      5646      5649      5651      5942 
 592.0738  592.0738  592.0738  556.1772  592.0738  592.0738  556.1772  556.1772 
     5943      5944      5947      5949      5950      5951      5954      5955 
 556.1772  556.1772  556.1772  556.1772  592.0738  592.0738  592.0738  556.1772 
     5956      5957      5958      5963      5964      5965      5966      5967 
 556.1772  592.0738  556.1772  556.1772  592.0738  556.1772  592.0738  592.0738 
     5968      5977      5979      5980      5982      5983      5984      5989 
 592.0738  592.0738  556.1772  592.0738  592.0738  556.1772  592.0738  556.1772 
     5991      6000      6001      6002      6003      6004      6007      6008 
 592.0738  556.1772  556.1772  556.1772  592.0738  556.1772  592.0738  592.0738 
     6011      6012      6014      6017      6018      6025      6028      6029 
 592.0738  592.0738  556.1772  556.1772  556.1772  592.0738  556.1772  556.1772 
     6030      6031      6032      6034      6037      6038      6039      6040 
 592.0738  592.0738  556.1772  592.0738  556.1772  556.1772  592.0738  592.0738 
     6041      6042      6043      6044      6341      6342      6344      6349 
 592.0738  556.1772  592.0738  556.1772  592.0738  592.0738  592.0738  592.0738 
     6350      6351      6353      6357      6358      6359      6360      6361 
 592.0738  592.0738  556.1772  592.0738  556.1772  592.0738  592.0738  556.1772 
     6362      6365      6367      6368      6370      6371      6375      6376 
 556.1772  592.0738  592.0738  592.0738  592.0738  592.0738  592.0738  592.0738 
     6377      6379      6382      6384      6386      6388      6389      6392 
 556.1772  592.0738  592.0738  592.0738  556.1772  592.0738  556.1772  592.0738 
     6394      6395      6396      6397      6398      6400      6401      6402 
 592.0738  556.1772  592.0738  556.1772  556.1772  592.0738  592.0738  592.0738 
     6404      6407      6409      6411      6412      6413      6414      6675 
 556.1772  592.0738  556.1772  556.1772  592.0738  592.0738  556.1772  556.1772 
     6677      6680      6681      6682      6683      6684      6685      6689 
 592.0738  592.0738  592.0738  556.1772  592.0738  556.1772  592.0738  556.1772 
     6690      6691      6692      6693      6694      6696      6697      6699 
 592.0738  556.1772  556.1772  556.1772  592.0738  592.0738  556.1772  592.0738 
     6702      6704      6705      6707      6712      6715      6716      6717 
 556.1772  556.1772  592.0738  592.0738  556.1772  592.0738  556.1772  592.0738 
     6720      6722      6724      6725      6726      6728      6730      6734 
 592.0738  592.0738  592.0738  556.1772  592.0738  592.0738  592.0738  592.0738 
     6735      6736      7020      7022      7023      7025      7027      7028 
 556.1772  592.0738  592.0738  592.0738  556.1772  556.1772  592.0738  592.0738 
     7030      7031      7032      7033      7034      7037      7039      7040 
 592.0738  556.1772  556.1772  592.0738  556.1772  592.0738  556.1772  592.0738 
     7042      7043      7044      7046      7047      7048      7049      7050 
 556.1772  592.0738  592.0738  556.1772  592.0738  592.0738  592.0738  556.1772 
     7051      7052      7053      7060      7061      7062      7063      7064 
 592.0738  592.0738  592.0738  592.0738  592.0738  592.0738  592.0738  592.0738 
     7066      7068      7069      7070      7071      7073      7074      7075 
 592.0738  592.0738  556.1772  556.1772  556.1772  592.0738  556.1772  592.0738 
     7076      7078      7079      7081      7082      7083      7085      7086 
 556.1772  556.1772  592.0738  592.0738  592.0738  556.1772  556.1772  592.0738 
     7088      7091      7092      7093      7094      7096      7097      7099 
 592.0738  592.0738  556.1772  592.0738  556.1772  556.1772  556.1772  592.0738 
     7104      7105      7106      7108      7110      7111      7112      7393 
 592.0738  592.0738  592.0738  592.0738  592.0738  592.0738  556.1772  592.0738 
     7394      7395      7398      7400      7401      7404      7406      7407 
 556.1772  556.1772  556.1772  592.0738  592.0738  556.1772  592.0738  592.0738 
     7408      7409      7410      7411      7412      7413      7414      7416 
 592.0738  592.0738  592.0738  592.0738  556.1772  556.1772  556.1772  592.0738 
     7417      7418      7419      7420      7421      7422      7423      7424 
 592.0738  556.1772  592.0738  592.0738  592.0738  556.1772  592.0738  592.0738 
     7425      7426      7429      7430      7431      7433      7434      7436 
 592.0738  592.0738  592.0738  556.1772  592.0738  592.0738  556.1772  592.0738 
     7437      7439      7443      7444      7445      7446      7447      7448 
 592.0738  592.0738  592.0738  556.1772  592.0738  592.0738  592.0738  592.0738 
     7450      7451      7453      7454      7455      7458      7459      7460 
 592.0738  556.1772  556.1772  592.0738  592.0738  592.0738  592.0738  592.0738 
     7463      7464      7465      7466      7469      7472      7475      7477 
 592.0738  556.1772  592.0738  556.1772  556.1772  556.1772  556.1772  592.0738 
     7478      7480      7482      7764      7765      7766      7769      7770 
 592.0738  592.0738  592.0738  592.0738  556.1772  556.1772  592.0738  556.1772 
     7771      7772      7773      7774      7776      7777      7778      7780 
 556.1772  592.0738  592.0738  556.1772  556.1772  592.0738  592.0738  592.0738 
     7781      7782      7783      7784      7786      7787      7788      7791 
 592.0738  592.0738  592.0738  556.1772  592.0738  592.0738  556.1772  592.0738 
     7792      7793      7794      7795      7796      7797      7798      7799 
 556.1772  592.0738  592.0738  556.1772  556.1772  556.1772  592.0738  556.1772 
     7800      7801      7802      7804      7805      7809      7810      7811 
 592.0738  556.1772  556.1772  556.1772  592.0738  592.0738  592.0738  592.0738 
     7812      7813      7815      7818      7819      7820      7823      7824 
 556.1772  592.0738  556.1772  592.0738  556.1772  592.0738  556.1772  592.0738 
     7825      7826      7827      7829      7830      7834      7835      7836 
 556.1772  592.0738  592.0738  556.1772  592.0738  592.0738  592.0738  556.1772 
     7837      7839      7842      7844      7845      7847      7850      7851 
 592.0738  592.0738  592.0738  592.0738  592.0738  556.1772  556.1772  556.1772 
     7852      8142      8143      8144      8145      8146      8147      8148 
 592.0738  556.1772  556.1772  556.1772  592.0738  556.1772  592.0738  592.0738 
     8149      8150      8151      8152      8153      8154      8155      8156 
 556.1772  592.0738  556.1772  556.1772  592.0738  556.1772  592.0738  556.1772 
     8157      8158      8159      8160      8161      8162      8163      8164 
 556.1772  556.1772  592.0738  592.0738  592.0738  556.1772  592.0738  556.1772 
     8165      8166      8167      8168      8169      8170      8171      8172 
 592.0738  556.1772  592.0738  592.0738  556.1772  556.1772  592.0738  556.1772 
     8173      8174      8175      8176      8177      8178      8179      8180 
 556.1772  592.0738  556.1772  556.1772  592.0738  592.0738  592.0738  592.0738 
     8181      8182      8183      8184      8185      8186      8187      8188 
 556.1772  556.1772  556.1772  556.1772  592.0738  592.0738  592.0738  556.1772 
     8189      8190      8191      8192      8193      8194      8195      8196 
 592.0738  556.1772  592.0738  592.0738  556.1772  556.1772  592.0738  592.0738 
     8197      8198      8199      8200      8201      8202      8203      8204 
 592.0738  592.0738  592.0738  556.1772  556.1772  556.1772  592.0738  556.1772 
     8205      8206      8207      8208      8209      8210      8211      8212 
 556.1772  592.0738  556.1772  592.0738  556.1772  592.0738  592.0738  556.1772 
     8213      8214      8215      8216      8217      8218      8219      8220 
 592.0738  556.1772  556.1772  556.1772  556.1772  556.1772  592.0738  556.1772 
     8221      8222      8223      8224      8225      8226      8227      8228 
 592.0738  556.1772  556.1772  592.0738  592.0738  592.0738  556.1772  592.0738 
     8229      8230      8231      8232      8233      8234      8235      8237 
 556.1772  592.0738  556.1772  592.0738  592.0738  556.1772  556.1772  556.1772 
     8238      8239      8240      8241      8242       105       110       113 
 592.0738  556.1772  556.1772  556.1772  592.0738  283.5620  460.3413  460.3413 
      114       115       117       118       122       123       124       127 
 283.5620  460.3413  460.3413  460.3413  460.3413  460.3413  283.5620  460.3413 
      128       131       137       143       258       431       432       433 
 460.3413  460.3413  283.5620  283.5620  283.5620  460.3413  460.3413  460.3413 
      439       441       444       446       448       449       451       452 
 460.3413  460.3413  460.3413  460.3413  460.3413  283.5620  460.3413  460.3413 
      453       471       473       827       828       829       833       844 
 460.3413  283.5620  460.3413  283.5620  283.5620  460.3413  460.3413  460.3413 
      863       866       872       875       880       881      1081      1349 
 283.5620  283.5620  283.5620  283.5620  283.5620  460.3413  460.3413  460.3413 
     1366      1374      1383      1386      1639      1645      1648      1660 
 460.3413  460.3413  283.5620  283.5620  283.5620  460.3413  460.3413  460.3413 
     1671      1934      1950      1952      1957      2304      2305      2310 
 283.5620  283.5620  460.3413  460.3413  460.3413  460.3413  283.5620  460.3413 
     2312      2364      2370      2381      2385      2387      2716      2717 
 460.3413  460.3413  460.3413  283.5620  460.3413  460.3413  460.3413  460.3413 
     2725      2726      2731      2737      3413      3422      3857      3873 
 460.3413  460.3413  460.3413  283.5620  460.3413  283.5620  283.5620  460.3413 
     3889      3904      4257      4313      4648      4649      4666      4667 
 283.5620  460.3413  283.5620  460.3413  460.3413  460.3413  460.3413  460.3413 
     4685      4703      5027      5041      5358      5371      5654      5655 
 460.3413  283.5620  283.5620  283.5620  283.5620  460.3413  283.5620  283.5620 
     5659      5660      5662      5663      5664      5671      5673      5674 
 283.5620  460.3413  283.5620  283.5620  283.5620  283.5620  283.5620  283.5620 
     5676      5679      5680      5682      5684      5686      5689      5690 
 460.3413  460.3413  460.3413  460.3413  460.3413  460.3413  283.5620  283.5620 
     5692      5696      5708      5709      5710      5711      5712      5715 
 283.5620  283.5620  283.5620  460.3413  283.5620  460.3413  283.5620  460.3413 
     5718      5722      5724      5726      5727      5728      6095      6427 
 460.3413  283.5620  460.3413  460.3413  283.5620  283.5620  460.3413  460.3413 
     6442      6446      6467      6787      6790      6795      7514      7867 
 460.3413  283.5620  460.3413  460.3413  283.5620  460.3413  460.3413  460.3413 
     7876      7913      8243      8244      8245      8246      8247      8248 
 460.3413  460.3413  460.3413  460.3413  460.3413  283.5620  283.5620  460.3413 
     8249      8250      8251      8252      8253      8254      8255      8256 
 460.3413  460.3413  283.5620  283.5620  283.5620  283.5620  283.5620  460.3413 
     8257      8258      8259      8260      8261      8262      8263      8264 
 460.3413  460.3413  460.3413  283.5620  460.3413  283.5620  283.5620  460.3413 
     8265      8266      8267      8268      8269      8270      8271      8272 
 283.5620  283.5620  283.5620  283.5620  283.5620  460.3413  283.5620  460.3413 
     8273      8274      8275      8276      8277      8278      8279      8280 
 283.5620  283.5620  283.5620  283.5620  460.3413  460.3413  460.3413  283.5620 
     8281      8282      8283      8284      8285      8286      8287      8288 
 460.3413  283.5620  460.3413  460.3413  283.5620  283.5620  460.3413  460.3413 
     8289      8290      8291      8292      8293      8294      8295      8296 
 283.5620  460.3413  283.5620  460.3413  460.3413  460.3413  460.3413  283.5620 
     8297      8298      8299      8300      8301      8302      8303      8304 
 460.3413  283.5620  460.3413  460.3413  460.3413  283.5620  460.3413  460.3413 
     8305      8306      8307      8308      8309      8310      8311      8312 
 283.5620  283.5620  283.5620  460.3413  460.3413  460.3413  283.5620  460.3413 
     8313      8314      8315      8316      8317      8318      8319      8320 
 460.3413  460.3413  283.5620  283.5620  460.3413  283.5620  283.5620  460.3413 
     8321      8322      8323      8324      8325 
 283.5620  460.3413  283.5620  460.3413  283.5620

# predict responses across a range of ages
# remember that our model uses the square root of ages, so we need to transform our predictor
xvals <- seq(0, 30, 0.1)
p.f <- predict(mod.int, newdata = data.frame(age.sq=sqrt(xvals), sex=0), type="response")
plot(p.f ~ xvals)

p.m <- predict(mod.int, newdata = data.frame(age.sq=sqrt(xvals), sex=1), type="response")
plot(p.m ~ xvals)

# typically, we want to visualize our data on the scale that it was collected
# so, because it is easier for us as humans to visualize age, rather than the square
# root of age, we should plot across that scale
plot(weight ~ age, data=dat, col=c("orange", "red")[dat$sex+1])
lines(p.f ~ (xvals), col="orange")
lines(p.m ~ (xvals), col="red")

9.2.1 Predicting with SE

Now, back to the bison. We can add the standard error of our model estimate to our plots to indicated that there is some uncertainty in our estimated values.

p.f.se <- predict(mod.int, newdata = data.frame(age.sq=sqrt(xvals), sex=0), 
                  type="response",
                  se.fit=T)

p.m.se <- predict(mod.int, newdata = data.frame(age.sq=sqrt(xvals), sex=1), 
                  type="response",
                  se.fit=T)

# typically, we want to visualize our data on the scale that it was collected
# so, because it is easier for us as humans to visualize age, rather than the square
# root of age, we should plot across that scale
plot(weight ~ age, data=dat, col=c("orange", "red")[dat$sex+1])
lines(p.f.se$fit ~ (xvals), col="orange")
lines(p.m.se$fit ~ (xvals), col="red")
polygon(x=c(xvals, rev(xvals)),
        y=c(p.f.se$fit + 1.96*p.f.se$se.fit,
            rev(p.f.se$fit - 1.96*p.f.se$se.fit)), col="#F0F00088", border=F)

polygon(x=c(xvals, rev(xvals)),
        y=c(p.m.se$fit + 1.96*p.m.se$se.fit,
            rev(p.m.se$fit - 1.96*p.m.se$se.fit)), col="#FF000088", border=F)

9.3 Extrapolating

Because we can predict our model to any value, we can extrapolate outside the realm of values used to fit the model. But, our estimates will be far less certain.

xvals <- seq(0, 100, 0.1)
p.f.se <- predict(mod.int, newdata = data.frame(age.sq=sqrt(xvals), sex=0), 
                  type="response",
                  se.fit=T)

p.m.se <- predict(mod.int, newdata = data.frame(age.sq=sqrt(xvals), sex=1), 
                  type="response",
                  se.fit=T)

# typically, we want to visualize our data on the scale that it was collected
# so, because it is easier for us as humans to visualize age, rather than the square
# root of age, we should plot across that scale
plot(weight ~ age, data=dat, col=c("orange", "red")[dat$sex+1], xlim=c(0, 100), ylim=c(0, 8000))
lines(p.f.se$fit ~ (xvals), col="orange")
lines(p.m.se$fit ~ (xvals), col="red")
polygon(x=c(xvals, rev(xvals)),
        y=c(p.f.se$fit + 1.96*p.f.se$se.fit,
            rev(p.f.se$fit - 1.96*p.f.se$se.fit)), col="#F0F00088", border=F)

polygon(x=c(xvals, rev(xvals)),
        y=c(p.m.se$fit + 1.96*p.m.se$se.fit,
            rev(p.m.se$fit - 1.96*p.m.se$se.fit)), col="#FF000088", border=F)