For each land use category, the following process was then used to develop the final models. Scatter plots of the dependent versus each independent variable were inspected to identify and remove extreme-value points as outliers. Partial regression plots, which illustrate the relationship between the dependent and each independent variable after the effects of the other independent variables have been accounted for (De Veaux et al. 2011), were inspected. This inspection served to identify and remove any overly-influential points as outliers, as well as to visually assess the significance of each candidate independent variable on the independent variable. From the ‘General Residential’ and ‘Low-Income Residential’ land uses, one and six outliers were removed, respectively. From the remaining points, 20% were then randomly removed to be reserved for validity testing, leaving 80% to form the training set for model development. For each land use, a preliminary OLS model was then built, and the significant variables with p < 0.05 were identified to be used in the final models. The final models were built using both OLS and WLS with three different weighting systems, resulting in four final models for each land use. Subsequently, provided the five assumptions of OLS were satisfied, these models were compared in terms of the log-likelihood, AIC (Akaike's information criteria) and BIC (Bayesian information criteria) to determine the best-performing model for each land use. These likelihood indicators as well as how the results are to be interpreted are presented in Table 6.
Indicators used for model comparison
Indicator . | Interpretation . |
---|---|
Log-Likelihood |
|
Akaike's Information Criterion (AIC) |
|
Bayesian Information Criterion (BIC) |
|
Indicator . | Interpretation . |
---|---|
Log-Likelihood |
|
Akaike's Information Criterion (AIC) |
|
Bayesian Information Criterion (BIC) |
|