Spatial Econometrics

James P. LeSage
Department of Economics
University of Toledo

May, 1999


Contents

Preface

This text provides an introduction to spatial econometrics as well as a set of MATLAB functions that implement a host of spatial econometric estimation methods. The intended audience is faculty and students involved in modeling spatial data sets using spatial econometric methods. The MATLAB functions described in this book have been used in my own research as well as teaching both undergraduate and graduate econometrics courses.

Toolboxes are the name given by the MathWorks to related sets of MATLAB functions aimed at solving a particular class of problems. Toolboxes of functions useful in signal processing, optimization, statistics, finance and a host of other areas are available from the MathWorks as add-ons to the standard MATLAB software distribution. I use the term Econometrics Toolbox to refer to my collection of function libraries described in a manual entitled Applied Econometrics using MATLAB available at http://www.econ.utoledo.edu.

The MATLAB spatial econometrics functions used to apply the spatial econometric models discussed in this text rely on many of the functions in the Econometrics Toolbox. The spatial econometric functions constitute a ``library'' within the broader set of econometric functions. To use the spatial econometrics functions library you need to install the entire set of Econometrics Toolbox functions in MATLAB. The spatial econometrics functions library is part of the Econometrics Toolbox and will be installed and available for use as are the econometrics functions.

Researchers currently using Gauss, RATS, TSP, or SAS for econometric programming might find switching to MATLAB advantageous. MATLAB software has always had excellent numerical algorithms, and has recently been extended to include: sparse matrix algorithms and very good graphical capabilities. MATLAB software is available on a wide variety of computing platforms including mainframe, Intel, Apple, and Linux or Unix workstations. A Student Version of MATLAB is available for less than $100. This version is limited in the size of problems it can solve, but many of the examples in this text rely on a small data sample with 49 observations that can be used with the Student Version of MATLAB.

The collection of around 450 functions and demonstration programs are organized into libraries, with approximately 30 spatial econometrics library functions described in this text. For those interested in other econometric functions or in adding programs to the spatial econometrics library, see the manual for the Econometrics Toolbox. The 350 page manual provides many details regarding programming techniques used to construct the functions and examples of adding new functions to the Econometrics Toolbox. This text does not focus on programming methods. The emphasis here is on applying the existing spatial econometric estimation functions to modeling spatial data sets.

A consistent design was implemented that provides documentation, example programs, and functions to produce printed as well as graphical presentation of estimation results for all of the econometric functions. This was accomplished using the ``structure variables'' introduced in MATLAB Version 5. Information from econometric estimation is encapsulated into a single variable that contains ``fields'' for individual parameters and statistics related to the econometric results. A thoughtful design by the MathWorks allows these structure variables to contain scalar, vector, matrix, string, and even multi-dimensional matrices as fields. This allows the econometric functions to return a single structure that contains all estimation results. These structures can be passed to other functions that can intelligently decipher the information and provide a printed or graphical presentation of the results.

The Econometrics Toolbox along with the spatial econometrics library functions should allow faculty to use MATLAB in undergraduate and graduate level courses with absolutely no programming on the part of students or faculty. In addition to providing a set of spatial econometric estimation routines and documentation, the book has another goal, applied modeling strategies and data analysis. Given the ability to easily implement a host of alternative models and produce estimates rapidly, attention naturally turns to which models and estimates work best to summarize a spatial data sample. Much of the discussion in this text is on these issues.

This text is provided in Adobe PDF and HTML formats for online use. It attempts to draw on the unique aspects of a computer presentation platform. The ability to present program code, data sets and applied examples in an online fashion is a relatively recent phenomenon, so issues of how to best accomplish a useful online presentation are numerous. For the online text the following features were included in the PDF and HTML documents.

1.
A detailed set of ``bookmarks'' that allow the reader to jump to any section or subsection in the text including examples or figures in the text.
2.
A set of ``bookmarks'' that allow the reader to view the spatial datasets and documentation for the datasets using a Web browser.
3.
A set of ``bookmarks'' that allow the reader to view all of the sample programs using a Web browser.

All of the examples in the text and the datasets are available offline and on my Web site: http://www.econ.utoledo.edu under the MATLAB gallery icon.

Finally, there are obviously omissions, bugs and perhaps programming errors in the Econometrics Toolbox and the spatial econometrics library functions. This would likely be the case with any such endeavor. I would be grateful if users would notify me when they encounter problems. It would also be helpful if users who produce generally useful functions that extend the toolbox would submit them for inclusion. Much of the econometric code I encounter on the internet is simply too specific to a single research problem to be generally useful in other applications. If econometrics researchers are serious about their newly proposed estimation methods, they should take the time to craft a generally useful MATLAB function that others could use in applied research. Inclusion in the spatial econometrics function library would have the added benefit of introducing new research methods to faculty and their students.

The latest version of the Econometrics Toolbox functions can be found on the Internet at: http://www.econ.utoledo.edu under the MATLAB gallery icon. Instructions for installing these functions are in an Appendix to this text along with a listing of the functions in the library and a brief description of each.

  
1. Introduction

This chapter provides an overview of the nature of spatial econometrics. An applied model-based approach is taken where various spatial econometric methods are introduced in the context of spatial data sets and models based on the data. The remaining chapters of the text are organized along the lines of alternative spatial econometric estimation procedures. Each chapter illustrates applications of a different econometric estimation method and provides references to the literature regarding these methods.

Section 1.1 sets forth the nature of spatial econometrics and discusses differences with traditional econometrics. We will see that spatial econometrics is characterized by: 1) spatial dependence between sample data observations at various points in the Cartesian plane, and 2) spatial heterogeneity that arises from relationships or model parameters that vary with our sample data as we move over the Cartesian plane.

The nature of spatially dependent or spatially correlated data is taken up in Section 1.2 and spatial heterogeneity is discussed in Section 1.3. Section 1.4 takes up the subject of how we formally incorporate the locational information from spatial data in econometric models. In addition to the theoretical discussion of incorporating locational information in econometric models, Section 1.4 provides a preview of alternative spatial econometric estimation methods that will be covered in Chapters 2 through  4.

Finally, Section 1.5 describes software design issues related to a spatial econometric function library based on MATLAB software from the MathWorks Inc. Functions are described throughout the text that implement the spatial econometric estimation methods discussed. These functions provide a consistent user-interface in terms of documentation and related functions that provide printed as well as graphical presentation of the estimation results. Section 1.5 introduces the spatial econometrics function library which is part of a broader collection of econometric estimation functions available in my public domain Econometrics Toolbox.

  
1.1 Spatial econometrics

Applied work in regional science relies heavily on sample data that are collected with reference to locations measured as points in space. The subject of how we incorporate the locational aspect of sample data is deferred until Section 1.4. What distinguishes spatial econometrics from traditional econometrics? Two problems arise when sample data has a locational component: 1) spatial dependence exists between the observations and 2) spatial heterogeneity occurs in the relationships we are modeling.

Traditional econometrics has largely ignored these two issues that violate the Gauss-Markov assumptions used in regression modeling. With regard to spatial dependence between observations, recall that Gauss-Markov assumes the explanatory variables are fixed in repeated sampling. Spatial dependence violates this assumption, a point that will be made clear in the next section. This gives rise to the need for alternative estimation approaches. Similarly, spatial heterogeneity violates the Gauss-Markov assumption that a single linear relationship exists across the sample data observations. If the relationship varies as we move across the spatial data sample, alternative estimation procedures are needed to successfully model this type of variation and draw appropriate inferences.

The subject of this text is alternative estimation approaches that can be used when dealing with spatial data samples. For example, no discussion of issues and models related to spatial data samples occurs in Amemiya (1985), Chow (1983), Dhrymes (1978), Fomby et al. (1984), Green (1997), Intrilligator (1978), Kelijian and Oates (1989), Kmenta (1986), Maddala (1977), Pindyck and Rubinfeld (1981), Schmidt (1976), and Vinod and Ullah (1981).

Anselin (1988) provides a complete treatment of many facets of spatial econometrics which this text draws upon. In addition to introducing ideas set forth in Anselin (1988), this presentation includes some more recent approaches based on Bayesian methods applied to spatial econometric models. In terms of focus, the materials presented here are more applied than Anselin (1988), providing program functions and illustrations of hands-on approaches to implementing the estimation methods described. Another departure from Anselin (1988) is in the use of sparse matrix algorithms available in the MATLAB software to implement spatial econometric estimation procedures. These implementation details represent previously unpublished material that describes a set of (freely available) programs for solving large-scale spatial econometric problems involving thousands of observations in a few minutes on a modest desktop computer. Students as well as researchers can use these programs without any programming to implement some of the latest estimation procedures on large-scale spatial data sets. A commerically available program called SpaceStat is available from Anselin that implements the maximum likelihood estimation methods by relying on Gauss software. Of course another distinction of the presentation here is the interactive aspect of a Web-based format that allows a hands-on approach that provides links to code, sample data and examples.

  
1.2 Spatial dependence

Spatial dependence in a collection of sample data observations refers to the fact that one observation associated with a location which we might label i depends on other observations at locations $j \ne i$. Formally, we might state:


 \begin{displaymath}y_{i} = f(y_{j}), i=1,\ldots,n \ \ \ j \ne i
 \end{displaymath} (1.1)

Note that we allow the dependence to be among several observations, as the index i can take on any value from $i=1,\ldots,n$. Why would we expect sample data observed at one point in space to be dependent on values observed at other locations? There are two reasons commonly given. First, data collection of observations associated with spatial units such as zip-codes, counties, states, census tracts and so on, might reflect measurement error. This would occur if the administrative boundaries for collecting information do not accurately reflect the nature of the underlying process generating the sample data. As an example, consider the case of unemployment rates and labor force measures. Because laborers are mobile and can cross county or state lines to find employment in neighboring areas, labor force or unemployment rates measured on the basis of where people live could exhibit spatial dependence.

A second and perhaps more important reason we would expect spatial dependence is that the spatial dimension of socio-demographic, economic or regional activity may truly be an important aspect of a modeling problem. Regional science is based on the premise that location and distance are important forces at work in human geography and market activity. All of these notions have been formalized in regional science theory that relies on notions of spatial interaction and diffusion effects, hierarchies of place and spatial spillovers.

As a concrete example of this type of spatial dependence, we use a spatial data set on annual county-level counts of Gypsy moths established by the Michigan Department of Natural Resources (DNR) for the 68 counties in lower Michigan.

The North American gypsy moth infestation in the United States provides a classic example of a natural phenomena that is spatial in character. During 1981, the moths ate through 12 million acres of forest in 17 Northeastern states and Washington, DC. More recently, the moths have been spreading into the northern and eastern Midwest and to the Pacific Northwest. For example, in 1992 the Michigan Department of Agriculture estimated that more than 700,000 acres of forest land had experienced at least a 50% defoliation rate.


  
Figure 1.1: Gypsy moth counts in lower Michigan, 1985
\fbox{\includegraphics[width=4.5in]{1985.eps}}

Figure 1.1 shows a map of the moth counts for 1985 in lower Michigan. We see the highest level of moth counts near Midland county Michigan in the center. As we move outward from the center, lower levels of moth counts occur taking the form of concentric rings. A set of k data points $y_{i},i=1,\dots,k$ taken from the same ring would exhibit a high correlation with each other. In terms of (1.1), yi and yj where both observations i and j come from the same ring should be highly correlated. The correlation of k1 points taken from one ring and k2 points from a neighboring ring should also exhibit a high correlation, but not as high as points sampled from the same ring. As we examine the correlation between points taken from more distant rings, we would expect the correlation to diminish.

Over time the Gypsy moths spread to neighboring areas. They cannot fly, so the diffusion should be relatively slow. Figure 1.2 shows a similarly constructed contour map of moth counts for the next year, 1986. We see some evidence of diffusion to neighboring areas between 1985 and 1986. The circular pattern of higher levels in the center and lower levels radiating out from the center is still quite evident.


  
Figure 1.2: Gypsy moth counts in lower Michigan, 1992
\fbox{\includegraphics[width=4.5in]{1986.eps}}

How does this situation differ from the traditional view of the process at work to generate economic data samples? The Gauss-Markov view of a regression data sample is that the generating process takes the form of (1.2), where y represent a vector of n observations, X denotes an nxk matrix of explanatory variables, $\beta$ is a vector of k parameters and $\varepsilon$ is a vector of n stochastic disturbance terms.


 \begin{displaymath}y = X \beta + \varepsilon
 \end{displaymath} (1.2)

The generating process is such that the X matrix and true parameters $\beta$ are fixed while repeated disturbance vectors $\varepsilon$ work to generate the samples y that we observe. Given that the matrix X and parameters $\beta$ are fixed, the distribution of sample y vectors will have the same variance-covariance structure as $\varepsilon$. Additional assumptions regarding the nature of the variance-covariance structure of $\varepsilon$ were invoked by Gauss-Markov to ensure that the distribution of individual observations in y exhibit a constant variance as we move across observations, and zero covariance between the observations.

It should be clear that observations from our sample of moth level counts do not obey this structure. As illustrated in Figures 1.1 and 1.2, observations from counties in concentric rings are highly correlated, with a decay of correlation as we move to observations from more distant rings.

Spatial dependence arising from underlying regional interactions in regional science data samples suggests the need to quantify and model the nature of the unspecified functional spatial dependence function f(), set forth in (1.1). Before turning attention to this task, the next section discusses the other underlying condition leading to a need for spatial econometrics -- spatial heterogeneity.

  
1.3 Spatial heterogeneity

The term spatial heterogeneity refers to variation in relationships over space. In the most general case we might expect a different relationship to hold for every point in space. Formally, we write a linear relationship depicting this as:


 \begin{displaymath}y_{i} = X_{i} \beta_{i} + \varepsilon_{i}
 \end{displaymath} (1.3)

Where i indexes observations collected at $i=1,\ldots,n$ points in space, Xi represents a (1 x k) vector of explanatory variables with an associated set of parameters $\beta_{i}$, yi is the dependent variable at observation (or location) i and $\varepsilon_{i}$ denotes a stochastic disturbance in the linear relationship.

A slightly more complicated way of expressing this notion is to allow the function f() from (1.1) to vary with the observation index i, that is:


 \begin{displaymath}y_{i} = f_{i}(X_{i} \beta_{i} + \varepsilon_{i})
 \end{displaymath} (1.4)

Restricting attention to the simpler formation in (1.3), we could not hope to estimate a set of n parameter vectors $\beta_{i}$ given a sample of n data observations. We simply do not have enough sample data information with which to produce estimates for every point in space, a phenomena referred to as a ``degrees of freedom'' problem. To proceed with the analysis we need to provide a specification for variation over space. This specification must be parsimonious, that is, only a handful of parameters can be used in the specification. A large amount of spatial econometric research centers on alternative parsimonious specifications for modeling variation over space. Questions arise regarding: 1) how sensitive the inferences are to a particular specification regarding spatial variation? 2) is the specification consistent with the sample data information? 3) how do competing specifications perform and what inferences do they provide? and 4) a host of other issues that will be explored in this text.

One can also view the specification task as one of placing restrictions on the nature of variation in the relationship over space. For example, suppose we classified our spatial observations into urban and rural regions. We could then restrict our analysis to two relationships, one homogeneous across all urban observational units and another for the rural units. This raises a number of questions: 1) are two relations consistent with the data, or is there evidence to suggest more than two? 2) is there a trade-off between efficiency in the estimates and the number of restrictions we use? 3) are the estimates biased if the restrictions are inconsistent with the sample data information? and other issues we will explore.

One of the compelling motivations for the use of Bayesian methods in spatial econometrics is their ability to impose restrictions that are stochastic rather than exact in nature. Bayesian methods allow us to impose restrictions with varying amounts of prior uncertainty. In the limit, as we impose a restriction with a great deal of certainty, the restriction becomes exact. Carrying out our econometric analysis with varying amounts of prior uncertainty regarding a restriction allows us to provide a continuous mapping of the restriction's impact on the estimation outcomes.


  
Figure 1.3: Distribution of home prices versus distance
\fbox{\includegraphics[width=4.5in]{figure1p4.eps}}

As a concrete illustration of spatial heterogeneity, we use a sample of 35,000 homes that sold within the last 5 years in Lucas county, Ohio. The selling prices were sorted from low to high and three samples of 5,000 homes were constructed. The 5,000 homes with the lowest selling prices were used to represent a sample of low-price homes. The 5,000 homes with selling prices that ranked from 15,001 to 20,000 in the sorted list were used to construct a sample of medium-price homes and the 5,000 highest selling prices from 30,0001 to 35,000 served as the basis for a high-price sample. It should be noted that the sample consisted of 35,702 homes, but the highest 702 selling prices were omitted from this exercise as they represent very high prices that are atypical.

Using the latitude-longitude coordinates, the distance from the central business district (CBD) in the city of Toledo, which is at the center of Lucas county was calculated. The three samples of 5,000 low, medium and high priced homes were used to estimate three empirical distributions that are graphed with respect to distance from the CBD in Figure 1.3.

We see three distinct distributions, with low-priced homes nearest to the CBD and high priced homes farthest away from the CBD. This suggests different relationships may be at work to describe home prices in different locations. Of course this is not surprising, numerous regional science theories exist to explain land usage patterns as a function of distance from the CBD. Nonetheless, these three distinct distributions provide a contrast to the Gauss-Markov assumption that the distribution of sample data exhibits a constant mean and variance as we move across the observations.


  
Figure 1.4: Distribution of home prices versus living area
\fbox{\includegraphics[width=4.5in]{figure1p5.eps}}

Another illustration of spatial heterogeneity is provided by three distributions for total square feet of living area of low, medium and high priced homes shown in Figure 1.4. Here we see only two distinct distributions, suggesting a pattern where the highest priced homes are the largest, but low and medium priced homes have roughly similar distributions with regard to living space.

It may be the case that important explanatory variables in the house value relationship change as we move over space. Living space may be unimportant in distinguishing between low and medium priced homes, but significant for higher priced homes. Distance from the CBD on the other hand appears to work well in distinguishing all three categories of house values.

  
1.4 Quantifying location in our models

A first task we must undertake before we can ask questions about spatial dependence and heterogeneity is quantification of the locational aspects of our sample data. Given that we can always map a set of spatial data observations, we have two sources of information on which we can draw.

The location in Cartesian space represented by latitude and longitude is one source of information. This information also allows us to calculate distances from any point in space, or the distance of observations located at distinct points in space to observations at other locations. Spatial dependence should conform to the fundamental theorem of regional science i.e., that distance matters. Observations that are near each other should reflect a greater degree of spatial dependence than those more distant from each other. In other words, the strength of spatial dependence between observations should decline with the distance between observations.

The second source of locational information is contiguity, reflecting the relative position in space of one regional unit of observation to other such units. Measures of contiguity rely on knowledge of the size and shape of the observational units depicted on a map. From this, we can determine which units are neighbors (have borders that touch) or represent observational units in reasonable proximity to each other. Regarding spatial dependence, neighboring units should exhibit a higher degree of spatial dependence than units located far apart.

I note in passing that these two types of information are not necessarily different. Given the latitude-longitude coordinates of an observation, we could construct a contiguity structure by defining a ``neighboring observation'' as one that lies within a certain distance. Consider also that given the centroid coordinates of a set of observations associated with contiguous map regions, we can calculate distances between the regions (or observations).

We will illustrate how both types of locational information can be used in spatial econometric modeling. We first take up the issue of quantifying spatial contiguity, which is used in the models presented in Chapter 2.

  
1.4.1 Quantifying spatial contiguity

Figure 1.2 shows a hypothetical example of five regions as they would appear on a map. We wish to construct a 5 by 5 binary matrix W containing 25 elements taking values of 0 or 1 that captures the notion of ``connectiveness'' between the five entities depicted in the map configuration. We record in each row of the matrix W a set of contiguity relations associated with one of the five regions. For example the matrix element in row 1, column 2 would record the presence (represented by a 1) or absence (denoted by 0) of a contiguity relationship between regions 1 and 2. As another example, the row 3, column 4 element would reflect the presence or absence of contiguity between regions 3 and 4. Of course, a matrix constructed in such fashion must be symmetric -- if regions 3 and 4 are contiguous, so are regions 4 and 3.


  
Figure 1.5: An illustration of contiguity
\fbox{\includegraphics[width=4in]{figure1p3.eps}}

It turns out there are many ways to accomplish our task. Below, we enumerate some of the alternative ways we might define a binary matrix W that represent alternative definitions of the ``contiguity'' relationships between the five entities in Figure 1.5. For the enumeration below, start with a matrix filled with zeros, then consider the following alternative ways to define the presence of a contiguity relationship.

Linear contiguity: Define Wij = 1 for entities that share a common edge to the immediate right or left of the region of interest. For row 1, where we record the relations associated with region 1, we would have all $W_{1j} = 0,
 j=1,\ldots,5$. On the other hand, for row 5, where we record relationships involving region 5, we would have W53 = 1 and all other row-elements equal to zero.

Rook contiguity: Define Wij = 1 for regions that share a common side with the region of interest. For row 1, reflecting region 1's relations we would have W12 = 1 with all other row elements equal to zero. As another example, row 3 would record W34=1, W35=1 and all other row elements equal to zero.

Bishop contiguity: Define Wij=1 for entities that share a common vertex with the region of interest. For region 2 we would have W23 = 1 and all other row elements equal to zero.

Double linear contiguity: For two entities to the immediate right or left of the region of interest, define Wij=1. This definition would produce the same results as linear contiguity for the regions in Figure 1.5.

Double rook contiguity: For two entities to the right, left, north and south of the region of interest define Wij=1. This would result in the same matrix W as rook contiguity for the regions shown in Figure 1.5.

Queen contiguity: For entities that share a common side or vertex with the region of interest define Wij=1. For region 3 we would have: W32=1,W34=1,W35=1 and all other row elements zero.

Believe it or not, there are even more ways one could proceed. For a good discussion of these issues, see Appendix 1 of Kelejian and Robinson (1995). Note also that the double linear and double rook definitions are sometimes referred to as ``second order'' contiguity, whereas the other definitions are termed ``first order''. More elaborate definitions sometimes rely on the distance of shared borders. This might impact whether we considered regions (4) and (5) in Figure 1.5 as contiguous or not. They have a common border, but it is very short. Note that in the case of a vertex, the rook definition rules out a contiguity relation, whereas the bishop and queen definitions would record a relationship.

The guiding principle in selecting a definition should be the nature of the problem being modeled, together with any additional non-sample information that is available. For example, suppose that a major highway connecting regions (2) and (3) existed and we knew that region (2) was a ``bedroom community'' for persons who work in region (3). Given this non-sample information, we would not want to rely on the rook definition that would rule out a contiguity relationship, as there is quite reasonably a large amount of spatial interaction between these two regions.

We will use the rook definition to define a first-order contiguity matrix for the five regions in Figure 1.5 as a concrete illustration. This is a definition that is often used in applied work. Perhaps the motivation for this is that we simply need to locate all regions on the map that have common borders with some positive length.

The matrix W reflecting first-order rook's contiguity relations for the five regions in Figure 1.5 is:


 \begin{displaymath}W = \left( \begin{array}{ccccc}
 0 & 1 & 0 & 0 & 0 \\
 1 & 0...
 ...& 0 & 1 & 0 & 1 \\
 0 & 0 & 1 & 1 & 0 \\
 \end{array} \right)
 \end{displaymath} (1.5)

Note that the matrix W is symmetric as indicated above, and by convention the matrix always has zeros on the main diagonal. A transformation often used in applied work is to convert the matrix W to have row-sums of unity. This is referred to as a ``standardized first-order'' contiguity matrix, which we denote as C:


 \begin{displaymath}C = \left( \begin{array}{ccccc}
 0 & 1 & 0 & 0 & 0 \\
 1 & 0...
 ...2 & 0 & 1/2 \\
 0 & 0 & 1/2 & 1/2 & 0 \\
 \end{array} \right)
 \end{displaymath} (1.6)

The motivation for the standardization can be seen by considering what happens if we use matrix multiplication of C and a vector of observations on some variable associated with the five regions which we label y. This matrix product $y^{\star} = Cy$ represents a new variable equal to the mean of observations from contiguous regions:


 
$\displaystyle \left( \begin{array}{c} y_{1}^{\star} \\
 y_{2}^{\star} \\
 y_{3}^{\star} \\
 y_{4}^{\star} \\
 y_{5}^{\star} \\
 \end{array} \right)$ = $\displaystyle \left( \begin{array}{ccccc} 0 & 1 & 0 & 0 & 0 \\
 1 & 0 & 0 & 0 &...
 ...array}{c} y_{1} \\
 y_{2} \\
 y_{3} \\
 y_{4} \\
 y_{5} \\
 \end{array} \right)$  
$\displaystyle \left( \begin{array}{c} y_{1}^{\star} \\
 y_{2}^{\star} \\
 y_{3}^{\star} \\
 y_{4}^{\star} \\
 y_{5}^{\star} \\
 \end{array} \right)$ = $\displaystyle \left( \begin{array}{c}
 y_{2} \\
 y_{1} \\
 1/2 y_{4} + 1/2 y_{5} \\
 1/2 y_{3} + 1/2 y_{5} \\
 1/2 y_{3} + 1/2 y_{4} \end{array} \right)$ (1.7)

This is one way of quantifying the notion that $y_{i} = f(y_{j}), j \ne i$, expressed in (1.1). Consider now a linear relationship that uses the variable $y^{\star}$ we constructed in (1.7) as an explanatory variable in a linear regression relationship to explain variation in y across the spatial sample of observations.


 \begin{displaymath}y = \rho C y + \varepsilon
 \end{displaymath} (1.8)

Where $\rho $ represents a regression parameter to be estimated and $\varepsilon$ denotes the stochastic disturbance in the relationship. The parameter $\rho $ would reflect the spatial dependence inherent in our sample data, measuring the average influence of neighboring or contiguous observations on observations in the vector y. If we posit spatial dependence between the individual observations in the data sample y, some part of the total variation in y across the spatial sample would be explained by each observation's dependence on its neighbors. The parameter $\rho $ would reflect this in the typical sense of regression. In addition, we could calculate the proportion of the total variation in y that is explained by spatial dependence. This would be represented by $\hat \rho C y$, where $\hat \rho$ represents the estimated value of $\rho $. We will examine spatial econometric models that rely on this type of formulation in great detail in Chapter 2, where we set forth maximum likelihood estimation procedures for a taxonomy of these models known as spatial autoregressive models.

One point to note is that traditional explanatory variables of the type encountered in regression can be added to the model in (1.8). We can represent these with the traditional matrix notation: $X \beta$, allowing us to modify (1.8) to take the form shown in (1.9).


 \begin{displaymath}y = \rho C y + X \beta + \varepsilon
 \end{displaymath} (1.9)

As an illustration, consider the following example which is intended to serve as a preview of material covered in the next two chapters. We provide a set of regression estimates based on maximum likelihood procedures for a spatial data set consisting of 49 neighborhoods in Columbus, Ohio set forth in Anselin (1988). The data set consists of observations on three variables: neighborhood crime incidents, household income, and house values for all 49 neighborhoods. The model uses the income and house values to explain variation in neighborhood crime incidents. That is, y= neighborhood crime, X=(a constant, household income, house values). The estimates are shown below, printed in the usual regression format with associated statistics for precision of the estimates, fit of the model and an estimate of the disturbance variance, $\hat \sigma_{\varepsilon}^{2}$.

 Spatial autoregressive Model Estimates 
 Dependent Variable =      Crime       
 R-squared      =    0.6518 
 Rbar-squared   =    0.6366 
 sigma^2        =   95.5032 
 log-likelihood =       -165.41269 
 Nobs, Nvars    =     49,     3 
 ***************************************************************
 Variable         Coefficient      t-statistic    t-probability 
 constant           45.056251         6.231261         0.000000 
 income             -1.030641        -3.373768         0.001534 
 house value        -0.265970        -3.004945         0.004331 
 rho                 0.431381         3.625340         0.000732
 

For this example, we can calculate the proportion of total variation explained by spatial dependence with a comparison of the fit measured by $\bar R^{2}$ from this model to the fit of a least-squares model that excludes the spatial dependence variable C y. The least-squares regression for comparison is shown below:

 Ordinary Least-squares Estimates 
 Dependent Variable =      Crime       
 R-squared      =    0.5521 
 Rbar-squared   =    0.5327 
 sigma^2        =  130.8386 
 Durbin-Watson  =    1.1934 
 Nobs, Nvars    =     49,     3 
 ***************************************************************
 Variable         Coefficient      t-statistic    t-probability 
 constant           68.609759        14.484270         0.000000 
 income             -1.596072        -4.776038         0.000019 
 house value        -0.274079        -2.655006         0.010858
 

We see that around 10 percent of the variation in the crime incidents is explained by spatial dependence, because the $\bar R^{2}$ is roughly 0.63 in the model that takes spatial dependence into account and 0.53 in the least-squares model that ignores this aspect of the spatial data sample. Note also that the t-statistic on the parameter for the spatial dependence variable Cy is 3.62, indicating that this explanatory variable has a coefficient estimate that is significantly different from zero. In addition, the coefficient on income falls in absolute value when we include the spatial lagged variable Cy in the model. We will pursue more examples in Chapters 2 and  3, with this example provided as a concrete demonstration of some of the ideas we have discussed.

  
1.4.2 Quantifying spatial position

Associating location in space with observations is essential to modeling relationships that exhibit spatial heterogeneity. Recall this means there is variation in the relationship being modeled over space. We illustrate two approaches to using location that allow locally linear regressions to be fit over sub-regions of space. These form the basis for models we will discuss in Chapter 4.

Casetti (1972, 1992) introduced our first approach that involves a method he labels ``spatial expansion''. The model is shown in (1.10), where y denotes an nx1 dependent variable vector associated with spatial observations and X is an nxnk matrix consisting of terms xi representing kx1 explanatory variable vectors, as shown in (1.11). The locational information is recorded in the matrix Z which has elements $Z_{xi}, Z_{yi}, i =
 1,\ldots,n$, that represent latitude and longitude coordinates of each observation as shown in (1.11).

The model posits that the parameters vary as a function of the latitude and longitude coordinates. The only parameters that need be estimated are the parameters in $\beta_{0}$ that we denote, $\beta_{x}, \beta_{y}$. These represent a set of 2k parameters. Recall our discussion about spatial heterogeneity and the need to utilize a parsimonious specification for variation over space. This represents one approach to this type of specification.

We note that the parameter vector $\beta$ in (1.10) represents an nkx1 matrix in this model that contains parameter estimates for all k explanatory variables at every observation. The parameter vector $\beta_{0}$ contains the 2k parameters to be estimated.


 
y = $\displaystyle X \beta + \varepsilon$  
$\displaystyle \beta$ = $\displaystyle Z J \beta_{0}$ (1.10)

Where:


 
y = $\displaystyle \left( \begin{array}{c}
 y_{1} \\  y_{2} \\  \vdots \\  y_{n}
 \end...
 ...ilon_{1} \\  \varepsilon_{2} \\  \vdots \\  \varepsilon_{n}
 \end{array} \right)$  
Z = $\displaystyle \left( \begin{array}{cccc}
 Z_{x1} \otimes I_k & Z_{y1} \otimes I_...
 ...begin{array}{cc}
 I_k & 0 \\  0 & I_k \\  \vdots \\  0 & I_k
 \end{array} \right)$  
$\displaystyle \beta_{0}$ = $\displaystyle \left( \begin{array}{c}
 \beta_{x} \\  \beta_{y}
 \end{array} \right)$ (1.11)

This model can be estimated using least-squares to produce estimates of the 2k parameters $\beta_{x}, \beta_{y}$. Given these estimates, the remaining estimates for individual points in space can be derived using the second equation in (1.10). This process is referred to as the ``expansion process''. To see this, substitute the second equation in (1.10) into the first, producing:


 \begin{displaymath}y = X Z J \beta_{0} + \varepsilon
 \end{displaymath} (1.12)

Here it is clear that X, Z and J represent available information or data observations and only $\beta_{0}$ represents parameters in the model that need be estimated.

The model would capture spatial heterogeneity by allowing variation in the underlying relationship such that clusters of nearby or neighboring observations measured by latitude-longitude coordinates take on similar parameter values. As the location varies, the regression relationship changes to accommodate a locally linear fit through clusters of observations in close proximity to one another.

Another approach to modeling variation over space is based on locally weighted regressions to produce estimates for every point in space by using a sub-sample of data information from nearby observations. McMillen (1996) and Brundson, Fotheringham and Charlton (1996) introduce this type of approach. It has been labeled ``geographically weighted regression'' (GWR) by Brundson, Fotheringham and Charlton (1996). Let y denote an nx1 vector of dependent variable observations collected at n points in space, X an nxk matrix of explanatory variables, and $\varepsilon$ an nx1 vector of normally distributed, constant variance disturbances. Letting Wi represent an nxn diagonal matrix containing distance-based weights for observation i that reflects the distance between observation i and all other observations, we can write the GWR model as:


 \begin{displaymath}W_{i} y = W_{i} X \beta_{i} + W_{i} \varepsilon_{i}
 \\
 \end{displaymath} (1.13)

The subscript i on $\beta_{i}$ indicates that this kx1 parameter vector is associated with observation i. The GWR model produces n such vectors of parameter estimates, one for each observation. These estimates are produced using:


 \begin{displaymath}\hat \beta_{i} = (X^{\prime} W_{i}^{2} X)^{-1} (X^{\prime} W_{i}^{2} y)
 \\
 \end{displaymath} (1.14)

One confusing aspect of this notation is that Wi y denotes an n-vector of distance-weighted observations used to produce estimates for observation i. The notation is confusing because we usually use subscripts to index scalar magnitudes representing individual elements of a vector. Note also, that Wi X represents a distance-weighted data matrix, not a single observation and $\varepsilon_{i}$ represents a n-vector. The precise nature of the distance weighting is taken up in Chapter 4.

It may have occurred to the reader that a homogeneous model fit to a spatial data sample that exhibits heterogeneity will produce residuals that exhibit spatial dependence. The residuals or errors obtained from the homogeneous model should reflect unexplained variation attributable to heterogeneity in the underlying relationship over space. Spatial clustering of the residuals would occur with positive and negative residuals appearing in distinct regions and patterns on the map. This of course was our motivation and illustration of spatial dependence as illustrated in Figure 1.2. You might infer correctly that spatial heterogeneity and dependence are often related in the context of modeling. An inappropriate model that fails to capture spatial heterogeneity will result in residuals that exhibit spatial dependence. This is another topic we discuss in the following chapters of this text.

  
1.4.3 Spatial lags

A fundamental concept that relates to spatial contiguity is the notion of a spatial lag operator. Spatial lags are analogous to the backshift operator B from time series analysis. This operator shifts observations back in time, where B yt = yt-1, defines a first-order lag and Bp yt = yt-p represents a pth order lag. In contrast to the time domain, spatial lag operators imply a shift over space but are restricted by some complications that arise when one tries to make analogies between the time and space domains.

Cressie (1991) points out that in the restrictive context of regular lattices or grids the spatial lag concept implies observations that are one or more distance units away from a given location, where distance units can be measured in two or four directions. In applied situations where observations are unlikely to represent a regular lattice or grid because they tend to be irregularly shaped map regions, the concept of a spatial lag relates to the set of neighbors associated with a particular location. The spatial lag operator works in this context to produce a weighted average of the neighboring observations.


  
Figure 1.6: First-order spatial contiguity for 49 neighborhoods
\fbox{\includegraphics[width=3.5in]{figure1p7.eps}}

In Section 1.4.1 we saw that the concept of ``neighbors'' in spatial analysis is not unambiguous, it depends on the definition used. By analogy to time series analysis it seems reasonable to simply raise our first-order binary contiguity matrix W containing 0 and 1 values to a power, say p to create a spatial lag. However, Blommestein (1985) points out that doing this produces circular or redundant routes, where he draws an analogy between binary contiguity and the graph theory notion of an adjacency matrix. If we use spatial lag matrices produced in this way in maximum likelihood estimation methods, spurious results can arise because of the circular or redundant routes created by this simplistic approach. Anselin and Smirnov (1994) provide details on many of the issues involved here.

For our purposes, we simply want to point out that an appropriate approach to creating spatial lags requires that the redundancies be eliminated from spatial weight matrices representing higher-order contiguity relationships. The spatial econometrics library contains a function to properly construct spatial lags of any order and the function deals with eliminating redundancies.

We provide a brief illustration of how spatial lags introduce information regarding ``neighbors to neighbors'' into our analysis. These spatial lags will be used in Chapter 3 when we discuss spatial autoregressive models.

To illustrate these ideas, we use a first-order contiguity matrix for a small data sample containing 49 neighborhoods in Columbus, Ohio taken from Anselin (1988). This contiguity matrix is typical of those encountered in applied practice as it relates irregularly shaped regions representing each neighborhood. Figure 1.6 shows the pattern of 0 and 1 values in a 49 by 49 grid. Recall that a non-zero entry in row i, column j denotes that neighborhoods i and j have borders that touch which we refer to as ``neighbors''. Of the 2401 possible elements in the 49 by 49 matrix, there are only 232 are non-zero elements designated on the axis in the figure by `nz = 232'. These non-zero entries reflect the contiguity relations between the neighborhoods. The first-order contiguity matrix is symmetric which can be seen in the figure. This reflects the fact that if neighborhood i borders j, then j must also border i.


  
Figure 1.7: A second-order spatial lag matrix
\fbox{\includegraphics[width=3.5in]{figure1p8.eps}}


  
Figure 1.8: A contiguity matrix raised to a power 2
\fbox{\includegraphics[width=3.5in]{figure1p9.eps}}

Figure 1.7 shows the original first-order contiguity matrix along with a second-order spatially lagged matrix, whose non-zero elements are represented by a `+' symbol in the figure. This graphical depiction of a spatial lag demonstrates that the spatial lag concept works to produce a contiguity or connectiveness structure that represents ``neighbors of neighbors''.

How might the notion of a spatial lag be useful in spatial econometric modeling? We might encounter a process where spatial diffusion effects are operating through time. Over time the initial impacts on neighbors work to influence more and more regions. The spreading impact might reasonably be considered to flow outward from neighbor to neighbor, and the spatial lag concept would capture this idea.

As an illustration of the redundancies produced by simply raising a first-order contiguity matrix to a higher power, Figure 1.8 shows a second-order spatial lag matrix created by simply powering the first-order matrix. The non-zero elements in this inappropriately generated spatial lag matrix are represented by `+' symbols with the original first-order non-zero elements denoted by `o' symbols. We see that this second order spatial lag matrix contains 689 non-zero elements in contrast to only 410 for the correctly generated second order spatial lag matrix that eliminates the redundancies.

We will have occasion to use spatial lags in our examination of spatial autoregressive models in Chapters 3,  4 and  5. The MATLAB function from the spatial econometrics library as well as other functions for working with spatial contiguity matrices will be presented along with examples of their use in spatial econometric modeling.

  
1.5 The MATLAB spatial econometrics library

As indicated in the preface, all of the spatial econometric methods discussed in this text have been implemented using the MATLAB software from MathWorks Inc. Toolboxes are the name given by the MathWorks to related sets of MATLAB functions aimed at solving a particular class of problems. Toolboxes of functions useful in signal processing, optimization, statistics, finance and a host of other areas are available from the MathWorks as add-ons to the standard MATLAB distribution. We will reserve the term Econometrics Toolbox to refer to my larger collection of econometric functions available in the public domain at www.econ.utoledo.edu. The spatial econometrics library represents a smaller part of this larger collection of software functions for econometric analysis. I have used the term library to denote subsets of functions aimed at various categories of estimation methods. The Econometrics Toolbox contains libraries for econometric regression analysis, time-series and vector autoregressive modeling, optimization functions to solve general maximum likelihood estimation problems, Bayesian Gibbs sampling diagnostics, error correction testing and estimation methods, simultaneous equation models and a collection of utility functions that I designate as the utility function library. Taken together, these constitute the Econometrics Toolbox that is described in a 350 page manual available at the Web site listed above.

The spatial econometrics library functions rely on some of the utility functions and are implemented using a general design that provides a common user-interface for the entire toolbox of econometric estimation functions. In Chapter 2 we will use MATLAB functions to carry out spatial econometric estimation methods. Here, we discuss the general design that is used to implement all of the spatial econometric estimation functions. Having some feel for the way in which these functions work and communicate with other functions in the Econometric Toolbox should allow you to more effectively use these functions to solve spatial econometric estimation problems.

The entire Econometrics Toolbox has been included in the internet-based materials provided here, as well as an online HTML interface to examine the functions available along with their documentation. All functions have accompanying demonstration files that illustrate the typical use of the functions with sample data. These demonstration files can be viewed using the online HTML interface. We have also provided demonstration files for all of the estimation functions in the spatial econometrics library that can be viewed online along with their documentation. Examples are provided in this text and the program files along with the datasets that have been included in the Web-based module.

In designing a spatial econometric library of functions, we need to think about organizing our functions to present a consistent user-interface that packages all of our MATLAB functions in a unified way. The advent of `structures' in MATLAB version 5 allows us to create a host of alternative spatial econometric functions that all return `results structures'.

A structure in MATLAB allows the programmer to create a variable containing what MATLAB calls `fields' that can be accessed by referencing the structure name plus a period and the field name. For example, suppose we have a MATLAB function to perform ordinary least-squares estimation named ols that returns a structure. The user can call the function with input arguments (a dependent variable vector y and explanatory variables matrix x) and provide a variable name for the structure that the ols function will return using:

 result = ols(y,x);
 

The structure variable `result' returned by our ols function might have fields named `rsqr', `tstat', `beta', etc. These fields might contain the R-squared statistic, t-statistics and the least-squares estimates $\hat \beta$. One virtue of using the structure to return regression results is that the user can access individual fields of interest as follows:

 bhat = result.beta;
 disp(`The R-squared is:');
 result.rsqr
 disp(`The 2nd t-statistic is:');
 result.tstat(2,1)
 

There is nothing sacred about the name `result' used for the returned structure in the above example, we could have used:

 bill_clinton = ols(y,x);
 result2      = ols(y,x);
 restricted   = ols(y,x);
 unrestricted = ols(y,x);
 

That is, the name of the structure to which the ols function returns its information is assigned by the user when calling the function.

To examine the nature of the structure in the variable `result', we can simply type the structure name without a semi-colon and MATLAB will present information about the structure variable as follows:

 result = 
      meth: 'ols'
         y: [100x1 double]
      nobs: 100.00
      nvar: 3.00
      beta: [  3x1 double]
      yhat: [100x1 double]
     resid: [100x1 double]
      sige: 1.01
     tstat: [  3x1 double]
      rsqr: 0.74
      rbar: 0.73
        dw: 1.89
 

Each field of the structure is indicated, and for scalar components the value of the field is displayed. In the example above, `nobs', `nvar', `sige', `rsqr', `rbar', and `dw' are scalar fields, so their values are displayed. Matrix or vector fields are not displayed, but the size and type of the matrix or vector field is indicated. Scalar string arguments are displayed as illustrated by the `meth' field which contains the string `ols' indicating the regression method that was used to produce the structure. The contents of vector or matrix strings would not be displayed, just their size and type. Matrix and vector fields of the structure can be displayed or accessed using the MATLAB conventions of typing the matrix or vector name without a semi-colon. For example,

 result.resid
 result.y
 

would display the residual vector and the dependent variable vector y in the MATLAB command window.

Another virtue of using `structures' to return results from our regression functions is that we can pass these structures to another related function that would print or plot the regression results. These related functions can query the structure they receive and intelligently decipher the `meth' field to determine what type of regression results are being printed or plotted. For example, we could have a function prt that prints regression results and another plt that plots actual versus fitted and/or residuals. Both these functions take a regression structure as input arguments. Example 1.1 provides a concrete illustration of these ideas.

 % ----- Example 1.1 Demonstrate regression using the ols() function 
 load y.data;
 load x.data;
 result = ols(y,x);
 prt(result);
 plt(result);
 

The example assumes the existence of functions ols, prt, plt and data matrices y,x in files `y.data' and `x.data'. Given these, we carry out a regression, print results and plot the actual versus predicted as well as residuals with the MATLAB code shown in example 1.1. We will discuss the prt and plt functions in Section 1.5.2.

  
1.5.1 Estimation functions

Now to put these ideas into practice, consider implementing an ols function. The function code would be stored in a file `ols.m' whose first line is:

 function results=ols(y,x)
 

The keyword `function' instructs MATLAB that the code in the file `ols.m' represents a callable MATLAB function.

The help portion of the MATLAB `ols' function is presented below and follows immediately after the first line as shown. All lines containing the MATLAB comment symbol `%' will be displayed in the MATLAB command window when the user types `help ols'.

 function results=ols(y,x)
 % PURPOSE: least-squares regression
 %---------------------------------------------------
 % USAGE: results = ols(y,x)
 % where: y = dependent variable vector (nobs x 1)
 %        x = independent variables matrix (nobs x nvar)
 %---------------------------------------------------
 % RETURNS: a structure
 %        results.meth  = 'ols'
 %        results.beta  = bhat
 %        results.tstat = t-stats
 %        results.yhat  = yhat
 %        results.resid = residuals
 %        results.sige  = e'*e/(n-k)
 %        results.rsqr  = rsquared
 %        results.rbar  = rbar-squared
 %        results.dw    = Durbin-Watson Statistic
 %        results.nobs  = nobs
 %        results.nvar  = nvars
 %        results.y     = y data vector
 % --------------------------------------------------
 % SEE ALSO: prt(results), plt(results)
 %---------------------------------------------------
 

All functions in the spatial econometrics library present a unified documentation format for the MATLAB `help' command by adhering to the convention of sections entitled, `PURPOSE', `USAGE', `RETURNS', `SEE ALSO', and perhaps a `REFERENCES' section, delineated by dashed lines.

The `USAGE' section describes how the function is used, with each input argument enumerated along with any default values. A `RETURNS' section portrays the structure that is returned by the function and each of its fields. To keep the help information uncluttered, we assume some knowledge on the part of the user. For example, we assume the user realizes that the `.residuals' field would be an (nobs x 1) vector and the `.beta' field would consist of an (nvar x 1) vector.

The `SEE ALSO' section points the user to related routines that may be useful. In the case of our ols function, the user might what to rely on the printing or plotting routines prt and plt, so these are indicated. The `REFERENCES' section would be used to provide a literature reference (for the case of our more exotic spatial estimation procedures) where the user could read about the details of the estimation methodology.

As an illustration of the consistency in documentation, consider the function sar that provides estimates for the spatial autoregressive model that we presented in Section 1.4.1. The documentation for this function is shown below:

   PURPOSE: computes spatial autoregressive model estimates
            y = p*W*y + X*b + e, using sparse matrix algorithms
  ---------------------------------------------------
   USAGE: results = sar(y,x,W,rmin,rmax,convg,maxit)
   where:  y = dependent variable vector
           x = explanatory variables matrix
           W = standardized contiguity matrix 
        rmin = (optional) minimum value of rho to use in search  
        rmax = (optional) maximum value of rho to use in search             
       convg = (optional) convergence criterion (default = 1e-8)
       maxit = (optional) maximum # of iterations (default = 500)
  ---------------------------------------------------
   RETURNS: a structure
          results.meth  = 'sar'
          results.beta  = bhat
          results.rho   = rho
          results.tstat = asymp t-stat (last entry is rho)
          results.yhat  = yhat
          results.resid = residuals
          results.sige  = sige = (y-p*W*y-x*b)'*(y-p*W*y-x*b)/n
          results.rsqr  = rsquared
          results.rbar  = rbar-squared
          results.lik   = -log likelihood
          results.nobs  = # of observations
          results.nvar  = # of explanatory variables in x 
          results.y     = y data vector
          results.iter   = # of iterations taken
          results.romax  = 1/max eigenvalue of W (or rmax if input)
          results.romin  = 1/min eigenvalue of W (or rmin if input)
   --------------------------------------------------
   SEE ALSO: prt(results), sac, sem, far
  ---------------------------------------------------
  REFERENCES: Anselin (1988), pages 180-182.
  ---------------------------------------------------
 

The actual execution code to produce least-squares or spatial autoregressive parameter estimates would follow the documentation in the file discussed above. We do not discuss programming of the spatial econometric functions in the text, but you can of course examine all of the functions to see how they work. The manual for the Econometrics Toolbox provides a great deal of discussion of programming in MATLAB and examples of how to add new functions to the toolbox or change existing functions in the toolbox.

  
1.5.2 Using the results structure

To illustrate the use of the `results' structure returned by our ols function, consider the associated function plt_reg which plots actual versus predicted values along with the residuals. The results structure contains everything needed by the plt_reg function to carry out its task. Earlier, we referred to functions plt and prt rather than plt_reg, but prt and plt are ``wrapper'' functions that call the functions prt_reg and plt_reg where the real work of printing and plotting regression results is carried out. The motivation for taking this approach is that separate smaller functions can be devised to print and plot results from all of the spatial econometric procedures, facilitating development. The wrapper functions eliminate the need for the user to learn the names of different printing and plotting functions associated with each group of spatial econometric procedures -- all results structures can be printed and plotted by simply invoking the prt and plt functions.

Documentation for the plt function which plots results from all spatial econometrics functions as well as the Econometrics Toolbox is shown below. This function is a wrapper function that calls an appropriate plotting function, plt_spat based on the econometric method identified in the results structure `meth' field argument.

  PURPOSE: Plots results structures returned by most functions
           by calling the appropriate plotting function
 ---------------------------------------------------
  USAGE: plt(results,vnames)
  Where: results = a structure returned by an econometric function
         vnames  = an optional vector of variable names
         e.g. vnames = vnames = strvcat('y','const','x1','x2');
  --------------------------------------------------
  NOTES: this is simply a wrapper function that calls another function
  --------------------------------------------------        
  RETURNS: nothing, just plots the results
  --------------------------------------------------
  SEE ALSO: prt()
 ---------------------------------------------------
 

A decision was made not to place the `pause' command in the plt function, but rather let the user place this statement in the calling program or function. An implication of this is that the user controls viewing regression plots in `for loops' or in the case of multiple invocations of the plt function. For example, only the second `plot' will be shown in the following code.

 result1 = sar(y,x1,W);
 plt(result1);
 result2 = sar(y,x2,W);
 plt(result2);
 

If the user wishes to see the plots associated with the first spatial autoregression, the code would need to be modified as follows:

 result1 = sar(y,x1,W);
 plt(result1);
 pause;
 result2 = sar(y,x2,W);
 plt(result2);
 

The `pause' statement would force a plot of the results from the first spatial autoregression and wait for the user to strike any key before proceeding with the second regression and accompanying plot of these results.

A more detailed example of using the results structure is the prt function which produces printed output from all of the functions in the spatial econometrics library. The printout of estimation results is similar to that provided by most statistical packages.

The prt function allows the user an option of providing a vector of fixed width variable name strings that will be used when printing the regression coefficients. These can be created using the MATLAB strvcat function that produces a vertical concatenated list of strings with fixed width equal to the longest string in the list. We can also print results to an indicated file rather than the MATLAB command window. Three alternative invocations of the prt function illustrating these options for usage are shown below:

 vnames = strvcat('crime','const','income','house value');
 res = sar(y,x,W);
 prt(res);                    % print with generic variable names
 prt(res,vnames);             % print with user-supplied variable names
 fid = fopen('sar.out','wr'); % open a file for printing
 prt(res,vnames,fid);         % print results to file `sar.out'
 

The first use of prt produces a printout of results to the MATLAB command window that uses `generic' variable names:

 Spatial autoregressive Model Estimates 
 R-squared       =    0.6518 
 Rbar-squared    =    0.6366 
 sigma^2         =   95.5033 
 Nobs, Nvars     =     49,     3 
 log-likelihood  =       -165.41269 
 # of iterations =     17   
 min and max rho =   -1.5362,   1.0000 
 ***************************************************************
 Variable        Coefficient      t-statistic    t-probability 
 variable 1        45.056480         6.231281         0.000000 
 variable 2        -1.030647        -3.373784         0.001513 
 variable 3        -0.265970        -3.004944         0.004290 
 rho                0.431377         3.625292         0.000720
 

The second use of prt uses the user-supplied variable names. The MATLAB function strvcat carries out a vertical concatenation of strings and pads the shorter strings in the `vnames' vector to have a fixed width based on the longer strings. A fixed width string containing the variable names is required by the prt function. Note that we could have used:

     vnames = ['crime      ',
               'const      ',
               'income     ',
               'house value'];
 

but, this takes up more space and is slightly less convenient as we have to provide the padding of strings ourselves. Using the `vnames' input in the prt function would result in the following printed to the MATLAB command window.

 Spatial autoregressive Model Estimates 
 Dependent Variable =      crime       
 R-squared       =    0.6518 
 Rbar-squared    =    0.6366 
 sigma^2         =   95.5033 
 Nobs, Nvars     =     49,     3 
 log-likelihood  =       -165.41269 
 # of iterations =     12   
 min and max rho =   -1.5362,   1.0000 
 ***************************************************************
 Variable         Coefficient      t-statistic    t-probability 
 const              45.056481         6.231281         0.000000 
 income             -1.030647        -3.373784         0.001513 
 house value        -0.265970        -3.004944         0.004290 
 rho                 0.431377         3.625292         0.000720
 

The third case specifies an output file opened with the command:

 fid = fopen('sar.out','wr');
 

The file `sar.out' would contain output identical to that from the second use of prt. It is the user's responsibility to close the file that was opened using the MATLAB command:

 fclose(fid);
 

In the following chapters that present various spatial estimation methods we will provide the documentation but not the details concerning implementation of the estimation procedures in MATLAB. A function has been devised and incorporated in the spatial econometrics library for each of the estimation procedures that we discuss and illustrate. These functions carry out estimation and provide printed as well as graphical presentation of the results using the design framework that was set forth in this section.

  
1.6 Chapter summary

This chapter introduced two main features of spatial data sets, spatial dependence and spatial heterogeneity. Spatial dependence refers to the fact that sample data observations exhibit correlation with reference to points or location in space. We often observe spatial clustering of sample data observations with respect to map regions. An intuitive motivation for this type of result is the existence of spatial hierarchical relationships, spatial spillovers and other types of spatial interactivity studied in regional science.

Spatial heterogeneity refers to the fact that underlying relationships we wish to study may vary systematically over space. This creates problems for regression and other econometric methods that do not accommodate spatial variation in the relationships being modeled. A host of methods have been developed in spatial econometrics that allow the estimated relationship to vary systematically over space.

A large part of the chapter was devoted to introducing how locational information regarding sample data observations is formally incorporated in spatial econometric models. After introducing the concept of a spatial contiguity matrix, we provided a preview of the spatial autoregressive model that relies on the contiguity concept. Chapters 2 and 3 cover this spatial econometric method in detail.

In addition to spatial contiguity, other spatial econometric methods rely on the latitude-longitude information available for spatial data samples to allow variation in the relationship being studied over space. Two approaches to this were introduced, the spatial expansion model and geographically weighted regression, which are the subject of Chapter 4.

Finally, a software design for implementing the spatial econometric estimation methods discussed in this text was set forth. Our estimation methods will be implemented using MATLAB software from the MathWorks Inc. A design based on MATLAB structure variables was set forth. This approach to developing a set of spatial econometric estimation functions can provide a consistent user-interface for the function documentation and help information as well as encapsulation of the estimation results in a MATLAB structure variable. This construct can be accessed by related functions to provide printed and graphical presentation of the estimation results.

  
2. Spatial autoregressive models

This chapter discusses in detail the spatial autoregressive models introduced in Chapter 1. A class of spatial autoregressive models has been introduced to model cross-sectional spatial data samples taking the form shown in (2.1) (Anselin, 1988).


 
y = $\displaystyle \rho W_1 y + X \beta + u$ (2.1)
u = $\displaystyle \lambda W_2 u + \varepsilon$  
$\displaystyle \varepsilon$ $\textstyle \sim$ $\displaystyle N(0,\sigma^2 I_n)$  

Where y contains an nx1 vector of cross-sectional dependent variables and X represents an nxk matrix of explanatory variables. W1 and W2 are known nxn spatial weight matrices, usually containing first-order contiguity relations or functions of distance. As explained in Section 1.4.1, a first-order contiguity matrix has zeros on the main diagonal, rows that contain zeros in positions associated with non-contiguous observational units and ones in positions reflecting neighboring units that are (first-order) contiguous based on one of the contiguity definitions.

From the general model in (2.1) we can derive special models by imposing restrictions. For example, setting X=0 and W2=0 produces a first-order spatial autoregressive model shown in (2.2).


 
y = $\displaystyle \rho W_1 y + \varepsilon$ (2.2)
$\displaystyle \varepsilon$ $\textstyle \sim$ $\displaystyle N(0,\sigma^2 I_n)$  

This model attempts to explain variation in y as a linear combination of contiguous or neighboring units with no other explanatory variables. The model is termed a first order spatial autoregression because its represents a spatial analogy to the first order autoregressive model from time series analysis, $y_t
 = \rho y_{t-1} + \varepsilon_t$, where total reliance is placed on the past period observations to explain variation in yt.

Setting W2 = 0 produces a mixed regressive-spatial autoregressive model shown in (2.3). This model is analogous to the lagged dependent variable model in time series. Here we have additional explanatory variables in the matrix X that serve to explain variation in y over the spatial sample of observations.


 
y = $\displaystyle \rho W_1 y + X \beta + \varepsilon$ (2.3)
$\displaystyle \varepsilon$ $\textstyle \sim$ $\displaystyle N(0,\sigma^2 I_n)$  

Letting W1 = 0 results in a regression model with spatial autocorrelation in the disturbances as shown in (2.4).


 
y = $\displaystyle X \beta + u$ (2.4)
u = $\displaystyle \lambda W_2 u + \varepsilon$  
$\displaystyle \varepsilon$ $\textstyle \sim$ $\displaystyle N(0,\sigma^2 I_n)$  

This chapter is organized into sections that discuss and illustrate each of these special cases of the spatial autoregressive model as well as the most general model form in (2.1). Section 2.1 deals with the first-order spatial autoregressive model presented in (2.2). The mixed regressive-spatial autoregressive model is taken up in Section 2.2. Section 2.3 takes up the regression model containing spatial autocorrelation in the disturbances and illustrates various tests for spatial dependence using regression residuals. The most general model is the focus of Section 2.4. Applied illustrations of all the models are provided using a variety of spatial data sets. Spatial econometrics library functions that utilize MATLAB sparse matrix algorithms allow us to estimate models with over 3,000 observations in around 100 seconds on an inexpensive desktop computer.

  
2.1 The first-order spatial AR model

This model is seldom used in applied work, but it serves to motivate some of the ideas that we draw on in later sections of the chapter. The model which we label FAR, takes the form:


 
y = $\displaystyle \rho W y + \varepsilon$ (2.5)
$\displaystyle \varepsilon$ $\textstyle \sim$ $\displaystyle N(0,\sigma^2 I_n)$  

where the spatial contiguity matrix W has been standardized to have row sums of unity and the variable vector y is expressed in deviations from the means form to eliminate the constant term in the model.

To illustrate the problem with least-squares estimation of spatial autoregressive models, consider applying least-squares to the model in (2.5) which would produce an estimate for the single parameter $\rho $ in the model:


 \begin{displaymath}\hat \rho = (y^{\prime} W^{\prime} W y)^{-1} y^{\prime} W^{\prime} y
 \end{displaymath} (2.6)

Can we show that this estimate is unbiased? If not, is it consistent? Taking the same approach as in least-squares, we substitute the expression for y from the model statement and attempt to show that $E(\hat \rho) = \rho$ to prove unbiasedness.


 
$\displaystyle E(\hat \rho)$ = $\displaystyle (y^{\prime} W^{\prime} W y)^{-1} y^{\prime} W^{\prime} (\rho W y +
 \varepsilon)$  
  = $\displaystyle \rho + (y^{\prime} W^{\prime} W y)^{-1} y^{\prime} W^{\prime} \varepsilon$ (2.7)

Note that the least-squares estimate is biased, since we cannot show that $E(\hat \rho) = \rho$. The usual argument that the explanatory variables matrix X in least-squares is fixed in repeated sampling allows one to pass the expectation operator over terms like $(y^{\prime} W^{\prime} W y)^{-1}
 y^{\prime} W^{\prime}$ and argue that $E(\varepsilon) = 0$, eliminating the bias term. Here however, because of spatial dependence, we cannot make the case that Wy is fixed in repeated sampling. This also rules out making a case for consistency of the least-squares estimate of $\rho $, because the probability limit (plim) of the term $y^{\prime} W^{\prime} \varepsilon$ is not zero. In fact, Anselin (1988) establishes that:


\begin{displaymath}\mbox{plim} N^{-1} (y^{\prime} W^{\prime} \varepsilon) = \mbo...
 ... N^{-1}
 \varepsilon^{\prime} W (I - \rho W)^{-1} \varepsilon
 \end{displaymath} (2.8)

This is equal to zero only in the trivial case where $\rho $ equals zero and we have no spatial dependence in the data sample.

Given that least-squares will produce biased and inconsistent estimates of the spatial autoregressive parameter $\rho $ in this model, how do we proceed to estimate $\rho $? The maximum likelihood estimator for $\rho $ requires that we find a value of $\rho $ that maximizes the likelihood function shown in (2.9).


 \begin{displaymath}L(y\vert\rho, \sigma^2) = { 1 \over{2 \pi \sigma^2}^{(n/2)}} ...
 ... \over{ 2 \sigma^2}} (y - \rho W y)^{\prime} (y - \rho W y) \}
 \end{displaymath} (2.9)

In order to simplify the maximization problem, we obtain a concentrated log likelihood function based on eliminating the parameter $\sigma^{2}$ for the variance of the disturbances. This is accomplished by substituting $\hat
 \sigma^2 = (1/n) (y - \rho W y)^{\prime} (y - \rho W y)$ in the likelihood (2.9) and taking logs which yields:


 \begin{displaymath}\mbox{Ln} (L) \propto - {n \over{2}} \mbox{ln} (y - \rho W y)^{\prime}
 (y - \rho W y) + \mbox{ln} \vert I_n - \rho W\vert
 \end{displaymath} (2.10)

This expression can be maximized with respect to $\rho $ using a simplex univariate optimization routine. The estimate for the parameter $\sigma^{2}$ can be obtained using the value of $\rho $ that maximizes the log-likelihood function (say, $\tilde \rho$) in: $\hat \sigma^2 = (1/n) (y - \tilde \rho W
 y)^{\prime} (y - \tilde \rho W y)$. In the next section, we discuss a sparse matrix algorithm approach to evaluating this likelihood function that allows us to solve problems involving thousands of observations quickly with small amounts of computer memory.

Two implementation details arise with this approach to solving for maximum likelihood estimates. First, there is a constraint that we need to impose on the parameter $\rho $. This parameter can take on feasible values in the range (Anselin and Florax, 1994):


\begin{displaymath}1/\lambda_{min} < \rho < 1/\lambda_{max}
 \end{displaymath}

where $\lambda_{min}$ represents the minimum eigenvalue of the standardized spatial contiguity matrix W and $\lambda_{max}$ denotes the largest eigenvalue of this matrix. This suggests that we need to constrain our optimization procedure search over values of $\rho $ within this range.

The second implementation issue is that the numerical Hessian matrix that would result from a gradient-based optimization procedure and provide estimates of dispersion for the parameters is not available with simplex optimization. We can overcome this problem in two ways. For problems involving a small number of observations, we can use our knowledge of the (theoretical) information matrix to produce estimates of dispersion. An asymptotic variance matrix based on the Fisher information matrix shown below for the parameters $\theta = (\rho,
 \sigma^2)$ can be used to provide measures of dispersion for the estimates of $\rho $ and $\sigma^{2}$. (see Anselin, 1980 page 50):


 \begin{displaymath}[I(\theta)]^{-1} = - E [ {\partial^2 L \over{ \partial \theta \partial
 \theta^{\prime} } } ]^{-1}
 \end{displaymath} (2.11)

This approach is computationally impossible when dealing with large scale problems involving thousands of observations. In these cases we can evaluate the numerical hessian matrix using the maximum likelihood estimates of $\rho $ and $\sigma^{2}$ as well as our sparse matrix function to compute the likelihood. We will demonstrate results from using both of these approaches in the next section.

  
2.1.1 The far() function

Building on the software design set forth in section 1.5 for our spatial econometrics function library, we have implemented a function far to produce maximum likelihood estimates for the first-order spatial autoregressive model. We rely on on the sparse matrix functionality of MATLAB so large-scale problems can be solved using a minimum of time and computer memory. We demonstrate this function using a data set involving 3,107 U.S. contiguous counties.

Estimating the FAR model requires that we find eigenvalues for the large n by n matrix W, as well as the determinant of the related n by n matrix $(I_{n} - \rho W)$. In addition, matrix multiplications involving W and $(I_{n} - \rho W)$ are required to compute the information matrix used to produce estimates of dispersion.

We constructed a function far that can produce estimates for the fi