Introduction
Advances in data acquisition, storage and database technology have generated enormous amount of data, necessitating efficient information extraction, which has led to an emerging area of data mining or KDD (knowledge discovery in databases). Data mining integrates several domains such as machine learning, database management, data visualisation, statistics and information theory. Although numerous studies have been carried out on data mining of relational and transaction databases (Agarwal and Srikant, 1994; Fayyad et al., 1996; Han and Fu, 1996; Piatetsky-Shapiro and Frawley, 1991), data mining of spatio-temporal databases, object oriented databases, multi-media databases, etc. still remain largely unexplored. Thus spatial data mining or KDSD (knowledge discovery in spatial databases) is an active field of research. In this chapter, the focus is on spatial data mining i.e., discovery of interesting knowledge from geospatial data. Spatial data obtained from remote sensing (RS) provides land use and land cover (LULC) patterns of the area under imaging that permit frequent updation of maps nearly on a real-time basis.
Diverse LULC features of the Earth’s surface based on their inherent spectral reflectance and emittance properties will have different combination of digital numbers (DNs) (pixel’s intensity or grey value) in the image. Radiance/reflectance measurements obtained in various wavelength bands for each pixel provide spectral patterns(referred to as the data henceforth) that can be classified and correlated to different LULC classes on the ground using statistical learning. Recent advances in statistical learning theories have generated promising tools and techniques in the field of pattern recognition applicable to RS data for deriving the information on land use (LU) classes like urban areas, agricultural land, water bodies, etc. (Kwan et al., 1994; Fukushima et al., 1998; Gori et al., 1998; Lee, et al., 2006). Spectral pattern recognition in LULCclassification utilises spectral information of multiple bands of remote sensors as the basis for automated image classification and, indeed, the spectral pattern present within the data for each pixel is used as the numerical basis for categorisation (Lillesand and Kiefer, 2002). Performance of spectral classification depends on parameters such as definition and representation of classes, selection of appropriate features, classifier design, selection of training and test samples, and evaluation of classification accuracy. The overall idea of per-pixel classification (hard classification) procedure is to categorise all pixels in an image into LULC classes or themes automatically using either unsupervised classification (such as ISODATA clustering) or supervised spectral classification techniques (such as Maximum Likelihood Classifier).
Unsupervised classification techniques do not require training data and form clusters based on inherent properties of the pixels such as spectral distance to class means, etc. K-Means and ISODATA (Iterative Self-Organizing Data) clustering, for example, begin with arbitrary cluster means and each time clustering repeats, the means of these clusters are shifted. The new cluster means are used for the next iteration. The ISODATA utility repeats clustering of the image until either (a) a maximum number of iterations have been performed, or (b) a maximum percentage of unchanged pixels have been reached between two iterations (Tou and Gonzalez, 1974; Jain and Dubes, 1998; PCI Geomatics Corp; Memarsadeghi et al., 2003).
Supervised classifiers need reference class samples (training data, usually obtained from ground) to predict or classify the unknown pixels in the image. The pixel categorisation is done by specifying the numerical descriptors of the various LU types and involves three stages – (i) training stage: identifying representative training areas and developing a numerical description of the spectral attributes of each LU class type in the image, known as training set, (ii) classification stage: each pixel in the image data set is categorised into the LU class that it resembles most closely, and (iii) output stage: the process consists of a matrix of interpreted LU category types (Lillesand and Kiefer, 2002). However, with heterogeneous and highly fragmented complex landscapes, selecting sufficient training samples becomes rather difficult and the problem worsens with the medium or coarse resolution data due to the presence of mixed pixels. Selection of training samples must take into account the spatial resolution of the RS data considering the complexity of landscape along with the appropriate data mining (classification) technique or scheme. In this context, performance evaluation of different classification algorithms on varying resolution data would be desirable as different classification methods have their own genesis and merits. Although many classification approaches have evolved over time, appropriateness of the approach suitable for features of interest in a given study area is not yet fully understood (Lu and Weng, 2007), necessitating qualitative and quantitative evaluation of the performance of classifiers with multi-resolution data.
The objective of this chapter is to discuss various advanced data mining techniques (such as Maximum Likelihood Classifier, Decision Tree, K-Nearest Neighbour, Neural Network, Random Forest, Contextual Classification using sequential maximum a posteriori estimation, and Support Vector Machine) and select the best classifier for varying spatial resolution data such as high resolution-IKONOS (4 m), medium spatial resolution sensor data-IRS LISS-III MS (23.5 m), Landsat TM (30 m) and low resolution data-MODIS (250 m). Understanding the strengths and weaknesses of variants of multi resolution data of RS sensors are essential for the selection of suitable RS image classification method (Lu and Weng, 2007). This requires prior knowledge of factors such as scale, characteristics of the study area, the availability of data and their characteristics, economic viability, time constraints, etc. The application at the user’s end must also take the scale and image resolution into account.
The chapter is organised as follows: section 2 discusses the advanced classifiers and section 3 details the data used in the classification experiments. Section 4 (4.1 to 4.4) presents the algorithm’s implementation considering different sensor’s data and assessment of their comparative performance with changing spatial and spectral resolutions, followed by discussion in section 5. A new Hybrid Bayesian Classifier is presented in section 6 with concluding remarks in section 7.
|