Discussion Two data mining techniques - MLP and DT were compared based on three different input data sets obtained from MODIS. The first input was the MODIS 7 bands product, the second was the PCs derived from MODIS 36 bands and third was the MNF components derived from the same MODIS 36 bands. The six different outputs at 250 m spatial resolution (3 obtained from MLP implementation and 3 obtained from DT algorithm) were compared with the high spatial resolution LISS-III classified image. In general, hard classification techniques such as MLP, DT and MLC perform well with high spatial resolution data (such as IRS LISS-III or Landsat ETM+) compared to moderate or low spatial resolution (such as MODIS). MODIS classified images had many pixels misclassified as is clear from the accuracy assessment (table 2 and table 6). However, some errors may have occurred since the signal of the pixel is ambiguous, perhaps as a result of spectral mixing. As an additional argument, it can be said that as the pixel size increases (in this case 250 m), the chance of high accuracies being product of random assignment of values also declines. Classification accuracy using ground truth and pixel to pixel mapping revealed that MLP on MODIS MNF components is overall superior to all other techniques, followed by DT on MNF, and the same technique performed worst on PC’s. At the pixel level lower accuracies were reported – since only 10 x 10 pixels with ≥ 90% homogeneity in LISS-III were considered for comparison (65% of the pixels were homogeneous in the study area) since the LC is very fragmented. However, at the sub-regional level the algorithms performed in a different way for various classes, revealing that a certain algorithm may be good for mapping a particular class, but at the same time may not be equally good for mapping all other classes. Pre-processing techniques such as PCA and MNF had varied effects on the accuracy. Both techniques performed better on MNF components and worst on PC’s compared to MODIS 7 bands data. Figure 5 illustrates that the training of the neurons was smooth in the case of MNF components, a reason to substantiate is that the noise component and the redundancy are removed from the data compared to PCs, where only redundancy is removed. The result also gives an insight as to which technique is better for mapping heterogeneous LC. The pixels in the PC’s were not very distinct and were clustered into sub groups comprising of two or three pixels, leading to inaccurate results. However, classifying each pixel based on signature for PC’s and MNF components was difficult, since the image was slightly pixilated, although the class separability was very good. The same trend was observed at the pixel level; the two techniques performed moderately better on MNF but relatively poor on PC’s and maintained the same position in the rankings for MODIS 7 bands data in the range of 62% (see table 2 and table 3). This reveals that highly preprocessed MOD 09 data (Level 3) take care of all the atmospheric disturbances, whereas the 36 band data, MOD 02 at Level 1B requires further preprocessing to actually represent a good estimate of the surface spectral reflectance as it would have been measured at ground level without atmospheric scattering or absorption. In other words, MOD 09 is 8-day composite product acquired on 8 continuous days while MOD 02 product was gained by processing of only one day image, thereby it was possibly still affected by atmospheric and angular effects. MLP classifiers have proven superior to conventional classifiers, often recording Overall accuracy improvements in the range of 10%. Similar results have also been reported by Kim (2006) while empirically comparing the performance of NN and DT. It was observed that the performance of NN improved faster than DT as the number of classes of categorical variable increased while varying the number of independent variables, the types of independent variables, the number of classes of the independent variables, and the sample size. As the number of successful applications of MLP increases, it is increasingly clear that the technique can produce more accurate results for RS applications. However, the back propagation NN is not guaranteed to find the ideal solution to a particular problem since the network may get caught in a local minimum in the output error field, rather than reaching the absolute minimum error. Alternatively, the network may begin to oscillate between two slightly different states, each of which results in approximately equal error. DT are less appropriate for estimation tasks, where the goal is to predict the value of a continuous variable, unless a lot of effort is put into presenting the data in such a way that trends and sequential patterns are made visible. The process of growing a DT is also computationally expensive since at each node each candidate splitting field must be sorted before its best split can be found. Pruning algorithms can also be computationally expensive since many candidate sub-trees must be formed and compared.
Citation: Uttam Kumar, Norman Kerle, Milap Punia and T. V. Ramachandra , 2011, Mining Land Cover Information Using Multilayer. J Indian Soc Remote Sens,
DOI 10.1007/s12524-011-0061-y.
|