admin Comment(0)

ISBN ; Digitally watermarked, DRM-free; Included format: PDF; ebooks can be used on all reading devices; Immediate eBook download after purchase The Self-Organizing Map (SOM), with its variants, is the most popular. The Self-Organizing Map (SOM), with its variants, is the most popular artificial neural network Get your Kindle here, or download a FREE Kindle Reading App . Self-Organizing Map Formation: Foundations of Neural Computation ( Computational Get your Kindle here, or download a FREE Kindle Reading App .

Language: English, Spanish, German
Country: Armenia
Genre: Academic & Education
Pages: 126
Published (Last): 24.08.2016
ISBN: 467-8-16820-293-1
ePub File Size: 30.38 MB
PDF File Size: 9.17 MB
Distribution: Free* [*Free Regsitration Required]
Downloads: 28250
Uploaded by: JASMIN

eBook ISBN: Open - Buy once, receive and download all available eBook formats, including PDF, EPUB, and Mobi (for Kindle). The Self- Organizing Map, or Kohonen Map, is one of the most widely used medical data , free-form text documents, digital images, speech, and process measurements. This method is known as Self-Organizing Maps (SOM), a method used for impacted canines, neuronal networks, self-organizing maps. .. [PMC free article] [PubMed] Proteins Critical to Learning in a Mouse Model of Down Syndrome. Our first effort uses eBook readers, which have several "ease of. Multistrategy Learning of Self-Organizing Map (SOM) and Particle Swarm For example, neural networks are very efficient at mapping input to output vectors and of the proposed methods due to the concept of No Free Lunch Theorem [ 42]. .. Our first effort uses eBook readers, which have several "ease of reading" .

Department of Electronic Engeenering. University of Valencia, Valencia, Spain. Late diagnosis frequently needs surgery to rescue the impacted permanent canine. In many cases, interceptive treatment to redirect canine eruption is needed. However, some patients treated by interceptive means end up requiring fenestration to orthodontically guide the canine to its normal occlusal position.

The significance values for each variable are shown in Table 3. In cases with a value of. On analyzing the pattern of each of these groups, it can be observed that the initial inclination of the canine and the sector where they are located are important prognosis factors as, at a greater inclination and greater sector, a greater probability of the need for fenestration arises.

In fact, these results are also clear if a simple comparison of the mean values of patients who needed fenestration with those who did not is made. However, the methodology that we have employed allows us to add to these two factors the valuation of the rest of the variables that make up the pattern, some of which present statistically significant differences in terms of the probability of needing fenestration.

The dependence of these other magnitudes does not appear when a customary statistics method comparing the means between the 2 groups is undertaken. Although it is impossible to predict with absolute certainty which patients are going to require fenestration in order to achieve canine eruption, using the approach taken in this work, we present patterns of patients depending on their probability of either needing fenestration or not in order to achieve the eruption of the canine following an interceptive treatment.

This pattern corresponds to the following values. Although most studies focus on the possibility of the impaction or not of the PMC, and to explain this alteration numerous variables have been presented, not all are significant when it comes to explaining the need for a final surgical solution to carry the PMC to its correct final location.

Hence, for example, the PMC alteration that presents itself most often in girls does not mean that this group is going to require a greater percentage of fenestrations. Likewise, other variables of interest in studies on impacted canines, such as age at beginning of treatment, coronal development of the lateral incisor; LID; the palatal or vestibular position of the canine and exfoliation of the deciduous canine that some consider effective 8 , are of no importance to our study when it comes to predicting whether fenestration is going to be required or not.

Even the duration of the treatment, which does show a statistically significant difference between the groups, as shown in table 3, is not representative for our study, as no difference exists in the cases of high and low probability.

These are also the variables that show a greater relationship with canine impaction in the different studies reviewed.

Thus, in Lindauer et al. Wardorf et al. In our study we were able to confirm that, in the pattern with the greatest probability of the need for fenestration, this variable presents a mean value of 3, on considering the right and left canine, as against a value of 1. It should be pointed out at this point that although we present mean values of the contralateral teeth, the right and left parts do not behave equally in our study: In our study, we were able to observe that the pattern of greatest probability of the need for fenestration corresponds to high levels of this angle, The development of the lateral incisor and Nolla stage also presented interesting results in our study, these variables being referred to in many canine impaction studies 1.

In our case, the need for fenestration is associated with little-developed lateral incisors and lower Nolla stages. Becker and Chaushu 21 also sustain this eruptive theory in their article of , as they observed that half the patients with impacted canines in the palate presented late development of dentition a mean of 1.

Broadbent as long ago as , described the mechanism of eruption and alignment of the maxillary front teeth Apart from evaluating the dental variables analyzed, we believe that the interesting point about this study resides in the fact that it shows the advantages gained by using new methodologies for representing complex clinical problems involving a large number of variables.

Sets containing a large number of variables provide a better way of finding solutions to those problems. This methodology has allowed us to group together eruptive behavior patterns and to observe the pattern of patients by using a set of many of the variables involved, as well as to eliminate various variables that are redundant to the problem or of very scant repercussion. Unlike conventional statistics, self-organizing maps allow us to observe the problem of canine impaction and the need or not for fenestration, showing us how the variables are grouped together in one or the other of the different suppositions.

We can assess the repercussion of each variable without the need to attend to concrete and absolute data by determining which set of variables appears to be associated with the greatest proportion for each clinical situation in our case.

Hence, we can predict more faithfully the different types of treatment prognosis, and, therefore, arrive at a safer diagnosis for future similar situations. National Center for Biotechnology Information , U. Published online Feb 4. Vanessa Paredes 1 Orthodontics. Author information Article notes Copyright and License information Disclaimer.

Corresponding author. Conflict of interest statement: The authors have declared that no conflict of interest exist.

Received Jul 1; Accepted Oct This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Material and Methods To study the dental characteristics associated with canine impaction, conventional statistics have traditionally been used. Conclusions In this study, we describe the process of debugging variables and selecting the appropriate number of cells in SOM so as to adequately visualize the problem posed and the dental characteristics of patients with regard to a greater or lesser probability of the need for fenestration.

Introduction The permanent maxillary canine PMC is the tooth that presents the highest incidence of displacement from its normal eruptive path due to its peculiar morphological, topographical and chronological characteristics.

Table 1 Variables represented on the SOM. Open in a separate window. Figure 1. Figure 2. Results As mentioned in the previous section, each patient is represented by a vector of as many components and variables as there are in the study: Figure 3.

Table 2 Average values of the patient pattern corresponding to each neuron N considered. Table 3 Probability of successful interceptive treatment for the eruption of the canine. Discussion Although it is impossible to predict with absolute certainty which patients are going to require fenestration in order to achieve canine eruption, using the approach taken in this work, we present patterns of patients depending on their probability of either needing fenestration or not in order to achieve the eruption of the canine following an interceptive treatment.

Conclusions Unlike conventional statistics, self-organizing maps allow us to observe the problem of canine impaction and the need or not for fenestration, showing us how the variables are grouped together in one or the other of the different suppositions. References 1. Eruption of the permanent upper canine: Am J Orthod Dentofacial Orthop. Litsas G, Acar A. A review of early displaced maxillary canines: The Open Dentistry Journal.

Canine impaction identified early with panoramic radiographs. Rebellato J. Hence the number of iterations should in part be related to the number of samples, although if a sample is left out altogether it still can be fitted into the map, but has no influence on training. As the number of iterations increase, the region of cells that is adjusted around the BMU is reduced, and the amount of adjustment often called the learning rate also reduces.

This means that the maps start to stabilise. The more the iterations the more computationally intense the SOM, and sometimes it is possible to reach an acceptable solution fast. Most SOMs are developed using a random starting point, although there are modifications that allow an initial map that reduces the number of iterations by basing it on the pattern of the samples, e.

The variables that are used to describe the map usually are the raw measurements, such as spectral intensities or chromatographic peaks. Under such circumstances it is possible to interpret the component planes to provide chemical insight.

Using self-organizing maps to develop ambient air quality classifications: a time series example

However sometimes the number of variables is large, and it can be time consuming to use all the original variables, especially if some are primarily noise. Hence an alternative is use functions of the variables such as Principal Components. The simplest geometry is as a rectangular map.

The rectangle refers to the arrangement of cells and not the shape of the cells. Often the cells are represented as hexagons, as we will do in this paper, but can be represented by squares. However there is no obligation to restrict the maps to rectangular ones, and circular, cylindrical or spherical maps can be visualised.

One problem of rectangular maps is that samples at the edges tend to be farther away from other groups of samples to those in the middle, that may have many more neighbours. Some datasets do, indeed, have extreme groups of samples, and so the rectangular approach is the most appropriate. But in other cases there may well not be any reason to separate out samples that are on the extreme edges and so a spherical or cylindrical representation is more appropriate.

The trouble with the latter representations are that they are harder to visualise on paper. A representation though can be retained in a computer, and the aim is not so much to present a graph to the user but to use the co-ordinates of samples to show which are most similar, then having other geometries could be worthwhile. In this paper we will restrict representations primarily to the most common type of map, which is rectangular, and use hexagonal cells.

For case study 1 NIR of food the number of cells far exceeds the number of samples 72 and as such the samples are well separated. The map of case study 2 is somewhat more crowded with a ratio of cells: However still the samples are reasonably well spread out.

Plot of scores of the left first two PCs and right BMUs of top case study 1 — NIR of four foods and bottom thermal profiles of nine groups of polymers.

First of all the full space is used efficiently. In PC scores plots sometimes there can be crowded as there may be many samples that have to be represented in scores space. In other cases, the space is used inefficiently with lots of blank space. The second advantage is that there is no need to choose which PCs are to be used for visualisation.

Third, there are many more options for graphical representation as discussed in this paper. For case study 1, although the groups are tightly clustered, the majority of the PC space is wasted, and basically meaningless as there are no samples and no information available for the "empty" regions. The groups are so tightly clustered that we cannot see any structure within the groups. For case study 2, again much of the PCspace is wasted, and the groups overlap considerably, the symbols becoming quite crowded and hard to distinguish.

These problems are no longer disadvantages in the SOMs. In addition there are a large number of ways of shading and representing symbols. People that are not trained data analysis experts often find SOMs easier to understand and interpret, a map being more intuitive than a scores plot or complex graph.

Note that BMUs can also be used for predicting the provenance of unknown samples, or a test set, simply by seeing which places in the map they fit into. This concept of having a "board" where unknown samples are slotted into is also intuitively easier for most users to understand than predicted positions of points on a graph.

A hit histogram can be consider as a three dimensional projection of the BMU map. In each cell that corresponds to a BMU for the training set, there is a vertical bar that represents the hits. Each sample in the training set will be represented on the map. For case study 1, there are 72 samples and each hits a different cell in the map, so there are 72 vertical bars.

The map is of size 30 x 20, or consists of cells, so there is plenty of room for the samples to spread around. We notice that for case study 2, there are samples, or roughly half the number of samples compared to the map. This is not clear on the BMU map, which is clarified in the hit histogram.

If there is more than one sample associated with an individual BMU, then either this is tolerated or a map with more cells can be generated.

The problem with maps that have more cells is that they are slower to train. For case study 2, most people would tolerate a small level of overlap. If samples fall into groups, or classes, this additional information can be used to shade the background on the SOM. A cell is shaded in the colour of its closest BMU. If more than one BMU is equidistant from the cell, it is shaded in a combination of colours, according to how many BMUs from each group it is closest two. In the right column, the BMUs are also presented.

Class diagrams of top case study 1 and bottom case study 2. BMUs are indicated in the right hand column. Different types of structure can be represented on such diagrams.

For case study 2, we would classify the samples into amorphous or semi-crystalline or else into one of nine groups. The two types of information can be presented on a single diagram. Superimposing different types of information. The BMUs shaded dark blue, red and light green represent amorphous polymers blue background whereas the remaining classes represent semi-crystalline polymers red background.

Maps ebook free download self-organizing

The aim of a U-Matrix is to show the similarity of a unit to its neighbours and hence reveal potential clusters present in the map. If there are classes present in the data, then the border between neighbouring clusters can be interpreted as a class border.

The lower it is the more similar the neighbouring cells are.

Self-Organizing Maps

When going from one class to another, we anticipate that the barrier will be high. A U matrix ideally separates different groups. Consider case study 1. Corn margarine is on the bottom right and can be seen to be quite different to the others.

Safflower oil and corn oil are on the left and are seen to be fairly similar. Sometimes the original division of samples into groups is not always reflected in large differences in the corresponding spectra. A close examination of the U matrix for case study 2 suggest that there is some substructure in certain of the polymer groups.

Each variable has its own component plane. Each has a different profile. Variable 1 has a very high intensity in the top right hand corner, suggesting it is highly diagnostic or of high intensity for Olive oil.

Free ebook download maps self-organizing

It has its lowest intensity in the bottom centre group Safflower oil. Variable 2 is highly intense in corn margarine but of low or negligible intensity for all the other groups.

Variable 3 is primarily diagnostic of corn margarine and olive oil. This representation is a slice through the weights vector, scaling the highest or most positive weight to 1 and the lowest or least positive to 0 for each of the variables.

Component planes for three variables or spectral wavelengths in case study 1 top , together with the class map bottom for reference. Component planes can be regarded as an analogue of loadings plots, allowing one to determine which variables, if any, are markers or diagnostic of a group of samples. There are a number of ways of doing this, but one is to see how similar a component plane of a variable is to its class component plane [ 56 ]. A class component plane can be represented by 1s for all cells are closest to BMUs for that class, 0 for cells that are closest to BMUs for another class, and an intermediate value if there are neighbouring BMUs from the class of interest and one or more other classes, rather like the class maps, but in this situation each single class has its own corresponding plane.

All component planes for the variables are likewise scaled between 0 and 1. Multiplying the two and summing provides an index for how strongly a variable represents a particular class and can be employed as a form of variable selection or ranking.

If there are two classes or groups in the data it is possible to subtract the index of one class B from that of the other A. A positive value represents a marker for class A and a negative value for class B. The magnitude of this difference allows ranking of variables according to their perceived relative importance as markers.

Kohonen Maps - 1st Edition

Where there are more than two groups, the index can be calculated for each of the groups, and subtracted from the index calculated from the groups left out. For example a marker for class A would have a positive value if the index for class A minus the index for all other groups together is positive.

SOMs as originally described were primarily for visualisation or exploratory data analysis. However adaptations have been described that allow SOMs to be used in a supervised method, that is for predictive modelling [ 55 , 56 , 61 - 63 ]. In addition to the variable component planes, another set of component planes are added that correspond to the class membership.

If there are four classes, there are four such planes. These have a value of 1 if a cell corresponds to a sample definitely belonging to a specified class, and 0 if definitely not intermediate values are possible where there is uncertainly.

Initially the values are randomly set to a value between 0 and 1. These then are used as extra planes in the training. The relative weight or importance of the variable and class planes can be adjusted. If the class information has a relative weight of 0, the result is the same as an unsupervised map. If the relative weight is very high, the objects are in effect forced into a class structure. When there are many classifiers it is possible to train the map separately for each classifier [ 56 ].

For case study 3 NMR of saliva , the samples can be classified according to whether they were treated with mouthwash or according to sampling day or donor. For unsupervised SOMs these factors are mixed together. Note that the training for each of the factors is quite different, so the samples are positioned in different cells in each of the supervised maps.

Note also that the maps have not been fully forced to provide complete class separation which can be controlled by adjusting the relative weights of the two types of information. Training set BMUs are indicated on the class plots. However a dramatic difference can be seen when comparing the supervised and unsupervised version of the sampling day.

In the former the samples are clearly divided into their day of sampling because this has forced the model, but in the latter they are more or less randomly distributed, as this factor has little or no influence, being a dummy factor Case study 3.

Hence supervised SOMs can overfit models. However, an advantage over, for example PLS type approaches is that it is possible to specify the relative importance of the classifier and the measured variable, whereas in PLS they have equal importance. Supervised SOM representations can, therefore, in themselves, be misleading under certain circumstances, but if correctly employed can be used safely in many situations and as such do provide valuable tools as described below.

Note that there is not much literature on how to optimise the relative weights of the class and variable information. However in methods such as PLS, the relative importance of these two types of information is usually fixed so that they are equal, and an advantage of supervised SOMs is that this can be adjusted. One of the most important uses of supervised SOMs involves determining what variables are important [ 55 , 56 ] for the purpose of defining a class or group of samples, often called marker variables.

These may, for example, be characteristic chromatographic peaks or wavelengths. The principles are similar to those described in the section on 'Component plans', with a number of additional features.

The first is that maps can be forced or trained separately for each type of grouping. For case study 3, there are three types of grouping, so an unsupervised SOM would mix these together.

A supervised SOM would distinguish these causes of variation and hence can be employed in cases where there several different factors.

Remember too that the component planes for donors and treatment type are not comparable. This allows different variables to be found. For each class, variables can be ranked according to the similarity of their supervised component plane and the supervised map for the corresponding class and factor. Component planes for supervised SOMs for case study 3, illustrating marker variables for donor J and for treatment.

Note light colours indicate a high level of the variable. Although a similar exercise could also be performed for the unsupervised map, this only makes sense if the class of interest is the predominant factor and shows grouping in the map. In this section we describe how to determine which variables are most significant, or are the most likely to be markers, for each class or grouping.

However this does not necessarily mean they are significant, it simply ranks variables in order of importance.

In case study 3, we expect there to be several strongly significant variables for the treatment type, but none for the sampling day, which is a dummy variable. Yet all variables will be ranked for each type of factor. There are a number of ways of determining significance.

One way [ 55 ] is to reform the map many times, from different random starting points. A factor that is significant will remain significant or a "positive marker" over all the iterations.

A variable this is not significant will only randomly appear on the list as a positive marker, and will sometimes appear as a negative marker. A marker that is positive some of the time and negative other times is not a stable marker and therefore not considered significant. Of course the more the times the SOM is formed the higher the confidence level. Naturally this method requires good computing power.

If 10, iterations are required to form a SOM, then this is repeated times, 1 million iterations are needed. This can be expedited using parallel processors, such as quadcore or even cluster computers, using for example parallel processing in Matlab using Linux. Although many packages may have been written prior to the widespread advent of parallel processors, it is a simple task to code SOMs into most modern environments using widespread programming tools.

It is possible to perform predictive modelling to determine what class an unknown is a member of. In such circumstances the sample is not part of the original training set, but after training, a test set of samples that are left out [ 3 ] can then be assessed.

If this happens a lot, one solution would be to increase the resolution of the map. Using an independent test set protects against overfitting. By increasing the relative importance of the classifier, apparently excellent separation between groups can be obtained but this is not always meaningful. Another problem arises if new unknown samples are members of none of the predefined groups. We will show how to deal with this in the next section.

As the training progresses, this neighborhood gets smaller, resulting in the neurons that are very close to the winner and will get updated accordingly. The training stops when there is no more neuron in the neighborhood. Usually, the neighborhood function, N j , t , is chosen as an L-dimensional Gaussian function as given below:. Initialisation—set initial synaptic weights to small random values, say in a interval [0,1], and assign a small positive value to the learning rate parameter.

For instance, Euclidean distance measurement is denoted as. Cooperation—identify all output nodes j within the neighborhood of J defined by the neighborhood size R. For these nodes, do the following for all input records.

Reduce the radius with exponential decay function:. Iteration—adjust the learning rate and neighborhood size, as needed until no changes occur in the feature map. Repeat step ii and stop when the termination criteria are met.

The improved hexagonal lattice area consists of six important points: Figure 3 illustrates the formulation of improved hexagonal lattice area. Detail explanation of the proposed method is discussed in next paragraph. PSO is a global optimisation, population-based evolutionary algorithm for dealing with problems in which the best solution can be presented as a point or surface in an n- dimensional space. Hypothesis are plotted in this space and seeded with an initial velocity, as well as a communication between the particles.

An improved hexagonal lattice area is introduced for SOM learning enhancement; and PSO is integrated into this proposed SOM to evolve the weights for the learning prior to the weights adjustments. This is because PSO can find the best reduced search space for a particular input and support the algorithm to take more nodes into consideration while determining search space and not to be trapped by the same node continuously [ 15 ].

At this stage, the enhanced SOM will be implemented for the classification purpose to obtain the weights and later will be optimised using PSO. Input feature vector x is presented to the network and the winner node J , that is closest to the input pattern, x is chosen using the equation:.

Initialise the population array of particle representing random solutions for d dimensional problem space. To investigate the effectiveness of PSO in evolving the weights from SOM, the proposed method has been performed in the testing and validation process.

In the testing phase, data is presented to the network with target nodes for each input sets. The reference attributes or classifier computed during training process is used to classify input data set. The algorithm identifies the winning node that will be used for determining the output of the network.

Then, the output of the network is compared to the expected result to decide the ability of the network for classification phase. This classification stage will classify test data into correct predefined classes obtained during training process.

A number of data is presented to the network, and the percentage of correct classified data is calculated. The percentage of the correctness is measured to obtain the accuracy and the learning ability of the network.

The result is validated and compared using several performance measurements: Later, the error differences between the proposed methods are computed for further validations. The performance measurement of the proposed methods is based on quantisation error QE and classification accuracy CA. The efficiency of the proposed methods is validated accordingly; if QE values are smaller and the classification accuracy is higher, then the results are promising. QE is used for measuring the quality of SOM map.

QE of an input vector is defined by the difference between the input vector and the closest codebook vectors. QE describes how accurately the neurons respond to the given dataset. For example, if the reference vector of the BMU calculated for a given testing vector x i is exactly similar as x i , the error in precision is 0. The equation is given as follows. While the classification accuracy indicates how well the classes are separated on the map, the classification accuracy of new samples measures the networks generalisation for better quality of SOM's mapping.

The goal of the conducted experiments is to investigate the performance of the proposed methods. The results are validated in terms of classification accuracy and quantisation error QE on standard universal machine learning datasets: As PSO and improved lattice structure are being implemented, the convergence time is increasing. This scenario is due to the PSO process in searching for the g best of BMU as well as wider coverage for updating nodes with the improved lattice structure.

The basic SOM architecture consists of a lattice that acts as an output layer with its input nodes fully connected. In this study, the network architecture is designed based on the selected real world classification problems. Table 1 provides the specification for each dataset. The input layer is comprised of input pattern with different nodes that is randomly chosen from training data set. Input patterns are presented to all output nodes neurons in the network simultaneously.

The number of input node determines the number of data required to be fed into the network, while the numbers of nodes in the Kohonen layer represent the maximum number of possible classes. The training starts once the dataset has been initialised and input patterns have been selected. The learning phase of the SOM algorithm repeatedly presents numerous patterns to the network. The learning rule of the classifier allows these training cases to organize in a two-dimensional feature map.

Patterns which resemble each other are mapped onto a specific cluster. During the training phase, the class for randomly selected input node is determined. This is done by labeling the output node that is more similar best-matching unit to the input node compared to other nodes in the Kohonen mapping structure. The outputs from the training are the resulting map that contains the winning neurons and its associated weight vectors.

Subsequently, these weight vectors are optimised by PSO.

Download self-organizing maps ebook free

The quality of the classification accuracy is calculated to investigate the behavior of the network in the training data. In the testing phase, for any input patterns, if the m th neuron is the winner, it belongs to the m th clusters. In this case, we were able to test the capacity of the network to correctly classify new independent test set to a reasonable class. An independent test set is a set similar to the input set but not part of the training set.

The testing set can be seen as a representative of the general case. There is no weight updating in the recalling phase.


A series of datasets obtained that was not used in learning phases, but was previously interpreted, was presented to the network. For each case, the response of the network the label of the associated neuron was compared to the expected result, and the percentage of correct responses was computed.

It is often reported in the literature that the success of the Self-Organizing Maps SOM formation is critically dependent on the initial weights and the selection of main parameters of the algorithm, namely, the learning rate parameter and the neighborhood set [ 36 , 37 ].

They usually have to be counteracted by trial and error method, hence time consuming to retrain the procedures. Due to the time constraints, all the parameter values were fixed and constantly used throughout all the experiments.

According to [ 38 ], the number of map units is usually in the range of to Deboeck and Kohonen [ 39 ] recommend using ten times the dimension of the input patterns as the number of neurons, and this was adopted in these experiments. There is no guideline in suggesting good learning rates to any given learning problem. In standard SOM, too large and too small learning rates can lead to poor network performance [ 40 ]. Neighborhood function and the number of neurons determine the granularity of the resulting mapping.

Larger neighborhood was used in the beginning of training and then gradually decreases to a suitable final radius. The larger the area for neighborhoods functions with high values, the more rigid and flexible the map will be. In these experiments, the initial radius size is set to half of the size of the lattice. A more recent version of the feature map adapts the Gaussian function to describe the neighborhood and the learning rate.

The Gaussian function is supposed to describe a more natural mapping so as to help the algorithm converge in a more stable manner. The accuracy of the map also depends on the number of iterations of the SOM algorithm.

A rule of thumb states, for good statistical accuracy, number of iterations should be at least times the number of neurons. According to [ 36 ], the total learning time is always to If the time taken is longer, the clustering result becomes inaccurate. A more serious problem is that the topology preserving mapping is not guaranteed even if a huge number of iterations were used.

Here, the SOM classifiers were evaluated by measuring the performance of clustering result based on the classification accuracy and the computation time [ 41 ]. To meet the requirement of SOM's quality measurement, the quantisation error was calculated, which is defined as the average distance between every input vector and its BMU. The experiments were conducted with various datasets and distance measurements: Euclidean, Manhattan, and Chebyshev distance.

The choice of distance measure influences the accuracy, efficiency, and generalisation ability of the results. The least quantisation error is 0. It shows that the improved lattice structure of ESOM yields significant impact on the accuracy of the classifications.

Euclidean distance, MAN: Chebyshev distance. The results were compared in terms of classification accuracy, quantisation error, and convergence error.

It shows that improved lattice structure of ESOM yields significant impact on the accuracy of the classifications despite slower convergence time.

By having larger grid size, higher training time will be generated. Furthermore, the larger the lattice size is, the more nodes for BMU calculation are to be considered. However, in this study the focus is on the performance of the proposed method based on higher accuracy and lower QE. Regardless the types of distance measurements, the results of the proposed method are significant.

This is due to the improved lattice structure and PSO in optimising the weights. As discussed before, the improved formulation of the hexagonal lattice structure gives more coverage on neighbourhood updating procedure.

Hence, the probability for searching the salient nodes as winner nodes is higher, and this is presented in terms of accuracy and quantisation. However, the convergence time is slower for the proposed method due to the natural behaviour of the particles in searching for g best globally and locally. However, this tradeoff, that is, higher accuracy with more convergence time and vice versa, does not give big impact on the success of the proposed methods due to the concept of No Free Lunch Theorem [ 42 ].

It means that general-purpose universal algorithm is impossible; an algorithm may be good at one class of problems, but its performance will suffer in the other problems. For detail explanation, higher accuracy is depending not only on types of datasets but also on the purpose of implementing the problems' undertaking. From the findings, it seems that the selection of SOM's lattice structure for better learning is crucial in updating the neighbourhood structures for network learning.

The standard formulation for basic and improved hexagonal lattice structure is illustrated in Figure 6.

Ebook free maps download self-organizing

However, after training, the number of nodes to be updated was Using the basic hexagonal formula, the wide area was not covered and caused insufficient neighborhood updating. The potential node might not be counted during the updating process. Now, we illustrate the scenario of the improved hexagonal lattice structure for wider and better coverage Figure 6.

The radius will decrease with exponential decay function. The improved neighborhood hexagonal lattice area is defined as 2.