Métodos estatísticos para agrupamento não hierárquicos de dados

Authors

  • E. Acarini USP; Instituto de Geociências
  • G. Amaral USP; Instituto de Geociências; Departamento de Paleontologia e Estratigrafia

DOI:

https://doi.org/10.11606/issn.2317-8078.v0i13p45-63

Abstract

A cluster is usually thought to be a subset of data points which are highly similar or associated and relatively unassociated with data points outside the subset. Cluster Analysis makes it possible to study data sets and subdivide them into several distinct clusters with similar internal characteristics. The K-means and ISODATA algorithms can establish this relationship, without any hierarchical base influencing in the cluster formation process. This gives great mobility to the data units analyzed and facilitates the consequent formation of clusters with the closest internal relationships. The K-means algorithm uses Euclidean distances between the means of the variables of a cluster with respect to each unit of the data set and establishes K clusters for the smallest interval distance possible. In the ISODATA technique (which operates in the same way) each pattern is put into cluster for which the squared distance between it and the cluster mean is smallest. The new clusters means are then computed and the whole procedure repeated.

Published

1992-10-01