Browse Prior Art Database

CLASSIFICATION AND SPECIFICATION OF FLAT CLUSTER METHODS

IP.com Disclosure Number: IPCOM000128286D
Original Publication Date: 1978-Dec-31
Included in the Prior Art Database: 2005-Sep-15
Document File: 24 page(s) / 53K

Publishing Venue

Software Patent Institute

Related People

William H. E. Day: AUTHOR [+3]

Abstract

Jardine and Sibson's flat cluster methods exhibit mathematical prop-erties that make them particularly appropriate for use in taxonomy. But since the number of flat cluster methods increases at least exponentially with the number of objects to be classified, it seems desirable to classify flat cluster methods, so as to isolate subclasses appropriate to the con-struction of taxonomic systems, andto develop an effective mechanism for specifying flat cluster methods. In this paper.a classification scheme is developed that is based on characteristics of confinement, consistency, stability, and authenticity. A type of graph-theoretic property, the normal property, is described, and it is shown by means of simple transformations that every normal property specifies a flat cluster method and that each flat cluster method is specified by at least one normal property. To illustrate these results, ten clustering properties described by Hubert are used to con struct normal properties, and the corresponding flat cluster methods are class-ified and compared with known methods. The ten normal properties specify three distinct sequences of flat cluster methods based on the graph-theoretic concepts of k-minimum, degree, k-line-connectivity, and k-point-connectivity.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 10% of the total text.

Page 1 of 24

THIS DOCUMENT IS AN APPROXIMATE REPRESENTATION OF THE ORIGINAL.

CLASSIFICATION AND SPECIFICATION OF FLAT CLUSTER METHODS*

William H. E. Day

Department of Computer Science Southern Methodist University Dallas, Texas 75275

February, 1978

*This research was supported in part by grants GJ-40487 and DCR75-10930 of the National Science Foundation.

ABSTRACT

Jardine and Sibson's flat cluster methods exhibit mathematical prop-erties that make them particularly appropriate for use in taxonomy. But since the number of flat cluster methods increases at least exponentially with the number of objects to be classified, it seems desirable to classify flat cluster methods, so as to isolate subclasses appropriate to the con-struction of taxonomic systems, andto develop an effective mechanism for specifying flat cluster methods. In this paper.a classification scheme is developed that is based on characteristics of confinement, consistency, stability, and authenticity. A type of graph-theoretic property, the normal property, is described, and it is shown by means of simple transformations that every normal property specifies a flat cluster method and that each flat cluster method is specified by at least one normal property. To illustrate these results, ten clustering properties described by Hubert are used to con struct normal properties, and the corresponding flat cluster methods are class-ified and compared with known methods. The ten normal properties specify three distinct sequences of flat cluster methods based on the graph-theoretic concepts of k-minimum, degree, k-line-connectivity, and k-point-connectivity.

TABLE OF CONTENTS

NO. PAGE

1. Introduction and Summary 1 2. Classification of Flat Cluster Methods 8 3. Specification of Flat Cluster Methods 16 4. Illustrations of Classification and Specification 20 5. References 23

LIST OF FIGURES

NO. PAGE

1. Fiat Cluster Method Classification 24

LIST OF TABLES

Southern Methodist University Page 1 Dec 31, 1978

Page 2 of 24

CLASSIFICATION AND SPECIFICATION OF FLAT CLUSTER METHODS

NO. PAGE

1. Graph-Theoretic Properties 25 2. Normal Properties 26

1. INTRODUCTION AND SUMMARY

In this paper P is a finite nonempty set of objects to be class-ified, and G is a loopless labeled undirected graph with points labeled by the objects in P and with two points joined by a line if the corres-ponding objects are similar with respect to a specified criterion. Line lengths and relative positions of points have no significance in sche-matic representations of such graphs. The set of all loopless labeled undirected graphs-with point set P is denoted by G(P).

A graph-theoretic cluster method operates on each graph G in G(P) to obtain a simple clustering of the objects in P. This clustering is a subset of the set P(P) of all nonempty subsets of P such that: each object in P appears in at least one subset in the clustering; 'and no sub-set in the clustering contains another subset in the clustering. The subsets in the clusteri...