The Remarkable Simplicity of Very High Dimensional Data: Application of Model-Based Clustering

Murtagh, Fionn

(2009)

Murtagh, Fionn (2009) The Remarkable Simplicity of Very High Dimensional Data: Application of Model-Based Clustering. Journal of Classification, 26 (3).

Our Full Text Deposits

Full text access: Open

Full text file - 3.52 MB

Abstract

An ultrametric topology formalizes the notion of hierarchical structure. Anultrametric embedding, referred to here as ultrametricity, is implied by ahierarchical embedding. Such hierarchical structure can be global in the dataset, or local. By quantifying extent or degree of ultrametricity in a data set, we show that ultrametricity becomes pervasive as dimensionality and/or spatial sparsity increases. This leads us to assert that very high dimensional data are of simple structure. We exemplify this finding through a range of simulated data cases. We discuss also application to very high frequency time series segmentation and modeling.

Information about this Version

This is a Submitted version
This version's date is: 2009
This item is not peer reviewed

Link to this Version

https://repository.royalholloway.ac.uk/items/19df89d8-6daa-6265-29aa-2189238d8fa5/6/

Item TypeJournal Article
TitleThe Remarkable Simplicity of Very High Dimensional Data: Application of Model-Based Clustering
AuthorsMurtagh, Fionn
Uncontrolled Keywordsstat.ME, math.GM
DepartmentsFaculty of Science\Computer Science

Identifiers

doihttp://dx.doi.org/10.1007/s00357-009-9037-9

Deposited by Research Information System (atira) on 03-Jul-2014 in Royal Holloway Research Online.Last modified on 03-Jul-2014

Notes

36 pages, 18 figures, 36 references


Details