28/08/2018
Publications
Are your data gathered? The Folding Test of Unimodality
Alban Siffer, Pierre-Alain Fouque, Alexandre Termier, Christine Largouët
Abstract
Understanding data distributions is one of the most fundamentalresearch topic in data analysis. The literature provides a great dealof powerful statistical learning algorithms to gain knowledge onthe underlying distribution given multivariate observations. We arelikely to find out a dependence between features, the appearanceof clusters or the presence of outliers. Before such deep investi-gations, we propose the folding test of unimodality. As a simplestatistical description, it allows to detect whether data are gatheredor not (unimodal or multimodal). To the best of our knowledge,this is the firstmultivariate and purely statisticalunimodality test.It makes no distribution assumption and relies only on a straight-forwardp−value. Through real world data experiments, we showits relevance and how it could be useful for clustering.