Machine learning with ordinal data

The problem of learning from ordinal or comparison based information has its roots in psychology, and has gained importance with the popularity of crowdsourcing approaches. In a comparison based framework, one is given a set of items, but has no access to any Euclidean representation or any pairwise similarity / distance information. The only available information are binary responses to questions of the form: Is item A more similar to item B than to item C? The significance of ordinal data lies in the fact that humans are better at providing preferences instead of stating absolute values, but its usefulness goes beyond this psychological phenomenon and ordinal data such as k-nearest neighbours are versatile tools in data engineering. With the increasing scope of crowdsourcing, it has become important to extend classical machine learning algorithms for classification, clustering etc. to the ordinal setting. Statistical theory of learning from ordinal data is in its early stages, and the project provides several foundational results in this setting.

Publications

  • M. Perrot, P. Esser, D. Ghoshdastidar. Near-optimal comparison based clustering. Neurips 2020 [Paper] [Preprint] [Code]
  • D. Ghoshdastidar, M. Perrot, U. Von Luxburg. Foundations of comparison-based hierarchical clustering. NeurIPS, 2019. [Preprint]
  • S. Haghiri, D. Ghoshdastidar, U. von Luxburg. Comparison based nearest neighbor search. AISTATS, 2017. [Paper] [Preprint]