Exploration of Data Summaries

Back to previous page

Conference Publication

Exploration of Data Summaries

AUTHORS:

CNRS, France

Sihem Amer-Yahia

CNRS, France

Aurélien Personnaz

ADDITIONAL AUTHORS:

Brit Youngmann

PUBLISHED IN:

accepted in:

PVLDB 2022

CURRENT STATUS

Yet to be published

DATE:

April 20, 2022

Read full article

Data summarization is the process of producing interpretable and representative subsets of an input dataset. It is usually performed following a one-shot process with the purpose of finding the best summary. A useful summary contains k individually uniform sets that are collectively diverse to be representative. Uniformity addresses interpretability and diversity addresses representativity. Finding such as summary is a difficult task when data is highly diverse and large. We examine the applicability of Exploratory Data Analysis (EDA) to data summarization and formalize the problem of guided exploration of data summaries that seeks to sequentially produce connected summaries with the goal of maximizing their cumulative utility. Our problem generalizes one-shot summarization. We propose to solve it with one of two approaches: (i) TOPSUM that chooses the most useful summary at each step; (ii) RLSUM that trains a policy with Deep Reinforcement Learning that rewards an agent for finding a diverse and new collection of uniform sets at each step. We compare these approaches with one-shot summarization and top-performing EDA solutions. We run extensive experiments on three large datasets. Our results demonstrate the superiority of our approaches for summarizing very large data, and the need to provide guidance to domain experts.

Download files:

Exploration of Data Summaries

Exploration of Data Summaries

Will be available soon to download

More Conference Publications

Data Democratisation with Deep Learning: The Anatomy of a Natural Language Data Interface

Learning Diversity Attributes in Multi-Session Recommendations

On Efficient Approximate Queries over Machine Learning Models

Get in touch