Big Data Systems

The focus of this work is information infrastructures and data management systems, with an emphasis on big data systems, data stream systems, interactive data exploration, uncertain data management, high-performance genomic data processing, RFID/sensor data management, flash memory databases, publish/subscribe, XML message brokering.

Provenance, Causality, and Reverse Data Management

Data is critical in almost every aspect of society, including education, technology, healthcare, economy, and science. Poor understanding and handling of data, poor data quality, and errors in data-driven processes are detrimental in all domains that rely on data. The goal of this research is to target these particular challenges, to develop tools that improve our understanding of data and facilitate the diagnosis of errors, and to extend the capabilities of modern database systems to support complex decisions and strategy planning queries.

Private Dissemination and Analysis of Data

The goal of this work is to understand how accurately aggregate properties about a data set can be studied while preserving the privacy of individual participants. Our recent work focuses on complex graph-structured data and trace data. Please see the following project pages for details, publications, and code releases:

Privacy, Provenance, and Data Retention

The goal of this work is to achieve the benefits of preserving history — accountability through the ability to audit the past — while avoiding threats to privacy posed by preserved data. Our work has included investigations of database forensics and models for the protection of audit histories. Please see the following project page for details and publications: