Print this pagePrint this page

Private Dissemination of Social Network Data

Many phenomena are naturally modeled as networks, in which entities (represented by nodes) participate in binary relationships (represented by edges). In a social network, nodes are individuals and edges are personal contacts or relationships. In a communication network, nodes are individuals and edges are flows of information such as phone calls or Email messages. In a technological network, nodes are machines, such as computers or power stations, which are connected by a means of transmission.

By analyzing data sets representing networks, researchers have studied topics as diverse as the robustness of the Internet, the spread of HIV, the causes of financial fraud, and the processes of cellular metabolism. However, a looming barrier threatens to disrupt the growing flood of research on networks. Many of the key systems under study record personal information about participants. The pattern and content of communications, social relationships, organizational affiliations, and online behavior is often laid bare within these data sets, and many participants object to wide disclosure of this information.

The goal of this project is to understand the extent to which accurate analysis of networks can be performed while safeguarding the privacy of participants. The publications below explore threats to privacy in networked data, anonymizing transformations of network data, and output perturbation techniques for accurate network analysis.

Publications

Project Members

  • Michael Hay (graduate student)
  • Chao Li (graduate student)
  • Gerome Miklau (faculty)
  • David Jensen (faculty)
  • Don Towsley (faculty)

This project is funded in part by NSF Cybertrust grant (CNS-0627642) Preserving utility while ensuring privacy for linked data, 2006-2009.