Courses
- CMPSCI 445: Information Management
Fall 2006This course provides an introduction to the design and use of database systems, as well as the key issues in building such systems. The World Wide Web is the largest distributed information management system. In addition to database topics, this course will also provide an introduction to key technologies for managing and exchanging data on the World Wide Web. In presenting the fundamental principles of databases we cover the relational model, conceptual design, and query languages. We also cover core database implementation issues including storage and indexing, query processing and optimization, transaction management and recovery. In presenting modern Internet-based data management we will cover XML data, web application development, and selected topics in information retrieval, information integration. The course will also emphasize the secure management of data in both conventional databases and the World Wide Web.
- CMPSCI 591Q: Database Systems Lab
Fall 2007This is a self-paced laboratory course exploring advanced topics in data management. Students will devise efficient solutions to real-world data management problems on realistic data sets. Topics include performance tuning of relational databases, querying graph-structured databases, querying streaming databases, indexing high-dimensional data, managing uncertainty in databases, information retrieval in databases, and implementation of database internals. Students will use an open-source database management system and a programming language API (like JDBC/ODBC).
- CMPSCI 645: Database Design and Implementation
Spring 06 Spring 07This course covers the design and implementation of traditional relational database systems and advanced data management systems. The course will treat fundamental principles of databases: the relational model, conceptual design, query languages, and selected theoretical topics. We also cover core database implementation issues including storage and indexing, query processing and optimization, as well as transaction management, concurrency, and recovery. Additional topics will address the challenges of modern Internet-based data management. These include XML data management, stream-based systems, information integration, and database security. Prerequisites: an undergraduate-level course on operating systems or databases.
- CMPSCI 691LL: Networked Information Systems
Fall 06This graduate seminar covers advanced information systems and data management issues in emerging network-connected environments. The first part of the course addresses the design and implementation of advanced information systems including geospatial databases, data warehouses, parallel databases, distributed databases, sequence databases, and XML databases. The second part of the course explores recent research topics in networked data management including stream systems, publish/subscribe systems, sensor networks, RFID networks, and data uncertainty and lineage.
- CMPSCI 691R: Reliability in Information Integration
Spring 07Databases and text collections are increasingly formed by integrating data that is gathered from diverse sources or authored by diverse individuals. Examples include structured databases extracted and integrated from the Web, collaboratively-authored blogs and wikis, and scientific databases that are shared, freely annotated, and republished. These applications are characterized by open authorship: it is usually impossible to restrict who can contribute to the underlying data, or to directly constrain contributor's actions. Human error, malice, or simple disagreement is often unavoidable. In addition, data integration and data extraction procedures are imperfect. Databases and text collections must therefore cope with incorrect, uncertain, or inconsistent data. This seminar will survey a range of techniques for preventing, detecting (or in some cases, merely tolerating) unreliable data.
- CMPSCI 745: Advanced Topics in Database Systems
Fall 2008This graduate course covers advanced data and information management systems. The first part of the course addresses the design and implementation of advanced database systems including data warehouses, data mining, column-based databases, parallel databases, and distributed databases. The second part of the course explores advanced research topics including data streams, sensor data management, data provenance and lineage, and probabilistic databases.
This is a three-credit graduate database course. The prerequisite is a graduate course on the principles and implementations of traditional database systems, an equivalent of CMPSCI645. Students with other backgrounds should contact the instructor for approval for enrollment.
Previous page: Publications
Next page: Links
Print this page