CS564 Database Management Systems

204 Educational Sciences on WedFri 2:30-3:45pm

Chat with us on the course Piazza site if you have any questions!

Description

This course covers database management systems, primarily relational systems (DBMSs). The first part of the course will focus on the use of a DBMS. We will introduce the concept of a data model, the entity-relationship (ER) model, the relational model, and learn how to use the SQL query language. We will also cover logical and physical database design issues. The other half of the course will concentrate on DBMS implementation. We will cover file organization, various indexing methods, techniques for external sorting. We will also learn about how a DBMS implements relational operators, and the basics of query optimization and transaction processing.

There will be two problem sets and a programming project (split into four parts). These assignments will explore database design and implementation as well as data management in web applications by utilizing appropriate features of SQL.

Class Logistics

Prerequisites

  • CS 367 is absolutely essential. CS 537 might be helpful.

For certain assignments and activities we will utilize Jupyter notebooks. Installation instructions are available here. The in-class activities, homework and projects will require the use of either Python or C++:

  • The programming language C++ will be used for the DBMS internals project. To brush up your C++ skills, you can go through the lecture material for CS 368: C++ for Java Programmers, or the material from a more recent class found here. Check here and here for more C++ tutorials. Keep the links handy throught the semester as you may need to go to it for C++ help.
  • All other assignments will require the use of Python.


Assignments

  • Problem sets (PSs) are individual assignments. Each student must send us an individual submission.
  • For the class project you will form groups of three and work on a single submission.
  • The honor code described below will be enforced for both types of assignments.

Lecture Plan

The reading material listed below is optional, but it refers to the two textbooks listed below. The lecture plan below may change.


# Date Topic Lecture Materials Extra Reading Material Assignments
Introduction and Querying
1 9/6 Course Logistics and Database History Lecture 1 (pdf)

Activities:
Notebook data: dataset_1.db
2 9/8 SQL: Introduction Lecture 2 (pdf)
Lecture 2 Notebook
References: Cow book Ch 5.1 - 5.2

Activities:
Notebook data: dataset_1.db
Greenspun - SQL for web nerds
3 9/13 SQL: Advanced I Lecture 3 (pdf)
References: Cow book Ch 5.3 - 5.4

Activities:
PS #1:
4 9/15 SQL: Advanced II Lecture 4 (pdf)
References: Cow book Ch 5.5 - 5.6

Activities:
Notebook data: dataset_1.db
Database Design and Normal Forms
5 9/20 Database Design: ER Diagrams Lecture 5 (pdf)
References: Cow book Ch 2

Activity Solutions (pdf)
PS #1 due

Project Part 1
6 9/22 Database Design: Theory 1 Lecture 6 (pdf)
References: Cow book Ch 19.1 - 19.3

Activities:
Closure Algorithms: closure.py (Download and keep in same directory as the activity)
7 9/27 Database Design: Theory 2 Lecture 7 (pdf)
References: Cow book Ch 19.4.1, Ch 19.5, Ch 19.6.1

Activities:
Closure Algorithms: closure.py (Download and keep in same directory as the activity)
8 9/29 Database Design: Theory 3 Lecture 8 (pdf)
References: Cow book Ch 19.4.2, Ch 19.6.2, Ch 19.8.1

Activities:
MVDs Notebook: MVDs.ipynb
PS #2:
Database Internals: Introduction
9 10/4 Data Storage and IO Models Lecture 9 (pdf)
References: Cow book Ch 9.1, Ch 9.3, Ch 9.4

Project Part 1 due
10 10/6 Buffer Manager and File Organization Lecture 10 (pdf)
References: Cow book Ch 9.4, 9.5 - 9.7

PS #2 due

Project Part 2:
11 10/11 External Sorting Lecture 11 (pdf)
References: Cow book Ch 13.1 - 13.3

12 10/13 Midterm Review Midterm Review (pdf)
13 10/18 Midterm
Database Internals: Indexing and Hashing
14 10/20 Indexing Lecture 12 (pdf)
References: Cow book Ch 8

15 10/25 The B+ Tree Index Lecture 13 (pdf)
References: Cow book Ch 10.3 - 10.7

Activities:
Notebook data: complaint.db
16 10/27 Hash Indices Lecture 14 (pdf)
References: Cow book Ch 11.1 - 11.2
Project Part 2 due

Project Part 3:
17 11/1 Bitmap Indices Lecture 15 (pdf)
Database Internals: Query Processing
18 11/3 Access Methods and Operators Lecture 16 (pdf)
References: Cow book Ch 12.1-12.3, 14
19 11/8 Joins I Lecture 17 (pdf)
References: Cow book Ch 12.1-12.3, 14
19 11/10 Joins II Lecture 17 (cont'd) (pdf)
References: Cow book Ch 12.1-12.3, 14
20 11/15 Relational Algebra Lecture 18 (pdf)
References: Cow book Ch 4.1,4.2

Activities:
Notebook files:
22 11/17 Query Optimization Lecture 19 (pdf)
References: Cow book Ch 15.1 - 15.4

Activities:
Notebook files:
23 11/22 Thanksgiving Break Project Part 3 due
Transactions
24 11/29 Transactions from a User's Perspective Lecture 20 (pdf)
References: Cow book Ch 16.1, 16.2, 16.6 (for Project)

Activities:
Notebook files:
Project Part 4
25 12/1 Mechanisms for Transactions: Logging and Locking Lecture 21 (pdf)
References: Cow book Ch 17
26 12/6 Guest Lecture: Stratis Viglas, Google
27 12/8 Research Talk: Machine Learning meets Data Management Lecture 23
28 12/13 Final Review Final Review (pdf) Project Part 4 due

Change log:
Note: you may have to clear your browser's cache to get newest files!
  • 9/4/17 - Lecture 1 Updated
  • 9/7/17 - Lecture 2 Updated
  • 9/12/17 - Lecture 2 and Activities 2 Updated
  • 9/13/17 - Lecture 3 Updated; Problem Set 1
  • 9/15/17 - Lecture 4 Updated
  • 9/20/17 - Lecture 5 Updated; Project Part 1 Released
  • 9/22/17 - Lecture 6 Updated
  • 9/27/17 - Lecture 7 Updated
  • 9/29/17 - Lecture 8 Updated; Problem Set 2
  • 10/4/17 - Lecture 9 Updated
  • 10/6/17 - Lecture 10 Updated; Project Part 2 Released
  • 10/11/17 - Lecture 11 Updated
  • 10/13/17 - Midterm review Updated
  • 10/20/17 - Lecture 12 Updated
  • 10/25/17 - Lecture 13 Updated
  • 10/27/17 - Lecture 14 Updated
  • 11/01/17 - Lecture 15 Updated
  • 11/03/17 - Lecture 16 Updated
  • 11/08/17 - Lecture 17 Updated
  • 11/15/17 - Lecture 18 Updated
  • 11/17/17 - Lecture 19 Updated
  • 11/29/17 - Lecture 20 Updated

Midterm Exam
The midterm exam will be in class on October 18th from 2:30pm - 3:45pm. The location will be:
  • 204 Educational Sciences
Final Exam
The final exam will be on December 20th from 2:45pm - 4:45pm.
Grading
Problem Sets15%
Programming Project30%
Midterm20%
Final35%
Textbooks

Recommended textbook:

  • Database management systems (3rd edition), by Raghu Ramakrishnan and Johannes Gehrke (also called the “cow book”).

Additional book that can be used:

  • Database Systems: The Complete Book (2nd edition), by Hector Garcia-Molina, Jennifer Widom, and Jeffrey Ullman.

Office Hours

Theo: Wed 11:00 pm -12:00 pm (after class), Fri 11:00 - 12:00, or by appointment @ Room CS4361

Minzhen: Thu 4:00-5:00 pm @ Room CS6372

Ting: Mon 3:00-4:00 pm @ Room CS4291

Note: the schedule of office hours may change from time to time, in which case an announcement will be made on the course Piazza.

Staff
Theo
Minzhen Yi (minzhenyi@cs.wisc.edu)
Ting Wang (xwang973@wisc.edu)
Late Policy
There will be no late dates for both the Problem Sets and Programming Projects. We will only grant extensions in the case of a severe medical or family emergency.
Honor Code and Collaboration Policy

We encourage you to discuss the Programming Projects and Problem Sets with other students; it's fine to discuss overall strategy and collaborate with a partner or in a small group, as both giving and receiving advice will help you to learn.

However, for the Problem Sets you must write your own solutions to all of the problems, and you must cite all people you worked with.

For the Programming Projects, each group must write its own code: it's not OK to share code or write code collaboratively across groups. (This includes posting and/or sharing your code publicly, such as on GitHub!)

If you do not do so, we will consider this a violation of the University of Wisconsin Honor Code.

If you consult any resources outside of the materials provided in class, you must cite these sources. We reserve the right to assign a penalty if your answers are substantially derivative, but, as long as you provide appropriate citations, we will not consider this an Honor Code violation.

Credit
The template of this website was created by HazyReseach@Stanford. A significant portion of the class is based on material from HazyResearch and previous offerings of CS564.