EDAF75

Lab 2: Database design

In this lab you will:

develop an E/R model for a movie theater chain,
identify relations in your E/R model,
translate the E/R model to a relational database,
write some SQL code to test your database, and
write the answers to some questions in a markdown file.

Background

A movie theater chain operates several theaters, each with a unique name and a given capacity. The company screens a number of movies at their theaters, each movie can be screened in many theaters, and each theater can have several screenings (performances) each day. Each performance has a start time, and only one movie is shown in each performance.

People who want to buy a ticket for a performance must sign up as customers – each customer can pick their own user name, which must be unique, they also have a full name (which needn't be unique), and a password.

Once the customer is registered, she can book tickets. A ticket is valid for a specific performance (i.e., one movie screened in a given theater at a given time), and seating is free, so there are no seat numbers.

A movie has a title, a production year, an IMDB-key (like tt0280707), and a running time (in minutes). Movie titles are not necessarily unique, but we'll assume that each title is used at most once every calendar year (the title "The awakening" has been used more than 30 times over the years, and "Love" almost doubles that – but we'll stick with our assumption of uniqueness for each year).

Some important points:

Each ticket should have an id (which is printed on it), and it should not be consecutive integers, since that would reveal sensitive information to the chain's competitors (it would give them a pretty good idea of how many tickets the chain is selling). Instead we want something like a randomblob(16) (or uuid).
Customer passwords should never be saved in plaintext, they must be encrypted, using argon2 or something like it. This doesn't have to effect your database design, and for this lab you can save whatever you want in the password field – we'll return to this issue in lab 3.

Assignment

We want you to work through the steps below, and deliver the following (which we'll discuss during the lab session):

A beautiful, shiny UML diagram which shows your E/R-model – it should be drawn in standard UML notation, and it should be drawn with some program (if you can't find anything fancier, you can always use UMLet as your fallback – that's what I use).
One SQL script, lab2.sql, which contains SQL code for creating your database, and adding some sample data.
One markdown file, lab2-answers.md, with answers to some questions in the text below.
Some SQL statements in a notebook (lab2-nb.ipynb – see below for instructions).

Develop an E/R model for the database described above, start by finding suitable entity sets. Hint: We will not need any entity set for the company itself.
Find relationships between the entity sets. Indicate the multiplicities of all relationships.
Find attributes of the entity sets and (possibly) of the relationships.
Identify keys, both primary keys and foreign keys. Answer these questions (put your answer in lab2-answers.md):
1. Which relations have natural keys?
2. Is there a risk that any of the natural keys will ever change?
3. Are there any weak entity sets?
4. In which relations do you want to use an invented key. Why?
Draw a UML diagram of your E/R model, using some program – it should be as clear and tidy as possible, and it must follow the standard for UML class diagrams. Make sure to have the file with your model easily available for the lab session.
Convert the E/R model to a relational model, use the method described during lecture 4.

Describe your model in your lab2-answers.md file – use the following conventions:
- underscores for primary keys
- slashes for foreign keys
- underscores and slashes for attributes which are both (parts of) primary keys and foreign keys
For the college application example we had during lecture 2 we would end up with:
```
students(_s_id_, s_name, gpa, size_hs)
colleges(_c_name_, state, enrollment)
applications(/_s_id_/, /_c_name_/, _major_, decision)
```
There are at least two ways of keeping track of the number of seats available for each performance – describe them both, with their upsides and downsides (write your answer in lab2-answers.md).

For your own database, you can choose either method, but you should definitely be aware of both.
Write SQL statements for the following tasks, put them in lab2.sql.
1. Create the tables, make sure to define primary keys and foreign keys.
  
  To get a unique and truly random key in SQLite3, we can use the following technique (obviously, you'll have to use other names for your table and attribute):
```
CREATE TABLE students (
  s_id         TEXT DEFAULT (lower(hex(randomblob(16)))),
  ...
  PRIMARY KEY  (s_id);
);
```
2. Insert some data into the tables. Use real-world movie names and theater names, and invent users. We want some rows in all tables except for the one holding tickets (it can be empty, but feel free to add tickets if you like). Use the data type DATE for dates and TIME for times. Dates are entered and displayed on the form '2022–02-01', and times can be entered like '19:30'.

Let SQLite3 execute the statements in your script by running the command:

sqlite3 movies.sqlite < lab2.sql

in a Command Line Interpreter (CLI).

It will create a database in the file movies.sqlite, and you can import it into a Jupyter notebook as in the attached notebook (lab2-nb.ipynb).

Your script should have the following structure:

-- Delete the tables if they exist.
-- Disable foreign key checks, so the tables can
-- be dropped in arbitrary order.
PRAGMA foreign_keys=OFF;

DROP TABLE IF EXISTS ...;
DROP TABLE IF EXISTS ...;
...

PRAGMA foreign_keys=ON;

-- Create the tables.
CREATE TABLE ... (
  ...
);
...

-- Insert data into the tables.
INSERT
INTO    ... (...)
VALUES  (...);
...

Download and save the notebook lab2-nb.ipynb – you should save it as a text file with the extension .ipynb. Then run it using jupyter lab, and solve the problems described in the notebook. If you can't run the notebook, you can have a look at the instructions as a html-page instead (and run your SQL code in the Sqlite3 command line tool, or some other way).