Lecture 4 – notes
Picking up where we left last time…
We first finished the model from Monday, and ended up with:
Translating our model into a database
In the slides I had some 'recipes' for translating an ER model into tables in a relational database, we started out by creating the 'obvious' parts of some of the 'obvious' tables (i.e., each entity set is translated into a corresponding table, and each attribut is translated into a column in the table).
During the lecture I only wrote a few tables, to illustrate some points, here I've first written the obvious parts of all entity sets – I've also defined a primary key where we can (there's no good key in the loans
table at the moment):
-- This is only the first iteration, more will be added below CREATE TABLE authors ( author_name TEXT, nationality TEXT, PRIMARY KEY (author_name) ); CREATE TABLE books ( isbn TEXT, year INT, title TEXT, PRIMARY KEY (isbn) ); CREATE TABLE copies ( copy_ean TEXT, shelf TEXT, condition INT, PRIMARY KEY (copy_ean) ); CREATE TABLE libraries ( library_name TEXT, muncipality TEXT, PRIMARY KEY (library_name) ); CREATE TABLE librarian ( librarian_id TEXT, librarian_name TEXT, PRIMARY KEY (librarian_id) ); CREATE TABLE loans ( out_date DATE, last_in_date DATE, return_date DATE ); CREATE TABLE borrower ( card_id TEXT, ssn TEXT, name TEXT, email TEXT, hashed_pwd TEXT, PRIMARY KEY (card_id) );
We then stopped and looked at how to translate different kinds of associations into SQL, and we started by trying to figure out how to implement the *-1
from Copy
to Book
.
You came up with the idea of letting the copies
table have an attribute which keeps track of the proper row in the books
table – in the slides this kind of attribute is called a foreign key:
CREATE TABLE copies ( copy_ean TEXT, shelf TEXT, condition INT, isbn TEXT, PRIMARY KEY (copy_ean), FOREIGN KEY (isbn) REFERENCES books(isbn) );
Now we can use a JOIN
if we want to get information about the book a copy refers to – let's say we are given an ean code, and want the corresponding title and year:
SELECT title, year FROM copies JOIN books USING (isbn) WHERE copy_ean = '...';
In the same vein, we could add foreign keys to the tables copies
(we also need to link them to libraries
), loans
, and borrowers
:
-- Now with added foreign keys, but we still need more... CREATE TABLE copies ( copy_ean TEXT, shelf TEXT, condition INT, isbn TEXT, PRIMARY KEY (copy_ean), FOREIGN KEY (isbn) REFERENCES books(isbn) ); CREATE TABLE librarians ( librarian_id TEXT, librarian_name TEXT, library_name TEXT, PRIMARY KEY (librarian_id), FOREIGN KEY (library_name) REFERENCES libraries(library_name) ); CREATE TABLE loans ( copy_ean TEXT, card_id TEXT, out_date DATE, last_in_date DATE, return_date DATE, PRIMARY KEY (copy_ean, card_id, out_date), FOREIGN KEY (copy_ean) REFERENCES copies(copy_ean), FOREIGN KEY (card_id) REFERENCES borrowers(card_id) ); CREATE TABLE borrowers ( card_id TEXT, ssn TEXT, name TEXT, email TEXT, hashed_pwd TEXT, library_name TEXT, PRIMARY KEY (card_id), FOREIGN KEY (library_name) REFERENCES libraries(library_name) );
Since a borrower could potentially have library cards at many libraries, the primary key in the borrowers
table can't be just ssn
. We can use the tuple (ssn, library_name)
, but easier is to just use the card_id
(which is what I did above).
In the loans
table, our foreign keys to copies
and borrowers
help us getting a key (we had none above), but since a borrower can borrow the same copy many times, we also need the date of the loan to make it unique (assuming no one in their right mind borrows the same copy several times in a day… – in that case we could instead use an invented key).
To simplify things here, I haven't implemented the foreign keys to Librarian
in the ER-diagram above (we could have one out_librarian_id
for the librarian handling the borrowing of the copy, and one in_librarian_id
for the librarian handling the returning of the book).
Exercise 1: Write a query which shows the title and year of all books the borrower with the ssn '20000101-1234'
has borrowed at 'Lunds stadsbibliotek'
(there is an answer at the end of this page, but you should definitely try to write the query yourself before looking at the answer!).
We then talked about how to implement a *-*
association, and I hope it was clear to you why we couldn't just let each table have a foreign key to the other (we can only do that if we're referencing just one element in the other table, here we're potentially referencing many).
So we came up with join tables (see the slides), and we connected the authors
table and books
table with:
CREATE TABLE authorships ( author_name TEXT, isbn TEXT, PRIMARY KEY (author_name, isbn), FOREIGN KEY (author_name) REFERENCES authors(name), FOREIGN KEY (isbn) REFERENCES books(isbn) );
We also talked a bit about how to implement guardians for children – one way is to have the guardian's card-id as a foreign key:
CREATE TABLE borrowers ( card_id TEXT, ssn TEXT, name TEXT, email TEXT, hashed_pwd TEXT, library_name TEXT, guardian_card_id TEXT, PRIMARY KEY (card_id), FOREIGN KEY (library_name) REFERENCES libraries(library_name), FOREIGN KEY (guardian_card_id) REFERENCES borrowers(card_id) );
This works fine, and would mean that everyone who doesn't have a guardian will have their guardian_card_id
as NULL
(a foreign key is allowed to be NULL
, it's only when it isn't NULL
that the database checks that it refers to a valid row in the foreign table).
To get the name of a guardian with this implementation is a bit tricky – we need to join the table borrowers
with itself.
This is called a self join, and it will introduce potential naming conflicts (see the notebook for lecture one, specifically the section about correlated subqueries).
To list the names of all children and their guardians, we can use the following query:
SELECT borrowers.name, guardians.name FROM borrowers JOIN borrowers AS guardians ON borrowers.guardian_card_id = guardians.card_id ORDER BY borrowers.name;
We want to join each row in the borrowers
table where guardian_card_id
isn't NULL
with the corresponding row where card_id
is the guardian_card_id
of the original row.
To do this, we need to 'rename' the table of the guardian borrowers, in the code above I call it 'guardians
', and then the query works.
But since the guardians
table (which is the same as the borrowers
table) have the column name
, I must be specific about which name I'm referring to.
We can introduce the alias children
for the 'outer' table (the table in which we look for rows with references to guardians):
SELECT children.name, children.name FROM borrowers AS children JOIN borrowers AS guardians ON children.guardian_card_id = guardians.card_id ORDER BY children.name;
but I'm not sure this makes the query easier to understand (but YMMV).
An alternative would be to introduce a separate table with guardians (it would make sense if we have only few children in our database, and don't want all the NULL
guardians hanging around):
CREATE TABLE guardians ( card_id TEXT, guardian_card_id TEXT, PRIMARY KEY (card_id), FOREIGN KEY (card_id) REFERENCES borrowers(card_id), FOREIGN KEY (guardian_card_id) REFERENCES borrowers(card_id) );
The 'rules' above will take us quite far when we translate an ER model into a relational database:
- an entity set will become a table
- a
*-1
('many-to-one') association will become a foreign key on the*
side of the association - a
*-*
('many-to-many) association will become a new (join) table
For those *-*
associations with an association class, the attributes of the association class will become attributes in the join table.
Exercise 2: Write a query to find the names and nationalities of the three most borrowed authors (answer below).
We then looked at some kinds of association which requires careful thought (see the slides), and three methods to implement inheritance (see the slides).
We concluded by looking at how to insert, update and delete data in our database – see the links above, and the slides.
If you have any questions about what we did, please go to Moodle and sign up for the QA session next week.
Solutions to exercises
Exercise 1
We need to join several tables: we start at the loans, then we need to:
- add information about the copy (to get the isbn code), so we
JOIN copies
, using theean_code
, which is in bothloans
andcopies
- add information about the book (to get its title and year), so we
JOIN books
, using theisbn
code, which is incopies
andbooks
- add information about the borrowers (to get the ssn and library), so we
JOIN borrowers
using the card-id:
SELECT title, year FROM loans JOIN copies USING (copy_ean) JOIN books USING (isbn) JOIN borrowers USING (card_id) WHERE ssn = '20000101-1234' AND library_name = 'Lunds stadsbibliotek';
Exercise 2
Now we need to join even more tables, and then to group on the author names (which, amazingly!, were unique):
SELECT author_name, nationality, count() FROM loans JOIN copies USING (copy_ean) JOIN books USING (isbn) JOIN authorships USING (isbn) JOIN authors USING (author_name) GROUP BY author_name ORDER BY count() DESC LIMIT 3;