0% found this document useful (0 votes)
31 views8 pages

Midterm 15w2

This document is a midterm exam for a database systems course. It provides instructions that examinees are only permitted to use a basic calculator and are not allowed to use notes, books, or cell phones. The exam consists of 10 pages including this cover sheet and contains 9 multiple choice and short answer questions worth a total of 43 marks. Examinees are instructed to show their work, write down any assumptions, and to attempt the easier questions first.

Uploaded by

ovothrowaway
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views8 pages

Midterm 15w2

This document is a midterm exam for a database systems course. It provides instructions that examinees are only permitted to use a basic calculator and are not allowed to use notes, books, or cell phones. The exam consists of 10 pages including this cover sheet and contains 9 multiple choice and short answer questions worth a total of 43 marks. Examinees are instructed to show their work, write down any assumptions, and to attempt the easier questions first.

Uploaded by

ovothrowaway
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

The University of British Columbia

Computer Science 404: Database Systems


Midterm #2, February 11, 2015. Instructor: E.M. Knorr

Time: 48 minutes. Only a simple, non-programmable, non-communicating calculator is permitted.


Closed book. No notes. No cell phones.

Name Student No
(PRINT) (Last) (First)

Signature

The examination has 8 pages + 2 appendix pages, and


that includes this cover sheet. Check that you have a Marks
complete paper consisting of 10 pages.
Question Max. Achieved
Print your name and ID at the top of this page, and
provide your signature. Have your student ID ready. 1 4

You do not need to print your name on any pages other 2 9


than the cover page because we use serial numbers.
3 5
A simple, non-programmable, non-communicating
4 4
calculator is permitted. No other aids are allowed.
5 4
Work quickly and do the easy questions first. Part marks
are available. 6 4
The marks for each question are given in braces. Do not 7 4
spend too much time on any one question.
8 5
To minimize disruptions during the exam, please avoid
asking the invigilators for help, tips, or explanations. 9 4

Write down any reasonable assumptions that you are Total 43


making, if you believe that a question is ambiguous.
Question 1 is a multiple-choice question worth 4 marks. You’ll get 3 if you only get 1 wrong, 2 if you
get 2 wrong, and 0 otherwise. Circle all TRUE answers. There may be as few as 0 TRUE answers to
the question, and as many as 5 correct answers—so be sure to read all parts of the question!

In case of ambiguity, please write down any reasonable assumptions that you need to make.

1. {4 marks} Which of the following statements about Alt. 2 extendible hash indexes are generally true?
Assume that we are dealing with indexes on unique keys (i.e., no duplicates). A “bucket” is a “page”.
a) A bad hash function will still distribute keys pretty evenly across the entire hash structure.
b) It is often faster to find a unique key in a hash index than in a B+ tree index.
c) Usually, hash indexes can support range queries efficiently.
d) The local depth nub in an extendible hashing index bucket tells you how many <key, rid>
pairs are in the bucket.
e) Hash buckets need to be at least half full.

Page 2
2. {9 marks} Consider the following schema for a mining table created by an investment firm. Assume
that we use 4K pages (4096 bytes), that a tuple cannot span pages, and that we try to fill pages to
capacity. All rid and child pointers are 10 bytes long, and as usual, we’ll ignore housekeeping bytes.
Don’t worry about whether there is an odd or an even number of data entries in your answer.

Details: Of the many thousands of mines around the world, supposes an investment firm tracks 200
mines and has 10 full years’ worth of data for each of those mines (assume 365.25 days/year). Numeric
fields take up 4 bytes, dates are stored in integer format (e.g., 20150211 means February 11, 2015), and
strings are 15 bytes each. Each data record contains 110 bytes. The table and its fields are:

MineDailyProduction (mineID, mineDate, KilogramsOfOre, TypeOfMetal, ...)

Suppose there is an Alternative 2 B+ tree clustering index on the MineDailyProduction table’s


mineDate and mineID fields taken together (i.e., it is a composite key containing 2 fields).

a) {5 marks} Compute the number of leaf pages (only) in this index. Show your work.

b) {2 marks} How many complete records will fit on one data page if we plan to leave approximately
30% free space on each data page of the table? Show your work. Make reasonable assumptions.

c) {2 marks} Note that the leading field of the clustering index’s key is the date. Why is this firm
planning to leave 30% free space on each data page? Be specific.

Page 3
3. {5 marks} Suppose we have the following entries that we’re going to insert into an empty Extendible
Hash index structure shown below. You can assume that each bucket can hold up to 3 entries. Use the
same hash function from class (i.e., modulo last n bits of each binary value). The index contains no keys
so far, but we’ll start with pointers to 2 initially empty buckets.

Add these keys, in this order: 12, 15, 7, 24, 8, 6, 16—and show the resulting structure every time the
directory doubles, and at the end. Note that 12 = (1100)2.

For the “end”, it is OK to add any remaining keys directly to the existing structure if it doesn’t split
again (to save you writing time).

Page 4
4. {4 marks} A data entry is a <key, rid> pair in a hash bucket. What is the maximum key length k
permitted so that at least 90 data entries will be able to fit on a 4K (4096-byte) bucket in an Alt. 2 hash
index? In other words, if the key length is any longer than k, then we won’t be able to fit 90 data entries
on a page. As usual, assume that each rid is 10 bytes long. Write down any reasonable assumptions, if
any, that you have to make. Show your work.

5. {4 marks} Suppose we have a B+ tree of height 2 containing 28,000 unique primary keys, kept
sorted in alphanumeric order. The leaf pages can contain a maximum of 140 data entries per page, and
these leaf pages are filled to capacity. How many pages need to be retrieved into an empty buffer pool
to determine the first 10,000 primary keys (only), in order? Show your work.

Page 5
6. {4 marks} Suppose a parent node in a B+ tree has 3 children: A, B, and C—where the middle node
is B. All 3 of A, B, and C are leaf pages, and they currently contain 56 data entries each. The order of
the tree is 42. Using the principles of B+ tree deletions and maintenance from class, how many keys can
we delete from B without having to merge two of A, B, and C? Assume that only node B has keys that
are being deleted from the tree. Briefly, justify your answer.

7. {4 marks} Consider a B+ tree index of order d = 5 (i.e., max. 10 entries for any of its nodes/pages).
What is the maximum possible number of data entries at the leaf level that can appear in this B+ tree, if
its height is exactly 2 (i.e., has 3 levels)?

Page 6
8. {5 marks} Compute the total number of cylinders that the disk arm has to move for each of these 2
algorithms: (a) the regular elevator algorithm (SCAN with LOOK), just like in class; and (b) the SSTF
(Shortest Seek Time First) algorithm. Let us assume that the disk assembly is currently at cylinder 0
and is heading higher.

The cylinder requests that have already arrived are, in order: 1500, 2100, 4000, and 800. Assume that
immediately after fetching the page(s) (i.e., completing the service) for cylinder 2100 that the following
requests arrive nearly simultaneously: 1300, 2000, 4500, and 1600. Thus, there are 8 service requests
in all.

Page 7
9. {4 marks} Consider the SYSIBM.SYSCOLUMNS and SYSIBM.SYSTABLES DB2 catalog tables
on the next 2 pages, and the query below. Recall that a TBCREATOR is a unique identifier in case
multiple tables have the same name.

In plain English, in one sentence, explain what the purpose of the following (relatively simple) SQL
catalog query is.

SELECT C.TBCREATOR, C.TBNAME


FROM SYSIBM.SYSTABLES T,
SYSIBM.SYSCOLUMNS C
WHERE C.NAME = 'SALARY'
AND
C.TBNAME = T.NAME
AND
C.TBCREATOR = T.CREATOR
AND
T.TYPE = ‘T’
AND
T.DBNAME = ‘PAYROLL’;

Page 8

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy