CS571 Week3 Assignment

CS571_Week3_Assignment3.1

Provided below are tables with a small amount of data. For each, identify the normal form the table currently is in, and then describe the procedure you would follow to bring the design to the next normal form. Then follow that procedure and present the data in the new design. (The videos for this lesson should be able to help with this a lot.) For example, if a table meets the criteria for first normal form but fails to meet the criteria for second normal form, state the table is in first normal form, and then describe the process you would follow to bring the design into second normal form. If a table is not even in first normal form just state “Not even 1NF”.

We want to capture: the pizzas eaten by each person.

Name	Pizzas
Amy	mushroom, pepperoni
Ben	cheese, pepperoni
Cal	supreme
Dan	cheese, mushroom, pepperoni, sausage, supreme

Not in 1NF, due to a list of values in one column “pizzas”. Rules of 1NF: Each column should contain atomic values, a column should contain values that are the same type, each column should have a unique name, and the order in which data is saved doesn’t matter. The list of multi-values should be broken up into separate row for each value in the multi-list.

NAME	PIZZA
Amy	mushroom
Amy	Pepperoni
Ben	Cheese
Ben	Pepperoni
Cal	Supreme
Dan	Cheese
Dan	Mushroom
Dan	Pepperoni
Dan	Sausage
Dan	Supreme

We want to capture: The project codes associated with a client code, as well as the client contact and the project name.

ClientCode	ClientContact	ProjectCode	ProjectName
BV	Lambright, Teodoro	LDU	Aberrant Longitude
BV	Lambright, Teodoro	MEK	Pink Cloud
HK	Boyers	EFS	Intensive Indigo Moose
JU	Valley, Russell	XDB	Strong Omega
JU	Valley, Russell	JPQ	Lonesome Cloud

Table is in 1NF, but not 2NF. There is redundancy in the client code and client contact. The client code functionally determines the client contact. Process to achieve consists of decomposition-splitting the single table into multiple tables.

Client Code	Client Contact
BV	Lambright, Teodoro
HK	Boyers
JU	Valley, Russell

Client Code	ProjectCode	ProjectName
BV	LDU	Aberrant Longitude
BV	MEK	Pink Cloud
HK	EFS	Intensive Indigo Moose
JU	XDB	Strong Omega
JU	JPQ	Lonesome Cloud

We want to capture: Employees’ names as well as those employees’ dependents names.

Lname	Fname	Dependent1	Dependent2	Dependent3
Smith	John	Alice	Elizabeth	Michael
Wong	Franklin	Alice	Joy	Theodore
Wallace	Jennifer	Abner

Not in 1NF.

We want to capture: Employees’ names along with the department they belong to, as well as the location of the main office for each department.

EmployeeName	DepartmentName	DepartmentMainOffice
Smith, John	Research	Sugarland
Wong, Franklin	Research	Sugarland
English, Joyce	Research	Sugarland
Narayan, Ramesh	Research	Sugarland
Borg, James	Headquarters	Houston
Wallace, Jennifer	Administration	Stafford
Jabbar, Ahmad	Administration	Stafford
Zelaya, Alicia	Administration	Stafford

Often when normalizing a table design you don’t yet have data in your table. For instance, in cases where you are designing a new data structure, not analyzing an existing table. So, the small amount of data provided in the above examples cannot always be counted on. However, sometimes you might have to determine whether a table is in 2NF or 3NF based solely on the knowledge of what attributes functionally determine other attributes. Presented here is a new type of notation that enables you to do that type of analysis without being able to see data.

A -> B, C

This means that column A is the key for the table and it functionally determines columns B and C. In other words, this means that for every unique value in column A, there can be only one row and thus only one distinct value for the data in columns B and C.

Determining if a column’s data functionally determines another column’s data can usually be accomplished by asking the question “Can X have only one Y or can it have more than one Y?” An example might be “Can a doctor have only one patient or more than one patient?” If the answer is only one, then doctor functionally determines patient, otherwise if a doctor can have more than one patient, the doctor does not functionally determine the patient. In other words, if doctor functionally determines patient, then if you know the doctor, you can know the one and only patient he/she has. If doctor does not functionally determine patient (which of course it doesn’t), then you can’t just talk about the doctor and his/her patient. You have to specify the patient. This means that the key would have to include both doctor and patient.

Is the following true? Why or why not?

Doctor, Patient -> Prescription

Is the following true? Why or why not?

Doctor, Patient, Prescription -> Refills Remaining

Sometimes a key that is made up of more than one attribute (like Doctor, Patient) can technically functionally determine another attribute (like the Patient’s birth date). But it is also true that you don’t need Doctor in order to functionally determine a Patient’s birth date. This situation is what 2NF is all about. Making sure that all functionally determined attributes need all of the key columns to functionally determine them. In cases where you don’t, you need to decompose the related data into two or more tables. One for the full key (and any attributes you need the full key to determine), and one for each attribute that is only determined by a part of they key (along with that part of the key). For example, if you had:

Doctor, Patient -> Last Appointment Date and Time, Patient Birth Date

You would want to create two tables. One for:

Doctor, Patient -> Last Appointment Date and Time

And one for:

Patient -> Patient Birth Date

The problem this solves is that if a patient is seeing more than one doctor (perhaps a general practitioner and a neurologist), you don’t have to duplicate the patient’s birth date for each doctor he/she visits. This is redundant, and additionally makes it possible for the “two” birth dates to differ in the database when clearly a person doesn’t have two birth dates.

Should the following design be decomposed into two or more tables? Why or why not? And if decomposing, what columns belong in each resultant table?

Patient, Hospital Room Number -> Hospital Floor

Should the following design be decomposed into two or more tables? Why or why not? And if decomposing, what columns belong in each resultant table?

Doctor, Patient -> Hospital Admit Date and Time, Doctor’s Pager #

Sometimes a key can technically functionally determine two attributes but does so transitively. An example would be:

Employee -> Employee’s Boss, Employee’s Boss’s Email Address

The problem is that a boss likely has several people that report to them. So, if everywhere we record an employee’s boss, we also record their boss’s email address, it’s possible that the same boss might appear to have different email addresses when looked up by one employee vs. another. Solving this problem is what 3NF is all about. Again, this is not just about reducing redundancy but also reducing the chance for inconsistent data. To solve this problem, we’d want to decompose the design into two tables:

Employee -> Employee’s Boss

Boss -> Boss’s Email Address

Aside:

This decomposition is sufficient to meet the requirements of 3NF. However, an even better design would involve realizing that a boss is also an employee and store all employees in one table and have the relationship between employee and boss recursively refer to other rows in the same table:

Employee -> Employee Email Address, Boss

The Boss attribute would refer to the Employee column of another row in that same table.

Should the following design be decomposed into two or more tables? Why or why not? And if decomposing, what columns belong in each resultant table?

Doctor -> Department, Department Office Location

(Note: a department can have only one office location)

Should the following design be decomposed into two or more tables? Why or why not? And if decomposing, what columns belong in each resultant table?

Doctor, Patient, Date and Time -> Diagnosis, Typical Prognosis

(By Typical Prognosis, assume of the diagnosis in general, not the particular doctor diagnosing the particular patient.)

Lost Password