Tuesday, March 27, 2012

Normalising Your Database - Second Normal Form (2NF)


Now we've looked at normalising a database to 1NF (First Normal Form), we will continue to investigate normalising to Second Normal Form.
A table is in first normal form and each non-key field is functionally dependent on the entire primary key.

Look for values that occur multiple times in a non-key field. This tells you that you have too many fields in a single table.

A Practical Approach

In the example below, see all the repeating values in the name and ProjectTitle fields. This is an inefficient way to store and maintain data. In a well-designed database, the only data that is duplicated is in key fields used to connect tables. The presumption is that the data in key fields will rarely change (so it's OK if it's repeated) while the data in non-key fields may change frequently (so it's not OK to repeat it).


*EmployeeID
LastName
FirstName
*ProjectNumber
ProjectTitle
EN1-26
O'Brien
Sean
30-452-T3
STAR manual
EN1-26
O'Brien
Sean
30-457-T3
ISO procedures
EN1-26
O'Brien
Sean
31-124-T3
Employee handbook
EN1-33
Guya
Amy
30-452-T3
STAR manual
EN1-33
Guya
Amy
30-482-TC
Web Site
EN1-33
Guya
Amy
31-241-TC
New catalog
EN1-35
Baranco
Steven
30-452-T3
STAR manual
EN1-35
Baranco
Steven
31-238-TC
STAR prototype
EN1-36
Roslyn
Elizabeth
35-152-TC
STAR pricing
EN1-38
Schaaf
Carol
36-272-TC
Order system
EN1-40
Wing
Alexandra
31-238-TC
STAR prototype
EN1-40
Wing
Alexandra
31-241-TC
New catalog


A table with a multi-field primary key and repeating data in non-key fields


If a ProjectTitle changed, you would have to edit it in several records. And what would happen in this table if the EmployeeID was part of the primary key and you wanted to add a new ProjectNum and ProjectTitle even though no employees had yet been assigned?

The primary key cannot contain a null value so you couldn't add the new project. Additionally, if a project ended and you wanted to delete it, you would have to delete the individual values because, if you deleted the records containing the titles and an employee was assigned to only that project, you would also delete that employee's record - something that you may not want to do.

In the above example, the asterisks indicate the fields that make up the primary key of this table as it now stands. A multi-field primary key is necessary because neither the EmployeeID nor the ProjectNum fields contain unique values.

The reason there are repeated values in LastName, FirstName, and ProjectTitle is that these fields are dependent on only part of the primary key. The value in EmployeeID determines what the value in LastName will be but the value in ProjectNum has nothing to do with it. Similarly, the value in ProjectNum determines the value in ProjectTitle but EmployeeID does not. These non-key fields relate to only part of the primary key. They are not functionally dependent on the entire primary key.

The solution to this lies in breaking the table into smaller tables that do meet second normal form. You will find that more tables is the solution to most problems encountered during data normalisation.

Complying with second normal form
Now we'll take the table above and design new tables that will eliminate the repeated data in the non-key fields.

1. To decide what fields belong together in a table, think about which field determines the values in other fields. Create a table for those fields and enter the sample data.
2. Think about what the primary key for each table would be and about the relationship between the tables. If necessary, add foreign keys or a junction table.
3. Mark the primary key for each table and make sure that you don't have repeating data in non-key fields.

EMPLOYEES

*EmployeeID
Last Name
First Name
EN1-26
O'Brien
Sean
EN1-33
Guya
Amy
EN1-35
Baranco
Steven
EN1-36
Roslyn
Elizabeth
EN1-38
Schaaf
Carol
EN1-40
Wing
Alexandra


EMPLOYEES_PROJECTS
*EmployeeID
*ProjectNum
EN1-26
30-452-T3
EN1-26
30-457-T3
EN1-26
31-124-T3
EN1-33
30-328-TC
EN1-33
30-452-T3
EN1-33
32-244-T3
EN1-35
30-452-T3
EN1-35
31-238-TC
EN1-36
35-152-TC
EN1-38
36-272-TC
EN1-40
31-238-TC
EN1-40
31-241-TC

PROJECTS
*ProjectNum
ProjectTitle
30-452-T3
STAR manual
30-457-T3
ISO procedures
30-482-TC
Web site
31-124-T3
Employee handbook
31-238-TC
STAR prototype
31-238-TC
New catalog
35-152-TC
STAR pricing
36-272-TC
Order system

Examine the tables to make sure there are no repeating values in non-key fields and that the value in each non-key field is determined by the value(s) in the key field(s). This removes the modification anomaly of having the repeated values.

That's all for the second article on normal forms. Check the following article to view details of normalising your database to 3NF (3rd normal form)




No comments:

Post a Comment