What Is Normalization in Relational Databases?
Normalization is a Systematic Approach that is used to design a Relational Database table up to the higher normal form. The basic goal of normalization is to keep each attribute in only one place to eliminate redundancy and the inconsistencies that can result from redundancy. The process requires a deep understanding of each attribute and each attribute’s relationship to its primary key.
Key Deliverable of Normalization
- No Data Redundancy
- No Data Inconsistency
Understanding the Normal Forms
1NF – 1st Normalization Form
- Add a Primary Key and ensure that every attribute depends on that Primary Key. Each row as a whole should be Unique. Identify each set of related data with a Primary Key.
- Eliminate Repeating Groups in Individual Tables.
- Create a Separate Table for each set of Related Data.
- Make sure that each attribute is Atomic (not Multi-Valued). Values stored in a Column should be of the Same Domain. Multi-Valued can be divided, like instead of Full_Name, it can be First_Name, Middle_Name, and Last_Name.
2NF – 2nd Normalization Form
- Be in 1NF
- Create Foreign Key.
- Create separate tables for sets of values that apply to multiple records.
- Relate these tables with a foreign key.
- Single Column Primary Key that is not functionally dependent on any subset of the candidate key relation
- It should not have Partial Dependency.
3NF – 3rd Normal Form
- Be in 2NF
- To move our 2NF table into 3NF, we again need to divide our table.
- Eliminate Data not dependent on the Primary Key
- It doesn’t have Transitive Dependency.
Example: We have again divided our tables and created a new table that stores Salutations. There are no Transitive Functional Dependencies, and hence our table is in 3NF.
Transitive Dependency Example
Changing the non-key column Full Name may change the Salutation.
Name: William Salutation: Mr.
Name: Lina Salutation: Ms.
Name | Salutation |
---|---|
William | Mr. |
Lina | Ms. |
If changing “Name” also changes “Salutation”, it means there’s a transitive dependency, which breaks 3NF rules.
Real-World Consideration: When to Break the Rules?
Microsoft Says:
Adhering to the third normal form, while theoretically desirable, is not always practical. If you have a Customers table and you want to eliminate all possible inter-field dependencies, you must create separate tables for cities, ZIP codes, sales representatives, customer classes, and any other factor that may be duplicated in multiple records. In theory, normalization is worth pursuing. However, many small tables may degrade performance or exceed open file and memory capacities.