DBMS Normalization

Kamesh Shekhar Prasad
4 min readFeb 20, 2021

Normalization is the way to organize data in database.

It is used to remove insertion, updation and deletion anomalies.

Normal form is used to reduce redundancy from the database.

Types of Normal Forms:-

  1. First Normal Form:-

First criteria of 1NF is that table must contain atomic value.

Single attributes of a table can’t hold multiple values.

Above table is not in 1NF because there are two phone numbers. So we should write each phone number in separate row.

So like above table we can write phone number of each employee in separate row.

2. Second Normal Form:-

First criteria is table must be in 1NF.

All non key attributes must be dependent on primary key.

Here non prime attribute TEACHER_AGE is dependent on TEACHER_ID which is a proper subset of a candidate key. So it violates the rule of 2NF.

So teacher id and age is kept in another table and id and subject kept in another to convert it into 2NF.

3. Third Normal Form

If a table is in 2NF and not contain any transitive partial dependency then it is in 3NF.

It is used to reduce the data duplication.

A relation is in 3NF if it holds it holds one of the following conditions for every non- trivial function dependency X-> Y.

  1. X is a super key.
  2. Y is a prime attribute.

Here in above table super key are {EMP_ID}, {EMP_ID, EMP_NAME}.. so on.

Here candidate key is {EMP_ID}

Here except EMP_ID, all attributes are non-prime.

Here, EMP_STATE & EMP_CITY dependent on EMP_ZIP and EMP_ZIP dependent on EMP_ID.

The non-prime attributes (EMP_STATE, EMP_CITY) transitively dependent on super key(EMP_ID). It violates the rule of third normal form.

So we seperated EMP_CITy and EMP_STATE to new EMPLOYEE_ZIP table.

4. Boyce Codd normal form

BCNF is advanced version of 3NF.

For BCNF, the table should be in 3NF, and for every FD, LHS is super key.

Here employee work in more than one department.

Here employee id column is dependent on employee country and employee department column is dependent on department type and employee department number.

EMP_ID and EMP_DEPT are candidate key.

So we brake whole table into 3 tables.

This is in BCNF because left side part of both the functional dependencies is a key.

5. Fourth normal form

If a table is in BCNF and has no multi-valued dependency then it will be in 4NF. For dependency A->B, if for single value of A, multiple values of B exists, then this will be in multi-valued dependency.

This table is in 3NF but course and hobby are having no relation to each other and one student id 21 having 2 courses and 2 hobby. so it is repetition of data.

So, we decompose it into two tables to make it in 4NF.

6. Fifth normal form(5NF)

A relation is in 5NF if it is in 4NF and not contains any join dependency and joining should be lossless.

So here john is teaching two subject compute and math and that for semester 1.

So, we decompose it into 3 tables to achieve 5NF.

Conclusion:-

So, main thing in normalization is that we focus on how to remove anomalies. And we also decompose table. For more details you can see Normalization.

--

--