Normal forms:
- Normalization divides larger tables into smaller tables and links them using relationships.
- The purpose of Normalization is to eliminate redundant (useless) data and ensure data is stored logically.
- Normalization is a process of organizing the data in the database to avoid data redundancy, insertion anomaly, update anomaly & deletion anomaly.
- Normalization is used for mainly two purposes,
- Eliminating redundant(useless) data.
- Ensuring data dependencies make sense i.e data is logically stored.
- There are three types of anomalies that occur when the database is not normalized.
- These are – Insertion, update and deletion anomaly.
- Insert anomaly: Suppose a new employee joins the company, who is under training and currently not assigned to any department then we would not be able to insert the data into the table if emp_dept field doesn’t allow nulls.
- Update anomaly: In the above table, we have two rows for employee Rick as he belongs to two departments of the company. If we want to update the address of Rick then we have to update the same in two rows or the data will become inconsistent.
- Delete anomaly: Suppose, if at a point of time the company closes the department D890 then deleting the rows that are having emp_dept as D890 would also delete the information of employee Maggie since she is assigned only to this department.
- To overcome these anomalies we need to normalize the data.
- Normalization rules are divided into the following normal forms:
- First normal form(1NF)
- Second normal form(2NF)
- Third normal form(3NF)
- Boyce & Codd normal form (BCNF)
- Fourth Normal Form (4NF)
- Fifth Normal Form (5NF)
- First Normal Form (1NF)
- For a table to be in the First Normal Form, it should follow the following 4 rules:
- It should only have single(atomic) valued attributes/columns.
- Values stored in a column should be of the same domain.
- All the columns in a table should have unique names.
- And the order in which data is stored does not matter.
- Second Normal Form (2NF)
- For a table to be in the Second Normal Form,
- It should be in the First Normal form.
- And, it should not have Partial Dependency.
- Third Normal Form (3NF)
- A table is said to be in the Third Normal Form when,
- It is in the Second Normal form.
- And, it doesn't have Transitive Dependency.
- Boyce and Codd Normal Form (BCNF)
- Boyce and Codd Normal Form is a higher version of the Third Normal form.
- This form deals with certain type of anomaly that is not handled by 3NF.
- A 3NF table which does not have multiple overlapping candidate keys is said to be in BCNF. For a table to be in BCNF, following conditions must be satisfied:
- R must be in 3rd Normal Form
- and, for each functional dependency ( X → Y ), X should be a super Key.
- Fourth Normal Form (4NF)
- A table is said to be in the Fourth Normal Form when,
- It is in the Boyce-Codd Normal Form.
- And, it doesn't have Multi-Valued Dependency.
- Fifth Normal Form (5NF)
- A table is said to be in the Fifth Normal Form when,
- It is in the Fourth Normal Form.
- And, it doesn't have non-loss decomposed (Join dependency).
Tags:
DBMS