Second Normal Form (2NF)
- For a table to be in the Second Normal Form, it must satisfy two conditions:
- The table should be in the First Normal Form.
- There should be no Partial Dependency.
- What is Dependency?
- Let's take an example of a Student table with columns student_id, name, reg_no(registration number), branch and address(student's home address).
- In this table, student_id is the primary key.
- It will be unique for every row.
- Hence we can use student_id to fetch any row of data from this table.
- Even for a case, where student names are the same, if we know the student_id we can easily fetch the correct record.
- So, a Primary Key (PK) for a table is the column or a group of columns(composite key) which can uniquely identify each record in the table.
- Example:
- Search for branch name of student with student_id 10.
- Search for branch name of student with student_id 11.
- Therefore, every other column depends on PK, or can be fetched using PK.
- This is called Dependency.
- What is Partial Dependency?
- Example:
- For a simple table like Student, a single column like student_id can uniquely identfy all the records in a table.
- But this is not true all the time.
- Let's create another table for Subject, which will have subject_id and subject_name fields and subject_id will be the primary key.
- Now we have a Student table with student information and another table Subject for storing subject information.
- Let's create another table Score, to store the marks obtained by students in the respective subjects.
- We will also be saving name of the teacher who teaches that subject along with marks.
- In the score table we are saving the student_id to know which student's marks are these and subject_id to know for which subject the marks are for.
- Together, student_id + subject_id forms a Candidate Key for this table, which can be the Primary key.
- Example:
- if I ask you to get me marks of student with student_id 10, can you get it from this table? No.
- Because you don't know for which subject.
- if I give you subject_id, you would not know for which student.
- Hence we need student_id + subject_id to uniquely identify any row.
- But where is Partial Dependency?
- Now if you look at the Score table, we have a column names teacher which is only dependent on the subject, for Java it's Java Teacher and for C++ it's C++ Teacher & so on.
- A primary key for this table is a composition of two columns which is student_id & subject_id.
- The teacher's name only depends on subject, hence the subject_id.
- This is Partial Dependency, where an attribute in a table depends on only a part of the primary key and not on the whole key.
- How to remove Partial Dependency?
- There can be many different solutions for this.
- Here, our objective is to remove teacher's name from Score table.
- The simplest solution is to remove columns teacher from Score table and add it to the Subject table.
- Hence, the Subject table will become:
- And our Score table is now in the second normal form, with no partial dependency.
Tags:
DBMS