GATE CS - DATABASES (DBMS):Normalization
Mastering normalization concepts and implementation.
Normalization
Normalization is the process of organizing data to reduce redundancy and eliminate anomalies.
Database Anomalies
1. Insertion Anomaly
Cannot insert data without other related data.
Example: Cannot insert student without course enrollment.
2. Deletion Anomaly
Deleting data causes loss of other related data.
Example: Deleting last enrollment deletes course information.
3. Update Anomaly
Updating data requires updating multiple rows.
Example: Updating course name requires updating all enrollment rows.
Functional Dependencies
Functional Dependency (FD): X → Y means Y is functionally dependent on X.
Meaning: For each value of X, there is exactly one value of Y.
Example: SID → Name (Student ID determines Name)
Types of Functional Dependencies
Trivial FD: X → Y where Y ⊆ X
Non-trivial FD: X → Y where Y ⊄ X
Full FD: X → Y where no proper subset of X determines Y
Partial FD: X → Y where proper subset of X determines Y
Transitive FD: X → Y and Y → Z, then X → Z (transitively)
Closure of Functional Dependencies
Closure (F+): Set of all FDs that can be inferred from given FDs.
Armstrong's Axioms:
- Reflexivity: If Y ⊆ X, then X → Y
- Augmentation: If X → Y, then XZ → YZ
- Transitivity: If X → Y and Y → Z, then X → Z
Normal Forms
First Normal Form (1NF)
Rule: All attributes must be atomic (single-valued).
Example:
Not 1NF:
Student(SID, Name, Courses)
Courses: "CS101, CS102" // Not atomic
1NF:
Student(SID, Name)
Enrollment(SID, Course)
Second Normal Form (2NF)
Rule:
- Must be in 1NF
- No partial dependency (non-key attribute fully dependent on primary key)
Example:
Not 2NF:
Enrollment(SID, CID, Name, Grade)
Primary Key: (SID, CID)
FD: SID → Name // Partial dependency
2NF:
Student(SID, Name)
Enrollment(SID, CID, Grade)
Third Normal Form (3NF)
Rule:
- Must be in 2NF
- No transitive dependency (non-key attribute depends on another non-key attribute)
Example:
Not 3NF:
Student(SID, Name, Dept, DeptHead)
FD: SID → Dept, Dept → DeptHead // Transitive
3NF:
Student(SID, Name, Dept)
Department(Dept, DeptHead)
Boyce-Codd Normal Form (BCNF)
Rule:
- Must be in 3NF
- Every determinant is a candidate key
Example:
Not BCNF:
Enrollment(SID, CID, Instructor)
FD: (SID, CID) → Instructor, CID → Instructor
Candidate Key: (SID, CID)
Determinant: CID (not candidate key)
BCNF:
Enrollment(SID, CID)
Course(CID, Instructor)
Decomposition
Decomposition splits a relation into multiple relations.
Properties:
- Lossless Join: No information loss
- Dependency Preserving: All FDs preserved
Lossless Join Test:
- R1 ∩ R2 → R1 or R1 ∩ R2 → R2
GATE CS Important Points
- Functional Dependencies: Understand and identify FDs
- Normal Forms: Know 1NF, 2NF, 3NF, BCNF rules
- Anomalies: Understand insertion, deletion, update anomalies
- Decomposition: Practice lossless join decomposition
- Closure: Calculate closure of attribute sets
Practice Tips
- Identify FDs: Practice finding functional dependencies
- Check Normal Forms: Determine which normal form a relation is in
- Normalize: Convert relations to higher normal forms
- Decompose: Practice lossless join decomposition
- Previous Year Questions: Solve GATE normalization questions