Data Integrity
Data Integrity refers to the accuracy, consistency, and reliability of data stored in a database. In DB2, ensuring data integrity means that the data remains correct, unchanged (unless intended), and trustworthy throughout its lifecycle—whether during storage, processing, or retrieval.
Data integrity is crucial for preventing data corruption, redundancy, or invalid data from entering the system. DB2 ensures data integrity by using constraints, rules, and relationships to maintain a valid database state even in multi-user environments.
Types -
DB2 supports four primary types of data integrity:
- Entity Integrity
- Referential Integrity
- Domain Integrity
- User-Defined Integrity
Entity Integrity -
Entity Integrity ensures that every table row is uniquely identifiable by using a primary key. A primary key is a column (or set of columns) that uniquely identifies each row in a table and cannot contain NULL values.
Example - Consider the Employees table:
Employee_ID | Name | Department |
---|---|---|
101 | Alice | HR |
102 | Bob | Finance |
- Primary Key: Employee_ID
- Why Important?: No two employees can have the same ID, and every employee must have an ID (no NULL values allowed).
Violation Example: If another row is inserted with Employee_ID = 101, the system will reject the operation since it would violate entity integrity by creating duplicate primary keys.
Referential Integrity -
Referential Integrity ensures that the relationships between tables remain valid by using foreign keys. A foreign key in one table refers to the primary key of another table, creating a parent-child relationship between the two tables. Referential integrity ensures that data in the child table always has a valid reference in the parent table.
Example - Consider two tables: Employees and Departments.
Employees table:
Employee_ID | Name | Department_ID |
---|---|---|
101 | Alice | 1 |
102 | Bob | 2 |
Departments table:
Department_ID | Department_Name |
---|---|
1 | HR |
2 | Finance |
- Foreign Key: Department_ID in the Employees table references the Department_ID in the Departments table.
- Why Important?: Referential integrity ensures that every employee is assigned to a valid department.
Violation Example: If an employee is added with Department_ID = 3 but no such department exists in the Departments table, DB2 will reject the operation.
Domain Integrity -
Domain Integrity ensures that values in a column are valid and consistent by enforcing data types, constraints, and rules. It defines the acceptable range, format, or type of values for each column.
Example - Consider the Products table:
Product_ID | Product_Name | Price |
---|---|---|
1 | Laptop | 50000 |
2 | Mouse | 1000 |
- Price Constraint: The Price column can only store positive numbers.
- Data Type: The Product_ID column must store integers.
Violation Example: If you try to insert a row with Product_ID = 'ABC', DB2 will reject the operation because the column only allows integers.
User-Defined Integrity -
User-Defined Integrity involves custom business rules and logic that ensure the data meets specific business requirements. These rules are not predefined in DB2 but can be implemented using triggers, stored procedures, or application logic.
Example - n a payroll system, a rule might state: "An employee's bonus cannot exceed 20% of their salary". You can implement this rule using a trigger:
CREATE TRIGGER bonus_check BEFORE INSERT OR UPDATE ON Salaries FOR EACH ROW WHEN (NEW.Bonus > (NEW.Salary * 0.2)) SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'Bonus cannot exceed 20% of salary';
- Why Important?: User-defined integrity ensures that data follows custom rules specific to the business context.
Violation Example: If a user tries to insert a bonus greater than 20% of the salary, the system will reject the operation.