File Handling


Before starting, we should remember that the mainframe environment won't accept files from external environments (e.g., Windows, Linux, etc.). The files used in the mainframe environment differ from those used in other environments, such as Windows, Linux, C, C++, etc. The mainframe environment uses the PS (Physical sequential) and VSAM files instead of simple text files.

Basic terminology is very important to understand the file concept very quickly, and the basic terms are -

Example - Let us take a simple example to understand this concept clearly.

Scenario - Let us assume we have an employee file with 5 employee records. Each record contains EMPLOYEE-ID, EMPLOYEE-NAME, EMPLOYEE-DESG, and EMPLOYEE-SALARY.

The physical employee file with records -

The file record structure or file record layout is -

01 EMPFILE-RECORD. 
   05 EMPLOYEE-ID        PIC X(05).
   05 EMPLOYEE-NAME      PIC X(15).
   05 EMPLOYEE-DESG      PIC X(10). 
   05 EMPLOYEE-SALARY    PIC 9(10). 
   05 FILLER             PIC X(40). 

Information -

Information is the meaningful data that is stored in the file. In the above example, the data is highlighted according to their meaning.

Field or Data item -

The field is the name for the meaningful information in the record. The name refers to variables in a record structure. In the above example - EMPLOYEE-ID, EMPLOYEE-NAME, EMPLOYEE-DESG, and EMPLOYEE-SALARY are the fields in the record.

The number of characters in the field is called as field size. In the above example - the EMPLOYEE-ID size is five characters.

Fields are 3 types based on their usage -

  • Primary Key Fields - These fields have unique information among all records, and strictly no duplicates are allowed. In the above example - the EMPLOYEE-ID field has a unique id for each record.
  • Alternative Key Fields - These fields have non-unique information, and duplicates are allowed. For example - the EMPLOYEE-DESG field can be used as an alternate key to get the employee record using the designation.
  • Descriptive Fields - These fields are only for informational purpose and adds meaning. In the above example - EMPLOYEE-SALARY and EMPLOYEE-NAME are descriptive fields.

Record -

A record is a collection of related fields representing as a single entity. One or more fields together form a record. In the above example, EMPLOYEE-ID, EMPLOYEE-NAME, EMPLOYEE-DESG, and EMPLOYEE-SALARY together formed the EMPFILE-RECORD.

The total size of all fields in the record is known as record size. In the above example, the record size is 80.

Physical record -

One line in the file is one physical record. In the above example, the file has five physical records.

Logical record -

A logical record is the record layout that is used by the COBOL program. Only one record can be processed at a time by the program. The logical record layout is -

01 EMPFILE-RECORD. 
   05 EMPLOYEE-ID        PIC X(05).
   05 EMPLOYEE-NAME      PIC X(15).
   05 EMPLOYEE-DESG      PIC X(10). 
   05 EMPLOYEE-SALARY    PIC 9(10). 
   05 FILLER             PIC X(40). 

File -

A file is a collection of records given a name and permanently stores the data on secondary memory (e.g., DISKs, TAPEs). Files are usually stored on DISKs in a mainframe environment.

In the above example, the physical file name is highlighted below -