Joining records
Joining records
The JOINKEYS statement is used to join two files based on matching key fields. It is similar to performing a JOIN operation, where records from two datasets are combined when they share common values in coded key fields.
JOINKEYS allows users to perform various types of joins, including -
- Inner Join: Includes only records that have matching keys in both files.
- Left Outer Join (JOIN UNPAIRED,F1): Includes all records from the first file and only matching records from the second file.
- Right Outer Join (JOIN UNPAIRED,F2): Includes all records from the second file and only matching records from the first file.
- Full Outer Join (JOIN UNPAIRED,F1,F2): Includes all records from both files, regardless of matching keys.
Syntax -
//SYSIN DD *
JOINKEYS FILE=F1,FIELDS=(…),[SORTED],
JOINKEYS FILE=F2,FIELDS=(….),[SORTED],
JOIN UNPAIRED,F1,[F2,ONLY]
REFORMAT FIELDS=(…)
OPTION COPY
OUTFIL FNAME=BOTH,INCLUDE/OMIT COND=(…),BUILD=(..)
OUTFIL FNAME=F1ONLY,INCLUDE/OMIT COND=(…),BUILD=(..)
OUTFIL FNAME=F2ONLY,INCLUDE/OMIT COND=(…),BUILD=(..)
/*
F1 | Specifies file1 |
F2 | Specifies file2 |
JOIN UNPAIRED | Used to specify whether unpaired records (i.e., records without a match in the other dataset) should be included. |
SORTED | Specifies the records are already sorted and do a copy only. |
REFORMAT FIELDS | Specifies to reformat the records from both files. |
Points to Note -
- The two files can be of different types, lengths and have keys in different locations.
- The records from input files can be processed in several ways before and after they are joined together.
- DFSORT control statements are allowed to be added with JOINKEYS including SORT or COPY, INCLUDE or OMIT, INREC, OUTREC, and OUTFIL.
- The first JOINKEYS statement identifies SORTJNF1 as the DDname for the F1 data set and indicates ascending keys. The second JOINKEYS statement identifies SORTJNF2 as the DDname for the F1 data set and indicates ascending keys.
Examples -
Scenario1 - Left Outer Join.
//STEP01 EXEC PGM=SORT
//JOIN1 DD DSN=MATEPK.INPUT.FILE1,DISP=SHR
//JOIN2 DD DSN=MATEPK.INPUT.FILE2,DISP=SHR
//SORTOUT DD DSN=MATEPK.OUTPUT.FILE,
// DISP=(NEW,CATLG,DELETE),
// SPACE=(CYL,(100,10)),UNIT=SYSDA
//SYSIN DD *
JOINKEYS F1=JOIN1,FIELDS=(1,5,CH,A)
JOINKEYS F2=JOIN2,FIELDS=(1,5,CH,A)
REFORMAT FIELDS=(F1:1,10,F2:11,10)
SORT FIELDS=(1,5,CH,A)
JOIN UNPAIRED,F1
/*
This specifies a left outer join, meaning all records from the first file (JOIN1) will be included in the output, even if no matching records are found in the second file(JOIN2).
Scenario2 - Full Outer Join.
//STEP01 EXEC PGM=SORT
//JOIN1 DD DSN=MATEPK.INPUT.FILE1,DISP=SHR
//JOIN2 DD DSN=MATEPK.INPUT.FILE2,DISP=SHR
//SORTOUT DD DSN=MATEPK.OUTPUT.FILE,
// DISP=(NEW,CATLG,DELETE),
// SPACE=(CYL,(100,10)),UNIT=SYSDA
//SYSIN DD *
JOINKEYS F1=JOIN1,FIELDS=(1,5,CH,A)
JOINKEYS F2=JOIN2,FIELDS=(1,5,CH,A)
REFORMAT FIELDS=(F1:1,10,F2:11,10)
SORT FIELDS=(1,5,CH,A)
JOIN UNPAIRED,F1,F2
/*
This specifies a full outer join, meaning all records from both datasets (JOIN1 and JOIN2) will be included, even if no match is found in the other dataset.