SUM Suppressing duplicate records
The SUM statement is used to suppress duplicate records by keeping only one occurrence of records that have identical key values. This is useful when removing duplicates from a dataset without performing any summing operations on the fields.
Syntax -
//SYSIN DD *
SORT FIELDS=...
SUM FIELDS=NONE
/*
SUM FIELDS=NONE | Specifies that no fields are summed and only one occurrence of each duplicate record is retained. This eliminates duplicate records based on the sorting keys defined. |
Examples -
Scenario1 - Suppressing duplicates based on a key field.
//SYSIN DD *
SORT FIELDS=(1,10,CH,A)
SUM FIELDS=NONE
/*
Sorts the records based on the first 10 characters in ascending order. It suppresses duplicate records based on the key (first 10 bytes), keeping only one occurrence of each unique key.
Scenario2 - Suppressing duplicates for a subset of fields.
//SYSIN DD *
SORT FIELDS=(1,5,CH,A,10,3,CH,A)
SUM FIELDS=NONE
/*
Sorts records based on the first 5 characters and the 3-character field starting at position 10. It suppresses duplicate records based on the combination of the first 5 bytes and the 3-byte field starting at position 10, keeping only one occurrence for each unique combination.