Occasionally, a hardware failure may prevent the successful completion of a sort or merge. Examples include a physically defective output volume or device, or a failure of the operating system for reasons unrelated to the sort. Since sorts tend to consume more system resources than any other type of application, it may be advantageous to be able to resume execution just before the failure occurred rather than restart the job at the beginning of the failed job step. MFX provides this restart capability through its support of the standard z/OS Checkpoint-Restart feature.
To instruct MFX to take checkpoints, code CKPT or CHKPT (either spelling) on the SORT/MERGE control statement and supply a SORTCKPT DD statement.
For a sort, checkpoints are taken at the beginning of Phase 3 before the output data sets (if any) are opened, and at every end-of-volume of a SORTOUT data set when OUTFIL is not in use. An operator may then restart the sort at Phase 3 or at any end-of-volume checkpoint. If necessary, a new output volume or device with identical characteristics to the defective volume or device may be substituted.
For a merge or copy, MFX takes a checkpoint at every end-of-volume of a SORTOUT data set when OUTFIL is not in use.
Checkpoints cannot be taken within a user exit routine.