Job Decomposition
Splitting Synchronisation Stages
Server ensures all ‘writes’ (input pins) to passive stages are completed before starting any ‘reads’ (output pins). This effective introduces into a Job a set of synchronisation points (to borrow terminology from parallel computing.)
Parallel does not support the equivalent behaviour so S2PX decomposes the job into small pieces who's execution is explicitly synchronised by a parent Job Sequence.
As always, there are exceptions to this rule:
InterProcess stages are treated as Active stages even though they are technically Passive.
Shared Container references are classified as Synchronisation stages if the referenced shared contain includes one or my synchronisation stages.
Splitting Server Synchronization Processes
Unlike Parallel Jobs, a single input or output stage can act as both an input and output within the same Server Job. When a Server Job contains such a design pattern the Server Engine will ‘synchronize’ execution so that any load processes (handling the data arriving on the stage’s input links) are completed before performing the read processes (which place data on the stage’s output links). For example, consider the following job design:
The processes used by the Server Job will be synchronized such that the stages up to and including the write operation of the first ODBC stage (ODBC_0
above) are fully executed before starting the ODBC read operation and all of its successive stages to the right. To convert this job design to run on the Parallel Engine S2PX will create a Parallel Job representing each of the separate sub-jobs, along with a Job Sequence which runs each job in to mimic the synchronization provided by the Server Engine:
Splitting Server Synchronization Processes within Containers
A similar challenge exists for containers (whether local or shared) that include stages that act as both a source and target. The same principles apply here but we need to split the containers:
This Server Job design would be converted to this Parallel equivalent:
New Parameters
If a Server jobs makes use of DataStage Job Status Macros (IBM Documentation) then decomposition will mean that the use of some DataStage Macros will no longer produce the same output. For this reason the generated Job Sequence captures these values and passes them to the child jobs as newly-introduced parameters. Any existing use o the following macros are automatically updated to use these new parameters:
Job Name | Value Expression |
---|---|
SV_JobName | DSJobName |
SV_JobController | DSJobController |
SV_JobStartDate | DSJobStartDate |
SV_JobStartTime | DSJobStartTime |
SV_JobStartTimestamp | DSJobStartTimestamp |
© 2015-2024 Data Migrators Pty Ltd.