Document toolboxDocument toolbox

Performance of SX2PX-Generated Parallel Jobs

Your S2PX-generated Parallel Jobs may take longer to execute than the Server Jobs from which they were derived. There are a number of potential reasons for this…

  • As you’ll already be aware, the startup time for Parallel Jobs is longer than for Server Jobs, meaning low-volume jobs could behave slower. Higher volume Jobs, however, are likely to perform better due to the inherent performance benefits provided by the Parallel engine.

  • For each Server job S2PX could potentially generate multiple Parallel jobs (each with an unavoidable startup overhead), along with a coordinating Job Sequence. This decomposition is unavoidable due to stage synchronisation.

  • Hashed files being replaced by the DRS stage means bulk record read could be slowed, but lookups (when utilising database indices) could be faster.

  • All S2PX-generated jobs use Parallel stages which are defined as running in Sequential mode as there’s no way S2PX can infer the data necessary to generate partitioning specifications. This is easily remediated (in the generated Parallel jobs) by Developers, where there is a business case to do so.

We recommend your first attempts at improving your Parallel Jobs' performance focus on this last topic (partition parallelism). Providing your jobs with details of the keys they can use for partitioning is likely to yield the most significant performance improvements for the time invested.

 

 

© 2015-2024 Data Migrators Pty Ltd.