Document toolboxDocument toolbox

Aggregator Stage

Summary

The Aggregator Stage classifies data rows from a single input link into groups and computes totals or other aggregate functions for each group.

IBM Documentation

Server Stage: Aggregator Stages - IBM Documentation

Parallel Stage: Aggregator stage - IBM Documentation

Conversion Notes

  • The Server Aggregator doesn’t mandate the specification of a group key. If the group key is omitted all rows are treated as a single key group and the stage will return a single row for all input data. The Parallel Aggregator stage does require the specification of at least one group key column, however. To address this the translated Parallel Aggregator stage will (in this event) insert an additional dummy column (with a constant value) to use as the key.

  • IBM’s documentation states that the order of aggregation results is not guaranteed in Server jobs. The generated Parallel jobs may also exhibit this characteristic so users should be cautious that their job designs are not relying on ‘undefined’ behaviour. In testing we have seen the generated Parallel Aggregator solutions behave consistently with the Server version.

Structural changes

First and Last aggregation types require custom handling with additional functions, transforms and sort functions before the actual aggregation takes place.

Count Rows is also replaced with a simple sum operation: A preceding Transformer stage introduces a dummy column with a constant value of 1 which is summed.

By way of example, this Server job:

Server Job

…is translated to this Parallel job:

Parallel Job - Top level

 

See Parallel Job Structural Differences for more details of how you job’s structure may alter during conversion.

Server features not supported

Feature

Asset Query (?)

Comment

Feature

Asset Query (?)

Comment

All features supported

© 2015-2024 Data Migrators Pty Ltd.