Aggregator Stage
Summary
The Aggregator Stage classifies data rows from a single input link into groups and computes totals or other aggregate functions for each group.
IBM Documentation
Server Stage: Aggregator Stages - IBM Documentation
Parallel Stage: Aggregator stage - IBM Documentation
Conversion Notes
The Server Aggregator doesn’t mandate the specification of a group key. If the group key is omitted all rows are treated as a single key group and the stage will return a single row for all input data. The Parallel Aggregator stage does require the specification of at least one group key column, however. To address this the translated Parallel Aggregator stage will (in this event) insert an additional dummy column (with a constant value) to use as the key.
IBM’s documentation states that the order of aggregation results is not guaranteed in Server jobs. The generated Parallel jobs may also exhibit this characteristic so users should be cautious that their job designs are not relying on ‘undefined’ behaviour. In testing we have seen the generated Parallel Aggregator solutions behave consistently with the Server version.
Structural changes
First and Last aggregation types require custom handling with additional functions, transforms and sort functions before the actual aggregation takes place.
Count Rows is also replaced with a simple sum operation: A preceding Transformer stage introduces a dummy column with a constant value of 1 which is summed.
By way of example, this Server job:
…is translated to this Parallel job:
See Parallel Job Structural Differences for more details of how you job’s structure may alter during conversion.
Server features not supported
Feature | Asset Query (?) | Comment |
---|---|---|
All features supported |
© 2015-2024 Data Migrators Pty Ltd.