Most DataStage jobs can be tested via MettleCI’s Unit Testing function simply by replacing input and output stages. However, some job designs - while commonplace - will necessitate a more advanced Unit Testing configuration.

The sections below outline MettleCI Unit Test Spec patterns that best match these job designs.

Table of Contents

Input stage with rejects

The Input stage can be Unit Tested by including both read and reject links in the given clause of the Unit Test Spec.

The CSV data specified for the rejects link should contain records that will actually test the flow of records through the reject path(s) of the job.

Info
If you use MettleCI’s automated Unit Test creation function for a Job that needs this Unit Testing pattern, MettleCI will ensure the resulting Spec in that new Unit Test reflects the above approach.

Image Removed

Code Block

language	yaml

given:
  - stage: Input
    link: Read
    path: Input-Read.csv
  - stage: Input
    link: Rejects
    path: Input-Rejects.csv
when:
...

Image Removed

Output stage with rejects

The output stage can be Unit Tested by including

the write link in the then clause of the Unit Test Spec; and
the reject in the given clause of the Unit Test Spec.

The CSV data specified for the rejects link should contain records that will actually test the flow of records through the reject path(s) of the job.

Info
If you use MettleCI’s automated Unit Test creation function for a Job that needs this Unit Testing pattern, MettleCI will ensure the resulting Spec in that new Unit Test reflects the above approach.

Image Removed

Code Block

language	yaml

given:
  - stage: Output
    link: Rejects
    path: Output-Rejects.csv
when:
...
then:
  - stage: Output
    link: Write
    path: Output-Write.csv

Image Removed

Stored Procedure stage

A Stored Procedure Stage will not only connect to an external Database for processing but it will also produce output records which are not deterministic. MettleCI’s Unit Test function needs to be made aware of the Stored Procedure Stage to be replaced by unit test data during unit test execution. This is done by adding the input link to the then clause of the Unit Test Spec and the output link in the given clause of the Unit Test Spec.

The CSV input specified by the given clause contains the data that will become the flow of records from the Stored Procedure stage. The data could simulate what would be produced by the real stored procedure if it had processed the Unit Test input records, however they don’t have to.

...

Code Block

language	yaml

given:
  - stage: StoredProcedure
    link: Output
    path: StoredProcedure-Output.csv
when:
...
then:
  - stage: StoredProcedure
    link: Input
    path: StoredProcedure-Input.csv

Image Removed

Classic Surrogate Key Generator stage

The classic Surrogate Key Generator stage will generate sequential keys from a given start value (typically set via a Job Parameter). To ensure that values generated by the Surrogate Key Generator stage remain the same between each execution of Unit Testing, add a fixed value for the start value Job Parameter in the when clause of the Unit Test Spec.

Image Removed

Code Block

language	yaml

given:
...
when:
  job: KeyGeneratorExample
  parameters:
     START_KEY: 100
then:
...

Database or Flat File-backed Surrogate Key Generator stage

Surrogate Key Generators backed by a Database or a Flat File will produce output records which are not deterministic. The use of a Database-backed Surrogate Key Generator will also require a live connection to an external Database which is not ideal for Unit Testing. To Unit Test job designs containing this type of Surrogate Key Generator, the Surrogate Key Generator stage needs to be removed from the job and replaced with Unit Test Data. This is done by adding the input link in the then clause of the Unit Test Spec and the output link in the given clause of the Unit Test Spec.

The CSV input specified by the given clause contains the data that will become the flow of records from the Surrogate Key Generator stage. The data could simulate what would be produced by the real Surrogate Key Generator if it had processed the Unit Test input records, however it doesn’t have to. The easiest way to simulate the Surrogate Key Generator output records using MettleCI Workbench would be to copy the CSV specified in the then clause, add a new column to represent the generated key and set appropriate key values.

Image Removed

Code Block

language	yaml

given:
  - stage: KeyGenerator
    link: Output
    path: KeyGenerator-Output.csv
when:
...
then:
  - stage: KeyGenerator
    link: Input
    path: KeyGenerator-Input.csv

Image Removed

Sparse Lookup stage

When building DataStage jobs using the Lookup stage, performing a Sparse or Normal lookup is as simple as changing the lookup type of the reference Database stage. However, when a DataStage job is compiled to OSH and executed, the Lookup stage is not used to perform the sparse lookup. Instead, the Lookup is replaced with the Database operator which is responsible for reading input rows, looking up values from the database and producing output records. It is for this reason that all Database log messages in the DataStage Director are attributed to the lookup stage and why the Database stage never appears in the Monitor of the DataStage Director.

Image Removed

It is not possible for the MettleCI Unit Test feature to change the lookup from Sparse to Normal without fundamentally transforming the run-time job design. Doing so would invalidate any Unit Test results, defeating the purpose of this MettleCI function. To Unit Test job designs using Sparse lookups, add the input link in the then clause of the Unit Test Spec and the output link in the given clause of the Unit Test Spec.

The CSV input specified by the given clause contains the data that will become the flow of records from the Sparse Lookup stage. The data could simulate what would be produced by the real Sparse Lookup Stage if it had processed the Unit Test input records, however they don't have to.

Image Removed

Code Block
given: - stage: SparseLookup link: Output path: SparseLookup-Output.csv when: ... then: - stage: SparseLookup link: Input path: SparseLookup-Input.csv

Image Removed

Alternative approach for testing Sparse-Lookup-heavy jobs

For jobs where the vast majority of job logic is implemented using Sparse Lookup stages, replacing all lookups with Unit Test data would result in little-to-no DataStage logic being tested (as illustrated below).

Image Removed

For this type of Job design, an alternative testing approach is to leave the Sparse Lookup in place and replace the input and output stages with Unit Test data. A live Database connection will be required during testing but the when clause can be used to set job parameters that dictate database connection settings.

Note
Technically this is an Integration Test, not a Unit Test: The Unit Test Harness does not provide any functionality for populating database reference tables with Unit Test data prior to test execution, users are responsible for managing Integration Test setup and tear down through governance and/or CI/CD pipeline customisation.

Image Removed

Code Block

language	yaml

given:
  - stage: Source
    link: Input
    path: Source-Output.csv
when:
  parameters:
    DbName: MyUnitTestDb
    DbUser: MyUnitTestUser
    DbPass: {iisenc}dKANDae233DJIQeiidm==
then:
  - stage: Taret
    link: Output
    path: Target-Output.csv

...

Child pages (Children Display)

all	true

Versions Compared

Old Version 16

New Version Current

Key

Input stage with rejects

Output stage with rejects

Stored Procedure stage

Classic Surrogate Key Generator stage

Database or Flat File-backed Surrogate Key Generator stage

Sparse Lookup stage

Alternative approach for testing Sparse-Lookup-heavy jobs

Page Comparison

Versions Compared

Old Version 16

New Version Current

Key

Input stage with rejects

Output stage with rejects

Stored Procedure stage

Classic Surrogate Key Generator stage

Database or Flat File-backed Surrogate Key Generator stage

Sparse Lookup stage

Alternative approach for testing Sparse-Lookup-heavy jobs