Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

At present the MettleCI automated unit test specification generator does not completely handle input/output stages within shared or local containers. The yaml specification it generates will have the inputs and outputs for the job modeled correctly but it will omit any inputs or outputs that are within containers. The unit test harness can handle inputs/outputs contained within containers, but we must modify the yaml by hand to include these inputs and outputs. They go in the usual place (given: for inputs and then:for outputs) but with names that disambiguate the links and stages referenced.

...

This is resolved by dot prefixing the container name to the front of the stage name to give a name of the form containerInvocation.stage In the case of nested containers the outer container is prepended in front of the inner, to as many levels as necessary to model the nesting accurately.

Note: Container inputs and outputs themselves are not modeled or intercepted since they are really just connectors with no physically manifested input or output.

For example, consider this container, which has container input/output (not needed to be modeled in the yaml) and actual physical input/output as well, which we do need to intercept/test with.

...

Here is a job using the above shared container with an invocation ID of “OrderAddressC1”

...

When we generate a unit test spec from this job, the resulting yaml looks like this

Code Block
---
given:
- stage: "sf_orders"
  link: "ln_filter"
  path: "sf_orders-ln_filter.csv"
when:
  job: "processOrders"
  controller: null
  parameters: {}
then:
- stage: "sf_samples"
  link: "ln_samples"
  path: "sf_samples-ln_samples.csv"
  cluster: null
  ignore: null
- stage: "sf_addressedOrders"
  link: "ln_addressed"
  path: "sf_addressedOrders-ln_addressed.csv"
  cluster: null
  ignore: null
- stage: "sf_summary"
  link: "ln_summary"
  path: "sf_summary-ln_summary.csv"
  cluster: null
  ignore: null

...

If we generate a unit test spec for the above monolithic job, it looks like this. As you can see all the stages are present, as expected. Note particularly stages ds_cust in the given: and ds_flaggedCust in the when: ... these are the stages inside the container that we need to manually add to the containerized job's test spec.

Code Block
---
given:
- stage: "sf_orders"
  link: "ln_filter"
  path: "sf_orders-ln_filter.csv"
- stage: "ds_cust"
  link: "ln_cust"
  path: "ds_cust-ln_cust.csv"
when:
  job: "monolithic_v1"
  controller: null
  parameters: {}
then:
- stage: "ds_flaggedCust"
  link: "ln_flagged"
  path: "ds_flaggedCust-ln_flagged.csv"
  cluster: null
  ignore: null
- stage: "sf_samples"
  link: "ln_samples"
  path: "sf_samples-ln_samples.csv"
  cluster: null
  ignore: null
- stage: "sf_addressedOrders"
  link: "ln_addressed"
  path: "sf_addressedOrders-ln_addressed.csv"
  cluster: null
  ignore: null
- stage: "sf_summary"
  link: "ln_summary"
  path: "sf_summary-ln_summary.csv"
  cluster: null
  ignore: null

Note: We do not need to always “undo” containers to derive the needed stage names, but it can make things clearer for understanding how to add the missing inputs/outputs your first few times.

...

Code Block
---
given:
- stage: "sf_orders"
  link: "ln_filter"
  path: "sf_orders-ln_filter.csv"
- stage: "OrderAddressC1.ds_cust"
  link: "ln_cust"
  path: "OrderAddressC1-ds_cust-ln_cust.csv"
when:
  job: "processOrders"
  controller: null
  parameters: {}
then:
- stage: "OrderAddressC1.ds_flaggedCust"
  link: "ln_flagged"
  path: "OrderAddressC1-ds_flaggedCust-ln_flagged.csv"
  cluster: null
  ignore: null
- stage: "sf_samples"
  link: "ln_samples"
  path: "sf_samples-ln_samples.csv"
  cluster: null
  ignore: null
- stage: "sf_addressedOrders"
  link: "ln_addressed"
  path: "sf_addressedOrders-ln_addressed.csv"
  cluster: null
  ignore: null
- stage: "sf_summary"
  link: "ln_summary"
  path: "sf_summary-ln_summary.csv"
  cluster: null
  ignore: null

Lines 6-8

Code Block
- stage: "OrderAddressC1.ds_cust"
  link: "ln_cust"
  path: "OrderAddressC1-ds_cust-ln_cust.csv"

and 14-18

Code Block
- stage: "OrderAddressC1.ds_flaggedCust"
  link: "ln_flagged"
  path: "OrderAddressC1-ds_flaggedCust-ln_flagged.csv"
  cluster: null
  ignore: null

...

The structure of a MettleCI Unit Test Specification (“Spec”) is modelled loosely on the Gherkin syntax of a testing tool called Cucumber. It associates Unit Test data, which are stored in CSV files identified by the path property, with each of your Job’s input and output links, which are identified by the stage and link properties of your Spec. The Given clause defines the Unit Test data for your Job’s input links and the Then clause for the output links.

Example

Here's a simple example:

Job design

Image Added

Unit test Specification

Code Block
languageyaml
given:
  - stage: sqInput
    link: in
    path: given.csv
...
then:
  - stage: sqOutput
    link: out
    path: expected.csv

Logical view

Image Added

Local and Shared Containers complicate this as Stage names in DataStage are only unique within a given Job or Local/Shared Container.

Consider writing a Unit Test Spec for the following Job MyJob which includes a Shared Container stage ContainerC1 which is a reference to the Shared Container scWriteAccounts:

Job MyJob

Image Added

Unit Test Specification

Code Block
languageyaml
given:
  - stage: sqAccounts
    link: inAccounts
    path: GivenAccounts.csv
when:
...
then:
  - stage: sqAccounts
    link: out
    path: ExpectedAccounts.csv


Shared Container scWriteAccounts

Image Added

The resulting Unit Test Spec is ambiguous because the MettleCI Unit Test Harness will not be able to uniquely identify which Unit Test Data file is associated with each sqAccounts stage. To avoid these sort of issues, the stage properties within Unit Test Specs expect fully qualified stage names. A fully qualified stage name is prefixed with any relevant parent Container names using the format <container name>.<stage name>.

Here’s an example of a fully qualified stage name:

Job MyJob

Image Added

Unit Test Specification

Code Block
languageyaml
given:
  - stage: sqAccounts
    link: inAccounts
    path: GivenAccounts.csv
when:
...
then:
  - stage: ContainerC1.sqAccounts
    link: out
    path: ExpectedAccounts.csv

Shared Container scWriteAccounts

Image Added

Since the output sqAccounts stage is within ContainerC1 its full qualified stage name is ContainerC1.sqAccounts (line 8) and the Unit Test Spec is no longer ambiguous. When working with Shared Containers the <container name> within a fully qualified stage name refers to the stage name in the parent Job (ContainerC1) rather than the Shared Container itself (scWriteAccounts).

Share Containers

...

When automatically generating a Unit Test Spec MettleCI will correctly generate the references to input and output stages within Local Containers, including where input or output stages are defined within those Local Containers.

Automatically-generated Unit Test Specs for Jobs which feature one or more Shared Containers will also be modelled correctly with the exception that they will omit any inputs or output stages defined within any constituent Shared Containers. Unfortunately the design time Job information available from DataStage does not provide enough information for MettleCI to adequately identify and complete the generation of Unit Test Specs where a Share Container includes a stage which needs to be referenced by the Spec.

Once a MettleCI Unit Test Spec is adequately configured (with correct Shared Container references, if required) then the MettleCI Unit Test Harness can correctly handle input and output stages within Local and Shared Containers.

Summary

When Unit Testing stages within Local and/or Shared Containers, developers should be aware of the following requirements and constraints:

  • When working with Local Containers the <container name> is the name of the Stage on the parent canvas.

  • When working with Shared Containers, the <container name> is also the name of the Stage on the parent canvas rather than the name of the Shared Container itself.

  • Stages inside multi-level nested containers can be defined using <container name>.<container name>.<container name>.<stage name> where the left-most container name is the top level container, the next container name is the second level and so on.

  • Input and Output container stages do not exist at runtime and can’t be referenced in a MettleCI Unit Test Spec. They can simply be ignored.

  • The Unit Test Spec and Unit Test Harness make no distinction between Shared and Local Containers. Therefore the use of Local and Shared Containers can be used interchangeably or even mixed within a single Job.