The MettleCI automated unit test specification generator will handle input/output stages within local containers but not shared containers. Due to the design time information available for a job, test specifications generated for jobs using shared contains will have the inputs and outputs for the job modeled correctly but it will omit any inputs or outputs that are within shared containers. The unit test harness can handle inputs/outputs contained within containers, but we must modify the yaml by hand to include these inputs and outputs. They go in the usual place (given:
for inputs and then:
for outputs) but with names that disambiguate the links and stages referenced.
...
This is resolved by dot prefixing the container name to the front of the stage name to give a name of the form containerInvocation.stage
In the case of nested containers the outer container is prepended in front of the inner, to as many levels as necessary to model the nesting accurately.
Note: Container inputs and outputs themselves are not modeled or intercepted since they are really just connectors with no physically manifested input or output.
For example, consider this container, which has container input/output (not needed to be modeled in the yaml) and actual physical input/output as well, which we do need to intercept/test with.
...
Here is a job using the above shared container with an invocation ID of “OrderAddressC1”
...
When we generate a unit test spec from this job, the resulting yaml looks like this
...
A Unit Test Spec associates Unit Test Data defined by the path property with links within the DataStage Job under test. Links within a Job are uniquely identified by the stage and link properties:
Code Block | ||
---|---|---|
| ||
... then: - stage: |
...
sf_addressedOrders |
...
|
...
|
...
link: |
...
ln_addressed |
...
|
...
|
...
|
...
path: |
...
If we generate a unit test spec for the above monolithic job, it looks like this. As you can see all the stages are present, as expected. Note particularly stages ds_cust in the given:
and ds_flaggedCust in the when:
... these are the stages inside the container that we need to manually add to the containerized job's test spec.
Code Block |
---|
---
given:
- stage: "sf_orders"
link: "ln_filter"
path: "sf_orders-ln_filter.csv"
- stage: "ds_cust"
link: "ln_cust"
path: "ds_cust-ln_cust.csv"
when:
job: "monolithic_v1"
controller: null
parameters: {}
then:
- stage: "ds_flaggedCust"
link: "ln_flagged"
path: "ds_flaggedCust-ln_flagged.csv"
cluster: null
ignore: null
- stage: "sf_samples"
link: "ln_samples"
path: "sf_samples-ln_samples.csv"
cluster: null
ignore: null
- stage: "sf_addressedOrders"
link: "ln_addressed"
path: "sf_addressedOrders-ln_addressed.csv"
cluster: null
ignore: null
- stage: "sf_summary"
link: "ln_summary"
path: "sf_summary-ln_summary.csv"
cluster: null
ignore: null |
Note: We do not need to always “undo” containers to derive the needed stage names, but it can make things clearer for understanding how to add the missing inputs/outputs your first few times.
...
Code Block |
---|
---
given:
- stage: "sf_orders"
link: "ln_filter"
path: "sf_orders-ln_filter.csv"
- stage: "OrderAddressC1.ds_cust"
link: "ln_cust"
path: "OrderAddressC1-ds_cust-ln_cust.csv"
when:
job: "processOrders"
controller: null
parameters: {}
then:
- stage: "OrderAddressC1.ds_flaggedCust"
link: "ln_flagged"
path: "OrderAddressC1-ds_flaggedCust-ln_flagged.csv"
cluster: null
ignore: null
- stage: "sf_samples"
link: "ln_samples"
path: "sf_samples-ln_samples.csv"
cluster: null
ignore: null
- stage: "sf_addressedOrders"
link: "ln_addressed"
path: "sf_addressedOrders-ln_addressed.csv"
cluster: null
ignore: null
- stage: "sf_summary"
link: "ln_summary"
path: "sf_summary-ln_summary.csv"
cluster: null
ignore: null |
Lines 6-8 (contrast these with lines 6-8 in the “monolithic” yaml, the only significant difference is that OrderAddressC1 has been prepended to ds_cust. The path (csv file name) can be anything but was chosen to enhance correlation)
Code Block |
---|
- stage: "OrderAddressC1.ds_cust"
link: "ln_cust"
path: "OrderAddressC1-ds_cust-ln_cust.csv" |
and 14-18 (contrast these with lines 6-8 in the “monolithic” yaml, the only significant difference is that OrderAddressC1 has been prepended to ds_flaggedCust. The path (csv file name) can be anything but was chosen to enhance correlation)
Code Block |
---|
- stage: "OrderAddressC1.ds_flaggedCust"
link: "ln_flagged"
path: "OrderAddressC1-ds_flaggedCust-ln_flagged.csv"
cluster: null
ignore: null |
...
orders.csv |
Local and Shared Containers complicate this as Stage names in DataStage are only unique within a given Job or Local/Shared Container. Consider writing a Unit Test Spec for the following Job and Shared Container:
Code Block |
---|
given:
- stage: sqAccounts
link: inAccounts
path: GivenAccounts.csv
when:
...
then:
- stage: sqAccounts
link: out
path: ExpectedAccounts.csv |
The resulting Unit Test Spec is ambiguous because the Unit Test Harness will not be able to uniquely identify which Unit Test Data file is associated with each sqAccounts
stage. To avoid these sort of issues, the stage properties within Unit Test Specs expect fully qualified stage names. A fully qualified stage name is the stage name prefixed with all parent Local/Shared Container names using the following format <container name>.<stage name>
.
Code Block |
---|
given:
- stage: sqAccounts
link: inAccounts
path: GivenAccounts.csv
when:
...
then:
- stage: ContainerC1.sqAccounts
link: out
path: ExpectedAccounts.csv |
Since the output sqAccounts
stage is within ContainerC1
, its full qualified stage name is ContainerC1.sqAccounts
and the Unit Test Spec is no longer ambiguous (line 8). When working with shared containers, the <container name>
within a fully qualified stage name refers to the stage name in the parent Job (ContainerC1
) rather than the Shared Container itself (scWriteAccounts
).
The MettleCI automated Unit Test Spec generator will handle input/output stages within local containers but not Shared Containers. Due to the design time information available for a job, Unit Test Specs generated for jobs using shared containers will have the inputs and outputs for the job modeled correctly but it will omit any inputs or outputs that are within Shared Containers. The unit test harness can handle inputs/outputs contained within local and shared containers, but inputs and outputs within shared containers need to be manually added to Unit Test Specs.
When Unit Testing stages within Local and/or Shared Containers, developers should be aware of the following requirements and constraints:
Stages in multiple levels of containers can be defined using
<container name>.<container name>.<container name>.<stage name>
where the left-most container name is the top level container, the next container name is the second level and so on.When working with Shared Containers, the
<container name>
is the name of the Stage on the parent canvas rather than the Shared Container name itself.Input
andOutput
container stages do not exist at runtime and can’t be stubbed during MettleCI Unit Testing.The Unit Test Spec and Unit Test Harness make no distinction between Shared and Local Containers. Therefore the use of Local and Shared Containers can be used interchangeably or even mixed within a single Job.