Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This is resolved by dot prefixing the container name to the front of the stage name to give a name of the form containerInvocation.stage In the case of nested containers the outer container is prepended in front of the inner, to as many levels as necessary to model the nesting accurately.

...

Note: Container inputs and outputs themselves are not modeled or intercepted since they are really just connectors with no physically manifested input or output.

For example, consider this container, which has container input/output (not needed to be modeled in the yaml) and actual physical input/output as well, which we do need to intercept/test with.

...

Here is a job using itthe above shared container with an invocation ID of “OrderAddressC1”

...

When we generate a unit test spec from this job, the resulting yaml looks like this

Code Block
---
given:
- stage: "sf_orders"
  link: "ln_filter"
  path: "sf_orders-ln_filter.csv"
when:
  job: "processOrders"
  controller: null
  parameters: {}
then:
- stage: "sf_samples"
  link: "ln_samples"
  path: "sf_samples-ln_samples.csv"
  cluster: null
  ignore: null
- stage: "sf_addressedOrders"
  link: "ln_addressed"
  path: "sf_addressedOrders-ln_addressed.csv"
  cluster: null
  ignore: null
- stage: "sf_summary"
  link: "ln_summary"
  path: "sf_summary-ln_summary.csv"
  cluster: null
  ignore: null

As can be seen, the physical inputs within the container are not present in the given:section and the physical outputs are not present in the then: section. We must add them. The rules given above say tell us construct the stage name for these inputs/outputs by prepending the container name. In our case this is OrderAddressC1 (the invocation, not the container name itself, as we need to be able to handle multiple uses of the same container within a job)

To understand this better, consider this “monolithic” job in which the container was “undone” and all stages are present.

...

If we generate a unit test spec for the above monolithic job, it looks like this. As you can see all the stages are present, as expected. Note particularly stages ds_cust in the given: and ds_flaggedCust in the when: ... these are the stages inside the container that we need to manually add to the containerized job's test spec.

info
Code Block
---
given:
- stage: "sf_orders"
  link: "ln_filter"
  path: "sf_orders-ln_filter.csv"
- stage: "ds_cust"
  link: "ln_cust"
  path: "ds_cust-ln_cust.csv"
when:
  job: "monolithic_v1"
  controller: null
  parameters: {}
then:
- stage: "ds_flaggedCust"
  link: "ln_flagged"
  path: "ds_flaggedCust-ln_flagged.csv"
  cluster: null
  ignore: null
- stage: "sf_samples"
  link: "ln_samples"
  path: "sf_samples-ln_samples.csv"
  cluster: null
  ignore: null
- stage: "sf_addressedOrders"
  link: "ln_addressed"
  path: "sf_addressedOrders-ln_addressed.csv"
  cluster: null
  ignore: null
- stage: "sf_summary"
  link: "ln_summary"
  path: "sf_summary-ln_summary.csv"
  cluster: null
  ignore: null

Note: We do not need to always “undo” containers to derive the needed stage names, but it can make things clearer for understanding how to add the missing

...

inputs/outputs your first few times.

The yaml we need can be created by taking modifying the original yaml generated and adding to add the appropriate stage/link/path entries.

Here is the yaml for the ProcessOrders job after we modify it.

Code Block
---
given:
- stage: "sf_orders"
  link: "ln_filter"
  path: "sf_orders-ln_filter.csv"
- stage: "OrderAddressC1.ds_cust"
  link: "ln_cust"
  path: "OrderAddressC1-ds_cust-ln_cust.csv"
when:
  job: "processOrders"
  controller: null
  parameters: {}
then:
- stage: "OrderAddressC1.ds_flaggedCust"
  link: "ln_flagged"
  path: "OrderAddressC1-ds_flaggedCust-ln_flagged.csv"
  cluster: null
  ignore: null
- stage: "sf_samples"
  link: "ln_samples"
  path: "sf_samples-ln_samples.csv"
  cluster: null
  ignore: null
- stage: "sf_addressedOrders"
  link: "ln_addressed"
  path: "sf_addressedOrders-ln_addressed.csv"
  cluster: null
  ignore: null
- stage: "sf_summary"
  link: "ln_summary"
  path: "sf_summary-ln_summary.csv"
  cluster: null
  ignore: null

Lines 6-8

Code Block
- stage: "OrderAddressC1.ds_cust"
  link: "ln_cust"
  path: "OrderAddressC1-ds_cust-ln_cust.csv"

and 14-18

Code Block
- stage: "OrderAddressC1.ds_flaggedCust"
  link: "ln_flagged"
  path: "OrderAddressC1-ds_flaggedCust-ln_flagged.csv"
  cluster: null
  ignore: null

were added to the generated yaml. In each case the stage name has been prepended with OrderAddressC1. because that is the name of the container invocation. This ensures uniqueness across all container invocations in the job. (globally uniqueglobal uniqueness) The link names do not need this disambiguation as they are already scoped to the source/target stages. The path (name of the csv file) can be anything you like. We chose a name that shows the connection ( container-stage-link.csv ) but we did not have to.

Here is a successful test run

...

Note: this example is worked using shared containers but the process is exactly the same for local containers. The invocation name of the container is prepended to any stage names that need disambiguation, and everything else is done exactly the same way.