Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The parameter which sets a job's Unit Test 'execution mode ' ( to either 'Interception' or 'Testing' ) is an environment variable which can be added to individual jobs, or job sequencesJobs or to Job Sequences.  When applied to a job sequence Job Sequence (and propagated to the jobs within that sequence) this approach enables you to run an entire batch in batch of Jobs in Interception mode (, capturing all jobs' inputs and outputs as an integrated set of MettleCI Unit Tests) , or in Testing mode, where you can replay that set of Unit Tests for each job Job in turn.

MettleCI’s Unit Test capability doesn't support the treatment of Sequences as units (e.g. stubbing the inputs to the first Job in a Sequence, deliberately avoids treating a Job Sequence as a testing unit, meaning that it does not support injecting test data into the inputs of the first Job(s) in a Sequence and testing the outputs of the last Job and ignoring any intermediate Jobs as a “black box”(s). While it has been common practice for some DataStage developers to “unit” ‘unit test' multiple Jobs via as a single test event using one or more Job Sequences , (prior to handing them over for some form of batch-level testing, ) this is, in effect a form of , forms an integration test. By strictly defining the scope of a unit test as a single DataStage Job , MettleCI reduces the risk and additional root-cause analysis effort in addressing required to address test failures that arise during the execution of multiple Jobs.Given

A Pragmatic Approach

Notwithstanding the caution expressed above, it is possible to hand-craft a set of unit test specifications which support the testing of multiple sequenced Jobs:

You can create MettleCI Unit Test Specifications which effectively inject test data into the Jobs as the beginning of the Job Sequence an test the data being produced at the output of the Job Sequence. For example, given a simple sequence of three interdependent DataStage Jobs…

Gliffy
imageAttachmentIdatt2373550087
macroId0abb9afd-70d1-4b6c-82a2-927cbb87fe79
baseUrlhttps://datamigrators.atlassian.net/wiki
nameJob Sequence UT
diagramAttachmentIdatt2373746695
containerId501448836
timestamp1673562222043

You can create a

Gliffy
imageAttachmentIdatt2373779490
macroId2c548685-fc1d-4781-86c3-50957d417720
baseUrlhttps://datamigrators.atlassian.net/wiki
nameJob Sequence Unit Test Process
diagramAttachmentIdatt2373877784
containerId501448836
timestamp1673564134680

… you could test this sequence using Test Specifications which look like this:

Parallel Job 1

Parallel Job 2

Parallel Job 3

Code Block
given:
  # Inject test data
  - stage: Sequential_File_0
    link: Link_1
    path: Sequential_File_0.csv
when:
...  # Test conditions
then:
  -# stage:No SparseLookupoutputs tested
  # link:Allow Inputnormal output operation to support path: SparseLookup-Input.csvdownstream jobs

No MettleCI Test Specification is required for this Job as it runs normally during a unit test, with no runtime intervention from MettleCI.

Code Block
given:
  # No input data injected
  # Allow normal input operation to read from upstream job
when:
...
  # Test conditions
then:
  # Compare output
  - stage: ODBC_1
    link: Link_2
    path: ODBC_1.csv

…and which would effectively look like this at runtime:

Gliffy
imageAttachmentIdatt2373779490
macroId2c548685-fc1d-4781-86c3-50957d417720
baseUrlhttps://datamigrators.atlassian.net/wiki
nameJob Sequence Unit Test Process
diagramAttachmentIdatt2373877784
containerId501448836
timestamp1673564134680