Null Handling
In Server Jobs a null is represented by a special placeholder value which can be assigned to columns, stage variables and even intermediate results. The @Null
system variable contains this special placeholder value and can be used used in expression just like any other value. In contrast, nulls in Parallel Jobs are represented by a separate ‘flag’, implemented by toggling a single bit of a column or a stage variable to indicate whether it is null (rather than just empty).
The style of null representation used by Server Jobs is sometimes referred to as an ‘in-band’ null, and the type used by Parallel is known as an ‘out-of-band’ null, each described here.
Intermediate values can’t be null as they do not have a flag which can be set.
As a result, Parallel jobs use IsNull()
and SetNull()
functions to read/write the null flag associated with columns or stage variables and are more restrictive in their use than Server’s @Null
system variable.
For example, take the following server column derivation:
If input.column = "test" Then @Null Else input.column
This can be converted to Parallel by replacing @Null
with SetNull()
because it is the last statement of the derivation to be evaluated:
If input.column = "test" Then SetNull() Else input.column
However, none of the following derivations will compile as replacing @Null
with SetNull()
will not set the null flag of the target column or stage variable:
"Hello" : If input.column = "test" Then @Null Else input.column # Works in server
"Hello" : If input.column = "test" Then SetNull() Else input.column # Doesn't compile in PX
5 + @Null # Works in server
5 + SetNull() # Doesn't compile in PX
Trim(@Null) # Works in server
Trim(SetNull()) # Doesn't compile in PX
Fortunately, Data Migrators' analysis indicates that none of the above uses of @Null
are rarely, if ever, observed in the real-world Server Jobs. S2PX therefore …
Converts
@Null
toSetNull()
when it will result in a valid Parallel expression, orLeaves
@Null
in place and reports the problem to the user if the conversion would result in an invalid PX expression.
Server Jobs can also use a special System Variable called @Null.Str
. This is a String representation of @Null
which is only useful for working with multi-value strings. In Server Jobs it is possible to construct a multi-valued using 'System Marks’:
For most Server Transform functions this is just a normal string with a special character separating Hello
and World
. However, functions which work with multi-valued strings will interpret this string as an array with two values Hello
and World
. DataStage Basic can access these arrays directly using the array index operator using references of the form array<1>
and array<2>
which would return Hello
and World
respectively. This array index operator does not exist in Server Transformers and the Extract(array, index)
function needs to be used instead.
To construct a multi-valued array with 3 values - the second value being @Null
- intuition would lead us to use the following expression:
This expression, however, will actually result in a @Null
result as concatenating a String with another @Null
will result in @Null
(see https://en.wikipedia.org/wiki/Three-valued_logic). This is where @Null.Str
is used:
Now calling…
…to get the second field in the array will return @Null
.
It's worth noting that the Extract()
function can also be emulated with the Field()
function:
The Field
function, however, works with strings rather than multi-valued arrays, so getting the second field from the array will actually result in DataStage returning @Null.Str
rather than interpreting it as @Null
.
© 2015-2024 Data Migrators Pty Ltd.