Differences between Server and Parallel VarChar
This page uses the terms VarChar and String interchangeably
There are two major differences between Server and Px strings, these are not changes that can be easily accounted for in the conversion and as such will be required for manual remediation. The main reason for this is that the behaviour can occur anywhere the data allows for it, it’s a problem fundamental to differences how Server handles data vs how Px handles data if you like.
Quote characters inside strings
Consider cases where a DataStage Server Jobs is reading data from a Sequential file:
# | Item (VarChar 36) | Description (VarChar 256) |
---|---|---|
1 | “Apple” | “A sweet juicy apple” |
2 | “Golden Apple“ | “An apple with special powers, gives regeneration effect” |
3 | “Enchanted Golden Apple” | “Also referred to as ““Notch Apple”” gives strength and all elemental resistances” |
The potential for a problem occurs in column 2 row 3, where quote characters occur inside a string. Server jobs handle this situation by escaping quote characters with another quote (i.e., it uses two adjacent quotes).
The Parallel engine does not have a built-in solution that is accessible to the Server-to-Px plugin, so it will generate invalid CSV data for those data:
# | Item (VarChar 36) | Description (VarChar 256) |
---|---|---|
1 | “Apple” | “A sweet juicy apple” |
2 | “Golden Apple“ | “An apple with special powers, gives regeneration effect” |
3 | “Enchanted Golden Apple” | “Also referred to as “Notch Apple” gives strength and all elemental resistances” |
Solutions
The simplest solution to this problem is to manually change the “Quote” field to something that is not used by the text.
This of course changes either what is output to the sequential file or changes the expected input. Either way it has further implications for your project that you will need to account for.
Empty Strings and Null strings
Server strings provide null handling behaviour which cannot be fully replicated by Parallel Sequential Files. Consider the following example:
# | Item (VarChar 36) | Description (Nullable VarChar 256) |
---|---|---|
1 | “Apple” | “” |
2 | “Golden Apple“ |
|
3 | “Enchanted Golden Apple” | “Also referred to as Notch Apple gives strength and all elemental resistances” |
Note that the Server engine will interpret the Description
values in rows 1 and 2 as being different (i.e., row 1 is an empty string and row two is null
).
Parallel, on the other hand, does not differentiate null
and empty strings. Parallel jobs do support null handling, but not in a form that support handling Sequential files. The closest we could possibly get to this is to define null_field_value=""
however this doesn’t really work since all we have done is forced empty strings to be read as null
. In this case the data are interpreted thus:
Item (VarChar 36) | Description (Nullable VarChar 256) |
---|---|
“Apple” | “” |
“Golden Apple“ | ““ |
“Enchanted Golden Apple” | “Also referred to as Notch Apple gives strength and all elemental resistances” |
© 2015-2024 Data Migrators Pty Ltd.