Why does MettleCI manage assets as ISX files?

Why use ISX file and not DSX?

IBM provide a number of formats in which yo can export your DataStage assets, depending upon the version of DataStage you're using:

There are a number of good reasons to select ISX as the management format for Information Server artefacts:

ISX's are compressed, and Git only stores deltas so the disk footprint is minimised
Git will attempt a merge on non-binary files, which would likely result in the corruption of the asset
ISX is the only format which has support for all Information Server asset types (not just those in DataStage) so is therefore more useful, and future-proof.

There are multiple ISX format variations:

Vanilla Useable as a flexible, single job version by all tools
Information Server Manager-specific ISX format.
ISTool releases (multiple job versions)

We use the 'vanilla' ISX format (1) which contains a single version of the job which can be exported/imported without tooling restrictions.

Although DSX and the ISX wrapped XML are ASCII readable text, they are not easily 'human readable', and there is not, in our view, any value in inspecting a job's source code outside of the DataStage designer. DataStage jobs cannot be loaded into an external diff tool to identify the changes between them, and DataStage job exports contain stages in a nondeterministic order, meaning that two successive exports of an unaltered job may mistakenly identify significant change when compared using a traditional diff tool. The only way to read and understand a job or compare differences between two jobs is to let DataStage load them into memory and present a logical representation which is human readable. This process remains the same regardless of whether the export is encoded using binary or ASCII.

Both ISX and DSX formats include a lot more information that changing update, export and compile dates. They also contain view port position, zoom levels and snap to grid. They also contain a lot of "nonfunctional" information such as link label positions. Normalising the export might cut down some noise but does not provide a robust way to determining if a given job has changed between checkins.

Earlier iterations of MettleCI did not provide the quick and easy checkin process that we have already demonstrated. Checking in jobs manually was tedious and time consuming, so checkins were usually performed in bulk. Since there was so much time between a change being made and a developer performing the checkin, there were a high amount of "false" checkins. We did use normalisation to reduce some of the noise but it wasn't robust enough to identify all unchanged jobs.

Since introducing the MettleCI checkin process, it is very rare for developers to checkin a job even though they haven't made any changes. On the odd occasion that it does occur, there isn't really any impact the job will be retested in the CI build and a little more space (kilobytes worth) will be used up by Git storage.

If, for some reason, developers are regularly unsure of what has changed, then the team should question why there is such a long gap between modifying a job and checking it in.