Document toolboxDocument toolbox

Parameter Management for Automated Deployment

Introduction

Continuous Integration and Delivery pipelines built with MettleCI will typically deploy a DataStage Project to multiple target environments, including at least one for testing and, ultimately, production. Even though a DataStage Project deployed to multiple target environments consists of the same assets (Jobs, Stages, Job Sequences, etc), the deployed DataStage Project will require a degree of environment-specific ‘configuration’ necessary for it to correctly operate within each target environment. For example, DataStage Jobs usually read and write to one more databases and will use different connection parameter values when executed in Testing and Production environments. Configuration changes of this nature are usually performed by changing the following types of DataStage Project elements:

Type

Description

Type

Description

DSParams

Project-level environment variables

*.apt configuration files

Parallel Engine configuration files that describe node distribution and associated resources

Parameter Set Value files

Collections of parameter values used with Parameter Sets, discussed in detail later in this guide.

MettleCI provides a simple-yet-powerful parameter management system to ensure automated deployments of your ETL solutions are correctly configured for the target environment. A DataStage Project’s parameter management configurations are checked into its corresponding Git repository, alongside the other assets within that DataStage Project. By managing your configuration in tandem with your code, your parameter management practices gain the same benefits that MettleCI provides for your code.

Parameter Substitution

MettleCI’s parameter management feature relies on its ability to substitute ${variable key} placeholders found in text files with variable values loaded from a Variables Properties file. A Variables Properties file is a text-based .properties file which defines a set of variable key-value pairs. To illustrate how MettleCI performs variable substitution, consider the content of the following example files:

Example Variable Properties file (example.var)

greeting.text=Hello person.name=Joe Blow

Example Text Document (example.txt)

${greeting.text} ${person.name}, how are you today?

If MettleCI were configured to substitute the content of example.txt using variables loaded from the example.var a new version of the example.txt would be output with these contents:

Hello Joe Blow, how are you today?

All Variable Properties files reside in the root of your DataStage solution's Git repository and follow the var.${environment id} naming convention, where ${environment id} is a short identifier assigned to each target environment to which MettleCI will deploy (e.g. ‘PROD’). When MettleCI performs a deployment to a target environment, the contents of the Variable Properties file matching the target environment identifier are used to replace ${variable key} placeholders found in DSParams, APT Config, and Parameter Set Value files. It is the substituted versions of these files which MettleCI will deploy to the target environment:

APT Configuration File Example

Consider the following Git repository structure:

 

datastage

DSParams

1node.apt

4node.apt

filesystem

var.ci

var.qa

var.prod

 

When MettleCI performs a deployment to the Quality Assurance environment (environment id = qa) the Variable Properties file var.qa is loaded from the root of the repository. All ${variable key} placeholders in datastage\DSParams, datastage\1node.apt and datastage\4node.apt are replaced and the resulting versions deployed to respective the target environment.

Taking the content of datastage/1node.apt in Git as an example:

The content of this file will change depending on which environment MettleCI is deploying to:

Environment

Variable File

Deployed 1node.apt

Environment

Variable File

Deployed 1node.apt

Continuous Integration (ci)

var.ci

1node.apt

Quality Assurance (qa)

var.qa

1node.apt

Production (prod)

var.prod

1node.apt

 

Parameter Sets and Value Files

A DataStage Parameter Set comprises two distinct components:

  1. A collection of parameters and default values which we will refer to as the “Parameter Set Schema” for the remainder of this document

     

  2. One or more Parameter Set Value Files which provide a named set of parameter values.

     

The schema portion of the Parameter Set is stored in your DataStage Project repository, whereas each Value File is stored as separate, editable files on the DataStage Engine located in ${Project Directory}/ParameterSets/${Parameter Set Name}/.

For the dstage1 project shown above, for example, the Sales value file for the Database Parameter Set would reside in the /opt/IBM/InformationServer/Server/Projects/dstage1/ParameterSets/Database/ directory:

The Sales value file contains the following lines:

A Parameter Set exported from DataStage contains the Parameter Set Schema and all Value files combined as a single ISX file. When The initial on-boarding of a Project into Git using MettleCI will place these combined per-Parameter Set ISX files in folders that match their original Categories.

The Parameter Set Value files to be used for overriding those in the original Parameter Set are plain text files and must be put into a Git folder with the datastage/Parameter Sets/${Parameter Set Name}/naming convention, regardless of where the actual Parameter Set lives within the DataStage repository. A convenient way to start the override configuration process is by taking copies of the Parameter Set Value files found on the Engine filesystem, manually checking them into Git alongside all other ISX files then editing them to insert the relevant ${variable key} placeholders.

Parameter Sets ISX files can be checked into Version Control using MettleCI Workbench. However, Parameter Set Value files found on the DataStage Engine filesystem can only be checked in using a standard Git client.

In our example, the original Database Parameter Set has been co-incidentally saved to the default Parameter Set category in the DataStage repository but it could just have easily been saved to a different location. The picture below reflects the resulting Git folder and file structure for the Database Parameter Set.

 

datastage

Parameter Sets

Database

Sales

Support

Database.isx

DSParams

1node.apt

4node.apt

filesystem

var.ci

var.qa

var.prod

 

When MettleCI performs a DataStage Project deployment, the Parameter Set ISX files are updated to include the Parameter Set Value Files in datastage/Parameter Sets/. If a Parameter Set Value File of the same name is found in both the ISX and the datatage/Parameter Sets/${Parameter Set name}/ directory, the Parameter Set Value File in the ISX is overwritten with the Parameter Set Value file in datastage/Parameter Sets/${Parameter Set Name}.

For example, consider our Database.isx Parameter Set export which contains a Sales and Support Value files. If the datastage/Parameter Sets/Database/ folder in version control contained Sales and Accounting Parameter Set Value files, the following configuration would be deployed to your DataStage Project:

Since the Parameter Set Value Files in Version Control are human-readable text files, the same variable substitution that is applied to DSParams and APT config file can also be applied to Parameter Set Value files.

 

© 2015-2024 Data Migrators Pty Ltd.