Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The advantages of short lived feature branches need to be weighed against the merge risks and branching constraints imposed by DataStage:

  1. Creating branches for use with DataStage introduces significant overhead compared to the expected lifetime of a short lived feature branch

  2. DataStage merge conflicts can occur if the same DataStage asset is modified in two concurrently developed feature branches

Through lived experience, we have come to the conclusion that the advantages of using short lived feature branches with DataStage are overshadowed by the branch creation overhead and the impacts of conflicts when branches are merged. For this reason, we strongly recommend that DataStage users integrate changes by committing straight to Trunk.

...

In this mode of working, developers collaborate in a “development” DataStage project. When a developer is ready to integrate their changes into Trunk, they should perform the following steps:

  1. Compile their jobs

  2. Validate the jobs pass compliance testing

  3. Ensure all related unit tests pass

  4. Commit changes

If a change is committed to the Trunk that causes the CI/CD Pipeline to fail, developers can either resolve the issue with a new commit or, if developing a fix will take an extended period of time, Git can be used to revert the commit until a fix is available.

...

Frequent integration of changes into Trunk is great for rapid development throughput, but there will come a time when the software needs to be released. Trunk Based Development has two main strategies for releasing software:

  1. Release from Trunk

  2. Branch for Release

Release from Trunk is the simplest strategy to adopt and it is very easy to transition from Release from Trunk to Branch for Release at a later date. For this reason, we recommend teams start with Release from Trunk and only consider Branch for Release when there is an immediate need to do so.

...

In the event that a defect is discovered in production, developers can choose one of two possible actions:

  1. Rollback to a previous good release until a fix is available

  2. Deploy a new version containing a fix

The Release from Trunk strategy assumes that there is no need to maintain multiple software releases. This assumption holds true for typical ETL development as the only release that needs to be maintained is the release currently deployed to production. However, successful implementation of this strategy requires a high release cadence. A low release cadence results in lots of changes being made “in development” between each release of software which limits the feasibility of fixing production issues by rolling back to a previous release or deploying a new release:

...

Gliffy
imageAttachmentIdatt1414168715
macroId07caa737-918c-4bb8-bac1-34859457ccca
baseUrlhttps://datamigrators.atlassian.net/wiki
nameRelease from trunk
diagramAttachmentIdatt1414135957
containerId1414234192
timestamp16134372270281669250736408

Planned releases proceed in the same way as Release from Trunk. However, when a critical issue is discovered in production, a branch is retroactively created for the release and fixes committed to the branch. Like the Trunk, each release branch should trigger a CI/CD pipeline determine if the updated release is fit for purpose. When a bug fix has passed CI/CD and is ready to be released, the commit is tagged with the relevant version and the CI/CD pipeline deploys it from the branch into production.

...

A variation of Branch for Release is to create a branch just before a planned release and deploy from the branch for both planned and bug-fix releases. This approach can be useful when there is a high volume of change occurring on the Trunk which is not always “stable”:

Gliffy
imageAttachmentIdatt1413578922
baseUrlhttps://datamigrators.atlassian.net/wiki
macroIdeb9269f1-c7f3-4904-9023-cf804f64c5b8
nameDeploy from Branch
diagramAttachmentIdatt1413578917
pageid1414234192
containerId1414234192
timestamp16124992092651669250771085

Branching before release allows development on the Trunk to continue unaffected while also allowing “stabilizing” changes to be made to the release prior to production deployment. A CI/CD Pipeline should include automated tests which give confidence that a given version of software is releasable. Committing more than a couple “stabilizing” changes prior to deploying to production is an anti-pattern and an indication that the CI/CD Pipeline does not have adequate test coverage, or that development practices are breaking the CI/CD Pipeline for extended periods of time. The goal is for teams to ensure that the Trunk is releasable at all times.

...

Given that typical DataStage development teams will only be maintaining a single release branch at any point in time, we recommend setting up “development” and “maintenance” DataStage Projects. The development DataStage Project will be the working copy that commits to the Trunk while the maintenance DataStage Project will be the working copy that commits to the currently active release branch:

Gliffy
imageAttachmentIdatt1445724232
baseUrlhttps://datamigrators.atlassian.net/wiki
macroId23fc9278-d212-4456-b9cf-f40bb4e10fc2
nameRelease Working Copy
diagramAttachmentIdatt1446281249
pageid1414234192
containerId1414234192
timestamp16133619173731669250798449

Each time a new Git release branch becomes active, the maintenance DataStage Project will need to be re-initialized from the newly created branch. Including the re-initialization of the maintenance DataStage Project as part of the CI/CD Pipeline is a simple way of significantly reducing the overhead of this activity:

...

An alternative approach which is based on test driven development can be used to avoid the merge while preventing regression. This approach is only feasible because bug-fixes committed to release branches are usually small in number and in scope, it is not applicable when merging branches more generally:

  1. Create/update unit tests in the maintenance branch to demonstrate the bug which is being fixed

  2. Implement bug fix and commit to release branch when unit tests pass

  3. Transfer unit test changes from maintenance branch to Trunk

  4. Implement bug fix and commit to Trunk when unit tests pass

This process will result in duplicated effort in steps 2 & 4, however the using Unit Tests to first demonstrate and then fix the bug will eliminate risk normally associated with these sort of activities. Unlike DataStage exports, unit test specs and data are text based, so Step 3 can be completed by cherry picking the relevant Unit Test changes. Cherry picking is an advanced Git technique, developers which are not familiar with Git should consider transferring Unit Test changes manually using MettleCI Workbench.

...