Continuous Integration Practices–Precommit Process

Often what gets all the attention in Continuous Integration is what happens on a continuous integration server.  There is more to CI than what happens on the server.  What happens on a developer workspace is part of CI as well.  An easy example of this is the precommit process.  It describes what a developer does before committing changes to source control.

On a simple and small project this would entail:

  1. Update to the latest source
  2. Recompile (a clean compile)
  3. Run all the tests (may require a local deployment)

The larger and more complex the project gets the less tenable this procedure becomes.  When you end up with thousands of unit tests, thousands of integration tests, and hundreds of user acceptance tests, functional tests, etc… it becomes unreasonable to run all of them every time before you commit.  To run all these tests would take a long time.  So how long is acceptable you say?  You need to decide that for yourself.  The project I am working on as I write this has decided that the whole precommit process should take about 30 minutes.  That doesn’t mean that we throw away tests until we can complete the process in the allotted time.  We need to take educated guesses as to which tests we should run, which tests are most relevant to the changes that are being tested.  For example if I was making changes to how the system processes orders there would be no value in running tests on product comparison.

In the book “Software Configuration Management Patterns: Effective Teamwork, Practical Integration” the pattern Private Build System is basically the same thing:

Context

A Private Workspace allows you, as a developer, to insulate yourself from external changes to your environment. But your changes need to work with the rest of the system too. To verify this, you need to build the system consistently, including building with your changes. This pattern explains how you can check whether your code will still be consistent with the latest published code base when you submit your changes.

Problem

How do you verify that your changes do not break the build or the system before you check them in?

Solution

Before making a submission to source control, build the system using a Private System Build similar to the nightly build.

This patlet raises the why of it: why do we need a precommit process?  And just like it says to make sure that we do not break the codeline (i.e. trunk or a branch).

Why don’t we want to break the codeline(build)? So that we don’t negatively affect the rest of the team.

How would breaking the codeline(build) negatively affect the rest of the team?  In two ways; first it would prevent anyone else from committing to the build as we follow the rule of not committing on top of a broken build, two if anyone updates from source control their local build will be broken too.

Why is blocking the build a bad thing?  It is not so bad if you fix the build quickly.  It really only becomes a bad thing when it is broken for an extended period of time.  This prevents others from committing to the build and so they continue to work, increasing the size of their changeset.  Large change sets are more likely to break the build, and when they do break the build they usually take a long time to fix.  This can easily lead to a situation where the build is often broken for extended periods of time.

Why is having the build broken for extended periods of time or all the time a bad thing?  The build then looses it usefulness.  It is clearly no longer keeping the codeline stable, which is its purpose.

Those 5 whys get us to the root of it.  We need a precommit process to support a centralized build process.  They also paint a picture that many feel is a slippery slope, causing great fear of breaking the build.  The danger is in how long you let any given broken build remain broken.  As long as you fix the build quickly, or rollback the offending changes to last good build quickly there is no danger.

Fear of Broken Builds

Fear of breaking the build has caused some people to adopt measures to keep the build green at all costs.  There are several commercial CI Servers that play into this fear with features that provide precommit isolated private builds.  These are builds that occur outside of the developer’s workspace, they are private in the sense that the results are only shown to the developer who submitted the changes and the changes don’t make there way into source control unless the build passes.  Another common measure is to insist that all tests are run as a part of the precommit process.  Both of these have the side effect of increasing the average size of a changeset committed to the build.  Granted that these large changesets will pass the build…they will not necessarily be easy to integration into each developer’s workspace.  The larger the changeset the greater the chance it will impact changes in a developer’s workspace, especially if the local changes are large as well.  Gradual changes over time are easier to integration into a developer’s workspace than big bang changes.  A significant compounding factor with big bang changes is that they tend to be committed at the end of an iteration.  This means you would likely have multiple developers competing to commit large changesets at the end of an iteration.  When the competition gets rough developers often rationalize breaking the build in favor of being able to commit their changes hoping to differ testing to the next iteration. We finished the story, except for the testing, we’ll test it next iteration.


Working in Small Changes

If the team works by decomposing stories into small tasks, tasks that can be completed in less than a day, preferably a couple of hours, things can work very smoothly.  Imagine that every hour or two the developers are committing changes as they complete tasks.  Their precommit process would involve integration of a small number of changes from a source control update and executing a small number of tests to exercise their changes.  I am sure you can see how this would result in a short precommit process with little chance that the updates from source control would impact local changes.










36,660 Total Views
March 3, 2012 Uncategorized

Continuous Integration Principles–Task Size Rules

Principles are those kinda things that you don’t have to believe in for them to exist.  For that matter they will still push you around without your consent.  I have never seen a set of Continuous Integration (CI) principles so I thought that I could share the ones that I have found through out my career.  The first one I elect to share is on task size…

 

Below is a loop diagram that shows some of the forces and effects task size can exert on CI.  If you have never seen a loop diagram before you can read this diagram by ideas relating to each other either in the same way or in the oppisite way.

As task size increases changeset size will increase as well.  People tend to commit changes when they have completed a task.

As changeset size increases so does the average time to fix a build.  When a build breaks and the changeset size was small it is generally easy to know what went wrong and fix it.  If the changeset size was large there are more things to suspect, investigate, and debug.  This takes longer.

As the average time to fix a broken build increases the build availability decreases.  This is assuming that you don’t allow committing changes to a broken build…the more often you have hard to fix builds the less available the build will be.

As the build is less available the changeset size will increase.  This is because developers will continue working while the build is not available, thus increasing the size of their changesets.

As changeset size increases the build stability will decrease. This is because the build breaks more often with large changesets than small changesets.

As build stability decreases the development environment tractability will decrease.  It is more difficult to work on a codeline that is unstable than a stable codeline.

As the environment tractability decreases the rate of change to the codeline will decrease.  Again it is more difficult to work on an unstable codebase.

As the rate of change decreases there will be less broken builds.  If you have a high rate of change you will have more broken builds.

As the number of broken builds increases the build availability will decrease.

 

I don’t mean to show this loop diagram as a standalone or complete system.  There are other forces that can come into play here to address the number of build breaks, how long it takes to fix a broken build, or even isolating build breaks.  These are important.  But they are not the core system.  This is the core system and the root input affecting the system’s output is the size of tasks.  If people are working with small tasks they will break the build far less often and when they do it will be easy to fix.  If people are working with large tasks they will break the build often and when they do it will be difficult to fix.  From what I have seen it works best when one or more small tasks can be completed in a day.  You will need to figure out what works on your project. 

You can also add to this diagram mitigating forces such as a precommit developer build to help control the number of broken builds.

P.S. If you are having a hard time getting task sizes down maybe you should draw a loop diagram of the forces at play keeping task sizes large on your project.

39,230 Total Views
October 23, 2011 Uncategorized

AgileDC 2011

I will be speaking at AgileDC this weekAgileDC-logo on agile testing on government contracts. I have not been to this even before but it look like it will be fun with all the great speaks showing up.  I did not see where my slide have ben posted on the agiledc.org site so I will link them in here.  Hope to see you there.

34,078 Total Views
Uncategorized

here