Chapter 7. Advanced CI tools and recipes – Agile ALM: Lightweight tools and Agile strategies

Chapter 7. Advanced CI tools and recipes

 

This chapter covers

  • Tools and recipes for continuous integration
  • Approaches for integrating different artifact types
  • Strategies and tooling for staging artifacts

 

This chapter provides techniques to help you implement advanced continuous integration (CI). As you learned in earlier chapters, CI is the backbone of an Agile ALM. But why do we need to discuss advanced CI? CI is a widespread discipline. Its basic concepts are widely understood and supported by all build servers (such as Jenkins, Bamboo, TeamCity, Continuum, and CruiseControl), but advanced topics, such as those I cover in this chapter, are rarely covered elsewhere. This chapter explains how to implement Agile ALM in the context of CI, as illustrated in figure 7.1.

Figure 7.1. Advanced CI scenarios for Agile ALM that are covered in this chapter: building and integrating platforms or languages (.NET and integrating Cobol by using Java and Ant), enabling traceable deployment of artifacts, building artifacts for multiple target environments (staging these artifacts by configuration, without rebuilding them), bridging different VCSs, and performing audits.

 

Note

If you’re interested in CI basics that we don’t cover in this book, consider reading Continuous Integration: Improving Software Quality and Reducing Risk by Paul. M. Duvall (Addison-Wesley, 2007) or Martin Fowler’s free online resources (http://martinfowler.com/articles/continuousIntegration.html).

 

This chapter starts with examples of integrating different artifact types into a comprehensive CI ecosystem. The first approach is to use a platform or language (such as Java), and the build systems it offers, to drive and manage the builds of other languages or platforms (such as Cobol). This approach is helpful in situations where languages and platforms don’t have their own native build systems (or have only support for integrating with an enterprise integration system). The examples we’ll look at will deal with those languages and platforms (specifically, integrating Cobol by using Java and Ant).

Where possible, we’ll use the tools available for the platform and then integrate the native build scripts on the common build servers. We’ll also cover .NET and look at how to build and integrate source code using lightweight (open and flexible) tools, such as Subversion and TeamCity, without having to use proprietary products like Microsoft’s Team Foundation Server.

After discussing builds (spanning different artifact types), we’ll talk about creating builds for multiple target environments. This is a good example of staging, discussed in chapters 2 and 3. We’ll cover strategies and solutions for promoting artifacts to other target environments, merely by configuring them (without rebuilding).

Then, we’ll discuss bridging different VCSs. As you already know, all artifact types (Java, Cobol, and so on) should be stored in a VCS, but sometimes you might need to work with different VCSs in a complex project setup. You may need to view this effort as a soft migration from one VCS tool to another, where you don’t try to replace the entire VCS in one step. These are all common scenarios, and I’ll explain how to deal with them in an Agile ALM context. We’ll look at an example of how to bridge a widespread enterprise VCS (Subversion) to another one (Git, a distributed VCS) to implement feature branching.

We’ll look at builds and audits with Jenkins. Checkstyle, FindBugs, PMD, Cobertura, and Sonar perform code audits. Then, you’ll learn strategies for building specific facets of your builds depending on where the build is running, and for injecting version numbers into the built applications. Another important aspect of this discussion will be deploying and staging artifacts or builds with Jenkins and Artifactory. In this section, I’ll show you how to deploy and stage a Maven based project to the component repository in a consistent and traceable way. By default, deploying a multi-module Maven project (a project that consists of multiple modules) results in deploying each module isolated from each other. This approach will illustrate how to deploy the complete application in a single downstream step, but only after each module build (including successful compiles and tests) has already succeeded.

All these views and issues are important facets of running a comprehensive, uniform Agile ALM infrastructure. This chapter explains strategies, and it shows by example how you can orchestrate tool chains in a cohesive and integrated way. Strategies and specific tool examples will show you how to set up an open Agile ALM infrastructure.

Let’s start by looking at how to integrate legacy Cobol applications.

7.1. Integrating other artifact types: Cobol

Integrating Java artifacts is a common task. All you need to get started is a build server and a build script. Compiling, packaging, and deploying Java artifacts are routine jobs. But Agile ALM comprises more than operating on only Java and derived artifacts compiled to Java bytecode (like Groovy, Scala, JRuby, and more). In this book, I can’t possibly illustrate how to integrate all the different artifact types and programming languages. In this section, though, we’ll look at how to prototype the processing of Cobol host sources. Host refers to IBM System/360 compatible product lines with operating systems like OS/390, MVS, and z/OS.

In this section, we’ll also look at an example of how to set up a CI environment to support Cobol development and how to control the processing using Java. We’ll integrate building Cobol artifacts into our CI ecosystem. This real-world solution also shows a strategy for continuously integrating and building non-Java artifact types. This approach shows how to use a platform or language (like Java) to drive and manage the build of other languages or platforms.

7.1.1. Preconditions and basic thoughts

Traditionally, Cobol development is done in a mainframe-based environment: Cobol sources are written and compiled on the host. Compiled Cobol sources are called “load modules,” or “modules.” After compiling sources, the generated modules are loaded to host libraries. This mainframe-based approach is different from how other applications are developed, such as Java applications, where you don’t develop sources on a host or transfer sources to a central host in order to compile them.

Different approaches often lead to silos, but that need not be the case. There are ways to bring those two worlds—developing Java and Cobol applications—together, and to foster a comprehensive Agile ALM approach. Bridging the two platform ecosystems is possible by using the same IDE for developing software in Java and Cobol and using CI in both cases. Feature-rich, commercial tools (such as products of the IBM Rational product family) nowadays enable developing and even compiling Cobol sources on the developer’s desktop workspaces.

Many projects find it helpful to transfer Cobol sources from the developer’s workspace to the host manually, such as via FTP, in order to compile the sources on the host. As this section will show, you can also develop Cobol sources on the desktop and use the host compiler to compile the sources, in an automatic, lightweight, and convenient way.

A basic precondition for processing Cobol sources is that the sources must be imported and managed in a VCS. We’ve already discussed the importance of managing sources in a VCS, and Cobol source code should be stored in a VCS as is the source code of any other programming language. Some people may suggest that this isn’t necessary because Cobol is a mainframe host language, and there have been mainframe facilities (such as libraries) in which both source and compiled binaries have been managed for some time. But the advantages of putting Cobol sources into a VCS include benefiting from all the features of a modern VCS and being able to add Cobol processing to a CI system. The CI system can trigger Cobol processing on the host continuously. Compiled Cobol sources (binaries, load modules) can be stored in a VCS or a component repository so they can be reused later by other CI build jobs that stage those modules to other test environments on the host.

During software development, a repetitive activity is using an IDE, such as Eclipse (with its Cobol support), and synchronizing the sources to a VCS with Eclipse’s excellent support for all common types of VCSs. But after editing sources, how do we compile them, and how do we put them onto the host? By using Cobol compilers, the compiling can be done on the developer’s desktop. But it’s much more common to upload the sources to the host for compiling, and not upload binaries into libraries.

Developing Cobol applications in an IDE on the desktop and offloading the build onto the mainframe further improves the quality, reduces the risk of late bug detection, accelerates feedback cycles, and prevents the desktop from being blocked by long processing times. Offloading the build to the host fosters productive workspaces (discussed in chapter 5). Besides that, when developing Cobol applications, desktop and workspace Cobol compilers may differ in functionality and handling. But how can we transfer sources to the host?

You can use FTP for communication between a developer’s desktop and the host. To do this, you need the following:

  • A host machine that runs an FTP server.
  • An authorized user with a valid user account.

Once you have the FTP server and the authorized user up and running, you can set up FTP-based processing. In general, the FTP server can address and communicate with the host’s job entry subsystem (JES). The JES is part of the operating system used to schedule and run jobs and control their output. On a mainframe, the job control language (JCL) controls the jobs. For instance, a JCL script may contain commands to compile a Cobol source. JCL scripts consist of steps (that are commands for the host) and programming features to implement conditional statements and workflow. If you submit JCL scripts into the JES, the resulting jobs are processed immediately, depending on job priorities. Here is a small JCL example:

//XXXXXXXJ JOB (ACCTCODE),'ABCDEF',NOTIFY=D123456,CLASS=I,
//             MSGLEVEL=(1,1),MSGCLASS=C
//*
//STEP1    EXEC PGM=ABCDE99

As you can see, it’s a simple file with only one step. An FTP server provides access to JES functions, including submitting and deleting jobs, displaying the status of jobs, and receiving the output of JCL messages.

There are many possible ways to manage your FTP communication. We’ll discuss the two main ones here, using either Ant or the Java API. We’ll start with the pragmatic approach, which uses Ant.

7.1.2. FTP communication with Ant

The Ant approach is pragmatic, in that it helps you handle the processing of JCL jobs in a straightforward way. This approach has two benefits. One, it’s lightweight, and, two, it’s mainly scripted in a native build scripting language (Ant) and can be directly included in a CI process. You don’t need to write a full-featured program to set up the communication between the build server or the developers’ machines and the host.

In brief, the approach of processing the Cobol sources is as follows:

  • The CI server checks out Cobol sources from the VCS and generates signal files, as well as JCL files (one signal file and one JCL file for each Cobol source), and a bash script (see figure 7.2).
    Figure 7.2. The processing of Cobol sources is based on Ant scripts that are dynamically generated and triggered by the CI server. Files are transferred from the CI server to the host and vice versa via FTP. CI with Cobol is similar to how we integrate Java applications: Cobol sources are managed by the VCS. The CI server checks Cobol sources and triggers and then monitors the success of Cobol compilation on the host. Compiled Cobol sources can be loaded into libraries on the host and transferred back to the CI server to store them in the VCS or a component repository for further reuse.

  • The bash script is triggered by CI and transfers the Cobol and JCL files to the host.
  • On the host, the JCL files kick off host jobs that process the Cobol files according to the type of processing defined in the JCL files. Typically, processing means that Cobol sources are compiled, and generated libraries are loaded on the host. The host runs the JCL jobs asynchronously, so we use signal files on the CI server to monitor the process.
  • FTP commands inside the JCL files trigger the transfer of generated libraries (the compiled Cobol sources) and log files back to the CI server.
  • In each JCL job, an FTP command removes the signal files from the CI server to indicate that the processing of this specific Cobol source is finished. The CI server can put the libraries into the VCS or distribution management in order to reuse these binaries for other test environments without again compiling them (compile once, run everywhere).

This lightweight process is based on a sequence of dynamically created Ant scripts that check sources from the VCS, generate files, run a bash[1] script to upload them to the host, wait for the host to process all the files, and collect the returned files.

1 Bash, or a Windows equivalent. Although Java is platform-independent, I’ve assumed Linux or Unix to be the test and production environments in this section. This won’t necessarily be the case, though.

Here are the steps of the solution in more detail:

  1. Check out all Cobol sources from the VCS. Use Ant for this.
    Note

    This is a generic solution (which will work regardless of the names of the Cobol sources), so the following steps include the generation of generic scripts.


  2. Iterate over the Cobol sources to identify the filenames and store them in a Java collection. Do this in a Java class that’s called by your Ant script. You can also write your own Ant task that wraps the functionality and can be used directly from outside your Ant script.
  3. In the same Java class, generate a bash script to transport the files (the Cobol sources and the to-be-generated JCL files that will include logic about what to do with the Cobol source on the host) via FTP directly. This means you’ll use FTP commands in your bash script. The script starts with #!/bin/sh and it’s essential to specify the FTP site correctly in the JCL: filetype=jes for the jobs and filetype=seq for the artifacts, as shown in listing 7.1. You can use variables for environment-specific configuration settings. The specific values for these placeholders are held in a Java properties file and can be injected into Ant scripts by filter-chaining at execution time (more on Ant’s filtering feature in section 7.3).
    Listing 7.1. Generating bash script for uploading files

  4. In the Java class, you also generate an Ant script that uses the Ant touch task.[2] Using it once touches every source and generates signal files: one use of this command impacts each Cobol source you checked out from the VCS and creates an empty file for each Cobol source. These empty files can be called barriers or signal files. The reason for creating these files is that once you put JCL files on the host, they are processed by the host asynchronously. By introducing these barriers, the transferring system (the CI server) can monitor the asynchronous processing and it will be informed when the processing on the host is completed. You call the generated Ant script dynamically to create these empty files; you can place these barrier files into an inbox folder.

    2 See the Ant documentation of the touch task: http://ant.apache.org/manual/Tasks/touch.html.

    The use case of this section shows that you can upload the Cobol artifacts and the corresponding JCL files in pairs to control processing of the Cobol artifacts. All files (Cobol sources as well as the JCL files) must be in place for the script to find them. You can configure where to place the files. It’s good practice to copy all Cobol sources to an outbox folder where you’ve also placed the corresponding JCL files. The following listing shows an example touch script snippet that generates Ant script.
    Listing 7.2. Generating touch script

  5. In the Java class, you dynamically generate one JCL file for each Cobol source. This JCL contains the JCL steps for processing the individual Cobol resources on the host. The JCL can vary, depending on the type of Cobol file (for instance, online or batch) and will include different JCL steps or Cobol compiling options. You can also include FTP commands in your JCL snippets. Firing these into the JES results in host jobs uploading log files or load libraries to the build server. You must add an FTP command to your JCL to remove the signal file that monitors the processing for this JCL.

Executing the bash script transports the files to the host. You don’t need to manually call the script, as it can be part of your build logic. For example, a sequence of Ant scripts can start by generating the files and then running them. In the script, you must include logic to pause until the host removes all signals. You can, for instance, write a small Java class, creating threads monitoring the inbox folder and its entries. When the folder is empty, the host has processed all jobs. Afterward, you can collect possible return files (like load modules) and store them in the VCS.

 

Maven and Ant

Ant scripts can be integrated with Maven scripts by using Maven’s AntRun plug-in for Ant: http://maven.apache.org/plugins/mavenantrun-plugin/.

 

We’ve reviewed automating common tasks, such as FTP, using Ant and scripting. Next we’ll examine how to use Java for FTP communication.

7.1.3. FTP communication with Java

Instead of using Ant to handle communication with the host, you can code it with Java. You can work with sockets yourself, or you can use the Commons Net library (http://commons.apache.org/net/). This library is an easy-to-use abstraction for handling different protocols, including FTP.

 

Note

Wherever possible, you should use common abstractions for communication, which means using higher levels of the Open Systems Interconnection (OSI) model (such as the application layer). FTP and HTTP are examples of such common abstractions.

 

Let’s transfer the JCL file shown earlier onto the host and execute it in the JES. The following listing shows how to achieve this with Java.

Listing 7.3. Uploading artifacts by using FTP

For more details on how to manage host jobs with Java, see the IBM documentation.[3]

3 For instance, see “Submit batch jobs from Java on z/OS” by Nagesh Subrahmanyam (www.ibm.com/developerworks/systems/library/es-batch-zos.html) and “Access z/OS batch jobs from Java” by Evan Williams (www.ibm.com/developerworks/systems/library/es-zosbatchjavav/index.html).

Listing 7.3 starts by connecting and logging in to the remote FTP server. Commons Net provides an FTP client; you don’t need to work with sockets on your own. We then configure the file type , which outputs “200” if it all worked successfully, retrieves the result , confirms that “200 SITE Command was accepted”, and reads the file to submit it to the host .

Please keep in mind that those are the most important high-level points. In addition, we handle exceptions (but only on a basic level, in order to avoid long code listings and an overly complex example).

For uploading JCL and generating jobs, it’s essential to set the file type to jes so the jobs get executed. On the other hand, for uploading Cobol sources, you need to set the file type to seq. This way they’re stored in a library instead of in the job queue.

Afterward, you can access the results of your operation and wait for the host to execute the job. The API is powerful: You can monitor the host job queue and scan for your jobs. You can also configure your FTP client by assigning a custom parser to work on the results in a more convenient way.

You can integrate the compiled Java class into a CI process depending on your requirements. For example, the Java class can be called by an Ant script or be included as an Ant task.

Java is powerful, but Microsoft .NET also has many helpful features, which we’ll cover next.

7.2. Integrating other artifact types: .NET

This section contributed by Hadi Hariri

This section will show you how easy it is to use a build framework such as MSBuild to build .NET software. Additionally, it will demonstrate how to add CI with .NET applications to an Agile ALM CI ecosystem that can also integrate other artifact types, such as Java.

To demonstrate using CI with .NET, we’ll use TeamCity as the build server, but build frameworks like MSBuild are agnostic regarding build servers, so you can add your build scripts to other build servers as well. TeamCity is a powerful build server that can be used to integrate different platforms and languages, including Java and Microsoft-related ones, in parallel.

The strategy presented in this section is to orchestrate best-of-breed tools and integrate them into a configured, personalized toolchain (as discussed in chapter 1). You don’t need to stick to proprietary Microsoft tooling, such as Visual Studio, to build your .NET software; rather, you can use lightweight tools instead. Additionally, it’s not necessary to use Microsoft Team Foundation Server to manage your .NET sources; you can manage the sources in parallel with other project artifacts in the same tool, such as Subversion, in order to foster an Agile ALM approach. Finally, this section shows an example implementation of the service-oriented approach that you learned about in chapter 1. You’ll see how easy it is to temporarily add additional build machines to your CI system by running builds in the cloud.

Like many other tools and practices in .NET, CI originated from the world of Java. But despite its relative newcomer status, .NET has certainly gained maturity in terms of adoption of best practices that evolved in the Java world. Much of this is impacted by the number of tools in .NET that support the key elements required for successful CI.

 

Ted Neward on integrating .NET and Java systems

I once asked Ted Neward, .NET and Java expert, “What’s your opinion about adding .NET artifacts to continuous integration processes and systems that are based on open source or lightweight tools, in the case that projects don’t want to use TFS, Visual Studio, or other Microsoft products?” This was his answer:

“In general, as much as I spend time handling integration between .NET and Java systems, I’m not a huge fan of mixing the developer tools across those two platforms—trying to get Maven to build .NET artifacts, for example, can be a royal pain to get right. In general, the best success I’ve had with this is to fall back to MSBuild, and kick it off as you would any other command-line tool. The build results can be captured to a text file and examined later, or if a more fine-grained control is needed, a shell around MSBuild can be built (probably with PowerShell, depending on the complexity of the problem) to capture events during the build. This doesn’t mean I’m going to tell .NET developers to stick with plain-vanilla Visual Studio or TFS, mind you. Better tools definitely exist for handling continuous integration builds than what comes out of the box. Pick one of those CI tools, figure out how to invoke a command-line tool from the CI infrastructure, use that to kick off MSBuild, and call it a day.”

 

When talking about build tools in the .NET space, there are two main contenders. On one side there’s NAnt, which is a port of Java’s Ant, and on the other there’s MSBuild, which is the .NET framework’s native build system from Microsoft. The core principles behind MSBuild are the same as those for NAnt.

7.2.1. Using MSBuild to build .NET projects

MSBuild provides a series of targets, each of which defines one or more tasks to be carried out. Targets are a sequence of grouped steps; this is similar to Ant, where targets are a sequence of Ant tasks. The following listing shows a sample build script.

Listing 7.4. A simple build script for .NET with MSBuild

As you can see, there’s support for concepts such as multiple targets (that is, one script being able to run different tasks and operations), property definitions, and conditions. Like NAnt, MSBuild is also extensible. You can create new tasks by implementing an interface and referencing it as an external assembly.[4]

4 A complete reference on all possible commands can be found at Microsoft’s “.NET Development” site (http://msdn.microsoft.com/en-us/library/aa139615.aspx).

TeamCity, from JetBrains, supports CI for both Java and .NET projects and has become one of the most popular CI servers.

7.2.2. Using TeamCity to trigger .NET builds

TeamCity ((see http://www.jetbrains.com/teamcity/) is available in two flavors: Professional, which is free, and Enterprise. It has quickly gained popularity over other tools, such as CruiseControl, due to its ease of use and rich feature set. In this section, we’re going to look at some of the features that TeamCity provides, starting with visual configuration of the environment.

Visual Configuration Environment

One of the more painful issues with CruiseControl is the requirement to set up the configuration through XML files and having to make these changes on the production CI server. With TeamCity, rather than requiring users to have permission to access folders on the server, all access control is handled via a web interface, allowing different levels of permissions. All project configurations are carried out using this interface, making setup easier and less error-prone.

Figure 7.3 shows how you can configure a build runner to choose between Ant, Maven2, MSBuild, NAnt, and many others.

Figure 7.3. Configuration in TeamCity: selecting a build runner (such as MSBuild or NAnt)

Ease of configuration is a main feature of TeamCity, although its ability to facilitate integration with other tools is also essential.

Integration is Core

TeamCity was built with the goal of being integrated with other tools and frameworks. Each developer or company has its own policies and ways of working. Some prefer to use tools such as MSTest for testing and MSBuild for build automation, whereas others prefer to use open source tools such as NUnit and NAnt. TeamCity tries to accommodate as many frameworks as possible by providing support for a variety of them. This is a key feature when it comes to having a productive CI environment.

One of the core benefits of CI is immediate feedback. When you break the build, you need to know why, to see test results, and so on. All this needs to be easily accessible and viewable without requiring a lot of effort. By supporting different testing and code coverage frameworks, TeamCity allows this seamless integration.

Figure 7.4 shows a sample output screen of build results with relevant information that allows you to investigate further if required.

Figure 7.4. Example TeamCity output screen showing test results

Apart from the more traditional build tools, TeamCity also supports some newer tools that are starting to gain interest among developers, such as Ruby’s Rake for build automation or Cucumber for testing. In addition to integrating with unit testing, code coverage, and automation tools, TeamCity also works with various types of source control management, including Team Foundation Server, Subversion, and some more popular distributed VCSs such as Git and Mercurial, as shown in figure 7.5.

Figure 7.5. Selecting a VCS for the .NET project build (with Subversion)

Another important aspect when it comes to integration is issue tracking systems. Again, TeamCity allows integration with common tools such as JIRA as well as Jet-Brains’ own issue tracker called YouTrack.

 

CI with .NET, but without Microsoft

Although you can use Microsoft tools (like the Team Foundation Server) to store your artifacts and run builds, this isn’t necessary. Tools like TeamCity or Jenkins are popular for building .NET projects (using MSBuild or NBuild), and you can store the artifacts in a common VCS, like Subversion.

What’s different when comparing CI with .NET to CI with Java is that you must use the specific .NET tools, such as MSBuild or NAnt, for building (and testing) the .NET components in what’s known as a managed environment. The most important point here is that you can host your .NET projects on the same VCS where you host projects for other languages and platforms (such as Java) and integrate them on the same build server where you also integrate other projects (such as Java projects).

There are also other approaches to integrating different platforms and languages and using a common, unified tool infrastructure, such as hosting your sources in TFS and building them with Jenkins (with the help of Jenkins’s TFS plug-in).

 

The flexibility of building and storing sources makes Agile ALM effective. Using remote build agents and cloud computing are also popular practices.

Build Agents and Cloud Computing

Many CI tools, including Jenkins, support the concept of build agents. The idea is to have one machine that handles the process and delegates the computing to other machines. As such, the main CI server would handle the configuring, reporting, and other non–CPU intensive processes, and one or more machines (called agents) would handle the compiling, building, and testing of the code. Figure 7.6 shows a build matrix that indicates the status of all agents and their utilization.

Figure 7.6. TeamCity showing agents

TeamCity has supported the concept of build agents from the beginning, but what’s new in the recent releases is its integration with cloud computing. Amazon’s EC2 cloud computing infrastructure is a pay-per-use concept where you pay for machines based on the number of hours they’re on—if your machine is running for, say, five hours, you would be charged for five hours of use. TeamCity uses EC2 via virtual build agents, which are similar to standard ones except that they run on virtual instances on the Amazon EC2. This means that TeamCity can dynamically start as many instances of agents as needed in order to keep the build queue under control during high loads. Additionally, TeamCity can shut down virtual build agents when they aren’t needed anymore; this minimizes EC2 consumption of uptime.

Figure 7.7 shows how to create an EC2 cloud profile in TeamCity.

Figure 7.7. Server configuration, defining a cloud profile for EC2

CI is the same, whether it’s in Java, Ruby, or .NET. What’s important when it comes to implementing CI is having the correct tools to make the whole process efficient and fast. Spending time to integrate multiple products for every project is cumbersome and a waste of resources, and that’s why it’s important to have tools that can seamlessly work with multiple frameworks, platforms, and tools, such as Jenkins and TeamCity. This leads to an effective toolchain that consists of one central CI server that integrates and works with different platforms and tools, such as Java and .NET.

7.3. Configure: building (web) apps for multiple environments

This section contributed by Max Antoni

Java applications consist of artifacts such as EAR, WAR, or JAR packages. When developing an application, you might want to do some integration tests and then deploy the application onto a test environment and into production. Deploying the application on one specific machine may require that you configure environment-specific application properties. But you often won’t want to run the script on each environment individually because it takes too long and you want to rely on one specific version of the software.

How can you support multiple environments and cope with runtime configurations? Generally, you have many options:

  • Use the artifacts and manually configure them for each environment (bad!).
  • Aggregate environment-specific data in Java properties, replacing them manually for each environment.
  • Write a script that scans configuration files, automatically replacing specific entries with other values.
  • Write scripts that check on which server the build is running, the target environment, and where tools are installed, to detect the configuration parameters for the output.
  • Use dependency injection, based on the technology you’re using.[5]

    5 One example of using dependency injection with Java EE 6 can be found in Juliano Viana’s “Application configuration in Java EE 6 using CDI—a simple example” blog entry on java.net: http://weblogs.java.net/blog/jjviana/archive/2010/05/18/applicaction-configuration-java-ee-6-using-cdi-simple-example.

It’s more elegant to use the features your build tool offers. If you use Ant, you can apply its filter-chaining feature. For instance, by embedding a filter chain together with an expandproperties command in a copy task, Ant will replace placeholders with values of property files. A basic template looks like this:

<copy todir="targetDir">
<fileset dir="templates" />
<filterchain>
<expandproperties/>
</filterchain>
</copy>

This section will show an elegant solution that uses Maven.

When people start using Maven, one of the first things they discover is that it always produces a single artifact per POM. This is good for identifying dependencies and avoiding circular dependency issues or duplicated class files in the classpath. In Maven, a specific artifact (that is, a specific version of your artifact) is defined by its groupId, artifactId, and version number (its GAV coordinates, in short). These coordinates shouldn’t be changed, even when the artifact is deployed to a different target environment. Let’s look at how Maven helps with this configuration using profiles.

A common solution to this configuration problem is to run the build with a different profile for each environment (profiles are discussed in section 7.3.3). This has the advantage of keeping the build simple. But the disadvantage is that you can’t use the results of the same build in multiple environments. This might be a critical issue if you have to meet common compliance or audit requirements. Besides governance issues, there may be technical reasons to avoid this approach. If you’re using the Maven Release plug-in to establish a full-fledged release based on Maven, it becomes a critical limitation because you create the release (that is, the output of the Release plug-in) only once. You would have to check out the sources from the created tag, and then build and deploy it again with the corresponding profiles activated. It’s much easier to have a single build that produces all artifacts for all environments.

You should try to keep your configuration data separated from your binaries. With Maven, you can create separate artifacts for each environment by using classifiers, different profiles, or projects.[6] In your build process, by default, you should build all artifacts for all possible target environments.

6 See the official documentation on the Maven web page: http://maven.apache.org/guides/mini/guide-building-for-different-environments.html.

Another typical way is using assemblies, discussed next.

7.3.1. Multiple artifacts with assemblies

Maven has assemblies that let you create multiple distributions of your application. Artifacts may produce a zip file with the source code, a JAR with all dependencies, and much more.

Let’s look at an example of a perfect approach for a web application. In this case, we want to produce a WAR file, which we do in a standard POM by specifying war in the packaging tag: <packaging>war</packaging>. You can add an assembly to your build’s plugin section, as shown in the following listing.

Listing 7.5. Maven assembly plug-ins

This next example in listing 7.6 produces the assembly as part of the package phase . By convention, assembly descriptors are placed in src/main/assembly. The assembly descriptor will produce another WAR file next to the one that the POM creates by default. The assembly ID that’s specified gets attached to the filename, so you end up with something like this in your target directory:

webapp-1.0-SNAPSHOT.war
webapp-1.0-SNAPSHOT-prod.war

The referenced assembly descriptor is shown in the following listing.

Listing 7.6. Assembly for production (prod.xml)

We configure the assembly to produce the same WAR as the normal development version. To do so, we add a dependency set that unpacks the dev WAR file. The production WAR file gets patched with the properties file in the dist directory by configuring a file set in the prod assembly.

7.3.2. Applying different configurations

You don’t want to hardcode specific data values in your application. Instead, it’s good practice to have environment-specific information grouped together in one or two configuration files. Assuming the configuration of our web application lives in Java property files, we can now have a basic configuration for development and additional configurations for each assembly being created (for example, production). In addition to that, a development team might need different configurations for their environments.

To achieve this, we’ll first separate the configuration files from other resources and place them into src/main/config/dev and src/main/config/prod. The resulting directory structure is shown in figure 7.8.

Figure 7.8. Directory structure, including configuration for different environments (dev, prod, test)

The production version of the property files might contain information like database connections, but the development version could use placeholders that are filtered by Maven to contain values for the individual developers’ profiles. This can be done by overriding Maven’s default configuration for resources in the POM. The following listing shows how to configure the resources folder.

Listing 7.7. Overriding default configuration in the POM

First we define the platform-specific resource folder and configure it to allow filtering. Next, we define the resource folder containing the platform-opaque content.

By setting up the resources correctly, accessing the content is transparent. The next listing shows a major part of our example web application.

Listing 7.8. Accessing configuration properties

We read the properties content in the init method of our servlet. To demonstrate the two ways of accessing content (content that’s dependent on the environment and content that’s platform-independent ), we specify two properties files. In the Java class, the approach is the same. In both cases we load the resource as a stream . Maven puts the properties into the correct folders.

Running mvn clean package now produces two WAR files with different configurations. But we might not want to create all versions for all environments on each build. In the next section, I’ll show you how to use a distribution profile to handle multiple environments.

7.3.3. Using a distribution profile and executing the example

To optimize the current POM, the entire assembly plug-in configuration in the build section can be wrapped into a dist profile. This has the advantage that a developer can choose when to produce the WAR files for all target environments. You can also use this approach to automate this step when producing a release. The following listing shows the profile.

Listing 7.9. Profiles

Maven’s Release plug-in offers a convenient way to work with profiles during releasing. By configuring the releaseProfiles element, the defined profiles are activated automatically. Here’s an example:

<plugin>
   <artifactId>maven-release-plugin</artifactId>
   <configuration>
      <releaseProfiles>dist</releaseProfiles>
      <goals>install</goals>
    </configuration>
</plugin>

You can now run the example with mvn clean package -Pdist. Maven will compile and package the web application, putting three WARs into your target folder:

webapp-1.0-SNAPSHOT.war
webapp-1.0-SNAPSHOT-prod.war
webapp-1.0-SNAPSHOT-test.war

 

Accelerating development with the Maven Jetty plug-in

Sooner or later, you’ll want to deploy your coded and packaged web application to a servlet container for development testing purposes. Usually, you would download a servlet container such as Tomcat, and then copy and unpack your WAR file to the container’s webapps folder. But with Maven, you profit from a much faster feedback loop by using the Maven Jetty plug-in to run your web application within Maven. You have to configure the Jetty plug-in in your POM and run special Maven Jetty goals to deploy and run the WAR file.

 

Using the Maven Jetty plug-in, the dev build can be created and tested in one single step. Add this to the plug-ins section of your POM:

<plugin>
   <groupId>org.mortbay.jetty</groupId>
   <artifactId>maven-jetty-plugin</artifactId>
</plugin>

Now run mvn clean jetty:run and then open http://localhost:8080/webapp/ in your browser. The test and prod environments can be tested by deploying the respective WAR files into a servlet container and then opening http://localhost:8080/webapp-1.0-SNAPSHOT-test/ and http://localhost:8080/webapp-1.0-SNAPSHOT-prod/ in your browser.

In summary, Maven offers functionality to configure artifacts for different target environments. By using assemblies, a build produces multiple artifacts for your specified environments. This way it’s easy to maintain the application and it encourages a clean separation of environment-specific configuration files. Combined with Maven’s releasing facilities, creating a release for multiple environments is done in a single step.

7.4. Building, auditing, and staging with Jenkins

Kohsuke Kawaguchi is currently lead developer on the Jenkins project, an open source CI server that was originally released in February of 2005 (under the name Hudson). Many developers find Jenkins easy to install and configure with its web interface written in Ajax (instead of using XML configuration files, as is required by Cruise Control). With Jenkins, many different artifact types can be built, such as .NET and Java.[7]

7 For further details on Jenkins, see John Ferguson Smart, Jenkins: The Definitive Guide (O’Reilly, 2011).

 

Jenkins and Hudson

Hudson split into two different products: Hudson (http://hudson-ci.org) and Jenkins (http://jenkins-ci.org). The original founder and core contributor, Kawaguchi, along with others, works on the open source product Jenkins. Hudson development is led by Oracle, together with Sonatype and others. Jenkins and Hudson follow different strategies regarding releasing and licensing—for details, please refer to the respective product sites. I used Jenkins in writing this book, but all discussions apply to both products. it’s possible, though, that divergent developments and incompatibilities may result in the future.

 

It’s recommended that you run Jenkins in a servlet container such as Tomcat or JBoss, but you can also start by executing java -jar Jenkins.war on the command line. Jenkins has a component architecture, so it supports a large library of plug-ins that extend its functionality. Using plug-ins, you can, for example, add different build types to your system, like .NET projects; add further reporting or auditing facilities; or add different communication channels, like posting build results on Twitter.

Discussing Jenkins could fill a whole book. Here, we’ll focus on introducing Jenkins and looking at it in the context of audits and its Artifactory integration.

7.4.1. Jenkins and triggering jobs

Jenkins jobs allow you to specify what you want Jenkins to build and to schedule when they should be built. Jenkins also specifies which artifacts should be procured as a result of the build. It also provides reporting on channels and a definition of what to do after the build has run.

Jenkins is a build server, so it doesn’t know how to build your project. You need build scripts, such as Maven, Ant, Ivy, MSBuild, and so on, to build your project,. Jenkins allows you to manage these build scripts (inside a Jenkins job) and run builds, depending on specific criteria.

These are the typical approaches to triggering a build:

  • A developer checks something into your VCS, and Jenkins then builds a new version of your software including these recent changes, because it’s monitoring changes to the VCS. This is a nice approach for setting up a continuous build. You can align this with your individual needs in many ways. For example, by configuring when Jenkins checks for changes. You could require that Jenkins checks the VCS once every hour by using a cron-like syntax in the Jenkins job configuration panel. Some projects use postcommit hooks inside their VCS to start a new build after a check-in. This is also a valid and effective approach. You can do this by referencing a shell script as a postcommit hook containing a single wget call to your build URL. This fosters a task-based approach, but it also has drawbacks. If you have many local changes to be synchronized with your central VCS, starting a build directly may lead to code in the VCS that’s inconsistent. This makes it difficult to check in your changes frequently, even if the changes don’t have any interdependency with each other. In this case, you should rely on the Jenkins feature. Letting Jenkins check the VCS for changes is more elegant and is often done on an hourly basis. Don’t be afraid of performance bottlenecks here. Jenkins doesn’t check out the entire source tree. For example, for Subversion, Jenkins checks the revision number to detect any changes.
  • Configure the job to start a scheduled build. This is often used to run nightly builds that may contain more complex logic, may do more testing, and so on. These builds are done once a day, typically overnight, to deliver the results in the morning when developers can then react to them.
  • Use dependency builds, which means a build job is dependent on the run of another job. This allows complex build chaining (often also called a staged build). One such scenario is to have a simple build job that compiles, packages, and does some basic sanity checks. Only if this is successful is a second downstream job run to complete the test coverage. This allows much better feedback loops than putting the complete build and test logic into one monolithic build job. On the success of the previous build, another Jenkins downstream job could be triggered to deploy the created artifact to a test environment. For instance, Jenkins’ Deploy plug-in can deploy artifacts to common application servers; build dependencies can also be configured by using Jenkins’ Build Pipeline plug-in.[8] You can also configure dependency builds across build tools. Using the Bamboo plug-in for Jenkins, for example, you can trigger a Bamboo build as the postbuild action in Jenkins. Besides that, Jenkins offers you features that aggregate build results for dependant builds. As an example, you can configure Jenkins to aggregate test results across different jobs.

    8 Jenkins’ Deploy plug-in can be found at https://wiki.jenkins-ci.org/display/JENKINS/Deploy+Plugin. The Build Pipeline plug-in is at https://wiki.jenkins-ci.org/display/JENKINS/Build+Pipeline+Plugin.

  • Push the build start button manually.

Jenkins organizes builds with build jobs that are aggregated on Jenkins’ dashboard.

7.4.2. Jenkins dashboard and Jenkins jobs

Jenkins provides intuitive navigation, starting with the Jenkins dashboard, which lists your build jobs. There, you can click on one job to see more details on a particular build job (for example, the workspace consisting of all sources checked out by Jenkins). Finally, you can go into one of the job runs to check its individual result.

What does this look like in detail? Figure 7.9 shows the dashboard listing a couple of jobs—you can see the status of the different jobs.

Figure 7.9. Jenkins dashboard listing the configured jobs and information about them (result of the last build, trend, duration). You can also start new builds by clicking the buttons at the far right.

The balls in the first column indicate whether the last job run was successful or not. Here, green indicates success (but the green must be configured by installing another plug-in; success in Jenkins is traditionally indicated with blue balls). Other possible indicators are red balls (for failed builds, due to compile errors, for example) and yellow balls (pointing to unstable builds, due to failed tests, for instance). You can configure what fails a build. For example, you can define when a build is considered to be unstable, based on the results of audits, tests, and test coverage.

The trend is displayed in the second (weather) column, which analyzes the last builds, and the subsequent columns provide pointers to the times of the last builds and their durations. You can also start new builds by clicking the button on the right. Rolling the mouse over a visual UI item delivers more detailed context information.

Proceeding from the dashboard to a specific build job, you’ll see the following (depending upon how you’ve configured Jenkins and what Jenkins plug-ins you’ve installed):

  • Links to a change history that shows the build history and the VCS changes new in this build (and identifying who did this change). This can be linked with a repository browser like FishEye and an issue tracker like JIRA.
  • Links to associated tools (like Trac or Sonar), if configured.
  • Links to the coverage report showing test coverage and its trend across builds—for example, measured by the Cobertura coverage tool (if you configure the Cobertura plug-in as part of your Maven build). For example, a display of code coverage on package, file, class, method, line, and conditional levels.
  • Exposed and stored artifacts, like a JAR built by the job. This can’t replace more sophisticated storage, like in a VCS or in a component repository like Artifac-tory. Depending on your context, you could provide a zip file that’s generated by your Maven build (via the Maven Assembly plug-in); for example, you could generate a target platform for your OSGi project or any similar package of artifacts you want to provide.
  • Links to the latest test results, which include pages containing a list of test modules and the test results (for instance, test failures new with this build).
  • Aggregated data about audits and tests and the trends across builds (if you configure the audit tools as part of your Maven build and reference the audit XML results in Jenkins).
  • Dedicated trend illustrations for tests and static code auditing (FindBugs, PMD, Checkstyle), as well as a handy overview of all violations detected by a specific tool (if you configure the audit tools as part of your Maven build).
  • The Javadocs delivered by the build (if they’re part of the Maven project description).
  • A link to the generated Maven site.
  • Links to Maven modules showing fine-grained information on the module level (such as audit violations).
  • A link to the Jenkins workspace showing the current sandbox checked out of VCS (by VCS checkout or update, according to what you’ve configured).
  • Links to configuration pages to configure what you see.

When a job runs, it adds the results of a specific build to the history. Whereas the job overview page aggregates information or shows the last results, each job build occurrence has its individual information. This is important, because with Jenkins you can also inspect older builds in addition to the most current one. Each build shows the information illustrated in figure 7.10, as part of the build detail page.

Figure 7.10. On the build detail pages, Jenkins provides more information, including build artifacts, details on why the build was triggered (here a change in Subversion, revision 255, detected by Jenkins), and an overview of static code analysis violations.

 

Jenkins and its Matrix Project Job Type

Jenkins comes with a job type called matrix project (see the official plugin page: http://wiki.Jenkins-ci.org/display/JENKINS/Aboutncysa). This job type expands a freestyle software project to a large number of parameterized build configurations. Matrix project lets you set up a single configuration with user-defined parameters. When you tell Jenkins to build it, it will build all the possible combinations of parameters and then aggregate the results. In addition to testing, this job type can also be useful for building a project for multiple target platforms. The plug-in is extended with further functionality continuously.

 

Figure 7.11 shows another excerpt of the build detail page with links to test results, audits, Artifactory, and built modules.

Figure 7.11. On the build detail pages, Jenkins links to test results, dedicated reporting pages according to code violations (here Checkstyle, FindBugs, and PMD), an aggregation page of violations (static analysis warnings), and to Artifactory and individual modules of the Maven build.

In the next section, we’ll discuss using Sonar with Jenkins. Sonar is an open source code analysis tool that helps to improve code quality.

7.4.3. Auditing with Jenkins and Sonar

Jenkins and Sonar both deliver reporting and data-aggregation support for auditing your software development. Auditing with Jenkins relies on configured build scripts, whereas Sonar doesn’t require your build scripts to be changed. Sonar can be integrated with Jenkins. Let’s start auditing with Jenkins.

Auditing with Jenkins

As a precondition for using your chosen auditing tools, you must configure your build project appropriately. This way, you can run the build script without Jenkins, for instance locally in the developer’s workspace, and get the result of those audits as well. Jenkins triggers the builds and aggregates and visualizes audit results. The following POM snippet shows what this can look like.

Listing 7.10. POM with Cobertura, FindBugs, Checkstyle, and PMD configuration

You can place your auditing plug-ins into the POM’s build and reporting sections. In the reporting section, shown in listing 7.10, Maven inspects your code and reports the result. This has an informal character only because the build itself isn’t influenced by the result of the audits. You can configure Jenkins to influence the reported project health, or even break the build. There are limitations, though, in handling the granularity and criticality of violations. Jenkins counts only the violations.

In the build section, auditing results can influence the build directly. This is handy for implementing quality gates, if, say, you want to break the build when defined requirements aren’t met. Another use case for including audits in the build section (and not in the site section) is if you don’t want or need to use Maven’s site lifecycle to generate these reports.

 

Continuous Inspection Versus One-Time Inspection

Some projects experience good results with a single one-time inspection instead of continuous inspections. These projects claim that it’s enough to include specific inspections for a particular instance, and then not run them again afterward. In these cases, this approach will point to possible design defects that result in a learning opportunity for the team. This is a task-based approach that allows developers to focus on their work instead of constantly focusing on passing audits.

 

Audit tools support the configuration of rules in dedicated configuration files. If you choose to use detailed files for configuring rules, you may place these files into a dedicated Maven project. You must place these files in a dedicated project if you run a multimodule Maven build. Then, in your parent POM build section, you add an extension element that adds only artifacts to the Maven compile classpath. This section extends the compile classpath, adding the audit rules:

<extensions>
   <extension>
      <groupId>net.huettermann</groupId>
      <artifactId>resource</artifactId>
      <version>1.0.0</version>
   </extension>
</extensions>

The resource project consists of only the configuration files for the audit tools, placed in the default file structure where Maven expects general resources: src/main/resources/. Be aware that you can’t use the extensions construct in a POM’s reporting section.

Figure 7.10 showed an aggregation of static code violations that can be detected by various audit tools (such as Checkstyle, FindBugs, PMD).

 

False Positives

Remember that you have to align audits with your individual context. In part, this involves placing a value on the auditing rules you include. A valuable rule for one project can be misleading and of no value for another project. For example, consider FindBugs’ UWF_FIELD_NOT_INITIALIZED_IN_CONSTRUCTOR rule, which rule checks whether you’ve initialized fields in the constructor. If you use any kind of Java injection mechanism, this may lead to a lot of false positives because the injection system takes care of initializing properties. Other rules are debatable; one person may state it’s perfect and a good design to have only one return statement at the end of your block. Another person may state that using multiple return statements improves clarity and reduces the overhead.

 

Figure 7.12 provides an example of using Checkstyle for code audits.

Figure 7.12. Checkstyle found an antipattern: this method isn’t designed for extension.

In figure 7.12, Checkstyle analyzed the code and found a design antipattern. According to Checkstyle, the method isn’t designed for extension—the method should be abstract or final. Checkstyle knows this rule as DesignForExtensionCheck. Jenkins reports this with a special colored background.

The example in figure 7.13 shows PMD in action. PMD detected an empty catch block (an unhandled exception), which is considered to be an antipattern.

Figure 7.13. PMD detects an empty catch block.

The last auditing example is shown in figure 7.14. FindBugs detected a code fragment that was obviously a coding defect. By accident, the developer repeated a conditional test. The developer wanted to add a different condition but made a mistake while doing so.

Figure 7.14. FindBugs points to a repeated conditional test, which is most likely a coding defect.

Discussing all tools or even all rules in detail is beyond the scope of this book. The lesson here is that you should include audits in your build and let Jenkins report and aggregate the results. For more information on the individual rules (and how to configure them), please consult the available documentation for these free tools:

 

What makes Checkstyle, PMD, and FindBugs complementary?

Although there’s some overlap, all three tools have different usage scopes and individual strengths.

Checkstyle focuses on conventions. For instance, does the code correspond to a defined format, are Javadocs set correctly, and are the Sun/Oracle naming conventions followed?

PMD focuses on bad practices, such as well-known antipatterns—code fragments that will lead to difficulties over time. Typical examples are having dead (unreached) code, too many complex methods, and direct use of implementations instead of interfaces.

Finally, FindBugs focuses on potential bugs. These are code statements or sequences that aren’t immediately clear but that will lead to terribly bad situations. Multiple parameters must be taken into account to detect such a circumstance. Examples include a code change that uses a conditional statement twice or that returns references to mutable objects while exposing internal representations.

 

 

Sun/Oracle Code Conventions

The Sun/Oracle code conventions usually don’t target project requirements (they’re too restrictive and too fine-grained), but you can use them as a template to customize your own conventions. Conventions are important for a team to follow, in order to collaborate with the greatest efficiency and to maintain a common standard while sharing code.

 

As mentioned, you can configure a failed build based on the results of your audits. Although you can directly configure the auditing tools and plug-ins in Maven (for example, to automatically break a build completely as soon as the test coverage is ascertained to be insufficient), Jenkins’s configuration helps to control operational efficiency. Although developers can use Jenkins, too, the build scripts aren’t usually supported by the build manager: Instead, project developers write and maintain build scripts as part of the application development effort. Depending on how you slice project roles, a central build manager may use Jenkins to apply centralized quality gates, but they probably won’t change the scripts themselves (because of organizational restrictions or because they don’t have the skills to do so).

 

Jenkins, Audits, and Ides

Jenkins integrations for IDEs are available, too, and there’s support for auditing in the IDE. For example, it can be wise to use the Checkstyle plug-in for Eclipse, but this should be integrated with the build stream. It’s more important to include audits (and only the audits that add value in your individual situation) in your build system than in your IDE. Many IDEs also provide auditing rules or apply conventions. You can configure Eclipse, for example, to apply rules, such as organizing imports, when you save. If you use this configuration, you check in your configurations to your VCS and provide them to others, but you can’t force colleagues to use them (which is a good reason for integrating audits with a build run on the central build server).

 

Finally, let’s look at test coverage. We added Cobertura to our Maven POM; figure 7.15 shows the test coverage of the project built via Jenkins, on the package level. You can navigate further into files, classes, and methods, inspecting which tests passed (and how often) and which didn’t. Please keep in mind that Jenkins is a tool that provides reporting and aggregating; it doesn’t measure or inspect code.

Figure 7.15. Code coverage breakdown by package, showing packages and their files, classes, and methods coverage

Auditing with Sonar

Although Jenkins provides a centralized view of your build results (including reporting of audits), there are other common tools for tracking code quality. Sonar (http://sonar.codehaus.org/) is one such application. It’s self-contained and isn’t dependent on build scripts or Jenkins. If you reference your Sonar installation in Jenkins, Sonar will examine the quality of the builds Jenkins performs.

Sonar can be configured to apply FindBugs, Checkstyle, and PMD, among other tools, and it can apply code coverage for your project without having to configure the tools in the Maven POMs. Because it doesn’t require any POM modifications, it can be executed on every Maven project. The benefit of this is that it allows Maven to do its core job (build the project) and it keeps the POMs lean. But this can be a drawback too, because you lose early feedback on simple compliance errors in your builds. Another benefit is that you can easily analyze projects. SonarSource (commercial support for Sonar) added reporting on many open source projects to their audit server hosted under their nemo subdomain: http://nemo.sonarsource.org.

Figure 7.16 shows the results of a build in the Sonar dashboard.

Figure 7.16. A project inspected by Sonar, showing the results of FindBugs, Checkstyle, and PMD inspections, and the results of code coverage

 

Continuous inspection, by Simon Brandhof (SonarSource founder and technical lead)

More than ten years ago, the concept of continuous integration was introduced. Its ultimate goal was to become capable of firing a release of any type at any time with minimal risk. To reach this objective, continuous integration has introduced new quality requirements on projects:

  • Anybody must be able to build the project from any place and at any time.
  • All unit tests must be executed during the continuous integration build.
  • All unit tests must pass during the continuous integration build.
  • The output of the continuous integration build is a package ready to ship.
  • When one of the preceding requirements are violated, nothing is more important for the team than fixing it.

This is a good starting point, but it isn’t sufficient to ensure total quality. What about other source code quality requirements? Requirements could be

  • Any new code should come with corresponding unit tests (regardless of previous state in code coverage).
  • New methods must not have a complexity higher than a defined threshold.
  • No cycle between packages must be added.
  • No duplication blocks must be added.
  • No violation to coding standard must be added.
  • No call to deprecated methods should be added.
  • More generally, how to keep overall technical debt under control and only let it increase consciously: this is the concept of continuous inspection.

A continuous inspection process can be seen as an information radiator dedicated to making the source code quality information available at any time to every stakeholder. Transparency is certainly one of the main reasons open source software is usually of better quality than closed source software is. A developer writing a new piece of code should always think about the next person or team who will maintain it. Continuous inspection ensures this golden rule is not forgotten.

 

Sonar enables you to navigate through components and provides appealing visualizations. You can also zoom in on individual class statements. An in-depth discussion of Sonar is beyond the scope of this book, but it’s worth taking a closer look at this technology. This is particularly true if you’re interested in audits and are looking for a one-stop solution that eliminates the need to integrate different tools into your build scripts.

For projects based on Maven, Sonar permits you to visualize artifact dependencies (for example, libraries cartography, which specifies the project and library in use). Sonar supports analyzing project sources to identify what project is using a specific version of a library. You can also use the functionality provided by Maven’s Dependency plug-in (http://maven.apache.org/plugins/maven-dependency-plugin) by running the dedicated commands on the console or by binding the functionality to a Maven phase, but Sonar’s reporting is much more convenient. Besides the improved convenience, questions like “Which projects are using commons-logging or any other specific library?” aren’t possible to answer when you use Maven’s Dependency plug-in directly.

But keep in mind that Sonar isn’t a substitution for a repository manager, like Artifactory, and Maven’s site lifecycle can read POMs and visualize dependencies as well.

7.4.4. Running build fragments in Jenkins only

It can be handy to run parts of your build scripts using Jenkins (that is, on a central integration server), but it’s also useful to have fast-running builds on developers’ machines, so developers can run their build scripts before code check-in. But build scripts should be the same whatever their running context is—how can build scripts know if they’re running in Jenkins or not? To resolve this, we need a special configuration or another mechanism to detect the runtime environment.

 

Note

Many developers don’t want to wait until test coverage (or other audits) are measured before each VCS commit. They only want to do a smoke test consisting of compiling and packaging artifacts. Other developers would like to run test coverage and other audits in their workspace, but they value fast builds and fast feedback, and many builds aren’t quick enough. Audits (or other similar advanced practices) can be run on the central build machine.

 

One solution is to configure Jenkins to inject specific Java properties into your build. While running the build script, it detects which specific parameters are or are not available if you start your build script on a developer’s desktop. Another solution is to use one of the implicit parameters that Jenkins automatically injects into every build (for instance, BUILD_NUMBER). In either case, you’ll need to use Maven profiles to handle the different build behaviors.

The next listing shows an example where we define a profile inside a parent POM. Using Maven profiles enables you to create different configurations, depending on where the script runs.

Listing 7.11. A Maven profile activated by Jenkins

Each profile has a unique identifier and contains information specifying when it will be activated. Many options for activating a profile are available, such as activating a default profile on particular operating systems, or Java versions, or if properties are set. In this case, we activate the profile when the property BUILD_NUMBER is set. The value of the property isn’t important in this example, only that it exists. Next, we configure the Maven logic by adding the Cobertura plug-in to the build phase. Many areas of your POM can be customized with profiles.[9]

9 For a comprehensive discussion, see the Maven “Introduction to Build Profiles” (http://maven.apache.org/guides/introduction/introduction-to-profiles.html).

Because <property file="${user.name}.properties"/> is a commonly seen pattern, it could be useful to provide a properties file named Jenkins.properties. If Jenkins runs as user Jenkins, it will run without further intervention. In other cases, you have to start your build script with -Duser.name=Jenkins, which is equal to setting user.name=Jenkins somewhere in the properties, although you probably don’t want to do this.

 

Advanced Maven pluginManagement

Besides adding a Maven plug-in to your build or reporting section, you can also use Maven’s pluginManagement section. There, you can define and fully configure a plug-in inside a parent POM for further flexible reuse. Once a plug-in has been defined and configured in the pluginManagement section, child POMs can reference the plug-in for usage without repeating the full configuration. This enables you to centrally provide default plug-in settings so that child projects don’t have to repeat the configuration settings again and again.

This feature allows you to configure project builds that inherit from the current one, but this configures only those plug-ins that are referenced within the plugins element in the children. The children have every right to override pluginManagement definitions.

 

Jenkins allows you to pass arbitrary use data as a key/value pair to the build script. You can evaluate this data to activate a Maven profile, for example by checking whether the parameter (the key) is set and passed to the build script, or whether the key has a specific value. Another use case for processing passed data is that you can process the build number in the build script.

7.4.5. Injecting build numbers into applications

A developed and delivered application should have a visible version number. Often different version numbers are used: a version number that is primarily used by domain experts and users of the applications, and a version number that is used by the development team. A shared version number improves communication between the stakeholders (developers, testers, and users) by linking bug fixes and features to delivered versions of the software.

Agile ALM encourages communication between stakeholders, so many Agile ALM projects inject a technical version number into the applications. One commonly used version number is the build number that is incremented by Jenkins because this build number is unique for each build project.

A typical setup to inject the build number into an application built with Maven may look like this: In your Jenkins build project, you can configure parameters to be passed to your build script. Parameters are key/value pairs, and in your build script you can access the key and read the passed value. In this case, we’ll assign Jenkins’ implicit variable $BUILD_NUMBER to the key jenkins.build_number. An implicit variable is not a user-defined variable, rather this variable has a specific meaning for Jenkins: Jenkins replaces the variable with the concrete build number whenever the build job runs. The resulting key/value pair looks like this:

jenkins.build_number=$BUILD_NUMBER

Jenkins triggers the project’s Maven build system. In the Maven POM, you can access the parameter and further process it, as shown in the following listing.

Listing 7.12. A Maven profile activated by Jenkins

The Maven POM uses Maven’s Ant plug-in to call Ant’s echo task. The echo task writes the build number to a file . The name and the location of the file are parameterized. This means that in the central parent POM (our master POM, not shown in this example), we configured the variable as a Maven property so we can access the property later where we want to use it (as we did at ).

The remaining aspect of integrating the version number into the application is to read the previously written version number file in the developed application, as you can see in this listing.

Listing 7.13. Reading the version file

In the application, the file can be read and the retrieved version number can be put into a semantic context, such as by displaying it in the user interface. The convenience method encapsulates the reading of the version number and has the version number as its return value .

Now that we’ve discussed different approaches to integrating Jenkins with your build system, we’ll talk about integrating Jenkins with the component repository, Artifactory.

7.4.6. Jenkins, Artifactory, staging, and atomic deployment of Maven artifacts

Integrating Jenkins with Artifactory has many appealing benefits. The most obvious advantage is that your CI server should deploy artifacts to a component repository, such as to Artifactory. But there are more advantages.

By integrating those two tools, you profit from atomic deployments of build artifacts. Additionally, the integration better links built and deployed artifacts to build jobs. Linking artifacts to builds enables Artifactory to semantically group binaries so that you can operate commands on the semantic group of artifacts, instead of operating commands on single, loosely coupled artifacts. A common command that you’ll want to apply on a group is staging the group of artifacts.

But first, let’s start with setting up the integration of Jenkins with Artifactory.

Installing and Configuring the Jenkins/Artifactory Bridge

The Jenkins/Artifactory communication is done with Artifactory’s REST API, and the Artifactory UI shows convenient views of this data as part of the Power Pack commercial add-ons. To install and configure the integration with Jenkins, here is what you need to do:[10]

10 For further information about the Jenkins/Artifactory integration, see the plug-in’s web page: https://wiki.jenkins-ci.org/display/JENKINS/Artifactory+Plugin.

  1. Get Jenkins running.
  2. Get Artifactory running (with the commercial Build Integration Jenkins add-on).
  3. Install the Jenkins Artifactory plug-in (available through the Jenkins plug-in manager).
  4. In Jenkins, configure your Artifactory server and user credentials in the Jenkins configuration panel.
  5. In your build job in Jenkins, configure your build section to run mvn cleaninstall. This will install the artifacts to a Jenkins repository, from which Jenkins deploys the artifacts to Artifactory.
  6. In your build job in Jenkins, configure a postbuild action to deploy artifacts to Artifactory (after the full build goes through successfully).
  7. In your build job in Jenkins, in the Artifactory Configuration section, select the Artifactory server, the target repositories for releases and snapshots, and check boxes to deploy Maven artifacts and capture and publish build info (see figure 7.17). Jenkins offers a drop-down list for scanning all available repositories. Double-check that you have the deployment permissions on the target repository (in Artifactory) and that the credentials are set correctly (in Jenkins). The Artifactory plug-in for Jenkins also includes a link on the Jenkins user interface to redeploy artifacts to Artifactory at any time.
    Figure 7.17. Configuring the integration of Jenkins with Artifactory. Jenkins resolves the central settings that you’ve set on the Jenkins configuration page and suggests valid entries for Artifactory server and target repositories. In this example, Jenkins will deploy artifacts to Artifactory after all single Maven modules are built successfully. It will also capture build information and pass it to the Artifactory server. Help texts are available on demand for all configuration settings (by clicking the question marks).

Atomic Deployments with Jenkins/Artifactory

Jenkins is a build server that can trigger Maven goals. If you configure Jenkins to deploy a Maven multimodule build project, the deployment is executed on all POMs, including parent and children.

If one Maven module deployment fails due to a build error, this may leave the build in an inconsistent state, with some artifacts being deployed into your component repository and others not, as shown in figure 7.18.

Figure 7.18. The default way of deploying Maven artifacts in a multimodule project: All single Maven projects are deployed one by one. If the multimodule build fails, some artifacts are deployed to the component repository, and others aren’t. The result is an inconsistent state.

Using Artifactory, together with Jenkins, the deployment is done as one atomic operation only at the end, after every single module has been processed successfully.

This is a big step forward and solves the problem with Maven deploying each module separately as part of the deploy phase, which can leave your repository with partial deployments and inconsistent builds (see figure 7.19).

Figure 7.19. Deploying Maven artifacts in a multimodule project, with Artifactory and Jenkins: All single Maven projects are installed locally. If the complete multimodule build succeeds, all artifacts are deployed to the component repository. In the case of a build failure, no single module is deployed. This result is a consistent state.

Based on this atomic deployment feature, you can set up a sophisticated build system to ensure high-quality and state-of-the-art releasing. For instance, one Jenkins job does some essential tasks like compiling and packaging, and a second downstream job does comprehensive testing. Only if all the tests pass are the artifacts published to Artifactory.

Staging Artifacts in Artifactory

Another common requirement is identifying (and semantically grouping) artifacts deployed to your component repository. The process of staging software involves picking artifacts and promoting them as a single unit. For staging, you must identify and reference all artifacts built by a Jenkins build job. And identifying them isn’t enough—you must change their locations (or their visibility) in one atomic step. These are common challenges, and Jenkins, in conjunction with Artifactory, provides a viable solution.

 

Staging Versus Promoting in Artifactory

This book doesn’t distinguish between the staging and promoting of artifacts. Both terms refer to moving artifacts from one rung of the staging ladder to a higher rung. Artifactory does distinguish between staging and promoting. Given a Maven-based build, staging in Artifactory involves replacing the snapshot version numbers in the sources of the modules to released version numbers before the released modules are built again and put into a different target repository. Promoting artifacts doesn’t change the sources; it moves (or copies) the artifacts to another logical repository, without rebuilding.

 

 

Staging artifacts

Regardless of what scripts you use (such as Ant, Maven, or shell build scripts), you’ll probably consider some kind of promotion build; there are just as many ways to implement a staging build. In a staging build, you’ll specify all of the individual requirements for the environment that will be used to run the release.

Some general pieces of advice, already discussed in this book, apply to many projects:

  • All your sources should be put into a VCS.
  • Releasing means applying tags or labels to the baselined version (the release).
  • Promoting has the precondition of being able to access the results of former builds again.

The artifacts that are built during releasing should be stored in such a way that you can promote them, so you should store the generated artifacts (at least the artifacts identified as being part of a release) in a component repository. This can be a VCS (like Subversion), a tool like Artifactory, or the build archive where Jenkins stores its build results. Once this is done, promoting means pulling these stored artifacts and putting them, by hand or by script, into another context (often deploying to another environment, without rebuilding). A script can pick up the artifacts from your component repository and deploy them accordingly.

This process can have many variations and can be supported by additional tools. One example is using Artifactory’s features to stage artifacts, as discussed in this chapter. Another option is using Jenkins’s build promotion plug-in. Once you install it, you can configure your individual promotion as part of the build job description. Parts of this description are the criteria for when a build is qualified to be promoted and what happens when it’s promoted. Concerning qualification criteria in Jenkins, you can manually mark builds as being promoted or note whether any downstream jobs (for example, running special tests) run successfully. Then the build is marked with a star in the build job history. The promotion action triggered by Jenkins afterward could run a different script.

 

In a nutshell, the principle is simple: Your CI server is the entity having the most complete knowledge about the project. This information is captured during build time and is sent to Artifactory upon deployment at the end of the build process (as a JSON object). The information contains the produced modules, their published artifacts and dependencies, and data about the build environment (server version, JVM version, properties, and so on). Once you have all this information inside Artifactory, you can do the following:

  • You can collect the data required to reproduce the build.
  • You can see the builds’ artifacts and dependencies.
  • You can see the builds each artifact belongs to.
  • You can get warnings when you try to delete artifacts used in builds.
  • You can export the whole set of artifacts or dependencies for a build as an archive to deploy or reuse elsewhere.
  • You can operate on the whole set of artifacts or dependencies for a build as one unit (promote, remove, and so on).
  • You can navigate to the build information in Jenkins and from the Jenkins build page to the Artifactory build info.

In Artifactory you can click on the Builds panel. This will display all build projects (corresponding to the name of the build job in Jenkins) that deployed artifacts to Artifactory (see figure 7.20).

Figure 7.20. Artifactory’s Build Browser lists all builds for a specific build name (in this case, Task-based). The build name corresponds with the name of the Jenkins job that produced the builds. You can click on one specific build to get more information about it.

You can select the build project of interest, and on the next page you’ll see a list of all builds sorted by the Jenkins build number. For each build job item, you can now use the context menu to jump directly to Jenkins (to the corresponding job detail page) or you go into Artifactory’s build detail page.

Artifactory’s build detail page provides many pieces of information about the build, including general build information (for example, who performed the deployment and when) and what the published modules are. In figure 7.21, two Maven modules belonging to build #14 are listed: a JAR and the corresponding POM.

Figure 7.21. Artifactory shows published modules for all builds, including in which repositories the artifacts are located (in the Repo Path column).

Clicking on an artifact in the Published Modules tab (see figure 7.21), opens the repository browser showing that artifact (valid for build artifacts and their dependencies). You can do this, too, by opening Artifactory’s repository browser (where you can browse all repositories and their artifacts in their versions) and navigating to and marking the artifact of interest.

When you display an artifact in Artifactory’s repository browser, you can see which build produced the artifact (the Produced By section in figure 7.22), with the build name and build number referencing the information from Jenkins. Artifactory also lists artifacts built using this artifact (the Used By section in figure 7.22). In this case, another Jenkins build job named Multi project references the artifact.

Figure 7.22. Artifactory shows the producers and consumers of artifacts built by Jenkins.

Promoting all artifacts belonging to a specific build is easy. Jenkins injects the two parameters (or properties) into Artifactory: build.name and build.number. You already learned in chapter 5 how to proceed from here. You perform a property search to find all artifacts belonging to this build, and save the search. Alternatively, you can save the search in the General Build Info tab in the Builds browser. Finally, you perform a bulk operation (such as copying artifacts to a special staging repository), as shown in chapter 5.

 

Injecting arbitrary properties into Artifactory

You can submit arbitrary properties from Jenkins to Artifactory through your Maven build. Your POM’s deployment sections are ignored when you deploy via Jenkins, so you must specify the properties in the Jenkins Artifactory plug-in, configuring the target Artifactory server.

As of this writing, this approach is less than completely reliable. A workaround is to edit the config.xml file (in .Jenkins/jobs/jobName/config.xml) of your Jenkins job to extend the repositoryKey by the property in the publisher’s section:

<repositoryKey>path;myProperty=${myProperty}</repositoryKey>

In Jenkins, you must reload the configuration from disk afterward.

 

You now have traceable information about all jobs that deployed artifacts. This is good, because you know now how each artifact ended up being in Artifactory, who put it there, and when. You have information about the job created, and you know what else was published by each job. This is a great deal more information than you have in the traditional Maven approach.

Staging/Promoting Artifacts in Jenkins

Using the Jenkins/Artifactory integration, it’s possible to trigger both stagings and promotions out of Jenkins conveniently.[11]

11 You’ll need the commercial Artifactory Pro in order to use all the features.

To use this integration feature, you need to activate it in the respective Jenkins build job and do some simple configuration, as shown in figure 7.23. You need to first enable the release management feature by clicking the check box, and then configure a VCS tag base URL. You don’t need to configure any VCS credentials, because the Jenkins/Artifactory integration will take the settings that you’ve already configured in the dedicated VCS configuration section of Jenkins. Optionally, you can force Jenkins to resolve all artifacts from Artifactory, which can further improve the quality of your builds, because you ensure that all artifacts (such as compile dependencies) are pulled from Artifactory, not from any other location. As a result, this quality gate overwrites any other settings developers may use in their individual workspaces.

Figure 7.23. Configuring the Jenkins build job to use the Jenkins/Artifactory release management functionality. The VCS base URL for this Jenkins build job must be specified. Among other options, you can force Jenkins to resolve all artifacts from Artifactory during builds.

For Maven projects, the Jenkins/Artifactory combination performs the following steps to stage a project:

  1. Change the POM version from snapshot to release (this also applies to multi-module builds).
  2. Trigger the Maven build.
  3. Commit the changed sources to VCS. This will trigger a new build in Jenkins, the release build, which will deploy the released modules to Artifactory.
  4. Change the POM version to the next development version (which is again a snapshot version).
  5. Commit the changes to VCS. This will trigger a new build in Jenkins, the next development build, which will deploy the new snapshot versions to Artifactory.

After activating and configuring the release management facility, an Artifactory release management staging link appears in the left panel of your Jenkins build job page. Clicking the link opens a new page in Jenkins to configure and trigger the staging process, as shown in figure 7.24. To stage the artifacts that were produced by a past Jenkins build, specify the last built version that will be the base for staging, configure the new versions for your Maven modules, and optionally create a tag in VCS. Finally, configure the target repository where you want to stage the release to. The target repository is a logical repository inside Artifactory. Clicking the Build and Release to Artifactory button starts the staging process.

Figure 7.24. Staging the artifacts that were produced by a past Jenkins build. Before starting the staging process, you must configure versions and a target repository.

Staging is wrapped as Jenkins builds, so you can open the Jenkins console for these builds and read the output of these job interactively and after the fact. In Jenkins, a successful release build is marked with a special icon beside the job in the job history.

After the staging is done and a release build has been finished successfully, you can promote the build. Promoting the build means that the build is moved (or copied, depending on how you configure the staging process) to another logical repository in Artifactory, without rebuilding (see figure 7.25).

Figure 7.25. Promoting a build from inside Jenkins requires selecting a target promotion repository. You can configure it to include dependencies and specify whether you want to copy the artifacts in Artifactory or move them.

Promoting built artifacts from inside Jenkins is a convenient way to put artifacts into a different repository location while still gaining from traceability. As part of this traceability, the Build Browser (see figure 7.20) is updated to show staged or released as the release status of Jenkins builds that were deployed as artifacts to Artifactory. Additional information can be found in the Release History tab of Artifactory’s build detail page (see figure 7.21).

Jenkins, Artifactory, and Maven provide considerable functionality. Another approach is to use Git and the git-svn bridge for feature branch–driven CI.

7.5. Using Git and git-svn bridge for feature branch–driven CI

This section contributed by René Gielen

The mainline of a feature development phase—also called trunk or head if the development isn’t taking place on a branch of the source code—is the unique line of development in the VCS. The mainline often also contains the latest revisions of the software’s features.

 

Using Both Subversion and Git

The git-svn bridge is often used to enable the use of both the Git and Subversion VCSs in parallel. This can be a good approach if you want to softly migrate from one tool to the other, or if your organizational structure requires you to use one tool for managing source code centrally (for instance, Subversion) but where developers have the freedom to use other tools in addition (such as Git).

 

In almost every case, a central CI job is set up to be triggered by check-ins to the mainline. Developers work on the mainline; they check out code, change code, and commit changes. They are notified by the central CI job about the results of their and their colleagues’ commits. Figure 7.26 illustrates the standard workflow (without feature branching).

Figure 7.26. Mainline CI without feature branching

This is a good approach, because you always want to ensure that the mainline integrates properly, unless you’re willing to sacrifice the benefit of having a continuous line of release candidates. Nevertheless, a number of problems might arise when the mainline CI is the only build and test automation is taking place:

  • Although a feature that a developer is currently working on might be far from complete and still subject to heavy changes and refactoring, their solution steps related to commits will continuously be integrated against other developers’ work. There’s a good chance that more integration problems will have to be resolved, as compared to an approach where the developer’s work wouldn’t have to be integrated against the team’s work until their feature is completed.
  • If the team embraces the “commit early, commit often” policy, the triggered CI builds will often include changes from more than a single commit, given the common case where the next build job isn’t allowed to fire unless the previous CI job is completed. Therefore, the features of the various change-sets that are committed, while the previous build blocks the following CI’s turn, might bleed into each other. If a build fails, each developer who committed during the previous build will be notified by the CI system as potentially having caused the problem. Even though only one or two of them would be to blame, each of these developers—maybe most of the team, for complex and long-running CI cycles—will have to interrupt their work to check whether they’re at fault for the broken build. This approach will often require updating and merging the local working copy of the code with the code in the VCS mainline, and this loss of focus might decrease the team’s productivity.
  • Forcing the CI system to fork build jobs unconditionally on any commit doesn’t solve the previous problem either. It will make it harder to investigate the last successful build and determine whether it has been completed, and it will be harder to determine the appropriate cumulated change-set that’s the target for investigation to find the problem and create a solution. In addition, if Developer A is to blame (or partly to blame) for an integration problem, which a later commit of Developer B reveals by breaking a CI build, then Developer B—as the only team member being notified—has to investigate the full problem and the cumulated change-set in doubt. This may potentially result in them having to notify developer A to check whether their commit may have broken the build. This process foils the idea of automatic detection of problematic change-sets and automatic and well-targeted developer notification.
  • Unless the developers in charge of fixing a broken CI have succeeded, the CI system will be useless to the rest of the team because their commits will always result in broken builds due to previous errors. Meanwhile, the change-sets that need to be investigated for a possible consecutive CI break might pile up significantly.
  • Being blamed for breaking the team’s CI is something a developer will try to avoid, particularly when their commit causes the project build to break; this means that each team member fetching their changes won’t be able to build the project locally until the fix is applied. Consequently, developers might be tempted to double-check that their upcoming commits won’t break the build, resulting in a process of updating and merging the local repository with the latest mainline revision followed by issuing a full project build. Doing this immediately wipes out the advantage of shortened turnaround cycles by shifting long-running tasks to the build server.
  • A common side effect of the team dealing with the previously described problems is that the individual developer will tend to pile up their work results and not commit to the VCS until they regard the feature they are focusing on as being completed. This clearly violates the “commit early, commit often” policy, leading to implications such as change-sets that are huge, poorly documented, and lacking safe rollback points during the feature development process.

This is an impressive list of possible problems. Teams need to see positive results in the form of improved build processes or enhanced productivity in order to maintain the acceptance and support for implementing improved practices such as CI. Without visible results, support for CI will drop dramatically over time.

7.5.1. Feature branching comes to the rescue

The concept of feature branching addresses most of the problems previously mentioned. The idea is pretty simple: Each developer is given an isolated branch in the VCS to use for their changes for as long as is necessary to implement a specific feature. Reaching this milestone, they would then merge the cumulated changes on their feature branch back to the project’s VCS mainline, which will then trigger the mainline CI job to check for proper integration.

 

Feature branching and CI

Some people claim that feature branching strictly conflicts with CI, because CI suggests that you should focus on a single VCS mainline (the head) and you shouldn’t branch in VCS (or at least should minimize branching). Too many branches can lead to delays in the development flow, big merging efforts, and overall communication fragmenting; having a single code line in the VCS means that you have a central, single synchronization point.

Depending on your specific requirements, feature branches can be the best approach for a given problem. If your task is to migrate major parts of the software to another solution, Martin Fowler suggests applying an approach named branch by abstraction instead of feature branching. See his “Feature Branch” discussion at http://martinfowler.com/bliki/FeatureBranch.html.

 

In an environment that uses Subversion as the VCS, which I have found to be common, the process would be similar to the following. First the developer starts work on a feature by creating a feature branch. They then switch their working copy to that branch:

> svn copy http://svn.myorg/ourproject/trunk \
      http://svn.myorg/ourproject/branches/myfeature \
      -m "Starting work on feature myfeature"
> cd checkout/ourproject
> svn switch http://svn.myorg/ourproject/branches/myfeature .

The developer starts working on the feature, issuing commits early and often. When finished, they reintegrate their work back to the trunk:

> svn switch -r HEAD http://svn.myorg/ourproject/trunk .
> svn merge --reintegrate \
      http://svn.myorg/ourproject/branches/myfeature
> svn commit -m "Merged myfeature into trunk"

Although working with Subversion for feature branching is possible nowadays, it hasn’t always been ideal. The reintegration of the full commit history when merging a branch into the mainline wasn’t available before Subversion 1.5, and support for managing conflicting merges is historically not considered a forte of Subversion.

Here’s where one would argue that this is a perfect use case for a distributed VCS, such as Git. In contrast to a server-based VCS such as Subversion, the concept of a local working copy is replaced by forking a central master repository as a fully featured local repository on which changes will be done directly. Commits always affect only the local repository. To reintegrate the changes, the Git user pushes a chosen change-set, which consists of various commits and their commit messages, back to the master repository.

The developer starts their work by cloning the master repository locally:

> git clone git://git.myorg/ourproject .

The developer starts working on the feature, issuing commits early and often. When finished, they reintegrate their work back to the master repository:

> git push

Given that this is the native and recommended way to work with Git, it should be used as a solution for establishing a feature branch–driven process, because Git has a good reputation for automatic conflict resolution—even in the case of complicated conflicting changes.

Regardless of whether the team decides to use Subversion or Git for a feature-branching process, in both cases, the CI setup can be configured to define additional build jobs for each feature branch. An ongoing discussion is occurring about whether these jobs should be called continuous building instead of continuous integration, which reflects the conviction that “real” integration checks can happen only against the mainline.[12] In my opinion, running feature-branch build jobs with a CI server can still be considered CI, because in such a setup, the change-sets of an individual developer are continuously integrated against a frozen state of the overall project, given that the feature-build job will incorporate automatic testing and validation. Combined with mainline CI, it might be seen as a staged CI. Figure 7.27 illustrates this setup.

12 See Martin Fowler on feature branching at http://martinfowler.com/bliki/FeatureBranch.html.

Figure 7.27. Feature-branching CI

The advantages of this approach are quite obvious:

  • The problem of having the integration of unstable code happen too early, causing more integration problems than necessary, is addressed by a process in which only complete features are integrated back into the mainline.
  • A commit-triggered feature-branch CI build will always cover minimal change-sets by having only the one developer working on that branch, making it easy to focus notifications and investigate problems in the case of breaking integration.
  • Breaking feature-branch CI won’t affect any other team members, even if the project build breaks. The developer won’t be blamed for holding up the team, and they can be truly confident in delegating full project builds and testing to the build server without causing harm. The individual developer can take full advantage of partial build and isolated testing features in their development environment and increase productivity and focus.
  • The “commit early, commit often” policy won’t negatively impact team productivity and developer reputation as described earlier, highly motivating the individual team member to embrace this policy.
  • A well-organized feature-branching setup allows for cherry-picking features in the deliverable product.
  • Build server job setups can easily be extended to do automatic deployments to, say, a testing environment, after successfully accomplished CI builds. This enables the developer to manually test the outcome without having to wait for a local build and deployment.

7.5.2. The Lone Ranger problem and how the git-svn bridge can fix it

Having a convincing concept isn’t enough—it has to stand up to a reality check. If you try to move to a feature branch–driven process that includes CI, you might encounter some unexpected obstacles.

Let’s imagine a senior developer with a rather progressive mindset, maybe even a contractor working on a rather traditional company’s project as part of the team. Let’s call him the Lone Ranger, fighting for a better world where software craftsmanship is regarded highly. He wants to take advantage of the described benefits but suddenly must face one or more of the following problems:

  • The company refuses to provide a CI infrastructure. Whether this is reasonable or not, the contractor has to accept it as a political decision.
  • The company, the infrastructure team, or the development team refuses to establish a Subversion branching policy, maybe because they fear the increased complexity of the VCS setup and handling.
  • Switching to Git isn’t an option. Maybe the infrastructure team doesn’t have the knowledge or resources to set up a suitable Git infrastructure, or maybe the company recently switched from CVS to Subversion and refuses to reverse this decision, preferring to switch to a better product when it comes along. Maybe the Lone Ranger was the one who convinced them to switch to Subversion, and his reputation would be damaged if he now suggested switching to yet another product so soon after making his initial recommendation.
  • The provided CI infrastructure isn’t able to deal with the increased load of many parallel builds when defining jobs per feature branch in addition to the mainline CI.

It seems like there’s no chance for a happy ending. But the Lone Ranger always has to win in the end, doesn’t he? Let’s see if we can manage to help him with that.

Both main problems—the lack of a CI infrastructure and the lack of a suitable VCS setup—can be addressed. Today’s software development workstations are, in most cases, comparable to small servers, with multicore CPUs and a good amount of main memory. Such a workstation is astonishingly well suited to run a background CI build server, IDE, and whatever else a developer needs. In addition, the developer having a CI system in the background can ditch the time and resource-consuming full project foreground builds for the sake of partial builds and isolated tests. Setting up a local build server for CI takes only minutes using free products such as Jenkins or TeamCity. We would have to configure a proper build job. Doing so requires a repository we can watch for commit triggering and checking the working copy of the CI build.

Here is where the git-svn bridge comes in handy. The bridge makes it possible to work with a local Git repository as if it were in a full Git-based working environment. The only change is that the role of the master repository is assigned to the conventional central Subversion repository. The git-svn bridge translates (transparently) most of the Git push and pull operations to the Subversion repository instead of to a Git master repository.

Let’s see how it works: Initialize a local Git repository to act as a Subversion repository clone, assuming that the repository has a standard layout as recommended by Subversion:

> git svn init -s
http://svn.myorg/ourproject/http://svn.myorg/ourproject/trunk

Fetch the Subversion change history into the local Git repository:

> git svn fetch

For Subversion repositories containing a rather small amount of changes, this will work quickly. For larger repositories, you would want to delimit the number of historical changes to fetch, which can be accomplished with the poorly documented but functional --log-window-size option:

> git svn fetch --log-window-size 1000

Configure the local CI job to watch and check out from the local Git repository. Start working on the feature in focus, issuing commits early and often:

> git commit -a \
-m "Refactored foo to separate interface and implementation"

When the feature is completed, push the changes to the central Subversion repository:

> git svn dcommit

Pull the latest changes from the Subversion repository to start working on the next feature:

> git svn rebase

Although the cleanroom process doesn’t recommend fetching changes from the central repository during a local, uncompleted feature development cycle, you might nevertheless face the real-world requirement to do so. Because Git doesn’t let you rebase the local repository if it has local-only changes in place, you’ll have to decide how to move those changes out of the way. Obviously, you could push the local changes to the master repository, but the developer wouldn’t want to do so unless they consider their feature to be completed. To perform a probably more desirable local merge similar to the behavior of svn update instead, you would utilize the stashing feature of Git as follows:

  1. Move your local commits to a safe hidden place to prepare for pulling changes from the central repository:
  2.  > git stash
  3. Fetch the changes from Subversion into your local Git repository:
    > git svn rebase
  4. Apply the hidden local changes by merging them back to your local repository, which now represents the updated state from Subversion:
    > git stash pop

With this in place, the happy ending for the Lone Ranger’s mission is within reach.

This was a focused introduction to a set of Git’s features, enabling you to deal with a specific use case without diving too deeply into a complex tool and its implied work-flows. Whenever the word branch is used in conjunction with Git, most users with Git experience will have Git’s extremely powerful native branching and merging features in mind, which is intentionally not used in the workflow that this section describes. Nevertheless, if a developer is new to Git and starts to embrace it, we recommend that they learn about Git’s branching and other advanced features.

7.6. Summary

In this chapter, we discussed CI, tooling, and strategies. You learned how to integrate all artifact types even with legacy technologies such as Cobol. You saw how an approach using a platform or language, such as Java, can be used to drive and manage the build of other languages or platforms. Where available, you can follow the second approach: use what exists for this platform and integrate the native build scripts with common build servers.

We discussed using .NET without all of the associated Microsoft tools. Although it’s a proprietary platform, .NET can be handled with common tools. You don’t have to use a complete proprietary toolchain; you can use lightweight tools (such as Subversion) to store your artifacts and to manage your builds. In one case, we dropped an Ant script into a CI server. In a second case, we dropped the MSBuild script into TeamCity.

We also discussed advanced configuration and staging recipes. One strategy involved building applications for different environments, and you saw that you should build your artifacts once and promote them to higher environments by plain configuration without rebuilding. It’s also possible to use handmade scripts to scan your data-driven application configurations and automatically replace context-sensitive data, or to use standard approaches like Java properties. But these strategies have limits and aren’t always the most efficient. In the other strategy, you learned how to use Maven to generate builds that run on multiple environments.

Two major CI servers, Jenkins and TeamCity, were part of our discussion. In general, both tools are similar. With Jenkins, you saw how to build, audit, and stage your Maven-based software. Here, the bridge to Artifactory was of special interest. TeamCity can also manage build scripts in a sophisticated way. We briefly prototyped how to drop .NET builds into a CI server and visualized how a build farm and an EC2 profile are connected.

We also talked about using feature branches with Git and Subversion. With this approach, you can use Subversion for version control and can also profit (or softly migrate) by using Git for feature branches.

In the next chapter, I’ll describe strategies and tools for collaborative and barrier-free development and testing. Starting with a data-driven approach, we’ll prototype acceptance tests and move on to behavior-driven development.