Skip to end of metadata
Go to start of metadata

Tracking file usage across Jenkins jobs using fingerprints

When you have interdependent projects on Jenkins, it often becomes hard to keep track of which version of a file is used by which version of a dependency on that file. Jenkins supports file fingerprinting to track dependencies.

For example, suppose you have the TOP project that depends on the MIDDLE project, which in turn depends on the BOTTOM project. You are working on the BOTTOM project. The TOP team reported that bottom.jar that they are using causes an NPE, which you (a member of the BOTTOM team) thought you fixed in BOTTOM #32. Jenkins can tell you which MIDDLE builds and TOP builds are using (or not using) your bottom.jar #32.

How do I set it up?

To make this work, all the relevant projects need to be configured to Record fingerprints of the jar files (in the above case, bottom.jar).

For example, if you just want to track which BOTTOM builds are used by which TOP builds, configure TOP and BOTTOM to record the fingerprint of bottom.jar. If you also want to know which MIDDLE builds are using which bottom.jar, also configure MIDDLE.

Since recording fingerprints is a cheap operation, the simplest thing to do is just blindly record all fingerprints of the followings:

  1. jar files that your project produce
  2. jar files that your project rely on

The disk usage is affected more by the number of files fingerprinted, as opposed to the size of files or the number of builds they are used. So unless you have a plenty of disk space, you don't want to fingerprint **/*.

Configure a job to Record Fingerprint(s) of a file or set of files

Go to your project, click Configure in the left navigation bar, then scroll down to the Post-build Actions section of the job

Click on the button to add a Post-build action.

Select Record fingerprints of files to track usage.

The post-build action configuration fields provide you with a pattern option to match the files you want to fingerprint as well as a couple check-box selections to do your file fingerprinting.

Maven job type does this automatically for its dependencies and artifacts.

How does it work?

The fingerprint of a file is simply an MD5 checksum. Jenkins maintains a database of md5sum, and for each md5sum, Jenkins records which builds of which projects used. This database is updated every time a build runs and files are fingerprinted.

To avoid the excessive disk usage, Jenkins does not store the actual file. Instead, it just stores md5sum and their usages. These files can be seen in

$JENKINS_HOME/fingerprints

Plugins can store additional information in these records. For example, Deployment Notification Plugin tracks files deployed on servers via chef/puppet through fingerprints.

How can I use it?

Here are a few typical scenarios that benefit from this feature:

You develop the BOTTOM project and you want to know who is using BOTTOM #13 in which builds

  1. Go to BOTTOM #13 build page.
  2. Click the "fingerprint" icon of bottom.jar in the build artifacts
  3. You'll see all the projects and builds that use it.

You develop the TOP project and you want to know which build of bottom.jar and middle.jar you are using in TOP #10.

  1. Go to TOP #10 build page.
  2. Click "see fingerprints"
  3. You'll see all the files fingerprinted in TOP #10, along with where they came from.

You have the TOP project that builds a jar. You also have the TOP-TEST project that runs after the TOP project and does extensive integration tests on the latest TOP bits. You want to know the test results of TOP #7.

  1. Go to TOP #7 build page.
  2. Click the "fingerprint" icon of top.jar in the build artifacts
  3. You'll see all the TOP-TEST builds that used it.
  4. Click it and you'll be taken to the appropriate TOP-TEST build page, which will show you test reports.
  5. If there's no TOP-TEST builds displayed, then that means TOP-TEST build didn't run against TOP #7. Maybe it skipped TOP #7 (this can happen if there are a lot of TOP builds in a short period of time), or maybe a new TOP-TEST build is in progress.
  • No labels

11 Comments

  1. It still isn't obvious to me how the artifacts are promoted between dependent projects. To refer to your example, how does the artifact bottom.jar from project BOTTOM get involved into the build of project TOP? Is there any support from Hudson or any plugin? Or do I have to deal it myself through script action by e.g. copying to a shared directory or check-in/check-out into my repository? Please detail the scenario, or give at least a best-practice.

  2. What's the definition of "the file has been used in ..."? How does Hudson determine whether the files are used in a certain build or not?

    Another question is What does the "original owner" means? I see this in the "Recorded Fingerprints" page and it says "outside Hudson".

  3. Indeed it would be great to have more information about this fingerprints mechanism. I used it with a bunch of Maven jobs but it's not very clear and it does not seem to work as expected, at least as I can understand from this documentation...

    For Maven jobs it should be great to have the option to disable the fingerprints recording since it's messing up some chains sometimes

  4. The restriction with promotion using fingerprinting is that it doesn't work with multi-configuration projects.

    1. Dan, what is your source for this comment? Are there any alternatives or workarounds to using promotions in a multi-configuration project?

  5. This is a nice first post for file fingerprinting, but the natural question to ask is "how, specifically, do I do it?".  I Googled around and discovered a page (not on this Wiki) that details how to do it, but the instructions did not apply (a version issue?).  I was looking to troubleshoot why my unit tests do not aggregate.  I saw a forum post made by Mr Kawaguchi in which he suggests setting up file fingerprinting.  I was caught in an endless cycle, I thought.  I found the answer in a blog called thingsyoudidntknowaboutjenkins.  I will make an edit here to provide the answer.

  6. Attempting to fingerprint build artifacts will fail your build if you do not archive your build artifacts as well.  I discovered this by having my builds fail.  I added a note to my Wiki edit above on this matter.

    1. Does anyone have an explanation for why it fails?  I'm guessing it has something to do with when/where the md5's are generated.

      What I would like to do is have the files from my workspace fingerprinted and that's it.  I want to do my own archive management using my Artifactory server and have none of the artifacts stored by Jenkins.  I haven't actually tried it out yet, so maybe things have changed since September? 

      1. I created a small test job that fingerprints a file that is not archived and it is working.  Since I do not know enough about the internal workings I cannot say whether it will always work or not.

        I'm using Jenkins 1.514.

  7. Is there any downside to using fingerprinting? Is there a noticeable performance impact or something?

    Aka, why is it not just on all the time as default in Jenkins? 

    1. It will slow things down for every file that is archived, and every file that is used in another job.

      It will also use disk space to store them all, potentially unbounded.

      Unless you don't have that many files that change, you probably don't want to enable it for everything.