Jenkins : JENKINS-41661 Post-Mortem

On February 2nd, CloudBees released several plugins which required a synchronized release. To achieve this synchronization, we preemptively blocked those releases from the public update center. 

Unfortunately, there were typos in the premeptive block which allowed the github-branch-source 2.0.1 and cloudbees-bitbucket-branch-source 2.0.2 and  release to still be visible in the update center. We didn't realize this until more than 18 hours after the release.

Bug

JENKINS-41661

Severity

Severe - Unable to use Jenkins

Executive Summary

GitHub Branch Source 2.0.1 and BitBucket Source 2.0.2 were available in the update center, but dependencies were not. As a result, Jenkins would not completely start.

User Impact

25 watches on the ticket. Jenkins does not fully load.

How did we find out?

User reports

Incident Type

Regression

Mitigation efforts

We removed the plugin from the update center and proactively scrubbed the metadata from mirrors. We also provided a zip file of all the needed dependencies on the ticket.

How did this bug escape?

We failed to test the update center after releasing the plugins. However, the Jenkins update center does not protect against this problem, but has enough data to do so.

So while we should have tested this, Jenkins should do a better job of protecting users from this situation.

Actions to prevent this 
from happening in the future

If we put an update center block, we should test it once the releases are available. However, this generally seems to be a fragile approach and I think it should be avoided.

CloudBees should instead use a staging server like Nexus to perform coordinated releases (when necessary) so that they can be pushed to the Jenkins infrastructure as a whole.

Finally, we should fix the Jenkins update center so that it doesn't screw the user if the update center is broken like this.(JENKINS-23757)