Goals
- Consolidate primary project servers to the OSUOSL
- WHY: Simplify server management, take advantage of shared resources/services offered by the OSUOSL.
- Migrate to Puppetmaster-based configuration management infrastructure
- WHY: Gain inventory and status reporting from puppet-managed nodes, open up ability to easily bootstrap new servers, fix lack of orchestration and quality/delivery pipeline for Jenkins' own infrastructure
- Migrate key asset ownership to the SPI
- WHY: Too much bus factor on domains and server ownership by Unknown User (rtyler)
- Upgrade core developer services, e.g. Confluence, JIRA.
- WHY: Some of core developer services have reached an end-of-life status from their vendors, making the continued secure operation of those services impossible.
- Revamp infrastructure monitoring
- WHY: Current monitoring infrastructure (Nagios-based) is not reliable and doesn't provide sufficient insight into the state of services, mostly just machines. More often downtime is noticed via Twitter/IRC rather than alerts.
- Clean-up/prune mirroring system
- WHY: Primary mirrors (OSUOSL), and secondaries are not intended to be the binary archive for all historical releases and need to save disk space for other projects
Projects
Migration to a modern Puppet infrastructure
Owner: Unknown User (rtyler)
Phases of work
- Set up Puppet Enterprise (Should be done in an automated fashion (no snowflakes!))
- Create a continuous delivery pipeline in Jenkins for our infra
- Define a good testing/deployment workflow that Jenkins and Puppet Labs can point to and say "this is how things should be done in 2014+"
- Add first production host to PE and start provisioning it via the CD pipeline
- Blog about progress thus far
- Migration strategy
- Developer testing and deployment pipeline
- ??
- Shift existing infrastructure over to the new model piece by piece
- Move primary/core services already managed by masterless Puppet, e.g. DNS, WWw, package repos
- Puppetize less-critical, more hacked together services (e.g. butlerbot, pieces of MirrorBrain)
Assigning ownership of key assets to SPI
Owner: Unknown User (kohsuke)
Core user/developer Service Upgrades
Owner: Unknown User (rtyler)
Phases of work
- Puppetize JIRA installation
- Puppetize Confluence installation
- Migrate production JIRA
- Migrate production Confluence
Mirror cleaning
Owner: Unknown User (kohsuke)
*COMPLETED*
Monitoring Revamp
Owner: Unknown User (rtyler)
Phases of work
- Evaluation different monitoring solutions
- Sensu - overly complex for monitoring our small deployment
- collectd
- Nagios
- Deploy/monitor one node
- Roll out to rest of cluster
Jenkins JIRA: Components/workflow refactoring
Owners: Unknown User (danielbeck), Unknown User (oleg_nenashev)
More info: 2014 JIRA Components Refactoring
Jenkins IRC Bot Improvements
Owner: Unknown User (oleg_nenashev)
Phases of work
- Unknown User (rtyler) - Puppetize IRC Bot => DONE
- Unknown User (oleg_nenashev) - Describe the deployment flow on GitHub/Wiki => DONE
- Unknown User (oleg_nenashev) - Introduce new commands (component management, etc.) => IN_PROGRESS
- Unknown User (oleg_nenashev) - Improve IRC Bot's testability (local debug, test suites) => IN_PROGRESS