homeblogbanish okay reduce downtime it automation

Banish (Okay, Reduce) Downtime with IT Automation

Unplanned outages are the bane of a sysadmin’s job. More than just a problem in itself, downtime is a telling symptom of existing issues in IT processes and architecture.

Many of the issues that lead to outages can be resolved with careful planning and automation tools. Here are some of the major ones.

Technical debt

Many organizations have layer after layer of one-off processes and routines. Some were created as temporary fixes, but were never dropped. Others are outdated processes that once had a legitimate purpose, but are now just clutter. This pile-up of technical debt eventually causes outages, and the causes of these outages are difficult to analyze — sometimes you don't even know where to start looking.

Automation gives you the ability to query your infrastructure, determine what you actually have, and start pruning and fixing in line with current business needs and goals. Automation also enables your system to remediate configuration drift regularly — automagically! — so you avoid building up so much technical debt to start with.

Uncoordinated, uncontrolled changes

Whether you have a change control board or not, you must be able to control and inspect changes in a structured way. You should be able to stage changes before they are made, and automatically incorporate code review.

Managing change in a spreadsheet is both difficult and highly prone to error; you need a version control tool to do it right, especially as your infrastructure and software become increasingly complex.

Absence of reproducible server setup

Here’s where configuration management software comes in. By automating the setup of physical servers and virtual machines, you eliminate manual errors and get:

  • The ability to set up development, testing and staging environments that accurately reflect the environment where the software will actually run, resulting in code that's much more likely to work correctly when it's deployed to production.
  • The ability to manage your IT infrastructure with the same kinds of tools software engineers use to manage their complex workflows. As IT becomes more complex, due to innovations like virtualization and cloud, these same software management tools (such as version control) become as necessary for system administration as they are for software development. This approach to IT management is often called infrastructure as code.

Absence of automated testing and validation

Using automated testing and validation tools in combination with configuration management helps you ensure that your organization tests for the things that matter, at the right stages of development, and in environments that match production. Optimally, you should design your testing plan right into the development process.

Avoiding downtime is just one of the benefits you get from automation. Others include:

  • Easier policy enforcement
  • Visibility, auditibility and accountability
  • Consistency
  • Better code quality
  • Quicker recovery

You can read more about how automation frees you to do more interesting, strategic work and improves organizational performance in our new white paper: How IT Automation Will Make Your Organization More Successful (and Your Job More Interesting). If you already know these, and you're trying to convince others in your organization, this may be just the piece you need to put into their hands.

>> Download the white paper

Aliza Earnshaw is managing editor at Puppet Labs.

Learn more