25% Tech, 75% Culture: Build a Window in the Wall
As methodologies lead and former professional services engineer at Puppet Labs, Eric Shamow has stories. Like the one where the dev and ops teams were butting heads so hard that their manager thought it might be easier to just fire one group and start fresh.
"Dev and ops weren't talking to each other, not using the same toolset, really resenting each other. There was no way to get the teams on the same page," he remembers.The manager's frustration was easy to understand. The organization was moving from a hidebound, proprietary infrastructure to something more nimble and open. Everyone seemed to understand how to behave in principle, but the silo walls were simply too high. It was a problem Eric had seen before, and the kind where everyone's instincts tend toward layering on process or tooling. "But most things like this are 25 percent tech, 75 percent culture." This is familiar territory to people who have been trying to introduce DevOps culture to their organizations. Our own 2013 State of DevOps report ranks getting rid of cultural barriers up there with other, more technological changes that increase agility. Though tools and best practices are key, they won't make a difference if teams are walling themselves off. Eric says breaking down those silos allowed the manager to keep both groups.
Common Toolchain, Common ProblemsThe ops team was already eager to get started with software-defined infrastructure, but was just beginning to face the challenges that posed. For instance, once you've written the code that controls all your DNS servers, you should check it into version control. Version control is key to deploying a new configuration, but it's not always well understood by operations teams. The strategy this particular team had adopted wasn't working at all. Eric told the ops team, "You should probably talk to your developers about their version control strategy, because all the things you're worrying about? They're worrying about that, too, and have to worry about it hundreds of times per day." So with guarded permission from the ops team — Eric remembers a distinct mood of "on your head be it" — the dev team joined them for dinner. "I let ops lay out their problems, and the devs were immediately engaged. The best thing about that conversation is that I didn't even have to participate in it at all." Once the ops team accepted the idea that the dev team might have some domain expertise in an area that mattered, a lot of things fell into place. The groups were talking again, and the ops team got some real benefit from the dev team's experience with version control. "All I really did," says Eric, "was work really hard on hacking down the barrier between those two groups. Instead of letting them work in isolation, I told them to get the other people in the room, show them respect, and not ask for anything other than advice." Within days of that conversation, says Eric, the dev group came back to ask the ops team for some advice about virtualization and functional testing. "It turns out ops had expertise dev could use too — somebody just needed to break the ice." The key point, says Eric, is that while you want to get to "give and take," teams need to be comfortable sticking to "give" for a little while, until relations have thawed.
Sharing InformationGetting teams used to working in silos to share anything with each other is probably the hardest part, and technology is just a small piece of the puzzle. Once teams are communicating, says Eric, they have to start thinking about what DevOps practices are meant to accomplish, both in terms of improving agility and producing measurable business results. "DevOps is about more than just development and operations. DevOps is about aligning IT with business needs," he says. "It's actually paying attention to what the business wants, not just what your group wants, and seeing the toolchain end to end."
That can be a hard process for ops teams, unless they learn to communicate their own requirements better."The common thing I have to point out to ops," says Eric, "is that you have to tell developers what information you need, and how to give it to you." When ops teams fail to specify clear deployment requirements, developer teams are left figuring out deployments on their own. Left to their own devices, what developers deliver isn't always deployable. That slows the deployment process and leads to frustration and resentment for everyone. "The key problem," says Eric, "starts with visibility." He remembers a situation where "everything was a giant mess," with app deployments taking up to three weeks. The developer and operations teams alike kept changing requirements and processes, creating more confusion and frustrating both teams. Eric came face to face with the dysfunction when he asked a developer to reconfigure the application to run on the correct server, and the developer wrote back claiming he didn't know which servers his application even ran on. "I wrote a flame e-mail, but before I hit 'send,' I decided I'd better go back and find where I told him which servers were his. Then I realized I'd never told him. There was no one place he could look and see what the infrastructure looked like. How was he supposed to know what to target his app at? It's not enough to just 'put a pager on a developer's belt.' Ops also has to give back."
The manual solution to problems like this is unpleasantly familiar to a lot of admins: You build an inventory spreadsheet or use a CMDB, and do your best to update it. It's never quite right when you need it to be, and keeping it updated joins the long list of tasks prioritized behind everything else you have to do just to survive another day.That's the kind of thing automation can help.
The Other 25 Percent: Automation Can Build a Window in the WallIt turns out the answer was an automated "server subway map" that provided developers with instant insight into each of the environments they worked with. Operations didn't need a lot of the information the tool provided, but developers needed to understand how their application was performing before it hit production. "The idea behind it was that developers should be able to go click on a particular app, then choose an environment, and see live information on that environment. In dev, for instance, developers could see peaks and spikes in memory usage, or memcache object retention rates." Building that sort of tool might at first seem like an additional burden on operations, says Eric, but it has a powerful effect. It shifts some responsibility for the success or failure of a deployment to the dev team, because much of the mystery about what's going to happen to code when it hits production is removed. In the end, operations teams benefit from the work as well: An automated map can help sysadmins assess the impact of a downed server. Eric says the next logical step in the process is to create self-serve deployment tools for developers that allow them to not only see how their code is running, but also to allow them to push code into dev and stage environments for themselves. "If ops is out of the picture, and promoting builds into test is done automatically, dev can't get away with producing sub-standard build artifacts, because they rely on the output themselves. That leads developers to create better builds."
Will We Lose the "Ops" in "DevOps"?With ops out of the picture when it comes to managing deployments, where does that leave the sysadmins? Why should they even want to bring down silos, if they're ceding control over traditional ops concerns? "We hire extremely smart ops people mostly because we get into tangled situations where infrastructures get messed up," Eric says. "But most of them aren't dealing with that 99 percent of the time. Most of them are doing boring, repetitive, fire-fighting tasks. They're getting woken up every night on pagers." That's a waste of valuable ops talent. "The idea isn't to obsolete ops," says Eric. "The future for ops is strategic." Instead of fighting fires and chasing bugs, he says, DevOps practices and culture allow operations teams to think about the future. "The future is in designing application architectures that can move seamlessly from a developer's laptop to an EC2 instance to a private cloud that can be autoscaled. This is what operations engineers know best, and where we should be leveraging their talent in organizations." Considering ops in that light, Eric says, helps frame the question operations teams need to ask themselves as they wonder why they're bringing DevOps culture and practices to their organizations. "Is the work they're doing day-to-day really moving the organization forward to help provide value, or are they simply maintaining a broken status quo?"
Read the findings of our 2013 State of DevOps survey. Building better communications and solving cultural problems is just the beginning of improving an organization with DevOps practices.
Read about the value of having a single source of truth about your infrastructure for developers and sysadmins alike.
Learn how a common toolchain can help you promote DevOps cultural values in your organization by building a self-serve deployment tool for your dev team.