Season 1 — Episode 13

Puppet’s Nigel Kersten and CircleCI’s Mike Stahnke go behind-the-scenes of the 2019 report to talk about the shift to a security-focused report and where they see these challenges heading and evolving.

 

Download the 2019 State of DevOps Report

Transcript

AndrewHey, everyone. Thanks for joining today's episode of Pulling the Strings. Today, we're going to talk to two of the four authors of the 2019 State of DevOps Report. Quick shout-out to the two authors who couldn't make it today: Alanna Brown, Puppet's own DevOps expert and Andi Mann, the chief technology advocate of Splunk. We have two of the four authors of Puppet’s State of DevOps Report. They're in here today. Go ahead and introduce yourselves.

NigelHey, so I'm Nigel Kersten, field CTO at Puppet. I've been around for a pretty long time, done a bunch of things: Product, CTO, CIO, currently working mainly with our largest customers, working around services to get most successful with Puppet.

MikeI'm Mike Stahnke. I'm currently the vice president of platform at CircleCI. I also spent a long time at Puppet and did a whole bunch of things. So Nigel and I go back a ways. It's kind of fun to see him again.

NigelWe were actually talking about this, how, I think Mike and I definitely met at the first Puppet Camp. Yes. 2009, 2009, 10 years ago.

AndrewWe should definitely put some sort of slideshow that goes back through the past of all your photos.

NigelThey're a little terrifying. I look like I'm a lead singer of Linkin Park or some new metal band.

MikeYeah. And Nigel sent me a picture of me like holding up something during the open spaces from 2009. It was like this week I think he sent it to me. I was like a baby.

NigelHe is literally holding up a suggestion for an unconference space called the Future of Infrastructure. So even 10 years ago, Mr. Mike Stahnke was pushing everyone forward to debate the future of infrastructure.

AndrewAnd now you're living in it.

MikeI mean, I do live in the future. I travel through time at the rate of one second per second.

AndrewRight on. For those of you who don't know what the annual State of DevOps Report is, Nigel will give a quick recap.

NigelSo what we try and do with the State of DevOps Report is survey people's practices and measure their outcomes — like their actual benchmarks for how they doing. And the reason we do this is because we want to try and draw a link ultimately between the kinds of DevOps practices used to manage IT and the actual outcomes i.e. does this stuff actually even work. And we spent quite a number years sort of focusing on that. But last year what we decided to do because we got a lot of feedback from people out in the field that while they really understood that DevOps was beneficial and they understood what the sort of shining light on the hill they were heading for was, they had no idea how to take the first step. How do they go from what they have now on that sort of journey towards constant improvement? And so last year, what we did is we focused on surveying people about what practices they adopted and when they adopted those practices and then used that to come up with an evolutionary model. Mainly so that we could give people pretty prescriptive guidance of you're in a world of everything being done completely manually, maybe a couple of pockets of success. How do you go from that to a whole organization that's humming along and working really, really well? And so that was what we focused on last year. And then this year, what we decide to do was focus on how does security integrate into the software delivery lifecycle, which I know is a subject pretty near and dear to Mike Stahnke's heart.

MikeI mean, it is in that it's super important, but it's not in that it's not always that fun to talk about. So that was what was kind of the challenge of this year's survey was how are we going to talk about security and make it interesting? Because security, it's not a fun thing for people. They don't show up - most don't get up in the morning and think what I'm going to do is go secure that system and make it better off than what it was this morning. There are people that do. But there's not a lot of them. And most people that do other people, well, they make fun of them and call them names.

NigelI mean, speaking personally, you know, security is boring when it's working. Right. And it's only fun when it's actually a disastrous, chaotic situation, which is not actually fun at all yet somehow fun.

MikeYeah. Yeah. I mean, I completely feel that. So. But I think realistically, the reason that security came up with so much was there's a lot of integration across different teams that has to happen for DevOps to be successful. And security was one of those that people were always complaining about and saying that they're blocking and saying no and not letting us do the things you want to do. And so that was where we want to dig in and get a little more data this year. You know, it still might not be the most fun thing ever. You know, you can't put animated pictures in a PDF, unfortunately. So we couldn't just fill it with memes. But there is some good data in there as well.

AndrewYou just let every millennial listening to this episode down.

MikeBecause I can't put animated pictures in a PDF?

AndrewSpot on.

NigelWe should do it in HyperCard.

AndrewThere ya go. Oh, boy.

MikeYou've just confused every millennial listening to this podcast.

AndrewEveryone's Googling "what's hypercard.".

AndrewSo if you haven't read the report yet, you can get it at puppet.com/state-of-devops. But don't get it right now - finish listening to this episode first. So I'd love to hear about what y'all focused on in particular, and you all can dig in a little bit. I want to start with you Stahnke. You make a really strong point about the lack of cynicism surrounding audits and their usefulness. What do you prefer as an engineering leader - ya know, you're on the hook for maintaining a secure platform?

NigelI mean, Mike's a fan of cynicism-driven development, so.

MikeI think most of my authoring for this year's this State of DevOps Report was Mike Rants about something, we have to work it into the report - oh, maybe it's a sidebar. I think that was most of my authoring contributions this year.

NigelIt turns out if you invite someone who has opinions to share opinions, you get opinions.

MikeYeah. So back to the question, though. I think it was, you know... My cynicism around security is basically that there's not a lot of cost to being bad at it. And I say this as a leader who is on the hook for security for a lot of things. But if you look at where big organizations have significant breaches or they have significant, you know, data exfiltrated out to attackers or bad third party actors or whatever. What really happens to them? They get a slap on the wrist and their stock prices recover fine. Their, you know, their forward progress is not prevented at all. It's like capitalism is way more concerned about the dollar than it is about the security, particularly in America, where privacy is, you know, third banana - it's not even second banana. And so there's just not a lot of incentive to really be that great and spend that much money on security. Bottom line, security doesn't pay the bills. That's where my cynicism comes from.

NigelYes, I concur.

AndrewYou also have a strong narrative that focuses on how empathy and trust aren't automatable. How, in your opinion, do teams get past this challenge?

NigelSo I think one of the things that came out through the report is that automation isn't enough, if there's no empathy there. And one of the things we've definitely seen is that there's a lot of talk around shifting left for security folks. And generally that's sort of talked about as if security folks tend to work in production validating production and they're moving closer and closer towards the part of the software delivery lifecycle where the developer's writing code. I think what we've seen from this report, though, is it's insufficient to just do the same activities that you were doing in production and sort of move left in the software delivery lifecycle. You've actually got to change how you work and work in a much more collaborative way. So it doesn't really matter if your security people are embedded in your teams or if they work centrally, but they need to be working very closely with the development teams and not have that sort of wall of responsibility they throw things over to be validated or not.

MikeI think it's a shared understanding of both sides and what the goals are. And most infrastructure people aren't like "man, what I want is really insecure infrastructure." That's that's never a goal of theirs. But it's also "wow, that slows me down" or "I can't move the speed I want to" and I think a lot of people get caught up in automation where you've automated your little section of a large process and you feel like you've won. And really, it doesn't matter if the value out to the end user hasn't gotten any faster because Section 1 of 5 is faster. It doesn't mean that, you know, 2 through 5 are fast. And so you start pulling in these other areas that are contributors to those processes and you start if you can work with them, you can figure out how do you deliver that value faster. How do you empathize for what their goals are? Do you have the same goals, do you have the same understanding, things like that.

AndrewRight on. I want to unpack a couple of nuggets from Nigel from the report. First of all, Nigel, tell us a bit more about what you been up to this year. You've been doing a lot more customer face time, right?

NigelYeah. So I think one of the things we've realized over the last five or six years is that especially with our biggest customers, choosing to adopt technology is only part of the solution and to change how they actually work, they actually have to change the way they work to sound a little bit sort of tautological there. But it's more that you can't just deploy automation, config management, infrastructure as code without reevaluating your processes around security, audit compliance, change management, change windows, release management, application development, just about everything. And so for the last 6 months, I've been spending time mainly with some of our biggest customers, helping them through that sort of a journey, which I find super interesting.

MikeWhenever I talk to customers about this topic, the thing that's really interesting, is they all want to change without changing anything, or we can't change anything. We want to be better with no change is really what they're asking you to perform.

NigelI would like to be skinnier and fitter without changing my diet or exercising.

MikeExactly. You get it.

NigelIt's not so unreasonable.

AndrewSo update your DevOps toolchain. I don't know, Nigel, you tell me. So as you travel the world and talk to these customers, what are some big wins that you've noticed and perhaps what are some trends that you're seeing from these people?

NigelThe trend I'd say and the biggest win is where if you've spent any time working in IT in a large, large enterprise you have change freeze windows, you have times when nothing can actually happen, you have days when you're not allowed to do releases, you have hours of the day you can't, you have regulatory windows imposed upon you. All of these things mean that you don't actually make very much progress. So if you think about things from a sort of kanban flow perspective, there's just lots of work starting and rotting and stopping and then being picked up. And it's just not a very efficient way to get stuff done. The general trend I'm seeing is that people who are rolling out strong automation, rethinking their processes, getting rid of manual steps where they're needed, giving people more autonomy and trust and reporting and auditing rather than requiring lots of approvals step, they're the ones moving towards a world of sort of much more, I guess, continuous delivery, continuous change management, where it becomes a much lower drama process to actually create a change and propagate it through the infrastructure. And it sounds really simple to say, but changing these processes in a big enterprise can take years because there's so many different departments interacting with them and just, you know, detritus built up over the years where people rely upon that one little step of a manual process, and I guess to bring it back to the empathy situation, like no one put these steps in because they were stupid or malicious. They seemed a perfectly reasonable response to a problem at some point in the past. But people haven't tended to have rethought all of these processes. The biggest thing I would say blocking a lot of this is just sheer fear. Like when you've got a system around you that's protecting you and saying you can't push changes in, you can only push it now, you feel much safer. So whereas if so, if something goes wrong, you at least followed all of the rules. If you take a lot of those rules away and just give people more autonomy and accountability, it's actually kind of a terrifying step to take. Of going, huh? Now I get to make decisions of whether this is a risky move or not a risky move.

MikeI don't actually to take away a lot of those guidelines and protection points, though. I mean, if you can have them that work and they work in an automated fashion, they work quickly or you have a set of guide rails. You know, when you're running on a railroad, you're only going one direction because you're on a train track, right?

NigelYeah. Yeah, I should totally clarify that. So when I say you don't - I think Mike's totally right - you're generally not getting rid of those processes as much as you're automating and they become invisible in the background. I guess what I was getting at was that psychologically, like if you've got a whole bunch of rules to follow and then it suddenly feels like they all go away and you could drive any way you want on the road, not just on the left or the right hand side, that's sort of a terrifying change for a lot of people. And particularly I think I think I spend a lot of time working pretty closely with ops people who are pretty passionate about infrastructure, the industry, the market, how things are changing. There are way, way more people out there who are ops people, sysadmins, developers who just want to turn up at 9 o'clock, leave at 5 o'clock, have great health insurance and have a great life. And work is not the most important thing in their life. And I think that's absolutely fine. I think one of the big things for me over the last few years has been developing a lot more empathy for those people who are just like "you know what, I don't even like computers all that much, but it pays the bills" and that's OK.

MikeYeah, that was definitely a wake up call for me more than once in my career. I've been the person that lived to eat, sleep, breathe this stuff and not everybody does. And so that's that's always been a weird wake up call. But back to your point about the process stuff. To me, I guess the companies that do this well, the orgs that do this well, they basically have automated confidence and velocity like through that. And so they know that if something is gonna get through this system, it means it's because it's correct. And if it's not correct, it doesn't get through their system. And that's really what, you know, that's really what the goals are here. Like, how do you speed up your flow for your change and for your deployments, for your velocity, whatever it is that you're doing as an idea organization? How do you do it quickly and you're sure it's correct?

AndrewSo if that's the proactive foot, what are your thoughts on the more reactive version of that where these same teams have to deal with potential vulnerabilities at risk, at scale?

MikeI mean, I think when you have a risk, you have to assess it and you have to see see, you know, what does it look like from a risk management standpoint? And most organizations look at this very differently. Sometimes you look at things that are exposed on the Internet, have a different risk classifications and things that are internal, or you could say things that are, you know, that house customer data or personally identifiable information have a different risk level than things that don't. Once you make those assessments, you kind of figure out how fast you deploy and responding to vulnerabilities. And then, you know, the great orgs can protect a vulnerability within a couple hours usually, and then others are months.

NigelIs this Mike Stahnke? You were nowhere near cynical enough about the actual process just then. That didn't sound like your actual opinions. I think the way it actually works is...

MikeI think that's kinda how it works in places.

NigelPeople do a risk assessment. They describe it in such a way so that they don't have to do anything about it and that it fits into a category of previous risks that have been decided to not do anything about.

MikeOK, I will grant you most of that. The difference actually is, I think the orgs that I've run don't do that. So I've always been very proud of the security practices that the places that I've run have done. And I think that we do a really good job of those types of things, you know, and no one's perfect. Obviously, there's there's always new vulnerabilities, always new attacks or whatever. But, you know, you have to have a process for response and protection just as much as you do for assessment or risk, you know, and all that kind of stuff. And really it's about privatization. The real reason security doesn't happen is because it doesn't get prioritized. If it's not in your nonfunctional requirements for delivery of a new feature, if it's not in the architecture reviews, if it's not in threat modeling ahead of time...

NigelIf it's not in the maintenance once it's actually been released.

MikeRight. If I mean, it has to be plumbed throughout the entire system. And normally it's a couple of spot checks. And usually the spot checks after it's live, you know, most companies run kind of a security vulnerability assessment tool, maybe it's an agent that runs on all the servers or some kind of Web scanner or, you know, and they look at a report, they print out, they make a spreadsheet of all the things that need to go get fixed. They hand it over to a team and they say, you got four weeks and then you go through a change control process. You attempt to do it. You file an exception for the ones you didn't feel like doing. You turn it back over. And welcome to security. You've done something.

NigelAnd I think even - like the thing that I think a lot of people who haven't worked in enterprise don't realize is, those people who are fixing it are almost never the people who wrote the code. Like, so teams of developers tend to work inside enterprises building applications. They'll be a small one or two of them on maintenance for a period after release and then they all get taken away basically. And then it becomes operations peoples' like job to fix this thing of code they didn't actually write and then security who also are people who didn't write the code run external assessments against it and give the operations people the list of tasks they actually have to do. So everyone's sort of many degrees removed from the actual issue, which is why I think there's often a culture of band aid-ing and ignoring.

MikeYeah. And in this case, you know, we talked about a DevOps divide for a long, long time where Dev and Ops have a wall and they're throwing things over. This is basically just a third wall where you have devs that write something, they throw it to Ops and then Ops has to attend to go fix it. But then security has all the authority, but none of the responsibility to go fix it. They can tell you to go fix it. And so then Ops is on the hook for that too. And so it basically just feels like everything's running downhill landing in an operations puddle. And I think that that's - as much as we want to say that DevOps has changed all of that, it has in some organizations and in others, that's the model still today without question.

NigelI think this is a really interesting topic - for all the talk of sort of cloud native empowered developers owning the whole development and maintenance and production release and everything of an application. It just fundamentally doesn't fit how enterprises build software. There's no teams of developers who are going to be staying on an application maintaining it for five years. Your highest paid developers, who are the people who know your cloud-native tech, your Kubernetes, your containers, all that stuff. Enterprises are not going to pay them maintenance salaries or high paid cloud-native salaries just to do maintenance. They're going to move them on to the next project. So I think this is where a lot of the conversation around, you know, developers being the new kingmakers, displacing operations, I think ignores some fundamental economics inside enterprises.

MikeI completely agree.

NigelDevelopers are too expensive to just have them maintain applications.

MikeEven those great operators are too expensive to just have them maintain applications. You know, they're putting together new workflows to do maintenance for you or automate it or whatever. But you're right, the same team doesn't exist for three years in an enterprise in any incarnation - it doesn't matter if they're doing development or whatever. You're gonna have a reorg, you're going to have a realignment, you know, something's gonna get offshored or offboarded or whatever, somebody retires.

NigelYou're gonna find out that person was the gum developer is actually a contractor and left.

MikeRight. Right. And so there's just a bunch of those things that always have to be dealt with in one way or another. And I think the other thing is expecting a team to be able to go top to bottom in a really complicated infrastructure. There's just a ton to keep in your head. You know, I look at the systems that we build that are huge, you know, they're huge distributed systems, they're on cloud. There's not one person who can tell me how everything works inside of CircleCI. That would be crazy. So we have to go to different people to get different information about different parts of the system. And so, you know, when people say that they want to do all these things. It works when you're small, I'll say that.

NigelFull stack developer.

MikeYeah, I do the front end and the back end.

AndrewSo from the report, one of the hypotheses was that bridging the gap between IT Ops and InfoSec can unlock a great deal of agility and allow organizations to remediate vulnerabilities quicker. With both of your actual realistic perspectives on how enterprise-scale DevOps teams work, how accurate is this? How much you agree with it? Where do you think that's all going?

NigelI think fundamentally like - and this is where I think DevSecOps, DevOps, all of these movements sort of share a bunch of things in common. If you have the processes, technology and people set up so that you can respond to change quickly with more change, that's sort of the most important thing as far as security goes, as far as infrastructure, as far as application development. It all comes down to the same thing. If you can actually define a change, however you define it, propagate it out through all of your infrastructure quickly, everything else is easy.

MikeWell, it's not even just infrastructure, it could be applications.

NigelYeah. Sure. If you can change your IT and know that the change happened at scale.

MikeRight. If you can make a change and be sure it worked and it happened quickly, you win. And not everybody can do that. It's because it's not easy. It's really easy to talk about, but it's really difficult to do. And I think, you know, you have to automate confidence, which is a really hard thing to do.

NigelYeah. So and I think you've made a really good point there. And maybe let me dive into a little bit more. It is really easy to say and it's really hard to do. So one of the customers I've been working with lately has three huge data centers spread all across the world. You know, well, well over 100,000 machines. There are 30,000 services. There's something like 9,000 service teams. And they all have dependency relationships to each other. Some of them are actively developed apps, some are not actively developed apps. Some are on one platform, some are another. So none of this stuff is as easy once you're out there in the real world. But, you know, without sounding too pessimistic, you can actually increment your way out of this. You don't need to throw it all away.

MikeWell, yeah. I think there's a lot there to unpack. I think the 30,000 services - the thing that I would point out there is in an enterprise those are not like classical microservices.

NigelNo, no, no. These are full applications.

MikeThey can be full applications, they can be like big back office suites, they can be ERP systems. You know, they can just be the worst things to maintain.

NigelOne of them is online banking.

MikeRight. And so, you know, there's just a bunch of things that are very difficult to work with. And so you don't even get the, I would say like, the complications that come with running a world of microservices, like because I think that's a world of complications that, it's emerging for a lot of companies right now are seeing how difficult this can be and actually keeping one thing up was easier than giving up 400. Who knew? But, you know, there are things like that that are still going on. But in enterprises, it's 400 of those really, really complicated things or 30,000 of those really, really complicated things. Most of the services are not single responsibility, isolated data store, you know, things like that. That's just not, not common, in the enterprise.

AndrewSo this State of DevOps report. This is the sixth one, correct?

MikeThis is the eighth.

AndrewAll right. There have been that many.

NigelIt predates containers. Linux containers. Sorry, that was for the Solaris people.

AndrewSo I gotta know, with eight of these in the books, ya gotta be honest with me, how much arguing was there behind the scenes - with the report, the survey questions, authoring the actual thing? I mean, or are you all at this point, a well oiled, well tuned machine and this stuff's just super easy?

NigelWe don't actually argue a lot.

MikeWe argued more about the topic than we did about the execution of it, I think. Nigel and I both really wanted to do DevOps antipatterns. We just wanted to have like a giant dumpster fire and have that be the entire report. I still think that's coming next year.

AndrewAll right. The ninth one: dumpster fire version.

NigelWell, part of the reason we kind of argued about it is - I'm going to be pretty honest here - like security doesn't interest me that much. I just prefer my infrastructure to be well-run and sort of secure in and of itself. But I think as I dove into it more because I've never actually worked as an InfoSec or SecOps person, I start to get a much more visceral understanding of the actual challenges those folks face in the enterprise rather than just the sysadmin perspective, which was mine, I think.

MikeI had run InfoSec, I'd done security response. I've done a bunch of security stuff and I almost had the same point of view, though, it still wasn't that interesting to me, but did turn out to be more interesting. Reading through, like building the right question set was was tough. Alanna and I spent a lot of time on those questions and went back and forth trying to figure out how we're going to pinpoint the hypotheses that we have and test them and things like that. I wouldn't say there was a lot of arguing, but man, there was a lot of wordsmithing. I will say that like those questions went back and forth dozens and dozens of times and, you know, we're still not pleased with a few of them. It was like, man, I wish we would have worded that differently - that would've given us better data.

AndrewSo at what point will the two of you break off into your own spin-off report? Kind of like the way you'd pull maybe like a Fast and the Furious. I heard Hobbs & Shaw was a really good movie.

MikeI don't even know what's going on anymore. [laughter]

NigelI'm trying to think of a movie reference, actually.

AndrewI'm just trying to think of which one of you is The Rock and which one of you is Jason Statham.

MikeI think if you're gonna pull these movie questions, you need Deepak on the podcast.

AndrewAh, you're right. He is our resident expert.

NigelFast and Furious is definitely much more Deepak.

AndrewAll right, we'll go with an easier question then. So DevOps might be a mainstream thing now, but just because something has a spotlight on it doesn't mean it's fully adopted. Things that come to mind for me, design systems, most of them, a lot of health food trends. So in all honesty, where do you all think DevOps is as a discipline in 2019 and what's next for it?

NigelIf it's going to continue to succeed, it's going to fade into the background and it's just going to become IT. Let me be a little clearer about that. I think that's what happens, that's the smoothest possible path. I think it's entirely likely that that won't happen. So let me outline two scenarios. The best possible scenario is the word DevOps goes away and it just becomes how we deal with computers. The worst possible scenario is that in five years time we will just use DevOps as a synonym for CI.

MikeI mean, I think if you look back at what the developer operations kind of role was, it was to make those - do the operations of developers to make them more successful, do things like pipelining, release engineering, a whole bunch of stuff that operators usually ended up doing because they were basically the ones sifting through the junk drawer in the tool chest to figure out how to make anything happen. I agree with you, though. I don't think that should be where it ends up. DevOps to me is more about working together than what you're working on, and I think that that's really important. Back to your health food kick though. I think it really in the intermittent fasting phase of DevOps where people are really excited about things. They want to try it. They try it for a little while. It either works right away and they're really happy about it or it doesn't work. And they think it's all bullcrap. So that's pretty much where we're fighting with DevOps right now.

NigelI don't know about this analogy. Maybe...

MikeI just wanna go back to the health food thing. It cracked me up.

NigelNo, no, I'm enjoying that one, too. Maybe containers are like a keto diet. You definitely slim down, but your breath smells. [laughter]

MikeHey, let's let's just call it there. It's good.

AndrewRight on. Well, I want to thank both of you for taking the time today to dig in a little more behind the State of DevOps Report, 2019.

NigelThanks, man.

MikeThanks for having us.

AndrewAll right. A huge thank you to everyone listening. If you haven't read it yet, go ahead and grab the State of DevOps Report 2019 at puppet.com/state-of-devops. And I'm sure at some point we'll have the survey out for the 2020 report. So when you see that, go ahead and take that too. Take it easy everyone. See ya.