2014 State of DevOps Report: Statistics Class Edition
In our last post, Improving the DevOps Survey: How and Why we shared the improvements we made to the design of the 2013 DevOps Survey. Our goal was ambitious: to provide an academically rigorous analysis of IT performance and DevOps practices, and how they relate to and predict organizational performance. In this post, we’ll dive deeper into the statistical methods we used to analyze this dataset, thanks to the improvements we made to the survey design.
As discussed in our last post, one of the major improvements we made to the survey was using Likert-type questions, like the one you see below. Unlike yes-or-no questions, Likert-type questions provide shades of gray, as opposed to black and white answers. This improvement made it possible to perform more advanced analysis.
Using Likert-type scales, we can measure what’s known in the stats world as a latent construct. A latent construct is something that can’t be measured directly, like happiness or job satisfaction.
To measure a latent construct, we need to ask several questions that capture the essence of the underlying construct. For example, some indicators of job satisfaction are a person’s willingness to provide a positive reference, or a general feeling of fulfillment.
You’ve probably seen the job satisfaction question before in other surveys. This is an example of a previously validated scale: one that has been thoroughly tested and refined in previous research. Climate for learning and organizational performance were two other previously validated scales we used in the 2013 DevOps Survey.
Defining IT Performance
One of the most exciting things about working on this project was creating a new latent construct for IT performance. Until now, there hasn’t been a highly reliable and valid way to measure IT performance. To get a latent construct for IT performance, we started with a set of related independent variables: deployment frequency, lead time for changes, mean time to recover and change fail rate. You’ll probably recognize these from last year’s survey.
After a lot of refining and statistical testing, we found that deployment frequency, lead time for changes, and mean time to recover captured the underlying construct of IT performance.
Surprisingly, change fail rate did not contribute to our IT performance construct. However, we did find significant differences between groups with high, medium and low change fail rates. As expected, high-performing IT organizations have the lowest failure rates when they roll out changes, and low-performing IT organizations have the highest failure rates when they make changes. In fact, high-performing IT organizations had 50% lower change fail rates than medium and low performing IT organizations.
After several rigorous statistical tests, we can now confidently say that we have a valid and reliable measure of IT performance in the context of DevOps.
Determining High, Medium and Low IT Performance Clusters
Now that we had a solid measure of IT performance to work with, and we knew what the essential components of IT performance were, we wanted to use these components to categorize organizations. But instead of using arbitrary cut-off points for categorizing companies or teams, we decided to use another method for determining these categories: cluster analysis.
Cluster analysis gives us groupings of items, in this case companies or teams, that are statistically similar to those in the same group, and different from those in other groups. The benefit of cluster analysis is that it allows us to categorize groups without specifying how many groups we think we should have. That is, we let data tell us how many groups there should be. In this case, three groups gave the best statistical solution, with high, medium, and low performing organizations.
Using cluster analysis, we found distinct groupings and significant differences between companies. Companies with high IT performance are very similar to each other and different from low- and medium-performing counterparts. Low IT performers are similar to other low IT performers, but different from high IT performers, and so on. This confirmed that high, medium and low IT performing companies are statistically different from each other.
Correlation vs. Causation
Another statistical method we used was correlation analysis. Correlations are a great way to begin looking at data, because it gives us hints of how individual measures relate to each other. To put it simply, positively correlated things move together, while negatively correlated things move in opposite directions. Here’s a video that explains this concept through dance. Watch it. It’s awesome.
One of the findings that surprised and delighted us was how highly correlated organizational culture was to nearly every other variable. It seems obvious, but our analysis supports our hypothesis that organizational culture is the foundation for creating a good work environment and we all know anecdotally that happy workers tend to make better products.
Below you’ll see the correlation tables for job satisfaction and climate for learning. The number in the right column is the correlation coefficient. You can see from the Job Satisfaction table below that our latent constructs, organizational culture and climate for learning had the strongest correlation to job satisfaction. We considered anything above 0.2 to be significant. On the other hand, salary showed very little correlation with job satisfaction (more on salary in a future post).
As the old adage goes, correlation is not causation. But correlation can act as an indicator for predictive relationships, which helps narrow the scope of analysis. This xkcd webcomic says it best: “Correlation doesn't imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing 'look over there.'”
Multiple Linear Regression
We wanted to know what predicted organizational and IT performance levels, so we used directional signals from our correlation analysis to build linear regression models. Any time you see the word “predictor” in the 2014 State of DevOps Report, you’ll know that the data backing these assertions were from our linear regression models.
Regression analysis is a statistical method that uses the relationship between two or more quantitative variables so one variable (dependent variable) can be predicted from the others (independent variables). If you know the relationship between X and Y, you can predict X by regression analysis once the level of Y has been set.
With this wealth of data, we can use segmentation to better understand behaviors and performance in targeted populations. For example, in the 2014 State of DevOps report, 16 percent of respondents said they were part of a DevOps department. We isolated this segment and found that more than 90 percent are in companies with high to medium IT performance. In fact, the DevOps department cohort is 50 percent more likely to be in a company with high IT performance. Fascinating! How are other departments performing? Are salaries higher in DevOps departments? These are all questions we can answer through segmentation and the statistical methods outlined above.
Now we want to hear from you. If you’ve made it this far, please take a minute to drop a comment, and tell us what you’d like us to tell you about DevOps today.
About the author: Nicole Forsgren Velasquez is a professor at the Huntsman School of Business at Utah State University, with expertise in IT impacts, knowledge management and user experience, and a background in enterprise storage and system administration. Nicole holds a Ph.D. in management information systems and a masters in accounting from the University of Arizona. You can follow her on Twitter.