If you are like me, you are probably tired of the endless articles talking about DevOps. Each day, you are guaranteed to see an onslaught of articles on the following topics:
- What Is DevOps?
- DevOps Is a Culture Change
- DevOps Requires Empathy
- DevOps Unicorns, or All Unicorns Started Out as Horses
- Buy My DevOps Tools (from many vendors)
- The Wall between Dev and Ops.
Enough, already. We get the point. Can we shift the conversation from talking about DevOps to sharing lessons learned about trying to implement DevOps? Let me start.
Over the past several months, I have had numerous conversations with a variety of Fortune 500 companies about how to embrace DevOps. Most enterprises understand why DevOps is important but struggle to figure out how to get started and how to make progress. In fact, some of the large enterprises that are touted as the poster children for DevOps are really only in the early phases of adoption and have a lot of room for improvement. The important thing is that they stopped talking about DevOps and started doing something about it.
Here are some lessons learned from some major Fortune 500 companies from their early DevOps initiatives:
Lesson #1 – Moving Operations to Development
One really large client did an impressive job of implementing continuous integration and continuous delivery for a pilot project. I was amazed to see a company of this size and complexity make the necessary changes to enable daily deployments for its new application. The operations team also did an impressive job of creating hardened and secure golden images that enforced many of its security and compliance requirements.
Yet, where the client had challenges was in operations. It bought into the thought that the developers should be responsible for operating the code that they push to production. However, it did not address the changes to its business support and operational support processes necessary to allow the developers to perform this role in an optimal fashion. The monitoring solutions that were in place were infrastructure focused, since the operations team was more skilled in this area. The developers did not have sufficient application performance monitoring tools to see what was going on in the system, which led to a bottleneck, as they had to wait for information from the operations team before they could resolve problems.
The end results were mixed. The client delighted its users by being able to quickly implement new business requests, but it decreased overall reliability by not focusing enough on operations and monitoring. The good news is that it was smart to try this as a pilot for a non–mission critical application and learn from it. Now, the client is addressing its shortcomings before rolling out DevOps to more projects.
The key takeaway here is that the company tried something, learned from it, improved it, and moved forward. Taking that first step and getting beyond talking about DevOps was the key to starting the transformation.
Lesson #2 – Build It and They Will Come (Or Will They?)
I have seen this pattern a number of times. An enterprise starts a grassroots “DevOps” project from within the infrastructure organization. Its focus is primarily on infrastructure automation, security, and governance. Too many times, these teams work in a vacuum and don’t get any buy-in from the development teams. The end result is they build a set of processes and services that nobody uses.
Grassroots initiatives like this can work, but they need to focus on being beneficial to the application development teams. It is great that the infrastructure team can automate secure golden images, but if those same images constrain the developers instead of enabling them, the developers won’t see their value. It is important to work with the developers, so that the developers can install their stacks and their tools on these images. This is an area where tools like Docker can help. The infrastructure owners can set up standardized and secure containers, allow the developers to add their software to the containers, and have them be portable across all environments.
The lesson learned here is: Don’t automate infrastructure in a silo. Understand that the developers are your customers and that automation should expedite the development process, not hinder it.
Lesson #3 – We Don’t Need No Stinkin’ Admins
I have also seen this pattern a few times. A development team embraces continuous integration and starts delivering software at a much improved rate. The team is able to manage all of the infrastructure itself on AWS and feels the need to exclude the admins in the name of speed to market. There is usually one person on the team who is its “full stack engineer,” which means that team member knows enough about both Dev and Ops to be dangerous.
The end result is that this team is able to deliver software quickly but exposes its company to huge risks. This practice often leads to gaping security holes, lack of patching, and numerous operational challenges. These issues are masked at first but become glaringly obvious as the number of virtual machines increases or the number of teams expands beyond the initial proof-of-concept team. It is easy to get code out the door in an application silo, but it does not scale.
The lesson learned here is: Don’t exclude the operations and security teams. To deliver anything at scale, you will need the expertise of all areas of IT.
Lesson #4 – There Are No Best Practices for DevOps Organizational Structure
One of the first questions I am asked by clients is, “What should the new organizational structure look like?” We have all heard the unicorns vs. horses argument before. Companies like Etsy, Netflix, Twitter, and Facebook are often referred to as unicorns because they don’t represent what most enterprises look like. DevOps evangelists will argue that all unicorns started out as horses, which implies that before they perfected DevOps, they had to go through a long transformational journey. I see that side of the argument; however, most of these companies are still web companies and do not have anywhere near the complexity of the multinational, multipurpose companies that many enterprises are.
Why is this important? Because most enterprises should not try to mimic the organizational structures of web companies. Enterprises have many different teams, building many different products, using many different technologies, in many different locations across the globe. In fact, many enterprises have large infrastructure teams at different locations than those of the developers. These enterprises struggle with the concept of building “DevOps” teams comprising a mixture of skill sets from development, operations, and security.
In organizations like this, it is perfectly acceptable to have distinct application and infrastructure teams, as long as the infrastructure team acts as a service provider to the developers. What I have seen work in larger organizations is the formation of a dedicated team within the infrastructure team with the charter of enabling the development teams. This team focuses on things like self-service provisioning, security, building out monitoring and logging frameworks, and a variety of other services, so that the developers can focus on building business features.
The key is to pick a direction and start. Expect that whatever organizational structure you put in place on day one will change over time as you learn.
Summary
As an industry, we are spending far too much time talking and adding to the hype and not enough time doing. Transforming your company to a DevOps model is a long journey. The key is to get started and start learning. Pick a pilot project—one that is not mission critical. Cycle through a number of short iterations, and perform a postmortem after each one. Make any necessary adjustments to your processes and organization structures, and repeat. The end goal should be to deliver high-quality software to the market faster and with high reliability. If your DevOps initiative is not contributing to that goal, then it is time to make some adjustments.