We Need to Kill DevOps

DevOps has been a truly transformational approach, allowing teams to develop, deploy and innovate on software quickly, easily and without the traditional friction (Or occasional fist fight) between the guardians of the pearly gates, the operations team, and the incoming hordes of developers. It’s spearheaded a wave of innovation in the automation, monitoring and software build landscape, promoted the almost ubiquitous use of virtualisation and containerisation, and has generally been a breath of fresh air in an industry whose last major change in approach was the Agile methodology. I’m proud of the work I’ve done within the DevOps sphere, and I have spent a long time eagerly inculcating people into it’s wonders.

And it’s time for it to die.

Before you reach for the pitch fork, or worse, the comment button, hear me out. DevOps as the methodology is crucial to modern technology delivery. We should not return to the walled gardens of yesteryear: where siloed technology departments peered at each other through a haze of distrust and mutual suspicion of their respective skills, and occasionally lobbed releases at each other. The methodology of DevOps is excellent and is, indeed, now ripe for additional methodologies such as MetOps to join it. However, the use of DevOps, either as a job title, or adjunct to a job title, needs to end.

Aside from the ridiculous idea of naming a job title after a methodology (You do not for instance hire an ‘agile’) it also runs against the original ethos of the DevOps movement: that is to say, bringing the operations and developers together. By making someone ‘a DevOps’ or ‘DevOps Engineer’ you are still making them ‘the other’; albeit an other that is currently squatting in the same bullpen as the developers.

For a great many enterprise developers, nothing has changed. They still have a gatekeeper, it’s just a gatekeeper that happens to be sitting closer. In many enterprise settings, DevOps engineers have drifted into the niche of being mini operations departments. We’re still lobbing around releases, it’s just the distance you have to throw them is greatly reduced.

This can and will create friction within a team, with the traditional ‘Developers vs Operations’ happening as an internal team skirmish rather than writ large. This also has the unpleasant side effect of convincing CTO’s that they no longer need an operations department. Surely, with a profusion of people with the word ‘Ops’ in their job title we don’t need this extra ops departments?

The problem is DevOps engineers are generalists. By getting rid of Operational departments you lose deep and focused infrastructural skills, as well as fragmenting functions that should be centralised across many disparate departments. Every team I’ve worked with that has gone down this route, has ended up with massive issues due to having generalists working in a specialised area.

So given the fanfare around DevOps and automation, how have we arrived at this situation? Generally it’s due to corporate culture and inertia. At some point developers have been branded as too unsafe to be allowed unfettered access to infrastructure without supervision. This is much the same argument that no one should be allowed access to create application code and release it without supervision, and is a development issue that has long been solved with modern SDLC (Software Development Life Cycle). Didn’t Automation promise to eliminate this problem and usher in an era of developer freedom? Wasn’t that the whole point of the DevOps methodology?

Done correctly, automated infrastructure and software releases can be treated the same as any other code base. Infrastructure code is specced by a Business Analyst, audited by an architect, developed by a developer with unit tests and monitoring, checked by the QC, and finally, when it’s been through various test environments, released. When a new feature needs to be added to the infrastructure, the code should be branched, the feature developed, and a pull request issued, reviewed and accepted. There is no reason that an App Developer should not be able to branch the infrastructure code, add an improvement and issue a pull request. That pull request can then be reviewed by the infrastructure developers. If it’s a good change, it’s accepted and deployed. If it needs work then the infrastructure developers can add some feedback, and the original developer can then amend it and resubmit it. And if it’s rejected, the infrastructure developers have to publicly state why.  

So if that’s now possible then why do many developers still feel that they are waiting on something to be delivered from the ‘DevOps team’? Generally speaking I think it comes down to two things.

First and foremost, many projects that claim they follow the DevOps Methodology and have embraced automation simply haven’t. Quite often they’ve adopted a DevOps Engineer into the team (Or far worse, created a ‘DevOps team’) as a fig leaf to show that they are hip, cool and down with the Silicon Valley crowd. Dig below the surface though, and you still see the same old processes. The result is that instead of an Operations department you have a multitude of mini ops running around, gatekeeping elements and being given exclusive access to infrastructure to manually create elements. I’m yet to work in a business where they can truthfully claim that they have fully automated everything. At some point, be that DNS entries, network management or something else, it’s not automated, and you have to work through layers of bureaucracy to amend some part of the infrastructure. Despite this not being the fault of the ‘local’ DevOps engineer, it is still perceived as operational blocking.

Secondly, you have impatient developers. All good developers want to deliver code that is functional, secure, and above all else, cool and fun. It can be amazingly frustrating, when they are blocked from doing so, because of an ‘unnecessary’ wait for new compute, storage, networking etc.

However, the DevOps engineers are often shielding these developers from a lot of the corporate nonsense. The devs don’t realise that, the inherent complexities of enterprise accounting, security process or just plain politics, often make it painful just getting to the point where you can ‘spin-up a few boxes’ on the AWS console.

Another issue is where application developers can be left waiting for the infrastructure developers to finish building new features that are needed to progress. On occasion, this can be more complex than implementing a new application feature. It may involve challenging and simplifying an existing business process before being able to automate it. The actual code is normally relatively trivial, challenging decades of fixed wisdom of how host creation is budgeted, or DNS entries for the company allocated and managed less so.

The answer to this is to involve the application developers. If an app developer is blocked awaiting a new infrastructure feature which is in turn blocked by process, then bring the app developer along to the meetings where the process is discussed. That way they have an appreciation for the non-technical blockers that might need to be dealt with, and an understanding of the process as well.

Photo by Eric Ward on Unsplash

So how do we collaborate more effectively? I’d suggest a good start is the top down adoption of the following rules:

  1. There are only developers. There are no ‘DevOps’, ‘QC’s’ or other non developers. There are ‘Infrastructure Developers’, ‘Application Developers’,  ‘Test Developers’ and so on.
  2. The only way *anything* happens to infrastructure is via released code, and all infrastructure code is available for development.

This eliminates the us vs them syndrome. There is no separate role, just developers with different specialisation. They are managed the same, follow the same processes as other developers and are, in every way, just a member of the dev team. If a developer is blocked by the lack of an infrastructure feature then this is simply a case of bad sprint management; the feature should have been identified, scoped and delivered before it become a blocker, the same as any dependent application feature. This process change should be relatively straight forward.

The second change is more far reaching. When we say the only way anything happens to infrastructure is via released code, that applies across the enterprise. Want a new top level DNS entry? It’s an amendment to the code. Want a new storage device allocated? Code. Want a new user? Code. And so on and so forth.

There is no reason why, if these elements are automated, that you can’t allow any developer in the company to make the code change and issue a pull request. Why not let projects add a new user? Since it’s coming in via a PR you have built in auditing and the ability to amend or reject it. Why not allow top level resources to be added via code? There are no insurmountable or good reasons why not. If nothing else, suggesting this approach will immediately flush out the lack of automation within the company, as well as driving collaboration with the operational teams that look after the infrastructure. By using Code as the lingua franca of change everyone has visibility on what’s happening and why.

So embrace DevOps the methodology and kill DevOps the job title. We’re all developers now, we just differ on our specialisations. We all want the same thing, to be able to concentrate on writing cool software that does amazing things.


Featured Photo by Simson Petrol on Unsplash
Bee Photo by Eric Ward on Unsplash

4 Comments

Join the discussion and tell us your opinion.

Bringing together people, processes and tools with MetOps – Age of Peers
23rd May 2018 at 2:28 pm

[…] part two we looked at how we could integrate MetOps into existing teams and leverage the business and […]

sp00nfeeder
23rd May 2018 at 4:28 pm

For your 2 top down rules, have you successfully implemented them in a mid to large company say, 500-1000 employees? How long did it take? What lessons learned can you share?

Michael Duffy
24th May 2018 at 12:30 pm

Hi Sp00nfeeder;

Short answer, not both rules. I’ve seen rule 2 (Only communicate infrastructure as code) succeed well in larger companies, but I think of the two, that’s the more immediately achievable. Changing job titles is always tricky at larger places because there are both issues with HR functions (How do you map remuneration onto the new job title before it gains industry traction, how do you advertise the position?), and also because a lot of peoples self-identity is wrapped in a job title. Changes of that nature across large numbers of teams would have to be handled carefully with an awful lot of buy-in from an awful lot of people.

It’s something I’d love to see adopted at scale, but I think it’ll be led from smaller companies first, and as generally happens with these things, the culture will make its way into large orgs. I’m not saying it couldn’t be done on a large scale, but it would take the kind of coordination amongst top-level executives right down to the trenches that you rarely see.

Excellent comment by the way!

prabhaganga
25th February 2019 at 6:53 am

This is a nice post in an interesting line of content.Thanks for sharing this article, great way of bring this topic to discussion.
DevOps Training in Pune