Stuck Stacks, the 7 year itch and the DevOps dilemma

Itching to try something new


7 years in, where do we go from here?

It is 7 years since OpenStack came into being. 7 short years comprised of long days and late nights bringing it all together to the point that global businesses such as Walmart, AT&T and Bloomberg are now running on OpenStack.

Whilst most OpenStack implementations started less than 7 years ago, if we are to believe some of the commentary surrounding OpenStack today, there are many OpenStack users who are starting to itch, feeling the frustration of difficult operations and expensive upgrades that left large holes in a business case based on expectations of reduced cost and accelerated application delivery. Similar to the classic movie “The 7 year Itch”, businesses view newer, younger technologies, less complicated technology with enthusiasm. Whilst it is tempting and easy to point the finger at OpenStack for complicating both technical and community aspects of the project, the real problem lies within organisations deploying OpenStack and falling into the same old traps that have plagued software technologies, especially Open Source technologies, for many years: just because you have the freedom to modify the technology doesn’t mean it is a good idea to do so.

This is the dilemma of devops: empowering developers to take control of applications and infrastructure whilst ensuring that they don’t deviate too far from the core so as to create operational headaches. Many organisations who got it wrong with OpenStack and now find themselves stuck with a stack, faced with the choice of managing an expensive upgrade or trying to scratch their devops itch another way, looking for alternatives to OpenStack. 73% of OpenStack operators out there are using versions of OpenStack that are no longer being supported by upstream. This means that they are using software with known security vulnerabilities and issues. There are 172 known vulnerabilities just for OpenStack, add in dependencies such as python, libvirt, qemu and the Linux kernel that many older versions of OpenStack are linked to and the potential attack plane for hackers is significant. The issue is not OpenStack itself, it is that new technologies that are not accompanied by a change in organisational approach will fail as they fall into the same traps as they did with OpenStack.

Beware of the open DevOps trap: the power to do whatever you want.

Most organisations accept that they need to take software seriously or else smart software competitors will eat their proverbial lunch. Agile development methodologies, moving infrastructure into cloud and microservices, and continuous integration/deployment have spearheaded enterprise attempts to be less legacy enterprise dependent and increase ability to become more reactive to changing business dynamics. The challenge is that doing these things well requires the three pillars of agile technology, processes and people to come together with clear understanding of where innovative engineering resources should be focused. Let me explain.

Many of the early OpenStack adopters, seeing a platform with huge momentum and great promise, determined that OpenStack was the right way to go for scalable, open infrastructure that could deliver the much needed agility and cost efficiencies required for next generation applications. The problem was that OpenStack wouldn’t easily fit into their organisation due to technical challenges with authentication, scale, network management and storage control. There were known limitations. Some of these may have been addressed internally by operators by changing well established processes or organisational structures but some of them could be addressed by customising OpenStack to fit into the organisation. This was an easy yet fundamental mistake to make. With consulting companies, smart internal engineers and a lot of goodwill, the primary objectives of using a technology like OpenStack became lost as more and more was spent on creating bespoke clouds. There are areas in a business where innovation is needed. There are other areas where commodity and cost minimisation are the way to go.

When free software became expensive

Often the mere mention of free software is enough to prompt open sorcerers to lecture the concepts of freedom versus free of charge. I get that free software doesn’t have to be free of charge, however, I don’t get why it needs to be so expensive to operate. The nature of OpenStack and many of the newer distributed software projects means that services are comprised of many software components that operators have the freedom to modify or adapt as they see fit. For the first wave of OpenStack operators between 2012 and 2015, many chose to use upstream packaged OpenStack as a base and then modify not only the config, but add their own patch sets to handle custom integrations with their own environments. It may be as innocuous as a custom driver to integrate a network card driver or something as major as pulling later versions of Keystone, the OpenStack Authentication Service, from a development branch to be able to comply with an operator specific corporate security mandate. Either way the resulting cloud is no longer able to be operated as a standardised set of services. It has deviated from the core requiring customised tooling and custom packaging to manage the customised environment. Expertise needed to deliver and operate this new customised cloud come with a price tag that put OpenStack in the expensive bracket. For many, high on the adrenalin of software freedom, this felt like the very empowerment their business needed. Now they could finally design bespoke infrastructure the exact way they wanted it that would work for the business. Wrong. This strategy took a standardised platform, customised it so that it lost most of the benefits of being a cost standardised platform, and forced operators to become upstream engineering experts. Any strategy that expects to reconcile the two opposing requirements of minimising costs by using a commodity, standardised platform whilst also having that platform highly customised to a very specific set of needs is likely to fail.

Stop the bus, you’re going too fast!

Agile technology cannot make up for sluggish processes or lethargic people. Again many organisations struggled to keep up with the rapid release cadence of OpenStack that saw 2 releases per year and a support cycle of 12 months. Typical was for a company to start evaluating a release 3 months after it appeared, testing integration with other technologies they were dependent upon and finally being in a position to roll it out for new projects some 12 months after official release just as it is approaching end of life. Reasons given were based on organisational inertia and learned helplessness. Attempts were made by some to slow the OpenStack release cycle to 1 release per year – these were unsuccessful as others felt strongly that slowing OpenStack innovation is the last thing needed. There were also calls to extend the support cycle such that upstream would support stable branches for 2 years or longer. Again the community was unwilling to trade innovation for a cycle that required enterprises to change less. The net result was many organisations decided to run versions they knew to already be in the EOL period. Today, whilst organisations are getting faster to adopt new OpenStack releases, many are slowed by the baggage of customisations. Most are simply unable or unwilling to upgrade with only 27% of OpenStack users running supported releases of Newton or later. This is not surprising. Nearly all implementations will require downtime which note can cost $100,000 USD per hour.

In our upcoming webinar, we show how to upgrade your OpenStack cloud easily and without downtime, register now.

Scream is you want to go faster

Not all companies are the same though. We are starting to see a next wave of OpenStack users who get it and are managing business change alongside the adoption of new, agile technologies. Being able to control software with a model and deliver new business services faster is vital to their success. These companies know that upgrading software in place without downtime is all a part of normal operations. They also know that it is something that needs to be automated, just a regular part of day to day operations. If something breaks, it is the definitions in the model that need updating, not a quick tactical fix on an individual server or process. This is where real Devops comes in and where real economies can be made so that OpenStack can not only compare well to public cloud services and on premise VMware, at scale with the right approach, workload and commercial model it can offer significant advantages. Marc Andreessen was right, software will eat the world, and, just as the husband in the “7 year itch” figured out the right thing to do, organisations seeking agile infrastructure are also doing so.

Stay tuned

In the next part of this blog we’ll start to explore the approaches you can take to avoid being eaten by the software yourself and provide a robust set of agile services upon which you can innovate.

Upcoming webinar

If you are stuck on an old version of OpenStack and want to upgrade your OpenStack cloud easily and without downtime, register for our upcoming live webinar with live demo of an upgrade from Newton to Ocata.

About the author

Mark's photo

Cloud Product Manager focused on Ubuntu OpenStack. Previously at MySQL and Red Hat. Likes motorcycles and meeting people who do interesting stuff with Ubuntu and OpenStack

More articles by Mark

Posted in: