Say you found yourself as an operations engineer in a serverless environment one day. What do you think you your day-to-day would be like? Can you imagine a world where there are no hosts to patch or maintain? No configuration management code to write? What about no services that need to be right-sized to meet scaling demand? What if developers could easily deploy services on demand without having to ask you?
Serverless and Ops
Serverless architecture has become an intriguing topic in the world of cloud computing. As the operations community debates what serverless is and what it isn’t, and the disruption it will cause to many tech environments, it’s time we also define what it will mean for us as a community.
The ops community has seen a great deal of change over the past decade. We’ve outsourced hardware maintenance, automated mundane and time consuming tasks, and paid for SaaS products to focus on our core engineering value. With that change, we’ve needed to redefine ourselves and our work. We now face that familiar challenge again with serverless computing emerging.
Rather than let others redefine our roles for us, we need to step up and take the lead on our role in this emerging landscape. Historically, we’ve seen our skills misunderstood and ultimately devalued when we’ve let other areas in engineering define these roles for us
"You make (ops) better by naming it and claiming it for what it is, and helping everyone understand how their role relates to your operational objectives."
As an operations engineer that has experienced this skills shift in the past, I see the opportunity that we as ops professionals have to not only keep pace, but get ahead of the curve and define what that even means. And as a community, we need to begin redefining our role in this paradigm shift now.
What does serverless even mean?
You’re probably no stranger to the term “serverless”. Admittedly, there are varying definitions of floating around out on the web right now And for the most part, people agree that servereless is a poor name. Yes, there are still servers involved, but they’re ones you don’t manage and thus don’t spend time thinking about.
To me, the best definition of what serverless means originates from Paul Johnston. The definition is simple and concise, and it’s easily applicable as you evaluate a solution:
“A Serverless solution is one that costs you nothing to run if nobody is using it (excluding data storage)”
This technical change brings new advantages with it, particularly in value delivery. Serverless enables us to spend less time on how we build and run our solution, and more time on how it will deliver value. By spending less time designing, building, and operating the infrastructure, we can spend more time delivering value to our users in the form of features. Best of all, we can deliver this value quicker.
What is operations?
Operations is simultaneously three separate but related things” it is a team, a role, and a responsibility. When discussing operations we often confuse these definitions and this leads us to argue about fundamentally different things, The confusion lies much in the fact that for the longest time, these three things were all combined. But over time, particularly with the adoption of devops, we have started to see the three become independent.
Operations used to typically refer to a particular team within an organization. The operations team was a collection of operations engineers whose responsibility was the stability and reliability of the services within the infrastructure. With the adoption of public cloud and DevOps, we started to see these three things become independent from one another.
The operations engineer role was no longer necessarily a part of an operations team. It may have instead moved to a hybrid team composed of people from both software development and operations roles. A great example of these blended roles can be seen in, an “infrastructure” or “platform” team. From these teams, some organizations took it a step further and dissolved the role entirely, and handed the role to software developers.
It's very easy for someone to argue that operations will go away if you only look at the team and disregard the role and responsibilities. If the team goes away, the role transfers to a different team. If the team goes away then the responsibilities transfer to someone else. Operations never goes away, it simply reappears as something different.
Why did I question my place?
Instead of telling you how serverless will be disruptive to operations and expecting you to take my word for it, I’ll tell you how I came to start questioning my role. There was a time when I worked outside of product engineering and I had a project to build services that were intended to run in a customer’s environment. I started by building a Python Flask service with instructions for deploying to a Linux host. But gaining greater adoption from the community didn’t go so smoothly. I moved on to packaging the service with Docker and Chef Habitat, but still wasn’t seeing enough adoption.
One day I thought to myself, “Wouldn’t it be cool if you could just click a link and the service deployed? Kind of like how SaaS companies use CloudFormation to setup integrations.”
That’s when it clicked: do away with the EC2 host and most of the environment variability, and then refactor my services to run on Lambda. Once I deployed using Lambda, I was hooked on it. I didn’t have to ask permission to spin up a new service. I didn’t have to bother adding some new roles in config management. And ongoing tasks, like host maintenance or right-sizing, were pretty much handled for me.
After the initial excitement, I started to have an existential crisis. Much of my for day-to-day responsibilities were now managed for me with serverless. If this became the norm, what would I do? What would my job even be? I saw my team, and even my role, possibly diminish.
And then I remembered I’ve been down this road before.
“If you don’t learn to code then you’re just another IT person.”
A former VPoE to me
That’s an actual quote said to me years ago. It is easily one of the lowest moments in my career; the second worst is a distant runner up. Rather than some happy ending where I proved them wrong, I wasn’t interested in picking up Java; that’s not what I took that job to do. I left a month later.
This experience was also around the time that the community began talking about NoOps. I found myself unprepared for this change. I thought the skill transformation I had done to embrace DevOps (this was still the early days) would carry me for several more years. And it definitely did… But not at the company mentioned above.
The lesson I carry with me is that I don’t want to experience that moment again. And I think this is a relatable experience for many of us in the operations community.
Serverless Is Not NoOps
The distinct advantages of serverless are realized through the reduction of infrastructure work; typically the domain of an operations team and role. This has caused people to (wrongly) state serverless is the second coming of NoOps. (The first coming being the adoption of public cloud.) However, as stated before, operations never actually goes away. In the face of change, it just becomes different.
Yes, much of the work that an operations team or role was responsible for does go away. The responsibilities for managing host level needs are gone. That starts with the building and managing of golden images. Then there’s the work of ensuring that hosts are properly configured to run the services they’re intended to run that is gone. The time spent reviewing and applying OS vendor patches is free now too.
However, systems will still need deployment, monitoring, scaling, as well as oversight to manage the accumulation of technical debt. Additionally, serverless doesn’t decrease the operational complexity of systems; it alters the complexity that needs to be managed and some even argue increases it. When monoliths were broken down into microservices, the complexity of a large codebase that was hard to scale was traded for scalability and complexity of managing the dependencies between microservices. New complexity that needed to be managed also arose; issues like eventual consistency, latency, load balancing, and on.
Similar results will happen as microservices are decomposed into nanoservices. How do you handle the complexity of managing multiple independent functions that may be needed to handle a single web request?
People who claim serverless heralds the death of operations take too narrow a view of what operations is. The responsibilities are still present and need to be handled.
How does ops fit with serverless?
That’s the question I set out to answer with ServerlessOps. And to be fair, I don’t fully know the answer to this question yet. I have some ideas, but whether they’re in fact correct or possible on a wide scale are still to be answered. Discovering the answer to this question will require time to experiment and determine the results.
But the question is highly intriguing and personal to me. If the ops community doesn’t define its future, then our future will be defined for us. It will end up being defined by people with a narrow definition of operations; primarily people who think of operations as a team and not a role or responsibility. That’s why I’ve set out on this journey.
My goal is to answer what the place of operations is and to ensure that we determine our place in what I see as the coming technical future.