This is a part of our Serverless DevOps series exploring the future role of operations when supporting a serverless infrastructure. Read the entire series to learn more about how the role of operations changes and the future work involved.
Glad I caught that before reaching prod. Otherwise we'd be this week's Last Week In AWS winner of the S3 Bucket Negligence Award!
Serverless, when using a public cloud provider, means offloading certain security tasks lower in the application stack to the provider, which in turn frees you up for tasks higher in the stack. When dealing with security, we usually try to work from the bottom up when securing an application stack, and rightly so. You can't secure an application properly without its underlying layers being secured properly. There's often so much work at the lower layers, however, that we don't have the time to secure higher layers adequately.
What serverless provides us with is an opportunity to offload much of the work at the lower end of the stack and, in turn, frees up time for us to concentrate on security issues higher in the stack. The less you have to worry about, the more you can focus on what's left. This new ability to focus is good, but it will also put many of you into unfamiliar territory. We understand things like OS patching and network controls because we've been doing it. But how many of us understand application security?
Let's discuss how security changes with serverless.
The Security Responsibilities You Don't Have
If you're familiar with the AWS shared responsibility model for security, then you're familiar with ceding responsibility, and control, over different layers of security to your public cloud provider. For example, physical security and hypervisor patching is the responsibility of the public cloud provider.
That’s a good thing! Even in the most sophisticated companies, there is usually more work than a security team can handle. (That assumes you even have a security team, which many organizations simply roll into the operations position already.) Cloud providers like AWS have a large, dedicated security team.
With serverless, the cloud provider assumes even more responsibility. Gone is the time spent worrying about host OS patching. Take the recent Meltdown and Spectre attacks.AWS Lambda required no customer intervention. AWS assumed responsibility for testing and rolling out patches.
Compare that with stories of organizations tracking patch announcements, testing, rolling out patches (and rolling back in some cases), and the overhead incurred as a result of the disclosure. A month after disclosure, just one third of companies had patched only 25% of their hosts. Moving more work to the large dedicated security teams who support the public cloud providers will enhance the security posture of most organizations.
The shared responsibility model let's you spend less time on these areas and more time focusing on other security layers.
The Security Responsibilities You Do Have
So what are the responsibilities of an operations person with regard to the security of serverless systems?
Preparing for the Unplanned
Your first responsibility with security is more or less a reliability responsibility. Any infrastructure change may result in system reliability issues and the same goes for security updates.
AWS makes regular changes to their infrastructure; some announced, but most not. How do you handle those changes today? If you're coming from the on-premises world, remember, you can't ask AWS to stop patching because your system broke.
To start, you have alerts for system instability and performance degradation. When AWS announces security updates to the EC2 hypervisor layer, you watch dashboards more closely during the process.
"Team, someone has disclosed a vulnerability with a website and logo. Keep a closer eye on alerts and the reliability of your services. AWS has released a bulletin that they will be rolling out patches to their infrastructure."
If a system is critical to your organization, then you should be investigating a multi-region architecture and the ability to fail over or redirect more traffic to a different region. None of this changes with serverless but it's worth reiterating that if you're using a public cloud provider then the impact of security changes and updates you don't control is already something you deal with and handle.
Cloud Infrastructure Access and Auditing
As for securing serverless infrastructure and systems, let's discuss a few areas. This isn't an exhaustive list of what to secure but it provides a good overview of where to start.
Start with the most basic of low hanging fruit. Are there S3 buckets that should not be open? If you're going serverless you'll find yourself hosting frontend applications and static assets in S3, which means determining what buckets should and should not be open.
"Glad I caught that open S3 bucket or else we'd be this week's winner of the Last Week in AWS S3 Bucket Negligence Award."
Similar to S3 bucket permissions, spend more time auditing IAM roles and policies for least privileged access. It can be tempting for a developer to write IAM roles with widely permissive access. But if a Lambda function only needs to read from DynamoDB, then make sure it doesn't have permission to write to it.
Make sure the function can only read from the one table it's intended to read from too! This may sound obvious but the current state of cloud security and mishaps that occur make this all worth repeating.
Understand that not everyone has the same level of knowledge or sophistication when it comes to permissions and access. A developer may not know the difference between DynamoDB GetItem and BatchGetItem operations, but they know they can write dynamodb:* and be unblocked. The developer may not know how to get the DynamoDB table name from their CloudFormation stack, but they know they can use a wildcard and get unblocked. As a member of the team, you should be finding these issues, correcting them, and educating your team on best practices.
"I see you have a system that writes to a DynamoDB table. I went over the IAM role for it and it's too wide. Do you have time for me to show you how I fixed it and hat you can do in the future?"
Finally, ensure that event and action auditing services, for example AWS CloudTrail, are set up. That will give you visibility into what's occurring in the environment. You want to know about repeated AccessDenied failures and activity in an unexpected AWS region.
"I don't know how, but someone is mining bitcoin in our account over in ap-southeast-2..."
There's a variety of tools and products already available to help you audit your cloud infrastructure for best practices.
This area will probably be the newest to most of us. It absolutely is for me. Topics like runtime application security, static analysis, and other AppSec areas aren't ones I've spent much time with. But there are still several areas of application security we can pick up quickly.
Start with application dependencies. You're so busy patching the host OS today; how much time does that leave you to ensure your application's third-party dependencies aren't vulnerable? With a public cloud provider patching the application runtime environment, many organizations can finally move up the stack. Just because you didn't write it doesn't mean you don't have to secure it.
Next, who or what is talking to your API endpoints? If you're using AWS API Gateway, you will probably have exposed public HTTP endpoints. (You can put them into a VPC, but then you've introduced the complexity of VPC management and increased latency during AWS Lambda cold-starts.) You'll need to ensure that functions invoked by API Gateway have proper authentication and authorization controls. You as the operations person will need to ensure that your chosen auth provider, e.g. AWS Cognito, or an identity SaaS provider, are properly integrated into API Gateway and function code.
Finally, another important attack vector will be "event injection". You're probably familiar with SQL injection attacks, a form of data injection, where unsanitized inputs lead to malicious or unexpected database activity.
In an event driven architecture, every function will be triggered by some sort of event input. You can't assume that the event's data is safe. You might believe that an earlier function in the processing chain has already validated its input and the data passed to your function is safe. You'd be wrong.
First, you're coupling your function to the behavior of another upstream function, which can lead to issues when the upstream function changes. Second, you're also assuming that there is only one pathway to your function, which isn't definitive. This old issue of data injection is still present and the threat surface has arguably increased due to the proliferation of small-event-triggered functions in your environment.
If you're new to application security, as I am, then have a look at the Serverless Security Top 10. This gives a good primer on the threat and attack vectors of serverless architecture.
The Future Role Of Security
I want to close by pointing out, security is just like operations in that it can be a responsibility, role, team, or some permutation of those. The security space will also have to redefine their role under serverless.
I've made the argument previously that serverless infrastructure results in operations teams not making enough sense anymore. A monolithic operations team won't be as effective because they'll be gradually minimized in the service delivery process. Ops teams should be dissolved and their members redistributed to cross-functional feature development teams. The same issues of being bypassed in the service delivery process will happen to security professionals. Arguable, it already does happen to them in many organizations.
I'm not really not sure where security professionals belong in the future. But this is an important topic for that community to address as I've been addressing the role of ops.
Read The Serverless DevOps Book!
But wait, there's more! We've also released the Serverless DevOps series as a free downloadable book, too. This comprehensive 80-page book describes the future of operations as more organizations go serverless.
Whether you're an individual operations engineer or managing an operations team, this book is meant for you. Get a copy, no form required.