26 August 2022

The importance of Securing Serverless Infrastructure

In this article, I will talk about the possible security risks of serverless computing and how to mitigate them.

Introduction

More and more companies are adopting Serverless computing. Serverless computing eliminates the maintenance burden of servers, runtimes, scaling, patching, fault tolerance, and other operational complexities. Eliminating this burden simplifies the development process and allows developers to build and deploy applications more efficiently. Examples of serverless computing in AWS include AWS Lambda, AWS Fargate with Amazon Container Service (ECS), and Amazon Kubernetes Service (EKS).

Shared Responsibility Model

An important part when using computing services of any public cloud is understanding the shared responsibility model. The shared responsibility model describes who is accountable -the service provider or the customer- for the security and compliance of the different layers of the cloud service.

More simply, a cloud provider isn’t accountable for many things you do on the cloud. For example, it doesn’t make sense for a cloud provider to be accountable for a security breach in a software package on a virtual machine you installed. It’s important to note that the shared responsibility model is not the same for every service. When we go higher in the stack of abstractions, from IaaS to PaaS or SaaS, the model shifts. Looking at the previous example, when using Amazon ECS Fargate, it doesn’t make sense to hold you accountable for a security breach for packages installed on the virtual machine that hosts our container instance. (under the hood talk)

Let’s look at the shared responsibility model in action and compare the AWS ECS/EC2 with the Fargate variant.

In the diagram below, you can see AWS is responsible for the Physical Hardware, foundational services, and the ECS control plane of both variants. But note that the underlying virtual machine that hosts your container instances on ECS/EC2 is the full responsibility of the customer, even though AWS provides you a one-click solutions to manage them. This means if anything happens, you are responsible! That is why you should ensure that VM is well protected and continuously verify it says that way. That responsibility is in contrast to the Fargate variant, where the worker node is the complete responsibility of AWS.

ECS on EC2 ECS on Fargate

From The ECS Shared Responsibility Model

Simple Vulnerable Lambda Function

Let’s apply this theory and start with the fun part. The lambda function below takes a name from a query parameter and renders it in a “Hello World” template using Jinja2.

Can you spot the issue?

from jinja2 import Template


def lambda_handler(event, context):
    name = event["queryStringParameters"]["name"]

    body = Template(f"Hello, {name}!").render()

    return {"statusCode": 200, "headers": {"Content-Type": "text/html"}, "body": body}

The issue is in the line where I am rendering the Hello template.

body = Template(f"Hello, {name}!").render()

You shouldn’t substitute variables inside the template but give the render function a name parameter and reference that using mustache.

body = Template("Hello, {{ name }}!").render(name=name)

You may have noticed this exploit is almost the same as SQL injection, which can be mitigated by parameterizing the SQL query and passing the variable as a parameter instead of using the variable directly. This is why the exploit is called Server-Side Template Injection (SSTI). Let’s try exploiting this vulnerability 😉

First, we use a valid parameter, like “world” to get an expected outcome:

/hello?name=world

Lambda normal output

But if act like a bad student and pass {{7*7}}: We get Hello 49!

/hello?name={{7*7}}

Lambda calc output

jinja2 now interpreted our input instead of printing it. 😱

To convert this to a Remote Code Execution (RCE) you need to use a Python “magic” function…

/hello?name={{"foo".__class__.__base__.__subclasses__()[187].__init__.__globals__["sys"].modules["os"].popen("printenv").read()}}

In short, you start from a Python string, call the class and then call its base class to get all imported classes. Then you take the 187th class (this number can differ, and there are other possibilities). You take this number because it imports the sys module, and the sys module imports the os module. Using the os module a lot is possible. For example, you could use it to interact with the filesystem to spawn processes. Below it’s used to execute the printenv command and print the output.

Lambda exploit output

Impact

Next, let’s figure out the possible impact of this SSTI.

First, as shown in the print screen above, the environment variables always include credentials. This is because the account id and secret of the lambda execution role are injected as environment variables into the function runtime. If you require the lambda function to access AWS DynamoDB and you, therefore, grant the lambda role full access on this DynamoDB, a malicious user can use those credentials to acquire full access to your data store.

Secondly, using this method, an attacker could read the source code of interpreted languages like Python and NodeJs or decompile code from languages like Java and Go. With that source code, an attacker can continue to discover secrets, outdated dependencies, and software bugs. In the same way, an attacker can steal company information, attack other applications or even start an advanced Spear Phishing campaign.

Lastly, if the Lambda function is connected to a VPC, an attacker can attempt to scan the internal VPC, Peered VPCs, VPCs routed through Transit Gateway or the on-premise network connected with a DirectConnect. It’s even possible to scan an S2S VPN if the access control lists aren’t strict enough. Acting this way, an attacker can look for other vulnerabilities and make lateral movements in the network.

Countermeasures

Looking for counter measurements, we need to look at the shared responsibility model of AWS Lambda.

AWS Lambda

From The Shared Lambda Responsibility Model

The model state you as a customer are responsible for:

Customer Function Code and Libraries
Resource Configuration
Identity & Access Management

Based on this information, we will explain the countermeasures of each topic.

Customer Function Code and Libraries

To protect a function, you can use static application security testing (SAST). SAST tools scan your code locally, in git, or in a build pipeline. The tool searches and notifies when specific patterns of prevalent bugs are found in your code. The advantage of this approach is the ease of implementation and performance. On the downside, there have to be rules defined.

Another possibility is to integrate runtime security for your AWS Lambda. An agent to do this runs actively in the same runtime of AWS, monitoring all application actions. Most of these tools will detect or block a remote code execution exploit and likely have features like detection/blocking malicious file uploads or malicious payloads.

Resource Configuration

Another critical topic is securing Lambda function configuration. To be more precise, securing the configuration of networking options: traffic allowed to reach your function or a VPC integration enabling a function to reach other servers/services via a private network.

If a function is publicly exposed, limiting the allowed IPs to your function will make it harder to abuse a function. If function access is limited to a VPC, it is safer to use VPC endpoints and disable all public traffic to the function. This will make it impossible to exploit a function unless another VPC entrypoint is discovered.

For the egress traffic of the Lambda function, I advise making the access lists as strict as possible so that only the necessary traffic is allowed.

Identity & Access Management

The last and most important topic is IAM. You have two options to configure IAM: the resource policy and the execution role.

A resource policy validates the IAM role that tries to invoke the lambda function. If you would define the AWS principal as a wildcard, any AWS role can be used to invoke the function from any AWS account. So again, make the resource policy as strict as possible, and don’t hesitate to use conditions.

The other policy you can define is the lambda execution role. A function assumes this role to access other AWS services like DynamoDB and other functions. Try to make this policy as verbose as possible and avoid wildcards.

Conclusion

As shown in this post, exploiting vulnerable Lambda functions or other serverless services is possible and can have a significant impact on organizations. Be aware of what you are responsible for by checking the shared responsibility model and try to cover each vector with at least one but preferable multiple security measurements. The more security layers you use, the more protected you are, and the less damage can be done.