Design Considerations for Secure Cloud Deployment
Whether you are migrating an on-premise deployment to a cloud provider tasked with deploying a new cloud-hosted application, or looking to improve existing processes for your cloud deployment, this article will help provide guidance on how to design a secure cloud deployment.
First, we will describe the differences between configuration management providers (e.g., Chef and Ansible for Infrastructure as Code, or IAC) and deployment providers (e.g., Terraform and CloudFormation). Then we’ll look at how to simplify your attack surface by replacing back-end code with generic cloud services and migrating processing pipelines from application code to IAC. We’ll compare options for service orchestrators and provide resources for cloud and cluster visualization. Finally, we’ll discuss cloud services and tools to facilitate the storage of application secrets for your cloud deployment.
Following this how-to guide will help reduce your attack surface, simplify security maintenance, and make it easier to catch mistakes in future implementations. Even if these suggestions aren’t exactly the right fit for your product, they can hopefully still provide new design inspiration.
INFRASTRUCTURE AS CODE (IAC)
Deployment Providers Vs. Configuration Providers
When choosing how to record and rebuild your existing setup, there are two main approaches: deployment providers (e.g., Terraform and Cloud Formation) and configuration providers (e.g., Chef and Ansible).
With deployment providers, you have guarantees about what is running on every machine; however, you may have less expressive power. The configuration files are effectively blueprints to describe and build your environment. Changes to configuration are executed by rebuilding a given node.
- Numerous static analyzers can be included in the CI/CD pipeline; it’s highly auditable
- No additional management nodes in your network
- Less opportunity for configuration drift
- Focus is on cloud-hosted deployments. Although there is no direct hardware support, providers are available for Citrix XenServer and Linux KVM (using libvirt) for on-premise deployments.
With configuration providers, systems may be required to run a dedicated agent that executes commands delegated by a management node while others leverage existing SSH services. Although this approach offers substantial flexibility, it can also introduce risk, as we will explore below.
- Can be used on a deployment that is a mix of on-premise and cloud-based
- Changes can be deployed to running instances rather than having to rebuild/restart instances
- Requires management nodes (note: this is not required for SaltStack or Ansible since they rely on SSH access to each node for provisioning)
- Has a risk of configuration drift (i.e., out-of-sync configuration)
- More challenging to perform a security audit on the configuration
- More active management (e.g., restarting failing agents)
Choosing your IAC Provider from a Security Perspective
From a security perspective, deployment providers with static analysis in the CI/CD pipeline are an excellent way to incorporate security checks in a low-impact manner. Terraform in particular has the most security-related linters available. When combined with Packer and Docker for building images and containers, a deployment provider offers a similar but more easily auditable configuration compared to a configuration provider.
There are fewer security linters available for configuration providers and they tend to be more superficial than the ones for deployment providers. Keep in mind that it’s likely that configuration provider scripts will need to rely solely on manual security review.
Whether you should use a single provider or multiple providers depends on your DevOps goals. Although more complex from a security perspective, choosing both a deployment provider and a configuration provider may have advantages. Here are a few resources from a developer perspective that may help you figure out what would be the best fit for your product:
- Why we use Terraform and not Chef, Puppet, Ansible, SaltStack, or CloudFormation
- Ansible and Terraform: Better Together
- The Terrors and Joys of Terraform
Static analyzers for configuration
Other configuration management
- GitHub: List of Static Analysis tools
- AWS Trusted Advisor
- Azure Advisor
- GCP Security Command Center
- CloudSploit [AWS, Azure, GCP, Oracle]
- ScoutSuite [AWS, Azure, GCP, Oracle/Alibaba with limited support]
- Prowler [AWS]
REPLACING CUSTOM BACK-END SERVICES WITH GENERIC CLOUD SERVICES
Now that we’ve explored ways to codify our deployments as IAC, let’s look for opportunities to migrate custom back-end services to generic cloud services that can be expressed in IAC.
Let’s consider the task of building a video transcoding service that allows users to upload, convert, and download media. From a security perspective, designing this service as an service running on EC2 or as lambda functions is littered with pitfalls.
Possible vulnerabilities might include:
- Path traversal via the upload or download of media
- Code execution through the unsafe processing of media (e.g., Ghostscript)
- Command injection to the subprocess invoking the transcoder command-line interface (CLI)
- Unsafely interpreting user media data as code (e.g., a local file inclusion [LFI] vulnerability or placing data in a web root)
- Denial of service (DoS) from a corrupt or oversized media file
- And more…
However, instead of investing time and resources into extensive threat modeling and enumerating all the possible corner cases, consider replacing the entire workflow with a cloud service, such as AWS:
With this design, your team can now focus simply on the security of S3 bucket uploads and downloads, which is a far less daunting task than the original design. The Lambda triggers can move data through the pipeline (instead of being used for transcoding), leveraging Elastic Transcoder for video transcoding and S3 for file storage. Even better, this service could be codified in Terraform, which will undergo static analysis during deployment.
By identifying opportunities to outsource your attack surface to a cloud provider, you will simplify the ongoing maintenance of your product. Outsourcing to a cloud provider may also come with performance benefits.
These services do come with an additional cost. Be sure to compare it with the cost of the vulnerabilities, remediation, and continued security maintenance of the alternative solutions.
Choosing an Orchestrator
Kubernetes has become the de facto solution for container orchestration. However, it is also the most hands-on choice and leads to complex debugging.
Let’s discuss a few common Kubernetes deployment risks, and then we’ll talk about multi-cloud alternatives. After that, we’ll focus on tools that provide automated scanning of your Kubernetes environment and look at how to secure your container images.
Hosted vs. Self-deployed Kubernetes
Similar to our previous example (replacing custom back-end services with generic cloud services), outsourcing your attack surface and complexity to a cloud provider is often the best choice for lower maintenance and more secure designs. Although hosted Kubernetes may provide less customization, that can be a good thing. Deploying your own cluster across EC2 can come with a variety of unexpected risks.
For instance, deploying your own cluster on EC2 with Kubernetes KOPs has an insecure configuration right out of the box. Namely, it does not restrict pod access to the EC2 metadata URL, which, through a series of actions, allows for cluster takeover. Although these limitations and risks are highlighted in security documentation, it is an easy oversight to make.
When to Choose Something Else
From a security perspective, it’s often advantageous to choose the solution with the most community support; however, it’s also about choosing something that is the best fit for your product.
Depending on your requirements and workloads solutions, further infrastructure abstractions such as AWS Fargate might be a good fit. As described on its site:
“AWS Fargate is a serverless compute engine for containers that works with both Amazon Elastic Container Service (ECS) and Amazon Elastic Kubernetes Service (EKS). Fargate makes it easy for you to focus on building your applications. Fargate removes the need to provision and manage servers, lets you specify and pay for resources per application, and improves security through application isolation by design.”
When evaluating options, prefer abstractions and put the burden of the attack surface on your provider where possible. Abstractions often have highly auditable configurations, which makes it easier to maintain security at scale.
Other popular alternatives to Kubernetes include:
- HashiCorp’s Nomad for a multi-cloud, hands-off solution to orchestration.
- Mesos with Marathon for a mixed containerized and non-containerized deployment.
Scanning Kubernetes Deployments
Whatever method you choose for deployment, take advantage of publicly available tools to identify misconfigurations in your cluster. Although these tools will not be comprehensive, they will help you identify the most common and severe misconfigurations.
Also take advantage of any available security analyzers, and automate routine scanning of your cluster to identify regressions.
Here are some popular open-source options:
Finally, aside from orchestrator security, be sure to review your container configurations. Common risks include outdated software, backdoors or crypto miners, and misconfigurations affecting container isolation.
For more on container security, check out the Docker Security Cheat Sheet guide from OWASP.
CLOUD AND CLUSTER VISUALIZATION
The more ways you can examine a design, the easier it is to spot problems. As a result, an important part of any deployment is visualization. A visual representation can not only help you spot misconfigurations but also provide helpful documentation for new employees.
Here are a variety of self-hosted and commercial services:
When migrating to cloud-hosted secrets management, it’s important to first identify all the existing secrets in your source code. Leaks can occur in forgotten reverted commits in Git history; or when employees accidentally check company code into their personal GitHub accounts.
Let’s look at some methods to find secrets, and then ways to manage them in your deployment.
Finding Existing Secrets in Code
If you use the self-hosted or online version of GitHub, you can use the Bishop Fox–developed tool GitGot to search for secrets across your org or all public repos.
Additionally, to search through the commit history of a given set of repos, you can use TruffleHog to identify the commits that introduce high-entropy strings or strings matching predefined regex patterns.
Finally, remember to revoke and rotate all identified credentials before migrating to a secrets management solution.
Choosing a Secrets Manager Provider
Although you could use AWS Key Management Service (KMS) directly or other solutions provided by cloud platforms, using HashiCorp’s Vault or Mozilla SOPS as an abstraction layer may simplify integration and ease any future migration to another provider:
- HashiCorp Vault (free, or paid for additional enterprise features)
- Mozilla SOPS (free)
Hopefully, these ideas will help you in the design, maintenance, and deployment of your product. Take advantage of the numerous configuration scanners for each component of your cloud deployment and integrate them into your CI/CD pipelines where possible. Try to simplify your designs and use existing cloud services as replacement drop-in components when available.
Finally, given all the complexity of a cloud deployment, feel free to reach out to Bishop Fox and schedule a cloud security review if you are ready to put your configuration to the test.
Subscribe to Bishop Fox's Security Blog
Be first to learn about latest tools, advisories, and findings.
Thank You! You have been subscribed.