Stay Secure on AWS: A Guide to Detecting and Remedying Security Misconfigurations

Amarpreet Singh
9 min readAug 7, 2023

--

Photo by Scott Webb on Unsplash

With the ease of cloud computing, companies are moving their infrastructure and applications to the cloud. However, with this shift, comes a shared sense of responsibility between the cloud service provider and the customer. Generally, cloud service providers are responsible for the security “of” the cloud, including measures such as physical security for data centers, network segregation, etc. On the other hand, customers are responsible for the security “in” the cloud, which encompasses tasks such as properly configuring security settings, securing access to resources with access policies, and implementing robust data protection measures such as encrypting data at rest or in transit. The shared security model ensures a balance of security and control for customers, allowing them to confidently adopt the cloud while maintaining high levels of security for their operations.

Misconfigurations are the biggest cause of data breaches in the cloud, exposing more than 33 billion records and costing companies close to $5 trillion in 2018 and 2019

DivvyCloud

In recent years, high-profile security breaches have caused significant financial losses for companies, highlighting the importance of a robust security posture. Some examples of such violations include the exposure of sensitive data on 150,000 patients due to a misconfigured AWS S3 bucket and the Capital One data breach caused by a misconfigured web application firewall (WAF), among many others. These incidents highlight the need for adequately configuring security settings and having the necessary security measures in place to prevent such breaches from occurring. These incidents remind us that even mature companies can fall victim to these types of violations that underscore the importance of continuous monitoring of their cloud infrastructure for any security misconfigurations.

Gartner predicts that, through 2025, 99% of cloud security failures will be the customer’s fault.

Implementing automatic detection and remediation of security misconfigurations on the cloud while ensuring compliance with security standards is crucial for the widespread adoption of the cloud. In an ideal scenario, you would want to be notified immediately if any security misconfiguration happened and resolve it automatically based on a misconfiguration.

AWS Native Detect & Remediate

Let’s look at how we can create an AWS native security solution that detects & can remediate a security misconfiguration in real time and sends alerts to respective users.

Solution Architecture of Detect & Remediate

The solution architecture to detect & remediate security misconfigurations encompasses four primary workflows: data ingestion, detection, remediation, and logging.

Data Ingestion: This phase involves collecting data from a variety of AWS services, including Amazon Guard Duty, AWS Inspector, AWS Macie, and more.

AWS Config: A pivotal tool in this architecture, AWS Config continuously monitors your AWS resource configurations. It alerts you when there’s a deviation from the defined AWS Config Rules, which represent your ideal configurations. AWS Config offers both predefined managed rules for common security checks and the flexibility to create custom rules using Lambda functions. For instance, you can craft rules to check for specific resource tags or ensure an S3 bucket uses a Customer Managed Key (CMK) instead of the default AWS-managed KMS key.

Detection Workflow: When AWS Config identifies non-compliant resources, it triggers a Lambda function via an EventBridge rule. This function then forwards the findings to AWS Security Hub, offering a consolidated view of the AWS security posture. AWS Security Hub evaluates these findings against industry standards and best practices. It aggregates data not just from AWS Config but also from other services like Guard Duty, Firewall Manager, and even third-party services such as Prisma Cloud and Dome 9.

Remediation Workflow: If AWS Security Hub determines a compliance status of FAILED, it triggers another EventBridge rule, which in turn activates a Lambda function. This function carries out a tailored remediation action based on the specific security concern. Remediation can also leverage AWS Systems Manager Automation for certain tasks. Once the Lambda function completes its task, it logs the details to CloudWatch and sends notifications via an SNS topic, to which users can subscribe.

Use Cases for the Detect & Remediate Architecture:

Scenario 1: Unrestricted Security Groups:

  • Detection: AWS Config identifies a security group that allows inbound traffic from 0.0.0.0/0 on port 22 (SSH).
  • Remediation: The Lambda function modifies the security group to restrict SSH access only to known IP addresses or removes the rule entirely.

Scenario 2: Unencrypted RDS Instances:

  • Detection: AWS Config finds an RDS instance that doesn’t have encryption enabled.
  • Remediation: The Lambda function triggers a snapshot of the RDS instance, creates an encrypted copy, and then replaces the original with the encrypted version.

Scenario 3: Untagged EC2 Instances

  • Detection: AWS Config identifies an EC2 instance that doesn’t have a required tag required by your organization.
  • Remediation: The Lambda function sends an alert to the instance creator or the operations team to add the missing tags.

Scenario 4: Unused Elastic Load Balancers (ELBs)

  • Detection: AWS Config spots an ELB with no registered instances for over 30 days.
  • Remediation: The Lambda function sends an alert to the respective team, suggesting a review and potential deletion of the unused ELB to save costs.

Scenario 5: S3 Buckets not Encrypted with Customer Managed Keys (CMK):

  • Detection: AWS Config identifies an S3 bucket where the encryption type is set to AWS-KMS, but the associated key ID matches one of the default Amazon managed KMS key IDs, indicating it’s not using a CMK.
  • Remediation: The Lambda function triggered by AWS Config sends a notification about the deviation from the organization’s policy. The alert includes details of the S3 bucket and the Amazon managed KMS key used and initiate a process to re-encrypt the bucket with the appropriate CMK.

These examples merely scratch the surface. The detect & remediate approach can be tailored to a myriad of scenarios, ensuring robust cloud security.

Detect & Remediate in action

In this section, we’ll dissect the code that powers our detection and remediation process, focusing on the scenario where S3 Buckets are not encrypted with Customer Managed Keys (CMK).

Infrastructure Setup

Our main class, S3CMKStack, extends the cdk.Stack class, which represents a cloud formation stack:

export class S3CMKStack extends cdk.Stack {
constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);

Setting up notifications

const nonCompliantNotificationTopic = new sns.Topic(this, 'NonCompliantNotificationTopic', {
displayName: 'Non-compliant S3 Bucket detected',
});
nonCompliantNotificationTopic.addSubscription(new subscriptions.EmailSubscription('YOUR-EMAIL@EMAIL.com'));

Lambda functions for detection & remediation

const detectFunction = new lambda.Function(this, 'DetectFunction', {
// ... properties ...
});

const remediateFunction = new lambda.Function(this, 'RemediateFunction', {
// ... properties ...
});

Defining Config Rule

const s3EncryptionRule = new config.CfnConfigRule(this, 'S3EncryptionRule', {
// ... properties ...
});

Setup EventBridge Rule

// EventBridge rule to trigger Detect Lambda when Config rule compliance changes
const eventRule = new events.Rule(this, 'ConfigRuleChange', {
// ... properties ...
});

eventRule.addTarget(new targets.LambdaFunction(remediateFunction));

// EventBridge rule to trigger Remediation Lambda when Security Hub determines a compliance status of "FAILED"
const securityHubFailedRule = new events.Rule(this, 'SecurityHubFailedRule', {
eventPattern: {
source: ['aws.securityhub'],
detailType: ['Security Hub Findings - Imported'],
detail: {
findings: {
Compliance: {
Status: ['FAILED']
}
}
}
}
});

Detection

The detection mechanism is powered by an AWS Lambda function. This function checks if an S3 bucket is encrypted. The core of our detection logic resides in the evaluate_bucket_encryption function. This function checks if an S3 bucket is encrypted with an AWS-managed key or a Customer Managed Key (CMK). Then it will add the evaluation on Config result if the resource is COMPLIANT or NON_COMPLIANT and send the findings to Security Hub.

def evaluate_bucket_encryption(bucket_name):
"""Evaluate the encryption status of an S3 bucket."""
try:
encryption_response = s3_client.get_bucket_encryption(Bucket=bucket_name)
master_key_id = encryption_response["ServerSideEncryptionConfiguration"]["Rules"][0]["ApplyServerSideEncryptionByDefault"]["KMSMasterKeyID"]

aliases = kms_client.list_aliases()["Aliases"]
for alias in aliases:
if alias.get("TargetKeyId") == master_key_id:
if alias["AliasName"].startswith("alias/aws/"):
message = f"Bucket {bucket_name} is encrypted with AWS-managed key."
logger.info(message)
notify_non_compliance(message)
return "NON_COMPLIANT", message
else:
return "COMPLIANT", "S3 bucket encrypted with Customer Managed Key"
except s3_client.exceptions.NoSuchBucket:
message = f"Bucket {bucket_name} does not exist."
logger.warning(message)
return 'NOT_APPLICABLE', message
except s3_client.exceptions.ClientError as e:
if e.response['Error']['Code'] == "AccessDenied":
message = f"Access denied for bucket {bucket_name}."
logger.warning(message)
return 'NOT_APPLICABLE', message
else:
message = f"S3 bucket {bucket_name} does NOT have encryption enabled."
notify_non_compliance(message)
return "NON_COMPLIANT", message
except Exception as e:
message = f"Unhandled exception for bucket {bucket_name}: {e}"
logger.error(message)
return 'NOT_APPLICABLE', message

Remediation

Once a non-compliant bucket is detected, another Lambda function is triggered to remediate by encrypt the bucket. Here’s how it works:

def encrypt_bucket_with_CMK(bucket_name):
"""Encrypt the S3 bucket with a Customer Managed Key."""
s3_client = boto3.client('s3')
kms_client = boto3.client('kms')

key_id = get_or_create_key()

# Apply the CMK to the S3 bucket
s3_client.put_bucket_encryption(
Bucket=bucket_name,
ServerSideEncryptionConfiguration={
'Rules': [{
'ApplyServerSideEncryptionByDefault': {
'SSEAlgorithm': 'aws:kms',
'KMSMasterKeyID': key_id
}
}]
}
)
logger.info(f"Bucket {bucket_name} encrypted with CMK {key_id}")

Notification

To keep stakeholders informed, we’ve also integrated an SNS topic. Whenever a non-compliant bucket is detected, a notification is sent to the subscribed email.

This was just one use-case. Since it uses a custom Lambda function, we can perform any remediation as per your organization’s needs.

Testing

After deploying the stack, proceed to create an S3 bucket that deliberately lacks compliance by encrypting it with an AWS managed key. In AWS Config, this particular bucket will be appropriately marked as NON_COMPLIANT. Additionally, the integrated code triggers the transmission of this discovery to Security Hub, allowing you to observe the flagged item there, categorized with severity levels such as CRITICAL, MEDIUM, or HIGH depending on the context. When Security Hub registers a FAILED status for an event, the event bridge promptly triggers a remediation lambda aimed at rectifying the encryption configuration of the aforementioned S3 bucket. This orchestrated process ensures the identification and swift resolution of security misconfigurations.

Comparing AWS Native Detect & Remediate with Third-Party CSPM Tools

While AWS provides a robust set of tools for cloud security, many third-party Cloud Security Posture Management (CSPM) tools have emerged in the market. These tools, like Prisma Cloud, Dome 9, and Check Point CloudGuard, offer their own sets of features and advantages. Let’s compare AWS’s native solutions with these third-party tools:

  1. Integration & Compatibility: AWS’s native tools are deeply integrated with its ecosystem, ensuring seamless compatibility. Third-party tools, on the other hand, often provide broader support across multiple cloud platforms, making them suitable for multi-cloud environments.
  2. Customization: AWS offers a high degree of customization, especially with AWS Config, where you can define custom rules using Lambda functions. Third-party tools might offer more user-friendly interfaces and pre-defined templates for common use cases.
  3. Cost: Using AWS’s native tools might be more cost-effective for organizations already heavily invested in the AWS ecosystem. However, third-party tools might provide more features in their pricing tiers, which could be beneficial for comprehensive security needs.
  4. Alerting & Reporting: Both AWS and third-party tools provide alerting mechanisms. However, third-party tools might offer more advanced reporting dashboards, integrating findings from multiple cloud platforms.
  5. Ease of Use: While AWS provides a comprehensive set of tools, there might be a steeper learning curve involved. Third-party tools often prioritize user experience, offering intuitive interfaces and guided workflows.

Conclusion

As the adoption of cloud services continues to grow, so does the importance of maintaining a robust security posture within these environments. AWS provides a comprehensive set of tools to help organizations detect and remediate security misconfigurations. However, as we’ve seen, the responsibility of security “in” the cloud lies with the customer. By leveraging AWS’s native tools, organizations can proactively address potential security threats, ensuring their data remains secure and compliant.

While third-party CSPM tools offer their own sets of advantages, especially for multi-cloud environments, AWS’s native solutions provide deep integration and customization capabilities. The choice between them will largely depend on an organization’s specific needs, infrastructure, and the complexity of its cloud deployments.

In this guide, we’ve walked through a practical example of how to set up a detect and remediate solution using AWS. By understanding and implementing such mechanisms, organizations can significantly reduce the risk of data breaches due to misconfigurations and ensure they are making the most of their cloud investments securely.

Remember, in the world of cloud security, being proactive rather than reactive can make all the difference. Stay informed, stay updated, and most importantly, stay secure.

Full source code can be accessed here!

Related Links

--

--

Amarpreet Singh

I'm a Solution Architect and engineering leader based in San Francisco, passionate about exploring new technologies and tackling interesting challenges.