Cloud security anti-pattern: Log storm - part 1

This post presents a problematic pattern that I encountered in the infrastructure of one of my clients. I am calling this pattern "log storm", simply because I haven't encountered a more appropriate name. In practice this architectural anti-pattern leads to the creation of unnecessary amount of log entries in a security system which in turn incurs unnecessary usage costs. It can probably be encountered in various platforms and setups, but in this post I am focusing on AWS-based set-up.

‍

Context

Important aspects, or components, of the set-up that I encountered are: multi-account cloud set-up and CloudTrail logging enabled. These are very often aspects of infrastructure encountered even in small companies and startups.

‍

Multi-account environment

A lot of businesses, even in the initial stage, find it very useful to segregate various classes of their assets. Putting assets into different accounts is a way to do that. The segregation might be introduced between certain departments (e.g., marketing resources vs. product resources), functions (e.g., application environment vs. shared service or production vs. staging environments). The goal here is to increase the overall security of the organization by limiting the impact of a potential data breach.

‍

CloudTrail logging

CloudTrail is one of the most important security services available in AWS. It basically allows to log each and every interaction with your management plane. It provides forensic material for future investigations and it allows your organization to enforce accountability of their employees and contractors.

These two components are helpful in bringing down some risks to accepted levels and they are quite often seen as regulatory and/or contractual requirements, especially in high-impact businesses, like finance or health.

‍

Anti-pattern description

A good Information Security Management System (ISMS) is unfortunately a complex structure. It has many moving elements on its own and naturally it needs to accommodate for all the changes in the business processes as well. After all, it plays a supporting/enabling role for the business core of an organization, but it's very sensitive to the specifics and details of the latter.

For a person governing the system, a natural response to this complexity is an attempt at simplification, which, among other things, means centralization. It's convenient to centralize identity management, asset management, access management, risk management, etc.. It's only right to work on centralizing our log management as well.

Centralized log management offers a wide range of benefits ranging from easier retention management, faster investigation and incident response times, capability of cross-analyzing forensic data, enforcing log protections, and so on.

In a multi-account environment based on AWS it makes sense to create a separate security account that will hold the logs delivered from other environments. For example, development and production applications can log events to the local CloudWatch log-groups with reduced retention time and these in turn can be transported to the central log-archive account for further processing.

It's tempting to use lambda functions for the log transformation and routing. After all, a lot of people are familiar with serverless code solutions and almost certainly they'll feel comfortable with one of several programming languages available. And on top of it, CloudWatch provides subscription functionality for lambda.

What could be easier? We subscribe the log-group to a lambda function, the function transforms the log entry and writes to centralized log archive destination.

The problem with this solution arises when we make use of CloudTrail's logging of lambda execution.

‍

In this solution, we are creating a feedback loop. The monitored application/service creates a log entry, the entry is transformed by lambda and execution of lambda creates another log entry. Ad infinitum. Soon we may discover that 90% of our CloudTrail logs consist of the lambda functions being executed in separate infinite loops.

The problem is even more severe if we decide to use several lambda functions, e.g. for additional processing and feeding a SIEM of some type.

All of these entries are CloudTrail entries and CloudTrail can be relatively expensive, if we cross the free tier treshold. On the other hand, we don't want to turn off logging of lambda functions execution entirely becaouse it's possible for them to contain potentially damaging code and we want to maintain the log trail and accountability.

‍

Summary

I hope that I was able to explain the problematic pattern in an understandable way. Please see the next blog entry for a description of a proper replacement for this anti-pattern. Also, make sure you check out our Secure Architecture assessment service and Security Controls consulting package to find out how can we help you improving your security operations!

‍