A Data-Driven Blueprint to Scaling Cloud Operations Security (Part 2 of 2)

Major InitiativesSecure Product Lifecycle (SPLC)Security Automation

(This is part 2 of a 2 part series)

In my previous post, I discussed the foundational building blocks that will enable you to make best use of security data in your operational security strategy. In this post, I will dive deeper into how we are making better use of the data we collect to constantly evolve and strengthen our approach.

The Data Plane

Security data is an immense source of security intelligence. If collected diligently, the answers you are looking for, may already be present in the logs, but the trick is to ask the right questions. If you compare the data against a security standard, you might find potential security gaps that teams need to remediate. If you compare the security data with the risk burn down velocity, you will be able to notice teams that need to be nudged to increase security levels.

At the end, security engagements and tooling can be rich sources of data and processing this data can lead to good results.

As scaling security is an art of balancing between human engagement and security tooling, it becomes important to look to data to come up with a company’s prioritization strategies. The way data is viewed can help security teams scale both in terms of business criticality but also in terms of risk. To drive the security program by analyzing the data is the essence of data-driven security.

The only data that matters is the one you choose to see.

Effective security has a lot to do with prioritization and choosing your battles. Forcing action by going through every piece of data can lead to security fatigue.  Therefore, it is important to arm your team with data that matters. In order to do that, you need to create views of data that allow the team to scale and thereby adopt a data-driven security strategy. Below, I outline a few methods that can help any security team that is trying to tap into data for scaling their security program.

An adoption first approach – removing the blind spots

When your security tooling is unable to monitor systems or the tooling is unable to communicate potential security issues to teams that are responsible for fixing them, this situation could create a security blind spot. Therefore, it is very important to push the adoption of the security tool chain across your company. Measuring the deviation from the security standard becomes challenging if the security tool chain has not been adopted and therefore this step improves visibility to help enumerate the potential risks in a product team’s environment.

Security blind spots are not limited to adoption of security tools alone. The blind spots can also happen because of lack of good asset and service management. Asset management requires the company to know about all its assets and service management requires the ability to know about all services within the company and the ability to attribute every system and service to an individual team.

By doing a deep dive into data you can answer questions such as how effective is your security monitoring of your organization’s cloud assets, whether the tool’s view of the operational environment matches with the reality of your company’s cloud usage, or whether there are teams who do not take advantage of the default security offered by the security tool chain.

A risk first approach – prioritizing appropriately and broadly.

As human interactions cannot scale across every service and tooling can produce a variety of issues, it is sometimes important to take a risk first approach to security. In a risk first approach, the definition of risk is dynamic and changes over time, but the idea is to tackle current high-risk items across your company first by leveraging the data plane. This allows us to create urgent connections with additional teams. For example, going after end of life software or mitigating all RCE urgently or looking at high risk software running across the company can immensely reduce risks quickly by peeking into the data collected provides an opportunity to scale by prioritizing risks.

Similarly, addressing high risk issues (blockers or critical issues) identified by security researchers is also taking a risk first approach outside the tooling space.

Security insights – extract security intelligence from the data.

Security data is not always about security bugs or security issues. It is also about the intelligence that sits within the data that may be untapped by security teams. For example, the data tells you the scale of infrastructure that a team uses, it tells you the security sensitive operations that a team may be performing and are of interest to the security team, and it can tell you about an odd service being used by a team that is not usually used by other teams. If intelligence is mined from the security data, security researchers can get better insights into the product and ask the right questions and prioritize their reviews appropriately. The data is there but the challenge is to bubble up interesting tidbits of security relevant information that can be used by the security teams. So, while it may seem interesting to extract security issues from the data one collects, it is also important to extract abstract understanding of the environment that helps one make good security decisions.

Better data = better decisions.

As a data-driven approach to security is built on top of a data architecture and product metadata, it is important to give this piece a lot of attention. Any anomaly within data between two sources, stale data or any integrity problems, that leads to bad analysis, should be dealt with urgently. Otherwise security teams can lose credibility when they generate false positives or are unable to communicate with teams because of lack of proper metadata. Therefore, keeping the data fresh and relevant is an unsaid but important requirement in any data-driven approach. Similar to code, data too should be unit tested so that monitoring the integrity of data is automatic with a goal to maintain high data quality.


Scaling a cloud operational security team is a fine balance between automation, finding potential security gaps across the breadth of products, and product deep dives, where the potential security issues are more contextual. As outlined above, data acts as a catalyst to scale cloud operations security if used in the correct manner and is a powerful third plane that security teams should not ignore. All of this needs to be backed with mature security on-boarding and engagement processes with tooling that supports them.

The trick to scaling security will be a balancing act between high touch and automation and finding the right mix. Log data is readily available from cloud providers/tools and can be useful for both the product and security teams. If a security team already has the data, it can start scaling by tapping it and asking the right set of questions. A data-driven approach to security will help make good and quick prioritization decisions further pushing the security program towards high impact, scalability and a high return on investment.

Mohit Kalra
Director, Secure Software Engineering, Operational Security

Major Initiatives, Secure Product Lifecycle (SPLC), Security Automation

Posted on 06-18-2019