This is the first of a multi-part series on security automation. This blog will focus on high-level design considerations. The next blog will focus on technical design considerations for building security automation. The third blog will dive even deeper with the specific examples as the series continues to get more technical.
There are many possible approaches for adding automation to your security process. For many security engineers, it is an opportunity to get away from reviewing other engineers’ code and write some of their own. One key difference between a successful automation project and “abandonware” is to create a project that will produce meaningful results for the organization. In order to accomplish that, it is critical to have a clear idea of what problem you are trying to solve at the onset of the project.
When choosing the right path for designing security automation, you need to decide what will be the primary goals for the automation. Most automation projects can be grouped into common themes:
Scalability is something that most engineers instinctively go to first because the cloud has empowered researchers to do so much more. Security tools designed for scalability often focus on the penetration testing phase of the software development lifecycle. They often involve running black or grey box security tests against the infrastructure in order to confirm that the service was deployed correctly. Scalability is necessary if you are going to measure every host in a large production environment. While scalability is definitely a powerful tool that is necessary in the testing phase of your security development lifecycle, it sometimes overshadows other important options that can occur earlier in the development process and that may be less expensive in terms of resources.
There is a lot of value in being able to say that “Every single release has ___.” Consistency isn’t considered as challenging of a problem as scalability because it is often taken for granted. Consistency is necessary for compliance requirements where public attestations need to have clear, simple statements that customers and auditors can understand. In addition, “special snowflakes” in the environment can drown an organization’s response when issues arise. Consistency automation projects frequently focus on the development or build phase of the software development lifecycle. They include integrating security tasks into build tools like Jenkins, Chef, Git, or Maven. By adding security controls into these central tools in the development phase, you can have reasonable confidence that machines in production are correctly configured without scanning each and every one individually.
Efficiency projects typically focus on improving operational processes that currently involve too much human interaction. The goal is to refocus the team’s manual resources to more important tasks. For instance, many efficiency projects have the word “tracking” somewhere in their definition and involve better leveraging tools like JIRA or Sharepoint. Often times, efficiency automation is purchased rather than built because you aren’t particularly concerned with how the problem gets solved, so long as it gets solved and that you aren’t the one who has to maintain the code for it. That said, SalesForce’s open-source VulnReport.io project (http://vulnreport.io) is an example of a custom built efficiency tool which they claim improved operational efficiency and essentially resulted “in a ‘free’ extra engineer for our team.”
Metrics gathering can be a project in itself or it can be the byproduct of a security automation tool. Metrics help inform and influence management decisions with concrete data. That said, it is important to pick metrics that can guide management and development teams towards solving critical issues. For instance, development teams will interpret the metrics gathered as the key performance indicator (KPI) that they are being measured against by management.
In addition, collecting data almost always leads to requests for more detailed data. This can be useful in helping to understand a complex problem or it can be a distraction that leads the team down a rabbit hole. If you take time to select the proper metrics, then you can help keep the team focused on digging deeper into the critical issues.
If your scalable automation project aims to run a web application penetration tool (WAPT) across your entire enterprise, then you are basically creating an “enterprise edition” for that tool. If you have used enterprise edition WAPTs in the past and you did not achieve the success that you wanted, then recreating the same concept with a slightly different tool will most likely not produce significantly different results when it comes to action within the enterprise. The success or failure of tools are typically hinged on the operational process surrounding the tool more than the tool itself. If there is no planning for handling the output from the tool, then increasing the scale at which the tool is run doesn’t really matter. When you are designing your automation project, consider operational questions such as:
Are you enumerating a problem that you can fix?
Enumerating an issue that the organization doesn’t have the resources to address can sometimes help justify getting the funding for solving the problem. On the other hand, if you are enumerating a problem that isn’t a priority for an organization, then perhaps you should refocus the automation on issues that are more critical. If no change occurs as the result of the effort, then the project will stop iterating because there is no need to re-measure the issue.
In some situations, it may be better to tackle the technical debt of basic issues before tackling larger issues. Tests for basic technical debt issues are often easier to create and they are easier for the dev team to address. As the dev team addresses the issues, the project will also iterate in response. While technical debt isn’t as exciting as the larger issues, addressing it may be a reasonable first step towards getting immediate ROI.
Are you producing “noise at scale”?
Running a tool that is known for creating a high level of false positives at scale will produce “noise at scale”. Unless you have an “at scale” triage team to eliminate the false positives, then you are just generating more work for everyone. Teams are less likely to take action on metrics that they believe are debatable due to the fear that their time might be wasted. A good security tool will empower the team to be more efficient rather than drown them in reports.
How will metrics guide the development team?
As mentioned earlier, the metric will be interpreted as a KPI for the team and they will focus their strategy around what is reported to management. Therefore, it makes sense not to bother measuring non-critical issues since you don’t want the team to get distracted by minor issues. You will want to make sure that you are collecting metrics on the top issues in a way that will encourage teams towards the desired approach.
Often times there are multiple ways to solve an issue and therefore multiples ways to measure the problem. Let’s assume that you wanted to create a project to tackle cross-site scripting (XSS). Creating metrics that count the number of XSS bugs will focus a development team on a bug fixing approach to the problem. Alternatively, counting the number of sites with content security policy headers deployed will focus the development team on security mitigations for XSS. In some cases, focusing the team on security mitigations has more immediate value than focusing on bug fixing.
What metrics does management need to see?
One method to determine how your metrics will drive development teams and inform management, is to phrase them in terms of an assertion. For instance, “I assert that the HSTS header is returned by all our sites in order to ensure our data is always encrypted.” By phrasing it as an assertion, you are phrasing the test in terms of a simple true/false terms that can be reliably measured at scale. You are also phrasing the test in terms of its desired outcome using simplistic terms. This makes it easier to determine if the goal implied by the metric’s assertion meets with management’s goals.
From a management perspective, it is also important to understand whether your measurement of change or a measurement of adoption. Measuring the number of bugs in an application is often measuring an ongoing wave. If security programs are working, the height of the waves will trend down overtime. Although, that means you have to watch the wave through multiple software releases before you can reliably see a trend in its change. If you measure security mitigations, then you are measuring the adoption rate of a project that will end with a state of relative “completeness.” Tracking wave metrics overtime is valuable because you need to see when things are getting better or worse. However, since it is easy to procrastinate on issues that are open ended, adoption-style projects that can be expressed as having a definitive “end” may get more immediate traction from the development team because you can set a deadline that needs to be met.
Putting it all together
With these ideas and questions in mind, you can mentally weigh which types of projects to start with for immediate ROI and the different tools for deploying them.
For instance, counting XSS and blind SQL injection bugs are hard tests to set up (authentication to the application, crawling the site, etc.), these tests frequently have false positives, and they typically result in the team focusing on bug fixing which would require in-depth monitoring overtime because it is a wave metric. In contrast, a security project measuring security headers, such as X-Frame-Options or HSTS, are simple tests to write, they have low false-positive rates, they can be defined as “(mostly) done” once the headers are set, and they focus the team on mitigations. Another easy project might be writing scalable tests that confirm the SSL configuration meets the company standards. Therefore, if you are working on a scalability project, starting with a simple SSL or security header projects can be quick wins that demonstrate the value of the tool. From there, you can then progress to measuring the more complex issues.
However, let’s say you don’t have the compute resources for a scalability project. An alternative might be to turn the projects into consistency style projects earlier in the lifecycle. You could create Git or Jenkins extensions that search the source code for indicators that the team has deployed security headers or proper SSL configurations. You would then measure how many teams are using the build platform extensions and review the collected results from the extension. It would have a similar effect as testing the deployed machines without as much implementation overhead. Whether this will work better overall for your organization will depend on where you are with your security program and its compliance requirements.
While the technical details of how to build security automation is an exciting challenge for any engineer, it is critical to build a system that will empower an organization. Your project will have a better chance of success if you spend time considering how the output of your tool will help guide progress. The scope of the project in terms of development effort and project coverage by carefully considering where in the development process you will deploy the automation. By spending time on defining how the tool can best serve the team’s security goals, you can help ensure you are building a successful platform for the company.
The next blog will focus on the technical design considerations for building security automation tools.
Principal Scientist, Security