Posts tagged "Security automation"

Lessons Learned from Improving Transport Layer Security (TLS) at Adobe

Transport Layer Security (TLS) is the foundation of security on the internet. As our team evolved from primarily consultative role to solve problems for the entire company, we chose TLS as one of the areas to improve. The goal of this blog post is to share the lessons we’ve learned from this project.

TLS primer

TLS is a commonly used protocol to secure communications between two entities. If a client is talking to a server over TLS, it expects the following:

  1. Confidentiality – The data between the client and the server is encrypted and a network eavesdropper should not be able to decipher the communication.
  2. Integrity – The data between the client and the server should not be modifiable by a network attacker.
  3. Authentication – In the most common case, the identity of the server is authenticated by the client during the establishment of the connection via certificates. You can also have 2-way authentication, but that is not commonly used.

Lessons learned

Here are the main lessons we learned:

Have a clearly defined scope

Instead of trying to boil the ocean, we decided to focus on around 100 domains belonging to our Creative Cloud, Document Cloud and Experience Cloud solutions. This helped us focus on these services first versus being drowned by the thousands of other Adobe domains.

Have clearly defined goals

TLS is a complicated protocol and the definition of a “good” TLS configuration keeps changing over time. We wanted a simple, easy to test, pass/fail criteria for all requirements on the endpoints in scope. We ended up choosing the following:

SSL Labs grade

SSL Labs does a great job of testing a TLS configuration and boiling it down to a grade. Grade ‘A’ was viewed as a pass and anything else was considered a fail. There might be some endpoints that had valid reasons to support certain ciphers that resulted in a lower grade. I will talk about that later in this post.

Apple Transport Security

Apple has a minimum bar for TLS configuration that all endpoints must pass if iOS apps are to connect to that endpoint. We reviewed this criteria and all the requirements were deemed sensible. We decided to make it a requirement for all endpoints, regardless if an endpoint was being accessed from an iOS app or not. We found a few corner cases where a configuration would get SSL Labs grade A and fail ATS (and vice-versa) that we resolved on a case-by-case basis.

HTTP Strict Transport Security

HSTS (HTTP Strict Transport Security) is a HTTP response header that informs compliant clients to always use HTTPS to connect to a website. It helps solve the problem of initial request being made over plain HTTP when a user types in the site without specifying the protocol and helps prevent the hijacking of connections. When a compliant client receives this header, it only uses HTTPS to make connections to this website for a max-age value set by the header. The max-age count is reset every time the client receives this header. You can read the details about HSTS in RFC 6797.

Complete automation of testing workflow

We wanted to have minimal human cost for these tests on an ongoing basis. This project allowed us to utilize our Security Automation Framework. Once the scans are setup and scheduled, they keep running daily and the results are passed on to us via email/slack/etc. After the initial push to get all the endpoints pass all the tests, it was very easy to catch any drift when we saw a failed test. Here is what these results looks like in the SAF UI:

Devil is in the Detail

From a high level it seems fairly straightforward to go about improving TLS configurations. However, it is a little more complicated when you get into the details. I wanted to talk a little bit about how we went about removing ciphers that were hampering the SSL Labs grade.

To understand the issues, you have to know a little bit about the TLS handshake. During the handshake, the client and the server decide on which cipher to use for the connection. The client sends the list of ciphers it supports in the client “hello” message of the handshake to the server. If server side preference is enabled, the cipher that is listed highest in the server preference and also supported by client is picked. In our case, the cipher that was causing the grade degradation was listed fairly high on the list. As a result, when we looked at the ciphers used for connections, this cipher was used in a significant percentage of the traffic. We didn’t want to just remove it because of the potential risk of dropping support for some customers without any notification. Therefore, we initially moved it to the bottom of the supported cipher list. This reduced the percentage of traffic using that cipher to a very small value. We were then able to identify that a partner integration was responsible to all the traffic for this cipher. We reached out to that partner and notified them to make appropriate changes before disabling that cipher. If you found this interesting, you might want to consider working for us on these projects.

Future work

In the future, we want to expand the scope of this project. We also want to expand the requirements for services that have achieved the requirements described in this post. One of the near-term goals is to get some of our domains added to the HSTS preload list. Another goal is to do more thorough monitoring of certificate transparency logs for better alerting for new certificates issued for Adobe domains. We have also been experimenting with HPKP. However, as with all new technologies, there are issues we must tackle to continue to ensure the best balance of security and experience for our customers.

Gurpartap Sandhu
Security Researcher

Developing an Amazon Web Services (AWS) Security Standard

Adobe has an established footprint on Amazon Web Services (AWS).  It started in 2008 with Managed Services, and expanded greatly with the launch of Creative Cloud in 2012 and the migration of Business Catalyst to AWS in 2013. In this time, we found challenges in keeping up with AWS security review needs.  In order to increase scalability, it was clear we needed a defined set of minimum AWS security requirements and tooling automation for auditing AWS environments against it.  This might sound simple, but like many things, the devil was in the details. It took focused effort to ensure the result was a success.  So how did we get here?  Let’s start from the top.

First, the optimal output format needed to be decided upon.  Adobe consists of multiple Business Units (BUs) and there are many teams within those BUs.  We needed security requirements that could be broadly applied across the company as well as to acquisitions. so we needed requirements that could not only be applied to existing services and new services across BUs; but also be future-proof. Given these constraints, creating a formal standard for our teams to follow was the best choice.

Second, we needed to build a community of stakeholders in the project. For projects with broad impact such as this, it’s best to have equally broad stakeholder engagement.  I made sure we had multiple key representatives from all the BUs (leads, architects, & engineers) and that various security roles were represented (application security, operational security, incident response, and our security operations center).  This led to many strong opinions about direction. Thus, it was important to be an active communication facilitator for all teams to ensure their needs are met.

Third, we reviewed other efforts in the security industry to see what information we could learn.  There are many AWS security-related whitepapers from various members of the security community.  There have been multiple security-focused AWS re:Invent presentations over the years.  There’s also AWS’s Trusted Advisor and Config Rules, plus open source AWS security assessment tools like Security Monkey from Netflix and Scout2 from NCC Group.  These are all good places to glean information from.

While all of these varied information sources are fine and dandy, is their security guidance relevant to Adobe?  Does it address Adobe’s highest risk areas in AWS?  Uncritically following public guidance could result in the existence of a standard for the sake of having a standard – not one that delivered benefits for Adobe.

A combination of security community input, internally and externally documented best practices, and looking for patterns and possible areas of improvement was used to define an initial scope to the standard.  At the time the requirements were being drafted, AWS had over 30 services. It was unreasonable (and unnecessary) to create security guidance covering all of them.  The initial scope for the draft minimum security requirements was AWS account management, Identity & Access Management (IAM), and Compute (Amazon Elastic Compute Cloud (EC2) and Virtual Private Cloud (VPC)).

We worked with AWS stakeholders within Adobe through monthly one-hour meetings to get agreement on the minimum bar security requirements for AWS and which were to be applied to all of Adobe’s AWS accounts (dev, stage, prod, testing, QA, R&D, personal projects, etc).  We knew we’d want a higher security bar for environments that handle more sensitive classes of data or were customer facing. We held a two-day AWS security summit that was purely focused on defining these higher bar security requirements to ensure all stakeholders had their voices heard and avoid any contention as the standard was finalized.

As a result of the summit, the teams were able to define higher security requirements that covered account management/IAM and compute (spanning architecture, fleet management, data handling, and even requirements beyond EC2/VPC including expansion into AWS-managed services such as S3, DynamoDB, SQS, etc.).

I then worked with Adobe’s Information Systems Security Policies & Standards team to publish an Adobe-wide standard.  I transformed the technical requirements into an appropriate standard.  This was then submitted to Adobe’s broader standards’ teams to review.  After this review, it was ready for formal approval.

The necessary teams agreed to the standard and it was officially published internally in August 2016.  I then created documentation to help teams use the AWS CLI to audit for and correct issues from the minimum bar requirements. We also communicated the availability of the standard and began assisting teams towards meeting compliance with it.

Overall the standard has been well received by teams.  They understand the value of the standard and its requirements in helping Adobe ensure better security across our AWS deployments.  We have also developed timelines with various teams to help them achieve compliance with the standard. And, since our AWS Security Standard was released we have seen noted scalability improvements and fewer reported security issues.  This effort continues to help us in delivering the security and reliability our customers expect from our products and services.

Cynthia Spiess
Web Security Researcher

Evolving an Application Security Team

A centralized application security team, similar to ours here at Adobe, can be the key to driving the security vision of the company. It helps implement the Secure Product Lifecycle (SPLC) and provide security expertise within the organization.  To stay current and have impact within the organization, a security team also needs to be in the mode of continuous evolution and learning. At inception of such a team, impact is usually localized more simply to applications that the team reviews.  As the team matures, the team can start to influence the security posture across the whole organization. I lead the team of security researchers at Adobe. Our team’s charter is to provide security expertise to all application teams in the company.  At Adobe, we have seen our team mature over time. As we look back, we would like to share the various phases of evolution that we have gone through along the way.

Stage 1:  Dig Deeper

In the first stage, the team is in the phase of forming and acquires the required security skills through hiring and organic learning. The individuals on the team bring varied security expertise, experience, and a desired skillset to the team. During this stage, the team looks for applicability of security knowledge to the applications that the engineering teams develop.  The security team starts this by doing deep dives into the application architecture and understanding why the products are being created in the first place. Here the team understands the organization’s application portfolio, observes common design patterns, and then starts to build the bigger picture on how applications come together as a solution.   Sharing learnings within the team is key to accelerating to the next stage.

By reviewing applications individually, the security team starts to understand the “elephants in the room” better and is also able to prioritize based on risk profile. A security team will primarily witness this stage during inception. But, it could witness it again if it undergoes major course changes, takes on new areas such as an acquisition, or must take on a new technical direction.

Stage 2: Research

In the second stage, the security team is already able to perform security reviews for most applications, or at least a thematically related group of them, with relative ease.  The security team may then start to observe gaps in their security knowhow due to things such as changes in broader industry or company engineering practices or adoption of new technology stacks.

During this phase, the security team starts to invest time in researching any necessary security tradeoffs and relative security strength of newer technologies being explored or adopted by application engineering teams. This research and its practical application within the organization has the benefit of helping to establish security experts on a wider range of topics within the team.

This stage helps security teams stay ahead of the curve, grow security subject matter expertise, update any training materials, and helps them give more meaningful security advice to other engineering teams. For example, Adobe’s application security team was initially skilled in desktop security best practices. It evolved its skillset as the company launched products centered around the cloud and mobile platforms. This newly acquired skillset required further specializationwhen the company started exploring more “bleeding edge” cloud technologies such as containerization for building micro-services.

Stage 3: Security Impact

As security teams become efficient in reviewing solutions and can keep up with technological advances, they can then start looking at homogeneous security improvements across their entire organization.  This has the potential of a much broader impact on the organization. Therefore, this requires the security team to be appropriately scalable to match possible increasing demands upon it.

If a security team member wants to make this broader impact, the first step is identification of a problem that can be applied to a larger set of applications.  In other words, you must ask members of a security team to pick and own a particularly interesting security problem and try to solve it across a larger section of the company.

Within Adobe, we were able to identify a handful of key projects that fit the above criteria for our security team to tackle. Some examples include:

  1. Defining the Amazon Web Services (AWS) minimum security bar for the entire company
  2. Implementing appropriate transport layer security (TLS) configurations on Adobe domains
  3. Validating that product teams did not store secrets or credentials in their source code
  4. Forcing use of browser supported security flags (i.e. XSS-Protection, X-Frame-Options, etc.) to help protect web applications.

The scope of these solutions varied from just business critical applications to the entire company.

Some guidelines that we set within our own team to achieve this were as follows:

  1. The problem statement, like an elevator pitch, should be simple and easily understandable by all levels within the engineering teams – including senior management.
  2. The security researcher was free to define the scope and choose how the problem could be solved.
  3. The improvements made by engineering teams should be measurable in a repeatable way. This would allow for easier identification of any security regressions.
  4. Existing channels for reporting security backlog items to engineering teams must be utilized versus spinning up new processes.
  5. Though automation is generally viewed as a key to scalability for these types of solutions, the team also had flexibility to adopt any method deemed most appropriate. For example, a researcher could front-load code analysis and only provide precise security flaws uncovered to the application engineering team.  Similarly, a researcher could establish required “minimum bars” for application engineering teams helping to set new company security standards. The onus is then placed on the application engineering teams to achieve compliance against the new or updated standards.

For projects that required running tests repeatedly, we leveraged our Security Automation Framework. This helped automate tasks such as validation. For others, clear standards were established for application security teams. Once a defined confidence goal is reached within the security team about compliance against those standards, automated validation could be introduced.

Pivoting Around an Application versus a Problem

When applications undergo a penetration test, threat modeling, or a tool-based scan, teams must first address critical issues before resolving lower priority issues. Such an effort probes an application from many directions attempting to extract all known security issues.  In this case, the focus is on the application and its security issues are not known when the engagement starts.  Once problems are found, the application team owns fixing it.

On the other hand, if you choose to tackle one of the broader security problems for the organization, you test against a range of applications, mitigate it as quickly as possible for those applications, and make a goal to eventually eradicate the issue entirely from the organization.  Today, teams are often forced into reactively resolving such big problems as critical issues – often due to broader industry security vulnerabilities that affect multiple companies all at once.  Heartbleed and other similar named vulnerabilities are good examples of this.  The Adobe security team attempts to resolve as many known issues as possible proactively in an attempt to help lessen the organizational disruption when big industry-wide issues come along. This approach is our recipe for having a positive security impact across the organization.

It is worth noting that security teams will move in and out of the above stages and the stages will tend to repeat themselves over time.  For example, a new acquisition or a new platform might require deeper dives to understand.  Similarly, new technology trends will require investment in research.  Going after specific, broad security problems complements the deep dives and helps improve the security posture for the entire company.

We have found it very useful to have members of the security team take ownership of these “really big” security trends we see and help improve results across the company around it. These efforts are ongoing and we will share more insights in future blog posts.

Mohit Kalra
Sr. Manager, Secure Software Engineering

Security Automation Part III: The Adobe Security Automation Framework

In previous blogs [1],[2], we discussed alternatives for creating a large-scale automation framework if you don’t have the resources for a multi-month development project. This blog assumes that you are ready to jump in with both feet on designing your own internal automation solution.

Step 1: Research

While we particularly like our design, that doesn’t mean that our design is the best for your organization. Take time to look at designs, such as Salesforce’s Chimera, Mozilla’s Minion, ThreadFix, Twitter SADB, Guantlt, and other approaches. These projects tackle large scale security implementations in distinctly different ways. It is important to understand the full range of approaches in order to select the one that is the best fit for your organization. Projects like Mozilla Minion and Guantlt are open-source if you are looking for code in addition to ideas. Some tools follow specific development processes, such as Guantlt’s adherence to Behavior Driven Development, which need to be considered. The OWASP AppSec Pipeline project provides information on how to architect around solutions such as ThreadFix.

The tools often break down into aggregators or scanners. Aggregators focus on consolidating information from your diverse deployment of existing tools and then trying to make sense of the information. ThreadFix is an example of this type of project. Scanners look to deploy either existing analysis tools or custom analysis tools at a massive scale. Chimera, Minion, and Adobe’s Security Automation Framework take this approach.  This blog focuses on our scanner approach using our Security Automation Framework but we are in the midst of designing an aggregator, as well.

Step 2: Put together a strong team

The design of this solution was not the result of any one person’s genius. You need a group of strong people who can help point out when your idea is not as clever as you might think. This project involved several core people including Mohit Kalra, Kriti Aggarwal, Vijay Kumar Sahu, and Mayank Goyal. Even with a strong team, there was a re-architecting after the version 1 beta as our approach evolved. For the project to be a success in your organization, you will want many perspectives including management, product teams, and fellow researchers.

Step 3: Designing scanning automation

A well thought out design is critical to any tool’s long term success. This blog will provide a technical overview of Adobe’s implementation and our reasoning behind each decision.

The Adobe Security Automation Framework:

The Adobe Security Automation Framework (SAF) is designed around a few core principles that dictate the downstream implementation decisions. They are as follows:

  1. The “framework” is, in fact, a framework. It is designed to facilitate security automation but it does not try to be more than that. This means:
    1. SAF does not care what security assessment tool is being run. It just needs the tool to communicate progress and results via a specified API. This allows us to run any tool, based on any language, without adding hard coded support to SAF for each tool.
    2. SAF provides access to the results data but it is not the primary UI for results data. Each team will want their data viewed in a team specific manner. The SAF APIs allow teams to pull the data and render it as best fits their needs. This also allows the SAF development team to focus their time on the core engine.
  2. The “framework” is based on Docker. SAF is designed to be multi-cloud and Docker allows portability. The justifications for using a Docker based approach include:
    1. SAF can be run in cloud environments, in our internal corporate network, or run from our laptops for debugging.
    2. Development teams can instantiate their own mini-copies of SAF for testing.
    3. Using Docker allows us to put security assessment tools in separate containers where their dependencies won’t interfere with each other.
    4. Docker allows us to scale the number of instances of each security assessment tool dynamically with respect to their respective job size.
  3. SAF is modularized with each service (UI, scheduler, tool instance, etc.) in its own Docker container. This allows for the following advantages:
    1. The UI is separated from the front-end API allowing the web interface to be just another client of the front-end API. While people will initiate scans from the UI, SAF also allows for API driven scan requests.
    2. The scanning environments are independent. The security assessment tools may need to be run from various locations depending on their target. For instance, the scan may need to run within an internal network, external network, a specific geographic location, or just within a team’s dedicated test environments. With loose-coupling and a modular design, the security assessment tools can be run globally while still having a local main controller.
    3. Docker modularity allows for choosing the language and technology stack that is appropriate for that module’s function.
  4. By having each security test encapsulated within its own Docker container, anyone in the company can have their security assessment tools included in SAF by providing an appropriately configured Docker image. Volunteers can write a simple Python driver based on the SAF SDK that translates a security testing tool’s output into compatible messages for the SAF API and provide that to the SAF team as a Docker image. We do this because:
    1. The SAF team does not want to be the bottleneck for innovation. By allowing external contributions to SAF, the number of tools that it can run increases at a far faster rate. Given the wide array of technology stacks deployed at Adobe, this allows development teams to contribute tools that are best suited for their environments.
    2. In certain incident response scenarios, it may be necessary to widely deploy a quick script to analyze your exposure to a situation. With SAF, you could get a quick measurement by adding the script to a Docker image template and uploading it to the framework.
  5. The “security assertion”, or the test that you want to run, should test a specific vulnerability and provide a true/false result that can be used for accurate measurements of the environment. This is similar to the Behavior Driven Development approaches seen in tools like Guantlt. SAF is not designed to run a generic, catch-all web application penetration tool that will return a slew of results for human triage. Instead it is designed for analysis of specific issues. This has the following advantages:
    1. If you run an individual test, then you can file an individual bug for tracking the issue.
    2. You create tests specifically around the security issues that are critical to your organization. The specific tests can then be accurately measured at scale.
    3. Development teams do not feel that their time is being wasted by being flooded with false positives.
    4. Since it is an individual test, the developer in charge of fixing that one issue can reproduce the results using the same tool as was deployed by the framework. They could also add the test to their existing automation testing suite.

Mayank Goyal on our team took the above principles and re-architected our version 1 implementation into a design depicted in the following architecture diagram:

The SAF UI

The SAF UI is simplistic in nature since it was not designed to be the analysis or reporting suite. The UI is a single page based web application which works with SAF’s APIs. The UI focuses on allowing researchers to configure their security assertions with the appropriate parameters. The core components of the UI are:

  • Defining the assertion: Assertions (tests) are saved within an internal GitHub instance, built via Jenkins and posted to a Docker repository. SAF pulls them as required. The Github repository contains the DockerFile, the code for the test, and the code that acts as bridge between the tool and the SAF APIs using the SAF SDK. Tests can be shared with other team members or kept private.
  • Defining the scan: It is possible that the same assertion(test) may be run with different configurations for different situations. The scan page is where you define the parameters for different runs.
  • Results: The results page provides access to the raw results. The results are broken down into pass, fail, or error for each host tested. It is accompanied by a simple blob of text that is associated with each result.
  • Scans can be set to run at specific intervals.

This screenshot demonstrates an assertion that can identify whether the given URL parameter has login forms are available over HTTP. This assertion is stored in Git, is initiated by /src/startup.sh, and it will use version 4 of the configuration parameters.

A Scan is then configured for the assertion which says when the test will be run and which input list of URLs to test. A scan can run more than one assertion for the purposes of batching results.

The SAF API Server

This API server is responsible for preparing the information for the slave testing environments in the next stage. It receives the information for the scan from the UI, an API client, or based on the saved schedule. The tool details and parameters are packaged and uploaded to the job queue in a slave environment for processing. It assembles all the meta information for testing for the task/configuration executor. The master controller also listens for the responses from the queuing system and stores the results in the database. Everything downstream from the master controller is loosely coupled so that we can deploy work out to multiple locations and different geographies.

The Job Queueing system

The Queueing system is responsible for basic queuing and allows the scheduler to schedule tasks based on resource availability and defer them when needed. While cloud providers offer queuing systems, ours is based on RabbitMQ because we wanted to have deployment mobility.

The Task Scheduler

This is the brains of running the tool. It is responsible monitoring all the Docker containers, scaling, resource scheduling and killing rogue tasks. It has the API that receives the status and result messages from the Docker containers. That information is then relayed back to the API server.

The Docker images

The Docker images are based on a micro-service architecture approach. The default baseline is based on Alpine Linux to keep the image footprint small. SAF assertions can also be quite small. For instance, the test can be a small Python script which makes a request to the homepage of a web server and verifies whether an HSTS header was included in the response. This micro-service approach allows the environment to run multiple instances with minimum overhead. The assertion script communicates its status (e.g. keep-alives) and results (pass/fail/error) back to the task executor using the SAF SDK.

Conclusion

While this overview still leaves a lot of the specific details unanswered, it should provide a basic description of our security automation framework approach at the architectural and philosophical level. For a security automation project to be a success at the detailed implementation level, it must be customized to the organization’s technology stack and operational flow. As we progress with our implementation, we will continue to post the lessons that we learn along the way.

Peleus Uhley
Principal Scientist

Security Automation Part II: Defining Requirements

Every security engineer wants to build the big security automation framework for the challenge of designing something with complexity. Building those big projects have their set of challenges. Like any good coding project, you want to have a plan before setting out on the adventure.

In the last blog, we dealt with some of the high level business concerns that were necessary to consider in order to design a project that would produce the right results for the organization. In this blog we will look at the high level design considerations from the software architect’s perspective. In the next blog, we will look at the implementer’s concerns. For now, most architects are concerned with the following:

Maintainability

This is a concern for both the implementer and architect, but they often have different perspectives. If you are designing a tool that the organization is going to use as a foundation of its security program, then you need to design the tool such that the team can maintain it over time.

Maintainability through project scope

There are already automation and scalability projects that are deployed by the development team. These may include tools such as Jenkins, Git, Chef, or Maven. All of these frameworks are extensible. If all you want to do is run code with each build, then you might consider integrating into these existing frameworks rather than building your own automation. They will handle things such as logging, alerting, scheduling, and interacting with the target environment. Your team just has to write code to tell them what you want done with each build.

If you are attempting a larger project, do you have a roadmap of smaller deliverables to validate the design as you progress? The roadmap should prioritize the key elements of success for the project in order to get an early sense if you are heading in the right direction with your design. In addition, while it is important to define everything that your project will do, it is also important to define all the things that your tool will not perform. Think ahead to all of the potential tangential use cases that your framework could be asked to perform by management and customers. By establishing what is out of scope for your project, you can set proper expectations earlier in the process and those restrictions will become guardrails to keep you on track when requests for tangential features come in.

Maintainability through function delegation

Can you leverage third-party services for operational issues?  Can you use the cloud so that baseline network and machine uptime is maintained by someone else? Can you leverage tools such as Splunk so that log management is handled by someone else? What third-party libraries already exist so that you are only inventing the wheels that need to be specific to your organization? For instance, tools like RabbitMQ are sufficient to handle most queueing needs.  The more of the “busy work” that can be delegated to third-party services or code, the more time that the internal developers can spend on perfecting the framework’s core mission.

Deployment

It is important to know where your large scale security framework may be deployed. Do you need to scan staging environments that are located on an internal network in order to verify security features before shipping? Do you need to scan production systems on an external network to verify proper deployment? Do you need to scan the production instances from outside the corporate network because internal security controls would interfere with the scan? Do you want to have multiple scanning nodes in both the internal and external network? Should you decouple the job runner from the scanning nodes so that the job runner can be on the internal network even if the scanning node is external?  Do you want to allow teams to be able to deploy their own instances so that they can run tests themselves? For instance, it may be faster if an India based team can conduct the scan locally than to run the scan from US based hosts. In addition, geographical load balancers will direct traffic to the nearest hosts which may cause scanning blind spots. Do you care if the scanners get deployed to multiple  geographic locations so long as they all report back to the same database?

Tool selection

It is important to spend time thinking about the tools that you will want your large security automation framework to run because security testing tools change. You do not want your massive project to die just because the tool it was initially built to execute falls out of fashion and is no longer maintained. If there is a loose coupling between the scanning tool and the framework that runs it, then you will be able to run alternative tools once the ROI on the initial scanning tool diminishes. If you are not doing a large scale framework and are instead just modifying existing automation frameworks, the same principles will apply even if they are at a smaller scale.

Tool dependencies

While the robustness of tests results is an important criterion for tool selection, complex tools often have complex dependencies. Some tools only require the targeted URL and some tools need complex configuration files.  Do you just need to run a few tools or do you want to spend the time to make your framework be security tool agnostic? Can you use a Docker image for each tool in order to avoid dependency collisions between security assessment tools? When the testing tool conducts the attack on the remote host, does the attack presume that code injected into the remote host’s environment can send a message back to the testing tool?  If you are building a scalable scanning system with dynamically allocated, short-lived hosts that live behind a NAT server, then it may be tricky for the remote attack code to send a message back to the original security assessment tool.

Inputs and outputs

Do the tools require a complex, custom configuration file per target or do you just need to provide the hostname? If you want to scale across a large number of sites, tools that require complex, per-site configuration files may slow the speed at which you can scale and require more maintenance over time. Does the tool provide a single clear response that is easy to record or does it provide detailed, nuanced responses that require intelligent parsing? Complex results with many different findings may make it more difficult to add alerting around specific issues to the tool. They could also make metrics more challenging depending on what and how you measure.

Tool scalability

How many instances of the security assessment tool can be run on a single host? For instance, tools that listen on ports limit the number of instances per server. If so, you may need Docker or a similar auto-scaling solution. Complex tools take longer to run which may cause issues with detecting time outs. How will the tool handle re-tests for identified issues? Does the tool have granularity so that dev team can test their proposed patch against the specific issue? Or does the entire test suite need to be re-run every time the developer wants to verify their patch?

Focus and roll out

If you are tackling a large project, it is important to understand what is the minimum viable product? What is the one thing that makes this tool different than just buying the enterprise version of the equivalent commercial tool? Could the entire project be replaced with a few Python scripts and crontab? If you can’t articulate what extra value your approach will bring over the alternative commercial or crontab approach, then the project will not succeed. The people who would leverage the platform may get impatient waiting for your development to be done. They could instead opt for a quicker solution, such as buying a service, so that they can move on to the next problem.  As you design your project, always ask yourself, “Why not cron?” This will help you focus on the key elements of the project that will bring unique value to the organization. Your roadmap should focus on delivering those first.

Team adoption

Just because you are building a tool to empower the security team, doesn’t mean that your software won’t have other customers. This tool will need to interact with the development teams’ environments. This security tool will produce outputs that will eventually need to be processed by the development team. The development teams should not be an afterthought in your design. You will be holding them accountable for the results and they need methods for understanding the context of what your team has found and being able to independently retest.

For instance, one argument for integrating into something like Jenkins or GIt, is that you are using a tool the development team already understands. When you try to explain how your project will affect their environment, using a tool that they know means that the discussion will be in language that they understand. They will still have concerns that your code might have negative impacts on their environment. However, they may have more faith in the project if they can mentally quantify the risk based on known systems. When you create standalone frameworks, then it is harder for them to understand the scale of the risk because it is completely foreign to them.

At Adobe, we have been able to work directly with the development teams for building security automation. In a previous blog, an Adobe developer described the tools that he built as part of his pursuit of an internal black belt security training certification. There are several advantages to having the security champions on development teams build the development tools rather than the core security team. One is that full time developers are often better coders than the security teams and the developers better understand the framework integration. Also, in the event of an issue with the tool, the development team has the knowledge to take emergency action. Often times, a security team just needs the tool to meet specific requirements and the implementation and operational management of the tool can be handled by the team responsible for the environment. This can make the development team more at ease with having the tool in their environment and it frees up the core security team to focus on larger issues.

Conclusion

While jumping right into the challenges of the implementation is always tempting, thinking through the complete data flow for the proposed tools can help save you a lot of rewriting. It also important that you avoid trying to boil the ocean by scoping more than your team can manage. Most importantly, always keep focus on the unique value of your approach and the customers that you need to buy into the tool once it is launched. The next blog will focus on an implementer’s concerns around platform selection, queuing, scheduling, and scaling by looking at example implementations.

Adobe @ BlackHat USA 2016

We are headed to BlackHat USA 2016 in Las Vegas this week with members of our Adobe security teams. We are looking forward to connecting with the security community throughout the week. We also hope to meet up with some of you at the parties, at the craps tables, or just mingling outside the session rooms during the week.

This year Peleus Uhley, our Lead Security Strategist, will be speaking on Wednesday, August 3rd, at 4:20 p.m. He will be talking about “Design Approaches for Security Automation.” DarkReading says his talk is one of the “10 Hottest Talks” at the conference this year, so you do not want to miss it.

This year we are again proud to sponsor the r00tz Kids Conference @ DefCon. If you are going to DefCon and bringing your kids, we hope you take the time out to take them to this great event for future security pros. There will be educational sessions and hands-on workshops throughout the event to challenge their creativity and skills.

Make sure to follow our team on Twitter @AdobeSecurity. Feel free to follow me as well @BradArkin. We’ll be tweeting info as to our observations and happenings during the week. Look for the hashtag #AdobeBH2016.

We are looking forward to a great week in Vegas.

Brad Arkin
VP and Chief Security Officer

Fingerprinting a Security Team

The central security team in a product development organization plays a vital role in implementing a secure product lifecycle process.  It is the team that drives the central security vision for the organization and works with individual teams on their proactive security needs.   I lead the technical team of proactive security researchers in Adobe. They are all recognized security experts and are able to help the company adapt to the ever changing threat landscape.  Apart from being on top of the latest security issues and potential mitigations that may need to be in place, the security team also faces challenges of constant skill evolution and remaining closely aligned to the business.

This post focuses on the challenges faced by the security team and potential ways to overcome them.

Increase in technologies as a function of time.

A company’s product portfolio is a combination of its existing products, new product launches, and acquisitions intended to help bridge product functionality gaps or expand into new business areas.  Over time, this brings a wide variety of technologies and architectures into the company.  Moreover, the pace of adoption of new technologies is much higher than the pace of retiring older technologies.  Therefore, the central security team needs to keep up with the newer technology stacks and architectures being adopted while also maintaining a manageable state with existing ones. An acquisition can further complicate this due to an influx of new technologies into the development environment in a very short period of time.

Security is not immune to business evolution.

The cloud and mobile space have forced companies to rethink how they should offer products and services to their customers.  Adobe went through a similar transformation from being a company that offers desktop products to one that attempts to strike the right balance between desktop, cloud, and mobile.  A security team needs to also quickly align with such business changes.

Multi-platform comes with a multiplication factor.

When the same product is offered on multiple operating systems, on multiple form factors (such as mobile and desktop), or deployed on multiple infrastructures, security considerations can increase due to the unique qualities of each platform. The central security team needs to be aware of and fluent in these considerations to provide effective proactive advice.

Subject matter expertise has limitations.

Strong subject matter expertise helps security teams’ credibility in imparting sound security advice to teams.  For security sensitive areas, experts in the team are essential to providing much deeper advice.  That said, any one individual cannot be an expert on every security topic.  Expertise is something that needs to be uniformly distributed through a team.

These challenges can be addressed by growing the team organically and through hiring.  Hiring to acquire new skills alone is not the best strategy – the skills required today will soon be outdated tomorrow.  A security team therefore needs to adopt strategies that allow it to constantly evolve and stay current. A few such strategies are discussed below.

T-Shaped skills.

Security researchers in a security team should aim for a T-Shaped skill set.  This allows for a fine balance between breadth and depth in security. The breadth is useful to help cover baseline security reviews.  The depth helps researchers become specific security subject matter experts. Having many subject experts strengthens the overall team’s skills because other team members learn from them and they are also available to provide guidance when there is a requirement in their area of expertise.

Strong Computer Science foundations.

Product security is an extension of engineering work.  Security requires understanding good design patterns, architecture, code, testing strategies, etc. Writing good software requires strong foundations in computer science irrespective of the layer of technology stack one ends up working on. Strong computer science skills can also help make security skills language and platform agnostic.  With strong computer science skills, a security researcher can learn new security concepts once and then apply to different platforms as and when needed.  With such strong fundamentals, the cost of finding out the “how” on new platforms is relatively small.

Hire for your gaps but also focus on ability to learn quickly.

A working product has so many pieces & processes that make it work.  If you can make a mental image of what it takes to make software, you can more clearly see strengths and weaknesses in your security team.  For example, engineering a service requires a good understanding of code (and the languages of choice), frameworks, technology stacks (such as queues, web server, backend database, third party libraries), infrastructure used for deploying, TLS configurations, testing methodologies, the source control system, the overall design and architecture, the REST interface, interconnection with various other services, the tool chain involved – the list is extensive. When hiring, one facet to evaluate in a candidate is whether he or she brings security strengths to the team through passion and past job experience that can fill the team’s existing gaps.  However, it can be even more important to evaluate the candidate’s willingness to learn new skills.  The ability to learn, adapt, and not be held captive to one existing skill set is an important factor to look for in candidates during hiring.  The secondary goal is to add a variety of security skills to the team and try to avoid duplicating the existing the skill set already in the team.

“Skate where the puck’s going, not where it’s been.”

To stay current with the business needs and where engineering teams are headed, it is important for a security team to spend a portion of their time investigating the security implications of newer technologies being adopted by the product teams.  As Wayne Gretzky famously said, “you want to skate where the puck’s going, not where it’s been.” However, security teams need to cover larger ground. You do have to stay current with new technologies being adopted. Older technologies still get used in the company as only some teams may move away from them. So it would be wise not to ignore those older technologies by maintaining expertise in those areas, while aiming to move teams away from those technologies as they become more difficult to effectively secure.  Predicting future areas of investment is difficult.  Security teams can make that task easier by looking at the industry trends and by talking to engineering teams to find out where are they headed.  The managers of a security team also have a responsibility to stay informed about new technologies, as well as future directions their respective companies may go in, in order to invest in newer areas to grow the team.

Go with the flow.

If a business has taken a decision to invest in cloud or mobile or change the way it does business, a security team should be among the first in the company to detect this change and make plans to adapt early.  If the business moves in a certain direction and the security team does not, it can unfortunately label a team as being one that only knows the older technology stack.  Moreover, it is vital for the security team to show alignment with a changing business. It is primarily the responsibility of the security team’s leadership to detect such changes and start planning for them early.

Automate and create time.

If a task is performed multiple times, the security team should evaluate if the task can be automated or if a tool can do it more efficiently.  The time reduced through automation and tooling can help free up time and resources which can then be used to invest in newer areas that are a priority for the security team.

Growing a security team can have many underlying challenges that are not always obvious to an external observer.  The industry’s primary focus is on the new threat landscapes being faced by the business.  A healthy mix of organic growth and hiring will help a security team adapt and evolve continuously to the changes being introduced by factors not in their direct control.  It is the responsibility of both security researchers and the management team to keep learning and to spend time detecting any undercurrents of change in the security space.

Mohit Kalra
Sr. Manager, Secure Software Engineering

Better Security Through Automation

Automation Strategies

“Automate all the things!” is a popular meme in the cloud community. Many of today’s talks at security conferences discuss the latest, sophisticated automation tool developed by a particular organization. However, adding “automation” to a project does not magically make things better by itself. Any idea can be automated; including the bad ones. For instance, delivering false positives “at scale” is not going to help your teams. This blog will discuss some of the projects that we are currently working on and the reasoning behind their goals.

Computer science has been focused on automation since its inception. The advent of the cloud only frees our ideas from being resource bound by hardware. However, that doesn’t necessarily mean that automation must take up 100 scalable machines. Sometimes simple automation projects can have large impacts. Within Adobe, we have several types of automation projects underway to help us with security. The goals range from business-level dashboards and compliance projects to low level security testing projects.

 

Defining goals

One large project that we are currently building is a security automation framework focused on security assertions. When you run a traditional web security scanner against a site, it will try to tell you everything about everything on the site. In order to do that effectively, you have to do a lot of pre-configuration (authentication, excluded directories, etc.). Working with Mohit Kalra, the Sr, Security Manager for the ASSET security team, we experimented with the idea of security assertions. Basically, could a scanner answer one true/false question about the site with a high degree of accuracy? Then we would ask that one simple question across all of our properties in order to get a meaningful measurement.

For instance, let’s compare the following two possible automation goals for combating XSS:

(a) Traditional automation: Give me the location of every XSS vulnerability for this site.

(b) Security assertion: Does the site return a Content Security Policy (CSP) header?

A web application testing tool like ZAP can be used to automate either goal. Both of these tests can be conducted across all of your properties for testing at scale. Which goal you choose will decide the direction of your project:

Effort to implement:

(a) Potentially requires effort towards tuning and configuration with a robust scanner in order to get solid results. There is a potential risk to the tested environment (excessive DB entries, high traffic, etc.)

(b) A straight forward measurement with a simple scanner or script. There is a low risk to the tested environment.

Summarizing the result for management:

(a) This approach provides a complex measurement of risk that can involve several variables (reflected vs. persistent, potential value of the site, cookie strategy, etc.). The risk that is measured is a point-in-time assessment since new XSS bugs might be introduced later with new code.

(b) This approach provides a simple measurement of best practice adoption across the organization. A risk measurement can be inferred but it is not absolute. If CSP adoption is already high, then more fine grained tests targeting individual rules will be necessary. However, if CSP adoption is still in the early stages, then just measuring who has started the adoption process can be useful.

Developer interpretation of the result:

(a) Development teams will think in terms of immediate bugs filed.

(b) Development teams will focus on the long term goal of defining a basic CSP.

Both (a) and (b) have merits depending on the needs of the organization. The traditional strategy (a) can give you very specific data about how prevalent XSS bugs are across the organization. However, tuning the tools to effectively find and report all that data is a significant time investment. The security assertion strategy (b) focuses more on long term XSS mitigations by measuring CSP adoption within the organization. The test is simpler to implement with less risk to the target environments. Tackling smaller automation projects has the added value of providing experience that may be necessary when designing larger automation projects.

Which goal is a higher priority will depend on your organization’s current needs. We found that, in playing with the true/false approach of security assertions, we focused more of our energy on what data was necessary versus just what data was possible. In addition, since security assertions are assumed to be simple tests, we focused more of our design efforts on perfecting the architecture of scalable testing environment rather than the idiosyncrasies of the tools that the environment would be running. Many automation projects try to achieve depth and breadth at the same time by running complex tools at scale. We decided to take an intermediate step by using security assertions to focus on breadth first and then to layer on depth as we proceed.

 

Focused automation within continual deployment

Creating automation environments to scan entire organizations can be a long term project. Smaller automation projects can often provide quick wins and valuable experience on building automation. For instance, continuous build systems are often a single chokepoint through which a large portion of your cloud must pass before deployment. Many of today’s continuous build environments allow for extensions that can be used to automate processes.

As an example, PCI requires that code check-ins are reviewed. Verifying this process is followed consistently requires significant human labor. One of our Creative Cloud security champions, Jed Glazner, developed a Jenkins plugin which can verify each check-in was reviewed. The plugin monitors the specified branch and ensures that all commits belong to a pull request, and that the pull requests were not self merged. This allows for daily, automatic verification of the process for compliance.

Jed worked on a similar project where he created a Maven plug-in that lists all third-party Java libraries and their versions within the application. The plugin would then upload that information to our third-party library tracker so that we can immediately identify libraries that need updates. Since the plug-in was integrated into the Maven build system, the data provided to the third-party library tracker was always based on the latest nightly build and it was always a complete list.

 

Your organization will eventually build, buy or borrow a large scale automation tool that scales out the enumeration of immediate risk issues. However, before you jump head first into trying to build a robust scanning environment from scratch, be sure to first identify what core questions you need the tools to answer in order to support your program. You might find that starting with smaller automation tasks that track long term objectives or operational best practices can be just as useful to an organization. Deploying these smaller projects can also provide experience that can help your plans for larger automation projects.

 

Peleus Uhley
Principal Scientist