Posts tagged "cloud security"

Evolving an Application Security Team

A centralized application security team, similar to ours here at Adobe, can be the key to driving the security vision of the company. It helps implement the Secure Product Lifecycle (SPLC) and provide security expertise within the organization.  To stay current and have impact within the organization, a security team also needs to be in the mode of continuous evolution and learning. At inception of such a team, impact is usually localized more simply to applications that the team reviews.  As the team matures, the team can start to influence the security posture across the whole organization. I lead the team of security researchers at Adobe. Our team’s charter is to provide security expertise to all application teams in the company.  At Adobe, we have seen our team mature over time. As we look back, we would like to share the various phases of evolution that we have gone through along the way.

Stage 1:  Dig Deeper

In the first stage, the team is in the phase of forming and acquires the required security skills through hiring and organic learning. The individuals on the team bring varied security expertise, experience, and a desired skillset to the team. During this stage, the team looks for applicability of security knowledge to the applications that the engineering teams develop.  The security team starts this by doing deep dives into the application architecture and understanding why the products are being created in the first place. Here the team understands the organization’s application portfolio, observes common design patterns, and then starts to build the bigger picture on how applications come together as a solution.   Sharing learnings within the team is key to accelerating to the next stage.

By reviewing applications individually, the security team starts to understand the “elephants in the room” better and is also able to prioritize based on risk profile. A security team will primarily witness this stage during inception. But, it could witness it again if it undergoes major course changes, takes on new areas such as an acquisition, or must take on a new technical direction.

Stage 2: Research

In the second stage, the security team is already able to perform security reviews for most applications, or at least a thematically related group of them, with relative ease.  The security team may then start to observe gaps in their security knowhow due to things such as changes in broader industry or company engineering practices or adoption of new technology stacks.

During this phase, the security team starts to invest time in researching any necessary security tradeoffs and relative security strength of newer technologies being explored or adopted by application engineering teams. This research and its practical application within the organization has the benefit of helping to establish security experts on a wider range of topics within the team.

This stage helps security teams stay ahead of the curve, grow security subject matter expertise, update any training materials, and helps them give more meaningful security advice to other engineering teams. For example, Adobe’s application security team was initially skilled in desktop security best practices. It evolved its skillset as the company launched products centered around the cloud and mobile platforms. This newly acquired skillset required further specializationwhen the company started exploring more “bleeding edge” cloud technologies such as containerization for building micro-services.

Stage 3: Security Impact

As security teams become efficient in reviewing solutions and can keep up with technological advances, they can then start looking at homogeneous security improvements across their entire organization.  This has the potential of a much broader impact on the organization. Therefore, this requires the security team to be appropriately scalable to match possible increasing demands upon it.

If a security team member wants to make this broader impact, the first step is identification of a problem that can be applied to a larger set of applications.  In other words, you must ask members of a security team to pick and own a particularly interesting security problem and try to solve it across a larger section of the company.

Within Adobe, we were able to identify a handful of key projects that fit the above criteria for our security team to tackle. Some examples include:

  1. Defining the Amazon Web Services (AWS) minimum security bar for the entire company
  2. Implementing appropriate transport layer security (TLS) configurations on Adobe domains
  3. Validating that product teams did not store secrets or credentials in their source code
  4. Forcing use of browser supported security flags (i.e. XSS-Protection, X-Frame-Options, etc.) to help protect web applications.

The scope of these solutions varied from just business critical applications to the entire company.

Some guidelines that we set within our own team to achieve this were as follows:

  1. The problem statement, like an elevator pitch, should be simple and easily understandable by all levels within the engineering teams – including senior management.
  2. The security researcher was free to define the scope and choose how the problem could be solved.
  3. The improvements made by engineering teams should be measurable in a repeatable way. This would allow for easier identification of any security regressions.
  4. Existing channels for reporting security backlog items to engineering teams must be utilized versus spinning up new processes.
  5. Though automation is generally viewed as a key to scalability for these types of solutions, the team also had flexibility to adopt any method deemed most appropriate. For example, a researcher could front-load code analysis and only provide precise security flaws uncovered to the application engineering team.  Similarly, a researcher could establish required “minimum bars” for application engineering teams helping to set new company security standards. The onus is then placed on the application engineering teams to achieve compliance against the new or updated standards.

For projects that required running tests repeatedly, we leveraged our Security Automation Framework. This helped automate tasks such as validation. For others, clear standards were established for application security teams. Once a defined confidence goal is reached within the security team about compliance against those standards, automated validation could be introduced.

Pivoting Around an Application versus a Problem

When applications undergo a penetration test, threat modeling, or a tool-based scan, teams must first address critical issues before resolving lower priority issues. Such an effort probes an application from many directions attempting to extract all known security issues.  In this case, the focus is on the application and its security issues are not known when the engagement starts.  Once problems are found, the application team owns fixing it.

On the other hand, if you choose to tackle one of the broader security problems for the organization, you test against a range of applications, mitigate it as quickly as possible for those applications, and make a goal to eventually eradicate the issue entirely from the organization.  Today, teams are often forced into reactively resolving such big problems as critical issues – often due to broader industry security vulnerabilities that affect multiple companies all at once.  Heartbleed and other similar named vulnerabilities are good examples of this.  The Adobe security team attempts to resolve as many known issues as possible proactively in an attempt to help lessen the organizational disruption when big industry-wide issues come along. This approach is our recipe for having a positive security impact across the organization.

It is worth noting that security teams will move in and out of the above stages and the stages will tend to repeat themselves over time.  For example, a new acquisition or a new platform might require deeper dives to understand.  Similarly, new technology trends will require investment in research.  Going after specific, broad security problems complements the deep dives and helps improve the security posture for the entire company.

We have found it very useful to have members of the security team take ownership of these “really big” security trends we see and help improve results across the company around it. These efforts are ongoing and we will share more insights in future blog posts.

Mohit Kalra
Sr. Manager, Secure Software Engineering

The Adobe Security Team at RSA Conference 2017

It feels like we just got through the last “world’s largest security conference,” but here we are again. While the weather is not looking to be the best this year (although this is our rainy season, so we Bay Area folks do consider this “normal”), the Adobe security team would again like to welcome all of you descending on our home turf here in San Francisco next week, February 13 – 17, 2017.

This year, I will be emceeing the Executive Security Action Forum (ESAF) taking place on Monday, February 13th, to kick off the conference. I hope to see many of you there.

On Thursday, February 16th, from 9:15 – 10:00 a.m in Moscone South Room 301, our own Mike Mellor and Bryce Kunz will also be speaking in the “Cloud Security and Virtualization” track on the topic of “Orchestration Ownage: Exploiting Container-Centric Data Center Platforms.” This session will be a live coaching session illustrating how to hack the popular DC/OS container operating environment. We hope the information you learn from this live demo will give you the ammunition you need to take home and better protect your own container environments. This year you are able to pre-register for conference sessions. We expect this one to be popular given the live hacking demo, so, please try and grab a seat if you have not already.

As always, members of our security teams and myself will be attending the conference to network, learn about the latest trends in the security industry, and share our knowledge. Looking forward to seeing you.

Brad Arkin
Chief Security Officer

Security Automation Part III: The Adobe Security Automation Framework

In previous blogs [1],[2], we discussed alternatives for creating a large-scale automation framework if you don’t have the resources for a multi-month development project. This blog assumes that you are ready to jump in with both feet on designing your own internal automation solution.

Step 1: Research

While we particularly like our design, that doesn’t mean that our design is the best for your organization. Take time to look at designs, such as Salesforce’s Chimera, Mozilla’s Minion, ThreadFix, Twitter SADB, Guantlt, and other approaches. These projects tackle large scale security implementations in distinctly different ways. It is important to understand the full range of approaches in order to select the one that is the best fit for your organization. Projects like Mozilla Minion and Guantlt are open-source if you are looking for code in addition to ideas. Some tools follow specific development processes, such as Guantlt’s adherence to Behavior Driven Development, which need to be considered. The OWASP AppSec Pipeline project provides information on how to architect around solutions such as ThreadFix.

The tools often break down into aggregators or scanners. Aggregators focus on consolidating information from your diverse deployment of existing tools and then trying to make sense of the information. ThreadFix is an example of this type of project. Scanners look to deploy either existing analysis tools or custom analysis tools at a massive scale. Chimera, Minion, and Adobe’s Security Automation Framework take this approach.  This blog focuses on our scanner approach using our Security Automation Framework but we are in the midst of designing an aggregator, as well.

Step 2: Put together a strong team

The design of this solution was not the result of any one person’s genius. You need a group of strong people who can help point out when your idea is not as clever as you might think. This project involved several core people including Mohit Kalra, Kriti Aggarwal, Vijay Kumar Sahu, and Mayank Goyal. Even with a strong team, there was a re-architecting after the version 1 beta as our approach evolved. For the project to be a success in your organization, you will want many perspectives including management, product teams, and fellow researchers.

Step 3: Designing scanning automation

A well thought out design is critical to any tool’s long term success. This blog will provide a technical overview of Adobe’s implementation and our reasoning behind each decision.

The Adobe Security Automation Framework:

The Adobe Security Automation Framework (SAF) is designed around a few core principles that dictate the downstream implementation decisions. They are as follows:

  1. The “framework” is, in fact, a framework. It is designed to facilitate security automation but it does not try to be more than that. This means:
    1. SAF does not care what security assessment tool is being run. It just needs the tool to communicate progress and results via a specified API. This allows us to run any tool, based on any language, without adding hard coded support to SAF for each tool.
    2. SAF provides access to the results data but it is not the primary UI for results data. Each team will want their data viewed in a team specific manner. The SAF APIs allow teams to pull the data and render it as best fits their needs. This also allows the SAF development team to focus their time on the core engine.
  2. The “framework” is based on Docker. SAF is designed to be multi-cloud and Docker allows portability. The justifications for using a Docker based approach include:
    1. SAF can be run in cloud environments, in our internal corporate network, or run from our laptops for debugging.
    2. Development teams can instantiate their own mini-copies of SAF for testing.
    3. Using Docker allows us to put security assessment tools in separate containers where their dependencies won’t interfere with each other.
    4. Docker allows us to scale the number of instances of each security assessment tool dynamically with respect to their respective job size.
  3. SAF is modularized with each service (UI, scheduler, tool instance, etc.) in its own Docker container. This allows for the following advantages:
    1. The UI is separated from the front-end API allowing the web interface to be just another client of the front-end API. While people will initiate scans from the UI, SAF also allows for API driven scan requests.
    2. The scanning environments are independent. The security assessment tools may need to be run from various locations depending on their target. For instance, the scan may need to run within an internal network, external network, a specific geographic location, or just within a team’s dedicated test environments. With loose-coupling and a modular design, the security assessment tools can be run globally while still having a local main controller.
    3. Docker modularity allows for choosing the language and technology stack that is appropriate for that module’s function.
  4. By having each security test encapsulated within its own Docker container, anyone in the company can have their security assessment tools included in SAF by providing an appropriately configured Docker image. Volunteers can write a simple Python driver based on the SAF SDK that translates a security testing tool’s output into compatible messages for the SAF API and provide that to the SAF team as a Docker image. We do this because:
    1. The SAF team does not want to be the bottleneck for innovation. By allowing external contributions to SAF, the number of tools that it can run increases at a far faster rate. Given the wide array of technology stacks deployed at Adobe, this allows development teams to contribute tools that are best suited for their environments.
    2. In certain incident response scenarios, it may be necessary to widely deploy a quick script to analyze your exposure to a situation. With SAF, you could get a quick measurement by adding the script to a Docker image template and uploading it to the framework.
  5. The “security assertion”, or the test that you want to run, should test a specific vulnerability and provide a true/false result that can be used for accurate measurements of the environment. This is similar to the Behavior Driven Development approaches seen in tools like Guantlt. SAF is not designed to run a generic, catch-all web application penetration tool that will return a slew of results for human triage. Instead it is designed for analysis of specific issues. This has the following advantages:
    1. If you run an individual test, then you can file an individual bug for tracking the issue.
    2. You create tests specifically around the security issues that are critical to your organization. The specific tests can then be accurately measured at scale.
    3. Development teams do not feel that their time is being wasted by being flooded with false positives.
    4. Since it is an individual test, the developer in charge of fixing that one issue can reproduce the results using the same tool as was deployed by the framework. They could also add the test to their existing automation testing suite.

Mayank Goyal on our team took the above principles and re-architected our version 1 implementation into a design depicted in the following architecture diagram:

The SAF UI

The SAF UI is simplistic in nature since it was not designed to be the analysis or reporting suite. The UI is a single page based web application which works with SAF’s APIs. The UI focuses on allowing researchers to configure their security assertions with the appropriate parameters. The core components of the UI are:

  • Defining the assertion: Assertions (tests) are saved within an internal GitHub instance, built via Jenkins and posted to a Docker repository. SAF pulls them as required. The Github repository contains the DockerFile, the code for the test, and the code that acts as bridge between the tool and the SAF APIs using the SAF SDK. Tests can be shared with other team members or kept private.
  • Defining the scan: It is possible that the same assertion(test) may be run with different configurations for different situations. The scan page is where you define the parameters for different runs.
  • Results: The results page provides access to the raw results. The results are broken down into pass, fail, or error for each host tested. It is accompanied by a simple blob of text that is associated with each result.
  • Scans can be set to run at specific intervals.

This screenshot demonstrates an assertion that can identify whether the given URL parameter has login forms are available over HTTP. This assertion is stored in Git, is initiated by /src/startup.sh, and it will use version 4 of the configuration parameters.

A Scan is then configured for the assertion which says when the test will be run and which input list of URLs to test. A scan can run more than one assertion for the purposes of batching results.

The SAF API Server

This API server is responsible for preparing the information for the slave testing environments in the next stage. It receives the information for the scan from the UI, an API client, or based on the saved schedule. The tool details and parameters are packaged and uploaded to the job queue in a slave environment for processing. It assembles all the meta information for testing for the task/configuration executor. The master controller also listens for the responses from the queuing system and stores the results in the database. Everything downstream from the master controller is loosely coupled so that we can deploy work out to multiple locations and different geographies.

The Job Queueing system

The Queueing system is responsible for basic queuing and allows the scheduler to schedule tasks based on resource availability and defer them when needed. While cloud providers offer queuing systems, ours is based on RabbitMQ because we wanted to have deployment mobility.

The Task Scheduler

This is the brains of running the tool. It is responsible monitoring all the Docker containers, scaling, resource scheduling and killing rogue tasks. It has the API that receives the status and result messages from the Docker containers. That information is then relayed back to the API server.

The Docker images

The Docker images are based on a micro-service architecture approach. The default baseline is based on Alpine Linux to keep the image footprint small. SAF assertions can also be quite small. For instance, the test can be a small Python script which makes a request to the homepage of a web server and verifies whether an HSTS header was included in the response. This micro-service approach allows the environment to run multiple instances with minimum overhead. The assertion script communicates its status (e.g. keep-alives) and results (pass/fail/error) back to the task executor using the SAF SDK.

Conclusion

While this overview still leaves a lot of the specific details unanswered, it should provide a basic description of our security automation framework approach at the architectural and philosophical level. For a security automation project to be a success at the detailed implementation level, it must be customized to the organization’s technology stack and operational flow. As we progress with our implementation, we will continue to post the lessons that we learn along the way.

Peleus Uhley
Principal Scientist

Tips for Sandboxing Docker Containers

In the world of virtualization, we know two words: Virtual Machines and Containers. Both provide sandboxing: Virtual Machines provide it through hardware level abstraction while containers provide a process level isolation using a common kernel. Docker containers by default are secure but do they provide complete isolation? Let us look at the various ways sandboxing could be achieved in containers and what we need to do to try and achieve complete isolation.

Namespaces

One of the building blocks of containers that provides the first level of sandboxing is Namespaces. It allows processes with their own view of the system. It isolates the processes from having less effect on other processes in container environment or in the host system. Today there are 6 namespaces available in Linux and all of them are supported by Docker.

  • PID namespace: Provides isolation such that a process belonging to a particular PID namespace can only see other processes in the same namespace. It makes sure that processes that belong to one PID namespace cannot know the existence of processes in other PID namespace and hence cannot inspect or kill them.
  • User namespace: Provides isolation such that a process belonging to a particular user namespace is given a view such that a user could be a root within that namespace, but on the host system, it is mapped as a non-privileged user. This provides a great security improvement in Docker environment.
  • Mount namespace: Provides isolation of the host filesystem from the new filesystem created for the process. This allows processes in different namespaces to change the mount points without affecting each other.
  • Network namespace: Provides isolation such that a process belonging to a particular network namespace gets its own network stack that includes routing tables, IP tables rules, sockets and interfaces. Additionally, we would require Ethernet bridges that allow networking between hosts and namespaces.
  • Uts namespace: Isolates two system identifiers – nodename and domainname. This allows containers to have its own hostname and NIS domain name, which is helpful during the initialization steps.
  • IPC namespace: Provides isolation of InterProcess communication resources that includes IPC message queues, semaphores etc.

Although, namespaces provide a great level of isolation, there are resources that a container can access, but they are not namespaced. These resources are common to all the containers on the host machine which raises concerns over the security. This may present a risk of attack or information exposure. Resources that are not sandboxed include the following:

  • The Kernel Keyring: The Kernel Keyring separates keys using UID. Since we have multiple users in different containers that might have the same UID, all of these users are allowed to have access to the same keys in the keyring. Applications using Kernel Keyring for handling secrets are much less secured due to lack of sandboxing
  • The /proc and system time: Due to the “one size fits all” nature of Docker, a number of linux capabilities remain enabled. With certain capabilities enabled, the exposure of /proc offers a source of information leak and large attack surface. /proc includes files that contain configuration information of the kernel. It has information about the host system resources. Another set of Capabilities include the SYS_TIME and SYS_ADMIN, that allow changes to the system time not just inside the container, but also for the host and other containers.
  • The Kernel Modules: If an application loads kernel modules, that would allow the newly added module to be available across all the containers in the environment and the host system. There are some modules that enforce security policies. Access to such modules would allow the applications to make changes to the security policies which again is a big concern.
  • Hardware: The underlying hardware of the host system is shared between all the containers running on the system. A proper cgroup configuration and access control is required to have a fair distribution of resources. In other words, namespaces allow a larger area to be divided into smaller areas and cgroups allow proper usage of these areas. Cgroups work on resources like memory, cpu, disk drives etc. Having a well-defined cgroup configuration would prevent DoS attacks.

Capabilities

Capabilities are rules that help in performing privileged operations. The privileged operations are only allowed by the root user. An individual non-root process would not be able to perform any privileged operation. By dividing the rules into Capabilities, we can assign them to individual processes without elevating their privilege level. This way we can sandbox the container with certain restricted action and if it is compromised, it would perform less damage than it would with the “root” access. Be careful when using capabilities:

  • Defaults: As mentioned earlier, with “one size fits all” nature of Docker, a number of Capabilities remain enabled. These default set of capabilities given to a container does not provide complete isolation. A better approach would be to remove all the capabilities for the container and then add only those capabilities that are required by the application process running in the container. Adding capabilities comes from trial and error approach using various test scenarios for the application running on the container.
  • SYS_ADMIN capability: Another issue here is that even capabilities are not finegrained. One such capability that is most talked about is the SYS_ADMIN capability. It has a lot of functionalities, some of which are used only by the privileged user. Another reason of concern here.
  • SETUID binary: The setuid bit provides full root permission to a process using it. Many linux distributions use the setuid bit on several binaries, despite the fact that capabilities can be an alternative to using setuid, thus making it more safe and provide less surface for attack in case there is a break out from a non-privileged container. Defang SETUID binaries by removing the SETUID bit or mount filesystems with nosuid.

Seccomp

Seccomp (Secure Computing mode) is a simple sandboxing tool feature in the Linux Kernel. Seccomp provides a filtering mechanism for incoming system calls. It provides a process to monitor all the system calls it can make and take action if the system call is not allowed by the filter. Thus, if an attacker gains access to the container, it would have a limited number of system calls in its arsenal. The seccomp filter system uses Berkeley Packet Filter (BPF) system, similar to the one that uses socket filters. In other words, seccomp allows a user to catch a syscall and “allow”, “deny”, “trap”, “kill”, or “trace” it via the syscall number and arguments passed. An additional layer of granularity is added in locking down the process in one’s containers to only do what is needed.

Docker has provided a default seccomp profile for running on the containers that is more like a whitelist of calls that are allowed. This profile disables only 44 system calls out of 300+ available system calls. This is because of the vast use cases of the containers and its current deployment. Making it stricter would make many applications not usable via Docker container environment. Eg: System call such as reboot is disabled, because there would never be a situation where a container would ever need to reboot the host machine.

Another good example is keyctl – a system call for which a vulnerability was recently found (CVE 2016-0728). Keyctl is also disabled by default now. A most secure seccomp profile would be to create a Custom seccomp profile that blocks these 44 system calls and the ones running on the container that are not required by the app. This can be done with the help of DockerSlim (http://dockersl.im) that auto-generates seccomp profiles.

The good part about the seccomp feature is that it would make the attack surface very narrow. However, it also has around 250+ calls still available that would make it susceptible to attacks. For example, CVE 2014-2153 is a vulnerability that was found in the futex system call, which enables privilege escalation through a kernel exploit. This system call is still enabled and is inevitable since it has legitimate use for implementing basic resource locking for synchronization needs. Although the seccomp feature makes the containers more secured than earlier versions of Docker, it only provides moderate security in the container environment. This needs to be hardened, especially for enterprises, to make it compatible with the application running on the containers.

Conclusion

Through the hardening methods for namespaces, cgroups and the use of seccomp profiles we are able to sandbox our containers to a great extent. By following various benchmarks and using least privileges we can make our container environment secure. However, this only scratches the surface and there are plenty of things to take care of.

Rahul Gajria
Cloud Security Researcher Intern

 

References

1. https://www.toptal.com/linux/separation-anxiety-isolating-your-system-withlinux-namespaces
2. http://www.slideshare.net/jpetazzo/anatomy-of-a-container-namespaces-cgroupssome-filesystem-magic-linuxcon
3. https://www.oreilly.com/ideas/docker-security?intcmp=il-webops-free-articlelgen_five_security_concerns_when_using_docker
4. https://www.nccgroup.trust/globalassets/ourresearch/us/whitepapers/2016/april/ncc_group_understanding_hardening_linux_c ontainers-10pdf/
5. https://docs.docker.com/engine/security/security/

New Security Framework for Amazon Web Services Released

The Center for Internet Security, of which Adobe is a corporate supporter, recently released their “Amazon Web Services Foundations Benchmark.” This document provides prescriptive guidance for configuring security options for a subset of Amazon Web Services with an emphasis on foundational, testable, and architecture agnostic settings. It is designed to provide baseline suggestions for ensuring more secure deployments of applications that utilize Amazon Web Services. Our own Cindy Spiess, web security researcher for our cloud services, is a contributor to the current version of this framework. Adobe is a major user of Amazon Web Services and efforts like this further our goals of educating the broader community on security best practices. You can download the framework document directly from the Center for Internet Security.

Liz McQuarrie
Principal Scientist & Director, Cloud Security Operations