Getting Secrets Out of Source Code

DYK?Ongoing ResearchSecure Product Lifecycle (SPLC)

Secrets are valuable information targeted by attackers to get access to your system and data. Secrets can be encryption keys, passwords, private keys, AWS secrets, Oauth tokens, JWT tokens, Slack tokens, API secrets, and so on. Unfortunately, secrets are sometimes hardcoded or stored along with source code by developers. Even though the source code may be kept securely in software control management (SCM) tools, that is not a suitable place to store secrets. For instance, it is not possible to restrict access to source code repositories as engineering teams collaborate to write and review code. Any secrets in source code will also be copied to clones or forks of your repository making them hard to track and remove. If a secret is ever committed in code stored in SCM tools, then you should consider it to be potentially compromised. There are other risks in storing secrets in source code. Source code could be accidentally exposed on public networks due to simple misconfiguration or released software could be reverse engineered by attackers. For all of these reasons, you should make sure secrets are never stored in source code and SCM tools.

Security teams should take a holistic approach to tackle this problem. First and foremost, educate developers to not hardcode or store secrets in source code. Next, look for secrets while doing security code reviews. If you are using static analysis tools, then consider writing custom rules to automate this process. You could also have automated tests that look for secrets and will fail the code audit if they are found. Lastly, evaluate existing source code and enumerate secrets that are already hardcoded or stored along with source code and migrate them to password management vaults.

However, finding all of the secrets potentially hiding in source code could be challenging depending on the size of the organization and number of code repositories. There are, fortunately, a few tools available to help find secrets in source code and SCM tools. Gitrob is an open source tool that aids organizations in identifying sensitive information lingering in Github repositories. Gitrob iterates over all the organization’s repositories to find files that might contain sensitive information using a range of known patterns. Gitrob can be an efficient way to more quickly identify files which are known to contain sensitive information (e.g. private key file *.key).

Gitrob can, however, generate thousands of findings which can lead to a number of false positives or false negatives. I recommend complementing Gitrob with other tools such as ‘truffleHog’ developed by Dylan Ayrey or  ‘git-secrets’ from AWS labs. These tools are able to do deeper searches of code and may help you cut down on some of the false reports.

Our team chose to complement Gitrob with custom python scripts that looked into the file content. The script identified secrets based on regular expression patterns and entropy. The patterns were created based on the secrets found through Gitrob and understanding of the structure of the secrets in our code. For example, to find an AWS Access ID and secret, I used a regular expression suggested by Amazon in one of their blog posts:

Pattern for access key IDs: (?<![A-Z0-9])[A-Z0-9]{20}(?![A-Z0-9])
Pattern for secret access keys: 

In order to scale, you can share these tools and guidance with product teams to have them run and triage the findings. You should also create clear guidelines for the product team on how to store secrets and move them securely to established secret management tools (e.g. password vaults) for your production environment. The basic principle here is to not store passwords or secrets in clear text and make sure they are encrypted at rest and transit until they reach the production environment. Secrets stored insecurely must be invalidated or rotated. Password management tools might also provide features such as audit logs, access control, and secret rotation which can further help keep your production environment secure.

Given how valuable secrets are – and how much harm they can cause your organization if they were to unwittingly get out – security teams must proactively tackle this problem. No team wants to be the one responsible for leaking secrets through their source code.

Karthik Thotta Ganesh
Security Researcher

DYK?, Ongoing Research, Secure Product Lifecycle (SPLC)

Posted on 05-25-2017