A Basic Distributed Fuzzing Framework for FOE

Last week, CERT released a Python-based file format fuzzer for Windows called Failure Observation Engine (FOE). It is a Windows port of their Linux-based fuzzer, Basic Fuzzing Framework(BFF). CERT provided Adobe with an advanced copy of FOE for internal testing, and we have found it to be very useful. One of the key features of FOE is its simplicity. The configuration file is very straightforward, which makes it easy to introduce to new teams. We have also used the “copy” mode of FOE to help automate triaging large sets of external reports. It is a great tool to have for dumb fuzzing. For this blog, I am going to discuss a simple Python wrapper I created during my initial testing of the tool which helped to coordinate running FOE across multiple machines. This approach allows you to pull seed files from a centralized location. You can also view the status of all of the fuzzing runs and their results from the same location. If you are not interested in writing a distributed fuzzing framework, then you might want to stop reading because the rest of this blog is all about code. :-)

The goal of this distributed fuzzing framework design was to create something simple, lightweight since I was experimenting with a new tool. I set a personal limit of keeping the project to around 1,000 lines of code in order to scope my time investment. That said, I also wanted to build something that I could easily scale later in the event that I liked it enough to invest more time. For the client-side code, I used Python since that was already required for FOE. On the server side, I had a Linux/Apache/MySQL/Perl (LAMP) server. Knowing that everyone has their own preference for server-side authoring, I am only going to describe the server-side architecture rather than providing the Perl source. Nothing in the server-side code is so complicated that a Web developer couldn’t figure out how to do an implementation in the language of their choice from this description. While I designed this for testing the FOE fuzzer, only one file in the entire system is FOE-specific, which makes the infrastructure reusable for other fuzzers. The current name of the main script is “dffiac.py” because I thought of this project as a, “Distributed Fuzzing Framework in a Can”.

For this design, all of the tracking logic is consolidated on the centralized server. The Python script will issue requests for data using simple GETs and POSTs over HTTP. The server will respond to the requests with basic XML. The fuzzing seed files are hosted on the server in a public web server directory from which they can be downloaded. Identified crashes will be uploaded to the server and placed in a public web server directory. Both the client-side and server-side codes are agnostic with regards to the format of the seed files and the targeted application. Therefore, this should be relatively easy to set up in any infrastructure.

 

The database design

In this design, the mySQL server coordinates the runs across all the different machines. You first need a table containing all the files that you want to fuzz. At a bare minimum, it needs a unique primary key (fid), the name of the file and its location on the web server. I currently have a database of more than 60,000 SWF files that are sub-categorized based on type so that I can focus fuzzing to specific types of SWF files. However, name and location will get you started with fuzzing.

 

seed_files

Field Type Description
fid Integer (primary key, autoincrement) The unique File ID for this entry
name VARCHAR The filename
location VARCHAR The relative web directory for the file (e.g. “/fuzzing/files/”)

 

The next thing that you will need is a table to track all of the fuzzing runs. A “run” is defined as one or more servers testing with the same FOE configuration file against a defined set of seed files. There are multiple ways in which you can define the selected seed files for the run. For instance, you may want to use FOE against multiple types of applications. For this scenario, you might have a different seed_files for each file type. To support the need for different seed_files tables, the design of run_records requires that you provide the “table_name” that will be used for this run. Once a seed_files table is selected, it may be necessary to further restrict the run to a subset of files within the seed_files tables. For instance, you may only want to select a subset of files within the given table. Therefore, the design requires that you provide a “type” parameter which denotes the method for selecting files from the seed_files table. The value of type can include values such as “all”, “range” or any other sub-category you want to define. As an example, this particular run may be a “range” type that starts at start_fid and stops at end_fid.

 

run_records

Field Type Description
rid Integer (primary key, autoincrement) The unique ID for this run
name VARCHAR The human readable name for the run
description VARCHAR A description for the run (e.g. config or mutation used, # of iterations, etc.)
type VARCHAR Values can include (all, range, etc)
table_name VARCHAR The name of the seed_files table that will be used for testing
start_fid Integer The first fid from seed_files to be fuzzed in this run
end_fid Integer The last fid from seed_files to be fuzzed in this run
current_fid Integer This tracks the next fid to be tested during the run

 

For every run, you will have multiple servers running FOE. For each server instance, it will be necessary to track the server name, when it started, the current status of the server, and when it last provided an update. The status will include values such as “running” and “complete.”  You can infer whether a machine has died based on whether it has been too long since the timestamp for the last_update field was modified.

 

server_instances

Field Type Description
siid Integer (primary key, autoincrement) The unique server instance ID
server_name VARCHAR The name of the server (e.g hostname + IP address)
status VARCHAR Is it running or has it completed.
start_time timestamp When did this instance start?
last_update timestamp When was the last request from this instance?
rid Integer What run_record is this instance associated with?

 

Lastly, you will need a table to record the results. The script will record the server_instance ID (siid) where the crash was found in case there are issues with reproducing the crash. This will allow a QA to retest on the original machine where the crash occured. It is also necessary to track which run was able to identify the crash. The rid is not recorded because it can already be extrapolated from the siid. According to database normalization rules, redundant information should not be stored in tables. In this design, the script will record a result in fuzz_records regardless of whether a crash was identified.  This allows you to track which files have been tested against which FOE configurations. If a crash is identified, the web server directory where the crash result was stored is also recorded.

 

fuzz_records

Field Type Description
frid Integer (primary key, autoincrement) The unique fuzz record ID
fid Integer The seed_files ID for this entry
siid Integer The server instance ID for this entry
crash Boolean Whether a crash was recorded during this test
location VARCHAR Where the crash result was stored (e.g. /results/run_id/)

 

The config file

You will start the Python script by providing a simple configuration file in the command line: “python dffiac.py dffiac.cfg”. The configuration file is in the same format as the FOE configuration file and contains the following:

 

dffiac.cfg

[foeoptions]
python_location=C:\Python26\python.exe
config_location=C:\FOE\configs\my_foe_config.cfg

 

[runoptions]
run_id=1
web_server=http://my.internal.server.com
upload_cgi=/fuzzers/crash_uploader.cgi
action_cgi=/fuzzers/action_handler.cgi

 

[logoptions]
log_dir=C:\dffiac\logs\

 

The foeoptions section tells the script where to find the Python executable and the location of the FOE config script you will use for this run. The runoptions section provides the run id (rid) the database is using to track this run along with the location of the web server, the path to the action_handler.cgi and the path to the CGI that will handle the file uploads. The logoptions allows you to specify where the script will log local information regarding the run. The logs directory needs to exist prior to starting the script. The config_location and run_id are likely the only two elements that will change from run to run.

 

The transaction flow

For this next section, we will review the transactions between the dffiac.py script and the web server. The web server will read in the GET parameters, execute the relevant SQL query and return the results as XML. All but one request is handled by the action_handler defined in the dffiac.cfg config file. The upload of the crash results is handled by the upload_cgi defined in the dffiac.cfg config file.

Once dffiac.py has started and been initialized by the config file, the script will begin sending requests to the server. An “action” parameter informs the action_handler CGI which query to perform. The server will always respond to the Python script with the relevant information for the request in a simple XML format.

 

The first HTTP request from the Python code will be to gather all the information regarding the run_id provided in the config file:

GET /fuzzers/action_handler.cgi?action=getRunInfo&rid=1

 

The web server will then perform this SQL query with the rid that was provided:

select run_type,start_fid,end_fid from run_records where rid = ?

 

The results from the query will be used to return the following XML (assuming the run is defined as the range of fids from 1-25):

<xml>
  <run_type>range</run_type>
  <start_fid>1</start_fid>
  <end_fid>25</end_fid>
</xml>

 

Now that dffiac.py has the information for the run, it will then inform the web server that the run is starting:

GET /fuzzers/action_handler.cgi?action=recordServerStart&rid=1&serverName=server1

 

This HTTP request will result in the following SQL query:

insert into server_instances (server_name,status,start_time,rid) values (?,'running',NOW(),?)

 

The insert_id from this query (siid) becomes the unique identifier for this instance and is returned for use in later queries:

<xml>
  <siid>1</siid>
</xml>

 

Now that this instance has officially registered to contribute to this run, the Python script will begin requesting individual files to test:

GET /fuzzers/action_handler.cgi?action=getNextFid&rid=1&run_type=range

 

The corresponding SQL query will vary depending on how you have defined your run. For this example, we will assume that this is a basic run that will incrementally walk through the file IDs in the seed_files table. To accomplish this, we create an SQL variable called “value” and assign it the current_fid. By recording the value of the current fid and incrementing the “value” in a single statement, we can avoid a race condition when multiple servers are running.

update run_records set current_fid = current_fid + 1 where rid = ? and @value := current_fid;

 

At this point, “@value” is set to 1 which is the fid the Python script will test and the current_fid in the database table has been incremented to 2. The web server can then fetch ”@value with the following SQL command:

select @value;

 

Since the process of asking for the next fid will automatically increment the value of current_fid, the value of current_fid will eventually exceed the value of the end_fid in the database table. While it may seem weird, it doesn’t hurt the process. This can be allowed to occur or you can add a little more server-side logic to have the server return -1 as the current_fid to stop the run when end_fid is reached.

 

The “select @value” result will be returned to Python script as the current_fid available for testing:

<xml>
  <current_fid>1</current_fid>
</xml>

 

The Python script will then compare the current_fid with the end_fid that it received earlier to determine whether to stop testing.

 

Once we have the fid of the file that we will test, we can then fetch the information for that specific file:

GET /fuzzers/action_handler.cgi?action=getFileInfo&rid=1&fid=1

 

Using the rid, the web server can query the run_records table to find the table_name that contains the seed files.

select table_name from run_records where rid = ?

 

Assuming the result of that query will be saved as the variable, “$table_name”, the web server can construct the query to retrieve the file name and the directory location that corresponds to the file id:

"select name, location from" . $table_name . "where fid = ?"

 

The web server will return the file name and location with the following XML:

<xml>
  <name>seed.txt</name>
  <location>/fuzzers/files/</location>
</xml>

 

Now, that the location of the seed file is known, it can be downloaded by dffiac.py and saved in the FOE seeds directory. The FOE fuzzer is then started, and dffiac.py waits for FOE to finish testing that seed file. Once FOE testing has completed, the result will need to be recorded by sending the fid and a boolean value indicating whether a crash was identified with that test:

GET /fuzzers/action_handler.cgi?action=recordResult&siid=1&fid=1&crash=1

 

This will result in the following query:

insert into fuzz_records (siid,fid,crash) values (?,?,?)

 

The web server will also record that it has received an update from this fuzzing server instance in the server_instances table to let us know that it is still alive and processing:

update server_instances set lastUpdate = NOW() where siid = ?

 

The result is recorded regardless of success or failure so that you can track which files have been successfully tested with which configs. You could infer this from the run_records, but if a machine dies, a file might be skipped. The server-side code will take the insert_id from the fuzz_records statement (frid) and return the following XML:

<xml>
  <frid>1</frid>
</xml>

 

If there was a crash, the Python script will zip up the crash directory, base64 encode the file and POST it to the upload_cgi identified in the dffiac configuration file. The script will leave the zip file on the fuzzing server if an error is detected during the upload. Along with the zip file, it will send the rid and frid. The rid is used to store files in a web server directory unique to that run. The frid is sent so that the action_handler can update the fuzz_records entry with the location of the uploaded crash file (e.g. “/results/1/zip_file_name.zip”) in the following SQL query:

update fuzz_records set location = ? where frid = ?

 

A successful upload will result in the following XML:

<xml>
  <success>1</success>
</xml>

 

A failed upload can return the description of the error to the client with the following XML:

<xml>
  <error>Replace me with the actual error description</error>
</xml>

 

The dffiac.py script will then continue retrieving new files and testing them with FOE until the end_fid is reached. Then the final call to the web server will record that this fuzzing server instance has completed its run and has stopped:

GET /fuzzers/action_handler.cgi?action=recordRunComplete&siid=1

 

The web server will record the completion with the following SQL query:

update server_instances set status='complete', lastUpdate=NOW() where siid = ?

 

The web server will respond to this last request with the following XML:

<xml>
  <success>1</success>
</xml>

 

The last XML response is currently ignored by the Python script but a more robust implementation could double-check for errors.

 

The Python code

The logic for the distributed fuzzing framework is split into one main file (dffiac.py) and three libraries that are contained in a /libs directory. We’ll start with the three libraries in the /libs directory. The code below is the library that contains the utilities for creating the zip file of the crash result.

 

ZipUtil.py (30 lines)

import zipfile
import os

 

class ZipUtil:

 

#Create a zip file and add everything in path_ref
def createZipFile(self, path_ref, filename):
  zip_file = zipfile.ZipFile(filename, 'w')

 

  #Check to see if path_ref is a file or folder
  if os.path.isfile(path_ref):
    zip_file.write(path_ref)
  else:
    self.addFolder(zip_file, path_ref)

 

  zip_file.close()

 

#Recursively add folder contents to the zip file
def addFolder(self, zip_file, folder):
  for file in os.listdir(folder):

 

    #Get path of child element
    child_path = os.path.join(folder, file)

 

    #Check to see if the child is a file or folder
    if os.path.isfile(child_path):
      zip_file.write(child_path)
    elif os.path.isdir(child_path):
      self.addFolder(zip_file, child_path)

 

The second library will base64 encode the zip file prior to uploading it to the web server via a POST method.  On the server side, you will need to base64 decode the file before writing it to disk.

 

PostHandler.py (77 lines)

import mimetools
import mimetypes
import urllib
import urllib2
import base64

 

class PostHandler(object):

 

  def __init__(self,webServer,uploadCGI):
    self.web_server = webServer
    self.upload_cgi = uploadCGI
    self.form_vars = []
    self.file_attachments = []
    self.mime_boundary = mimetools.choose_boundary()
    return

 

  #Add a form field to the request
  def add_form_vars(self, name, value):
    self.form_vars.append((name, value))
    return

 

  #Get the mimetype for the attachment
  def get_mimetype(self,filename):
    mimetype = mimetypes.guess_type(filename)[0] or 'application/octet-stream'
    return(mimetype)

  #Add a base64 encoded file attachment
  def append_file(self, var_name, filename, file_ref, mimetype=None):
    raw = file_ref.read()
    body = base64.standard_b64encode(raw)
    if mimetype is None:
      mimetype = self.get_mimetype(filename)
    self.file_attachments.append((var_name, filename, mimetype, body))

  #Get the body of the request as a string
  def get_request_body(self):
    lines = []
    section_boundary = '--' + self.mime_boundary

 

    # Add the form fields
    for (name, value) in self.form_vars:
      lines.append(section_boundary)
      lines.append('Content-Disposition: form-data; name="%s"' % name)
      lines.append('')
      lines.append(value)

 

    # Add the files to upload
    for var_name, filename, content_type, data in self.file_attachments:
      lines.append(section_boundary)
      lines.append('Content-Disposition: file; name="%s"; filename="%s"' % \
        (var_name, filename))
      lines.append('Content-Type: %s' % content_type)
      lines.append('Content-Transfer-Encoding: Base64')
      lines.append('')
      lines.append(data)

 

    #Add the final boundary
    lines.append('--' + self.mime_boundary + '--')
    lines.append('')

 

    #Combine the list into one long string
    CRLF = '\r\n'
    return CRLF.join(lines)
  #Send the final request
  def send_request(self):
    request = urllib2.Request(self.web_server + self.upload_cgi)
    content_type = 'multipart/form-data; boundary=%s' % self.mime_boundary
    request.add_header('Content-type',content_type)

 

    form_data = self.get_request_body()
    request.add_header('Content-length',len(form_data))
    request.add_data(form_data)

 

    result = urllib2.urlopen(request).read()
    return result

 

 

The third library handles the communication between the client and server. It will generate the GET requests and parse the XML responses.

 

actionHandler.py (94 lines)

import urllib
import urllib2
from xml.dom.minidom import parseString

 

class ActionHandler:

 

  #Initialize with the information from the config file
  def __init__(self,options,localLog):
    self.webServer = options['runoptions']['web_server']
    self.uploadCGI = options['runoptions']['upload_cgi']
    self.actionCGI = options['runoptions']['action_cgi']
    localLog.write("Configured web server\n")

 

  #Parse the XML for the requested text value
  def getText(self,nodelist):
    rc = []
    for node in nodelist:
      if node.nodeType == node.TEXT_NODE:
        rc.append(node.data)
    return ''.join(rc)

 

  #Make a web request to the server with the provided GET parameters
  def retrieveInfo(self,values):
    url = self.webServer + self.actionCGI
    data = urllib.urlencode(values)
    response = urllib2.urlopen(url,data)
    xml = response.read()
    response.close()
    return(xml)

 

  #Get the information for the rid provided in the config file
  def getRunInfo(self,rid):
    values = {'action':'getRunInfo',
      'rid': rid}
    xml = self.retrieveInfo(values)
    dom = parseString(xml)
    run_type = self.getText(dom.getElementsByTagName("run_type")[0].childNodes)
    start_fid = self.getText(dom.getElementsByTagName("start_fid")[0].childNodes)
    end_fid = self.getText(dom.getElementsByTagName("end_fid")[0].childNodes)
    return (run_type,start_fid,end_fid)

 

  #Record that this server instance is starting a run
  def recordServerStart(self,rid,serverName):
    values = {'action':'recordServerStart',
      'rid': rid,
      'serverName':serverName}
    xml = self.retrieveInfo(values)
    dom = parseString(xml)
    lastrowid = self.getText(dom.getElementsByTagName("siid")[0].childNodes)
    return (lastrowid)

 

  #Record that the server is now complete with its tests
  def recordRunComplete(self,siid):
    values = {'action':'recordRunComplete',
      'siid': siid}
    xml = self.retrieveInfo(values)

 

  #Get the fid for the next file to be fuzzed
  def getNextFid(self,rid,fid,run_type):
    values = {'action':'getNextFid',
      'fid':fid,
      'run_type':run_type,
      'rid':rid}
    xml = self.retrieveInfo(values)
    dom = parseString(xml)
    current_id = self.getText(dom.getElementsByTagName("current_fid")[0].childNodes)
    return current_id

 

  #Get the file name and location for the selected fid
  def getFileInfo(self,rid,fInfo):
    values = {'action':'getFileInfo',
      'rid':rid,
      'fid':fInfo.fid}
    xml = self.retrieveInfo(values)
    dom = parseString(xml)
    fInfo.name = self.getText(dom.getElementsByTagName("name")[0].childNodes)
    fInfo.location = self.getText(dom.getElementsByTagName("location")[0].childNodes)

 

#Record the result from the fuzzing test
  def recordResult(self,siid,fid,result):
    values = {'action':'recordResult',
      'siid':siid,
      'fid':fid,
      'crash':result}
    xml = self.retrieveInfo(values)
    dom = parseString(xml)
    frid = self.getText(dom.getElementsByTagName("frid")[0].childNodes)
    return frid

 

Finally, we get to the main file which is responsible for reading the config file and driving the fuzzing run. This is the only file that is specific to the FOE fuzzer.

 

dffiac.py (177 lines)

import os
import shutil
import socket
import subprocess
import sys
import urllib2
import ConfigParser
import time

 

sys.path.append("libs")

 

from ZipUtil import ZipUtil
from PostHandler import PostHandler
from ActionHandler import ActionHandler

 

#This will track the fid, and location of the file
class FileInfo:
  pass

 

#Convert the options in the config file to lists
def parse_options(config):
  options = {}
  for section in config.sections():
    options[section] = {}
    for (option, value) in config.items(section):
      options[section][option] = value
  return options

#Create a local text file for logging
def openLog(options):
  localLogDir = options['logoptions']['log_dir']
  runName = options['runoptions']['run_id']
  timestamp = int(time.time())
  localLog = open(localLogDir + runName + '_' + str(timestamp) + '.txt', 'w')
  localLog.write("Starting run: " + runName + " at " + str(timestamp) + "\n")
  return localLog

 

#Close the local text file log
def closeLog(localLog):
  localLog.write("COMPLETE\n")
  localLog.close()

 

#Download the next file to be fuzzed
def getNextFile(fInfo, options, foe_options, localLog):
  u = urllib2.urlopen(options['runoptions']['web_server'] + fInfo.location + fInfo.name)
  localFile = open(foe_options['runoptions']['seedsdir'] + "\\" + fInfo.name, 'wb')
  localFile.write(u.read())
  localFile.close()
  localLog.write ('Created file: ' + foe_options['runoptions']['seedsdir'] + "\\" + fInfo.name + '\n')

 

#Store the results in a zip file
def createZip(outputDir,filename):
  zipTool = ZipUtil()
  zipTool.toZip(outputDir,filename)
  zipFile = open(filename,'rb')
  return zipFile

 

#Post the zip file to the server
def postZip(options,frid,rid,filename,zipFile):
  form = PostHandler(options['runoptions']['web_server'], options['runoptions']['upload_cgi'])
  form.add_form_vars('frid',frid)
  form.add_form_vars('rid',rid)
  form.append_file('fname',filename,zipFile)
  result = form.send_request()
  return result

 

if __name__ == "__main__":
  if (len(sys.argv) < 2):
    print "usage: %s <runconfig.cfg>" % sys.argv[0]
    exit(1)

 

  #Read the dffiac config file
  configFile = sys.argv[1]
  if not os.path.exists(configFile):
    print "config file doesn't exist: %s" % configFile
    exit(1)
  config = ConfigParser.SafeConfigParser()
  config.read(configFile)

 

  #Read the foe config file
  options = parse_options(config)
  config2 = ConfigParser.SafeConfigParser()
  config2.read (options['foeoptions']['config_location'])
  foe_options = parse_options(config2)

 

  #Set up logging
  localLog = openLog(options)

 

  #Configure the web server
  aHandler = ActionHandler(options, localLog)

 

  #Get the information for this run
  rid = options['runoptions']['run_id']
  (run_type,start_fid,end_fid) = aHandler.getRunInfo(rid)

 

  #Record server start
  hostName = socket.gethostname()
  hostIP = socket.gethostbyname(hostName)
  serverName = hostName + "_" + hostIP
  siid = aHandler.recordServerStart(rid, serverName)
  localLog.write("Starting as server instance: " + siid + "\n")

 

  #Get the first file to be processed
  fInfo = FileInfo()
  fInfo.fid = aHandler.getNextFid(rid,start_fid,run_type)
  localLog.flush()

 

  #loop until done
  while (int(fInfo.fid) <= int(end_fid)):
    #Get the location information for the current file
    aHandler.getFileInfo(rid,fInfo)

 

    #Download and store the file
    getNextFile(fInfo,options,foe_options,localLog)

    outputDir = foe_options['runoptions']['outputdir'] + "\\" + foe_options['runoptions']['runid']

 

    #Run fuzzer
    exitCode = subprocess.call(options['foeoptions']['python_location'] + " " + options['foeoptions']['foe_location'] + " " + options['foeoptions']['config_location'], shell=True)

 

    #Check for completion of a succesful run
    if exitCode != 0:
      localLog.write("Error running foe on fid " + fInfo.fid + "\n")
    else:
      dirList = os.listdir(outputDir)

 

      #Detect whether bugs were found
      if len(dirList) > 2:

 

        #Record the result in fuzz_records
        frid = aHandler.recordResult(siid,fInfo.fid,1)
        localLog.write("Recording frid: " + frid + "\n")

 

        #Store the results in a zip file
        filename = frid + "-" + fInfo.name + ".zip"
        file_path = os.getcwd() + filename
        zipFile = createZip(outputDir,file_path)

 

        #Post the zip file back to the server
        result = postZip(options,frid,rid,filename,zipFile)
        zipFile.close()

 

        #Make sure the file got there OK
        if result.find("error") == -1:
          localLog.write("Results successfully uploaded.\n")
          os.remove(file_path)
        else:
          localLog.write("There was an error in the upload: " + result + "\n")

 

        localLog.write("Found bugs with " + fInfo.fid + "\n")
      else:
        #Record no bugs found in the directory
        aHandler.recordResult(siid,fInfo.fid,0)
        localLog.write("No bugs found with " + fInfo.fid + "\n")

 

    #The if len(dirlist) check on the results is complete
    #Erase files so that FOE starts clean on the next run
    os.remove(foe_options['runoptions']['seedsdir'] + "\\" + fInfo.name)
    shutil.rmtree(outputDir)
    localLog.flush()

 

    #Get the next FID
    fInfo.fid = aHandler.getNextFid(rid,fInfo.fid,run_type)

 

  #The while loop is complete
  #Record this run instance as being complete
  aHandler.recordRunComplete(siid)

 

  #Close the local file log
  closeLog(localLog)

 

This blog is only meant to describe how you can stand up a basic distributed fuzzing framework based on FOE fairly quickly in approximately 1,000 lines of code. The client-side code turned out to be 378 lines, my server-side action_handler CGI was 150 lines and the upload CGI was 72 lines of Perl. That is enough to get the script to run based on information from a database. With the remaining 400 lines, I created a CGI to display the status of my runs and a CGI to generate a run. You will also want to write a script to mirror the dffiac.cfg and FOE configuration file across machines. Over time, I expect that you would make this design more robust for your particular infrastructure and needs. You can also expand this infrastructure for your other fuzzers with some modifications to the main file. What I provide here is just enough to help you get started performing distributed fuzzing with a small amount of coding and the FOE fuzzer.

 

Permission for this blog entry is granted as CCplus, http://www.adobe.com/communities/guidelines/ccplus/commercialcode_plus_permission.html

 

Straight from the Source: SOURCE Boston

Karthik here from Adobe PSIRT. My colleague from the Adobe Acrobat team, Manish Pali, and I will be speaking next week at the SOURCE Boston conference. In our talk, we’ll cover some of the processes behind incident response at Adobe, including our security community outreach via the Microsoft Active Protections Program (MAPP), and automation strategies and solutions from the trenches for new and known vulnerability reports.

Demo alert! Manish is going to demo one of his tools for incident-triage automation—we’re hoping this and other aspects of the talk will benefit our friends on other incident response teams.

Please swing by our talk, if you’ll be at SOURCE Boston. We look forward to catching up in hallway conversations.

See you in Boston,

Karthik

Background on Security Bulletin APSB12-08

Today we released Security Bulletin APSB12-08 along with corresponding updates for Adobe Reader and Acrobat. We’d like to highlight a few changes we are making with today’s releases.

Rendering Flash (SWF) Content in Adobe Reader and Acrobat 9.5.1

First off, starting with the Adobe Reader and Acrobat 9.5.1 updates, Adobe Reader and Acrobat 9.x on Windows and Macintosh will use the Adobe Flash Player plugin version installed on the user’s system (rather than the Authplay component that ships with Adobe Reader and Acrobat) to render any Flash (SWF) content contained in PDF files. We added an Application Programming Interface (API) to both Adobe Reader/Acrobat and Flash Player to allow Adobe Reader/Acrobat to communicate directly with a Netscape Plugin Application Programming Interface (NPAPI) version of Flash Player installed on the user’s system. From a security perspective, this means that Adobe Reader/Acrobat 9.x users will no longer have to update Adobe Reader/Acrobat each time we make available an update for Flash Player. This will be particularly beneficial to customers in managed environments because fewer updates help reduce the overhead for IT administration.

If Adobe Reader or Acrobat 9.5.1 is installed on a system that does not have the NPAPI version of Flash Player installed and the user opens a PDF file that includes Flash (SWF) content, a dialog will prompt the user to download and install the latest Flash Player. (Browsers such as Firefox, Opera and Safari use the NPAPI version of Flash Player as opposed to the ActiveX version of Flash Player used by Internet Explorer. Chrome uses a bundled version of Flash Player, even if there is an NPAPI version of Flash Player installed on the system.)

We are currently working on integrating the same API into Adobe Reader and Acrobat X, and will follow up with another blog post once this functionality is available in version X.

Rendering 3D Content in PDF Files

We also changed the default behavior in Adobe Reader and Acrobat 9.5.1 to disable the rendering of 3D content. Since the majority of consumers do not typically open PDF files that include 3D content and 3D content in untrusted documents has been a previous vector of attack we have disabled this functionality by default starting with version 9.5.1. Users have the option to enable 3D content, but a Yellow Message Bar will flag potentially harmful documents in the event that untrusted documents attempt to render 3D content. IT administrators in managed environments will also have the option of turning this behavior off for trusted documents.

More information on the two changes to content rendering described above is available in the Adobe Reader and Acrobat 9.5.1 release notes.

Further Alignment of the Adobe Reader/Acrobat Update Cycle with Microsoft’s Model

In June 2009, we shipped our first quarterly security update for Adobe Reader and Acrobat. Since then, we have come a long way in putting mitigations into place that make Adobe Reader and Acrobat a less attractive attack target. Sandboxing Adobe Reader and Acrobat X, in particular, has led to greater than expected results. Attackers have indicated through their target selection thus far that the extra effort required to attack version X is not currently worth it. Additionally, we have seen a lower volume of vulnerability reports overall against Adobe Reader and Adobe Acrobat. Given the shift in the threat landscape and the lower volume of vulnerability reports, we have revisited the decision to follow a strict quarterly release cycle.

After three years of shipping a security update once a quarter and announcing the date of the next update the same day we ship the current update, we are making a change. We are shifting to a model that more closely aligns with the familiar “Microsoft Patch Tuesday” cadence. We will continue to publish a prenotification three business days before we release a security update to Adobe Reader and Acrobat. We will continue to publish security updates on the second Tuesday of the month. We will continue to be flexible and respond “out of cycle” to urgent needs such as a zero-day attack. What we are discontinuing is the quarterly cadence and the pre-announcement of the next scheduled release date in the security bulletin for the previous release. We will publish updates to Adobe Reader and Acrobat as needed throughout the year to best address customer requirements and keep all of our users safe.

A Note on the Update Priority Ratings in APSB12-08

Finally, in today’s Security Bulletin, we rated Adobe Reader and Acrobat 9.5.1 for Windows as a “Priority 1″ update, while Adobe Reader and Acrobat X (10.1.2) was rated a “Priority 2″ update. This was an interesting decision, and we thought we would provide some background information: Although there are no exploits in the wild targeting any of the vulnerabilities addressed in Adobe Reader 9.5.1, Adobe Reader 9.x continues to be a target for attackers, so, for users who can not update to Adobe Reader X, we feel that urgently updating Adobe Reader 9.x remains a must to stay ahead of potential attacks.

Since the release of Adobe Reader X, Protected Mode mitigations (or the Protected View mitigations in Adobe Acrobat X version 10.1 and later) continue to be the best way to block potentially malicious behavior in PDF files. Therefore, a “Priority 2″ designation is appropriate for the Adobe Reader X and Acrobat X 10.1.2 updates. Adobe Reader and Acrobat for Macintosh and Linux have not historically been a target of attacks, and therefore are also assigned a “Priority 2.”

Presenting “Malware Classifier” Tool

Hi folks,

Karthik here from Adobe PSIRT. Part of what we do at PSIRT is respond to security incidents. Sometimes this involves analyzing malware.  To make life easier, I wrote a Python tool for quick malware triage for our team. I’ve since decided to make this tool, called “Adobe Malware Classifier,” available to other first responders (malware analysts, IT admins and security researchers of any stripe) as an open-source tool, since you might find it equally helpful.

Malware Classifier uses machine learning algorithms to classify Win32 binaries – EXEs and DLLs – into three classes: 0 for “clean,” 1 for “malicious,” or “UNKNOWN.” The tool extracts seven key features from a binary, feeds them to one or all of the four classifiers, and presents its classification results.

The tool was developed using models resultant from running the J48, J48 Graft, PART, and Ridor machine-learning algorithms on a data set of approximately 100,000 malicious programs and 16,000 clean programs.

Malware Classifier is available at Open @ Adobe.

I will be speaking about the research behind the tool at Infosec Southwest 2012 in Austin, TX, on April 1. If you’re going to be there, I look forward to meeting up and discussing product security and secure engineering at Adobe.

An Update for the Flash Player Updater

Peleus here with the second major 2012 security announcement for Flash Player. Today’s release of Flash Player contains a new background updater. This new background updater will allow Windows users to choose an automatic update option for future Flash Player updates.

If you read this September 2011 CSIS report, then you saw that 99.8 percent of malware installs through exploit kits are targeting out-of-date software installations. This point was reiterated recently in volume 11 of the Microsoft Security Intelligent Report. Also, attackers have been taking advantage of users trying to manually search for Flash Player updates by buying ads on search engines pretending to be legitimate Flash Player download sites. Improving the update process is probably the single most important challenge we can tackle for our customers at this time.

Overview of the background updater design

A full technical description of the new background updater design is available on DevNet, but here are the highlights:

After a successful installation of Adobe Flash Player 11.2, users will be presented with a dialog box to choose an update method. The following three update options are available to users:

  • Install updates automatically when available (recommended)
  • Notify me when updates are available
  • Never check for updates (not recommended)

For our initial release, we have set the new background updater to check for updates once an hour until it gets a response from Adobe. If the response says there is no new update, then it will wait 24 hours before checking again. We accomplish this through the Windows Task Scheduler to avoid running a background service on the system. If you are running multiple browsers on your system, the background updater will update every browser. This will solve the problem of end-users having to update Flash Player for Internet Explorer separately from Flash Player for their other open-source browsers. Google Chrome users, who have the integrated Flash Player, will still be updated through the Chrome update system.

Additionally, the user can change their update preferences at any time via the Flash Player Settings Manager, which for Windows users can be accessed via the Control Panel > Flash Player. In the Flash Player Settings Manager, the update preferences can be found and selected in the “Advanced” tab under “Updates.”

Organizations with managed environments do have the capability to disable the background updater feature through the Flash Player mms.cfg file. Also, those users who want to be notified of updates and do not want to be silently updated can continue to use the existing update mechanism. Lastly, the background updater feature is currently Windows-only for Windows XP and newer operating systems. A Mac version is currently under development.

I do want to note that we are not promising that all Flash Player updates going forward will be completely silent. We will be making the decision to silently install on a case-by-case basis. For instance, any update that changes the default settings of Flash Player will require confirmation from end-users even if they have already agreed to allowing background updates. Today’s update is an example of where confirmation would be required since we are changing how updates get applied to the user’s machine. However, we could apply a zero-day patch without requiring end-user confirmation, so long as the user has agreed to receiving background updates. Adobe will also continue to release feature-bearing releases that will trigger an update notification to users that highlight new and exciting features to the Flash Player.

The new background updater will provide a better experience for our customers, and it will allow us to more rapidly respond to zero-day attacks. This model for updating users is similar to the Google Chrome update experience, and Google has had great success with this approach. We are hoping to have similar success.

One last note

Since Flash Player 11 was first released in September 2011, we have continued to maintain Flash Player 10.3 with security updates for users who cannot update to the current version of Flash Player. In support of Microsoft’s initiative to get the world to drop Internet Explorer 6 and upgrade to a newer version of Internet Explorer for a safer browsing experience, Adobe will be dropping support for Internet Explorer 6 starting with today’s release of Flash Player 10.3.

While we will no longer include testing on Internet Explorer 6 in our certification process and strongly encourage users to upgrade to the newest version of Internet Explorer, we will not block the installation of newer versions of Flash Player 10.3 on systems running Internet Explorer 6 and expect functionality on those systems to remain unchanged.

CanSecWest 2012

The team and I are about to head off to CanSecWest. While I have been attending CanSecWest for several years, this year will be a unique experience for me. During my talk, I will demo an open-source tool I just released, called Adobe SWF Investigator. The tool can be useful for developers, quality engineers and security professionals for analyzing SWF applications. It has been a pet project of mine for some time, and I decided to share it with a broader audience.

Within my current role, I have to look at all aspects of SWF applications from cross-site scripting issues to binary analysis. Therefore, the tool includes capabilities to perform everything from testing cross-site scripting to viewing the individual SWF tags within the file format. I am hoping that by releasing the tool as an open-source ActionScript application, it will encourage all ActionScript developers to learn more about security. The tool is designed to be an extensible framework everyone can build upon or modify. More information on the tool can be found in my DevNet article.

In addition to demonstrating the tool, I will also be talking about Advanced Persistent Response. Adobe has been the focus of hackers for some time, and I plan to discuss what we have learned and observed in the process of responding to those threats. My talk will be on Wednesday at 3:30pm, if you are interested. When I am not speaking, you can probably find me and the Adobe team either at the Adobe table or milling around the pwn2own contest for no particular reason. Please feel free to come by and talk with us. See you there!

When Do I Need to Apply This Update – Adding Priority Ratings to Adobe Security Bulletins

How urgently do I need to apply this update? That’s the most common question we get from customers in managed environments when we release a security bulletin. Our current severity ratings do a good job of objectively describing the worst-case scenario involved with a security issue, but they do not necessarily tell a customer all they need to know about the risk and priority of a particular security update. All critical security updates are not created equal. For example, if a Flash Player issue is being exploited in the wild, the update to resolve the vulnerability deserves a much higher priority than, say, a patch for a critical vulnerability in Photoshop. After all, Flash Player is a browser-based plugin with hundreds of millions of customers. Photoshop, on the other hand, has a much smaller customer base and would require significant social engineering to successfully exploit the product. So we started to wonder, how can we communicate the priority of our security updates more effectively?

We want to be as simple and direct as possible about the real-world risk associated with the vulnerabilities addressed in any given security update, and we decided that adopting a separate priority ranking scheme was the best way to accomplish this. Here is the priority scheme we are planning to use to rank security updates in the future:

Priority 1 Priority 2 Priority 3
This update resolves vulnerabilities being targeted, or which have a higher risk of being targeted, by exploit(s) in the wild for a given product version and platform. Adobe recommends administrators install the update as soon as possible. (for instance, within 72 hours). This update resolves vulnerabilities in a product that has historically been at elevated risk. There are currently no known exploits. Based on previous experience, we do not anticipate exploits are imminent. As a best practice, Adobe recommends administrators install the update soon (for instance, within 30 days). This update resolves vulnerabilities in a product that has historically not been a target for attackers. Adobe recommends administrators install the update at their discretion.

We’re going to base our priority ranking on historical attack patterns for the relevant product, the type of vulnerability, the platform(s) affected, and any potential mitigations that may be in place. This is a new system, so we may find that adjustments will need to be made. We also believe that continuing to use the current severity ratings makes sense, since this information has been helpful to many customers, so you can expect to see both ratings being used in future security bulletins.

We look forward to your feedback. Our goal is to help our customers in managed environments prioritize updates, so we’ll see if this new priority ranking scheme works to accomplish that! As we have been emphasizing a lot recently, the majority of attacks we are seeing are exploiting software installations that are not up-to-date with the latest security updates, so as always we recommend that users keep their software installations updated with the latest version of Adobe software.

RSA Conference Schedule

Brad Arkin here. RSA Conference is upon us once again. There are some exciting talks and events on the calendar, but I’m looking forward to the informal “hallway track” the most.

In the days leading up to RSA Conference, everyone in the industry seems to be reminding each other of the sessions you “absolutely should not miss.” Here’s my pitch—and a summary of where you can find me and members of the Adobe Secure Software Engineering Team at RSA Conference:

MONDAY, FEBRUARY 27, 2012

On Monday, February 27, you’ll find me at the “Improving Application Security Seminar” (SEM-002), along with experts from Symantec, Cigital, Fortify Software, HP, Microsoft, and Veracode. This full-day seminar for delegates will kick off at 8:30 a.m. in Room 305 at the Moscone Center.

In the evening, please join the Adobe Security Team from 6:30 to 9:30 p.m. at Roe Restaurant (10 Hawthorne Street, two blocks from the Moscone Center) for food, drinks, and a lively discussion on the current challenges facing the security industry. Please note that this is a limited capacity event, so please register for this event as soon as possible to save your spot.

TUESDAY, FEBRUARY 28, 2012

Join Adobe’s Kyle Randolph and other participants from EMC, Cigital, Symantec and Microsoft for a panel discussion titled “Making Sense of Software Security Advice: Best vs. Practiced Practices” (ASEC-106) at 1:10 p.m. on Tuesday, February 28, in Room 302. The panel, moderated by EMC’s Reeny Sondhi, will help you make sense of the different software security advice available and discuss how to apply it to your work.

WEDNESDAY, FEBRUARY 29, 2012

If you are an early riser, join me at 8:00 a.m. on Wednesday, February 29, in Room 302 for a panel discussion moderated by Chenxi Wang from Forrester, titled “War Stories: The Good, Bad and the Ugly of Application Security Programs” (ASEC-201). I’ll be participating on the panel along with Doug Cavit from Microsoft and James Routh from JPMorgan Chase &amp; Co. We look forward to your questions and comments!

Afterwards, don’t miss my talk “Never Waste a Crisis – Necessity Drives Software Security Improvements” (ASEC-203), which will take place from 10:40-11:30 a.m. in Room 302. I’ll share some general lessons on both how to prepare for a crisis and what to do once it arrives. And I’ll provide step-by-step instruction on what to do through every phase of a crisis with an eye towards promoting the priority of software security activities throughout.

THURSDAY, MARCH 1, 2012

On Thursday, March 1, I’ll be moderating a SAFECode panel discussion titled “What Motivated My Company to Invest in a Secure Development Program?” (ASEC-301). Other panelists include Steven Lipner from Microsoft, Gunter Bitz from SAP, Janne Uusilehto from Nokia, and Gary Phillips from Symantec. Don’t miss what promises to be a lively discussion from 8:00-9:10 a.m. in Room 302!

We hope to see you at RSA Conference!

Buzz from Kaspersky SAS 2012

Hello world! Karthik here from Adobe Product Security Incident Response Team (PSIRT) engineering. Last week, I got to attend the Kaspersky Security Analyst Summit 2012 in Cancun, which was a melting pot of great security research and ideas. It was wonderful to meet researchers from industry and government and discuss Adobe’s security activities, such as product security incident response and product vulnerability sharing in the Microsoft Active Protections Program (MAPP). Thanks for listening and sharing your ideas. Let’s keep the conversation going.

On a lighter note, Team Adobe—consisting of Brad Arkin, Domingo Montanaro (general manager at iSIGHT Partners Brazil) and me—bagged the “Security Jeopardy” competition at the event on Friday evening. The winning answer only our team could come up with, ironically: “What is ‘zero knowledge.’”

SAS 2012 Security Jeopardy Winners

Until the next conference!

Karthik

Flash Player Sandboxing is Coming to Firefox

Peleus here. In December of 2010, I wrote a blog post describing the first steps towards sandboxing Flash Player within Google Chrome. In the blog, I stated that the Flash Player team would explore bringing sandboxing technology to other browsers. We then spent 2011 buried deep within Adobe laying the groundwork for several new security innovations.

Today, Adobe has launched a public beta of our new Flash Player sandbox (aka “Protected Mode”) for the Firefox browser. The design of this sandbox is similar to what Adobe delivered with Adobe Reader X Protected Mode and follows the same Practical Windows Sandboxing approach. Like the Adobe Reader X sandbox, Flash Player will establish a low integrity, highly restricted process that must communicate through a broker to limit its privileged activities. The sandboxed process is restricted with the same job limits and privilege restrictions as the Adobe Reader Protected Mode implementation. Adobe Flash Player Protected Mode for Firefox 4.0 or later will be supported on both Windows Vista and Windows 7. We would like to thank the Mozilla team for assisting us with some of the more challenging browser integration bugs. For Flash Player, this is the next evolutionary step in protecting our customers.

Sandboxing technology has proven very effective in protecting users by increasing the cost and complexity of authoring effective exploits. For example, since its launch in November 2010, we have not seen a single successful exploit in the wild against Adobe Reader X. We hope to see similar results with the Flash Player sandbox for Firefox once the final version is released later this year. In the meantime, please help us get these protections out to end-users as fast as possible by volunteering to download our beta and help test. Information on known bugs, configuration options and other information can be found on Adobe Labs in the “Getting Started” section.

P.S.: I will be speaking at CanSecWest on this and other exciting topics. I hope to see everyone there!