Brave malware analysts at Yelp have spent a lot of time looking at the digital forensics from potentially infected macOS systems, gathered using our open source project, OSXCollector.

Early on, we automated parts of the analysis process, augmenting the initial set of digital forensics collected from the machines with the information gathered from the threat intelligence APIs and internal blacklists. This involved identifying potentially suspicious domains, URLs and file hashes but our approach to the analysis still required a certain degree of configuration and manual maintenance which was tedious for the malware response team.

In this blog post I will explain how we identified and automated the manual effort needed to analyze digital forensics collected from potentially infected macOS machines. I will also introduce AMIRA: Automated Malware Incident Response and Analysis, which we have open sourced on GitHub.

AMIRA logo

AMIRA logo

(Don’t) Repeat Yourself

Our traditional approach to malware incident response process used to start with taking a machine we suspected to be infected off the network and collecting digital forensics by running OSXCollector. This task was performed by our incredible HelpDesk ninjas as they had the best outreach to the users in terms of different office locations.

Once the OSXCollector output was obtained the HelpDesk engineer had to attach it to the malware case in our incident response platform so that the analysts from the security team could take a look and asses the risk, e.g. confirm that the machine is infected. The analysts would execute the Analyze Filter to get the full overview of what happened on the machine in order to determine how the malware possibly got there. The Analyze Filter checks various threat intel sources, like VirusTotal and OpenDNS Investigate, as well as internal blacklists and whitelists for known bad domains and file hashes (see our previous blog post for more information).

The time it takes to run the whole suite of the Output Filters depends on the size of the original OSXCollector output file and can take anywhere from a few minutes to several hours. During this time our engineers usually switched to some other tasks and then periodically checked the status of the analysis. If there were several malware cases to investigate, each analysis had to be run sequentially, in order to not exhaust the quota limits put on the threat intel APIs. Worse still, it was easy to interrupt the analysis if the machine that the analyst used to run the Output Filters on went to sleep or was otherwise disconnected from the network. This was not an ideal way of analyzing malware alerts efficiently.

Automating repetitive tasks

Enter automation: we enabled faster forensic collection and analysis by scripting repetitive tasks which also means fewer possibilities to make a mistake. We went ahead and turned OSXCollector and its awesome Output Filters into AMIRA: Automated Malware Incident Response and Analysis.

AMIRA is a service that turns the forensic information gathered by OSXCollector into an actionable response plan, suggesting the infection source as well as suspicious files and domains that require a closer look. Furthermore, we integrated AMIRA with our incident response platform, making sure that as little interaction as necessary is required from the analyst to follow the investigation.

Our malware responders do not need to spend any more time configuring and running the analysis filters on their own. AMIRA takes care of it and publishes a neat report summarizing the findings, so they can simply grab a cup of coffee and read it like a morning newspaper.

AMIRA inner workings

The service uses the S3 event notifications to trigger the analysis. We have configured a bucket for the OSXCollector output files so that when a file is added there the notification is sent to an SQS queue (AmiraS3EventNotifications in the picture below). AMIRA periodically checks the queue for new messages, and upon receiving one it will fetch the file from the S3 bucket. Then, it will run the Analyze Filter on the OSXCollector output file.

The Analyze Filter runs all the filters contained in the OSXCollector Output Filters package sequentially. Some of them communicate with the external resources, like domain and hashes blacklists (or whitelists) and threat intel APIs, e.g. VirusTotal, OpenDNS Investigate or ShadowServer. The original OSXCollector output is extended with all of this information and the very last filter run by the Analyze Filter summarizes all of the findings into a human-readable form. After the filter finishes running, the results of the analysis will be uploaded to the Analysis Results S3 bucket.

The overview of the whole process and the system components involved in it are depicted below:

Running AMIRA

AMIRA is written in Python, so the only prerequisites you will need to run it on your system are Python 2.7 and pip. You will also need an S3 bucket and an SQS queue configured to receive notifications about new objects created in the bucket. Another S3 bucket will be also necessary to store the results of the analysis. Check out AMIRA GitHub repository to read about the necessary configuration and how to run the automated analysis in more depth.

Integrating AMIRA with remote forensics collection

AMIRA takes all of the responsibility for configuring the Output Filters from analysts, but there is still some room for improvement to streamline the whole malware response process. The biggest delays and the most manual effort is wasted during the initial part of the response process, when we need to physically collect the machine to run the OSXCollector and then upload the results to the S3 bucket, to trigger AMIRA via the S3 event notifications mechanism.

To cut back on that time we have written a small shell script that executes OSXCollector and uploads the analysis results to the S3 bucket:

#!/bin/bash
# based on http://tmont.com/blargh/2014/1/uploading-to-s3-in-bash

# set "bash strict mode" from:
# http://redsymbol.net/articles/unofficial-bash-strict-mode/
set -euo pipefail

file="$1"
bucket="$2"
echo "Uploading file $file to bucket $bucket"
resource="/${bucket}/${file}"
contentType="application/x-compressed-tar"
dateValue=`date -u +"%a, %d %b %Y %T GMT"`
stringToSign="PUT\n\n${contentType}\n${dateValue}\n${resource}"
s3Key="$3"
s3Secret="$4"
signature=`echo -en ${stringToSign} | openssl sha1 -hmac ${s3Secret} -binary | base64`

curl -X PUT -T "${file}" \
    -H "Host: ${bucket}.s3.amazonaws.com" \
    -H "Date: ${dateValue}" \
    -H "Content-Type: ${contentType}" \
    -H "Authorization: AWS ${s3Key}:${signature}" \
    "https://${bucket}.s3.amazonaws.com/${file}" | cat

We deploy this script via our enterprise asset management software on any individual machine that is under investigation. This saves malware responders and HelpDesk engineers a lot of hassle when compared to the previous process which required chasing down the employees and acquiring the machine to launch OSXCollector (or, for remote employees, even shipping the laptop in from overseas!).

Integrating with the incident response platform

On the other hand, we didn’t want to require our analysts to fetch the analysis results from the S3 bucket on their own each time the analysis was completed. First of all, they would have to monitor the bucket for the new files themselves, or set up the S3 event notifications for the Analysis Results bucket as well. Second, they would need to get the actual results summary file using something like AWS Command Line Interface, which meant they will need to manage the AWS credentials on their machines - an additional set up step we wanted to avoid by introducing AMIRA.

We integrated AMIRA with our incident response platform. Each time the analysis is finished, the original OSXCollector output as well as the analysis results are attached to the related malware incident response case. This keeps things neat and clean, as the malware analysts do not have to leave the incident response environment to fetch forensics analysis results. Another few minutes saved and another win for the automation.

Adding up all these precious seconds saved

AMIRA helped us to save a lot of time during our malware incident response. In some cases it cut our time in responding to an incident from several hours to just minutes. It also offloaded much of the work from our HelpDesk team related to physical acquisition of the machine for forensics collection. Additionally, we can proactively run OSXCollector on the machines we suspect to be infected more often, as the cost of collecting and analyzing digital forensics decreased so dramatically.

Thanks to all this automation, the incident response team members can focus on what they excel at: finding unusual patterns and novel ways that malware sneaks onto our corporate infrastructure.

Become an Information Security Engineer at Yelp

If you're interested in building tools like AMIRA that help us secure Yelp and its users, apply to become an Information Security Engineer.

View Job

Back to blog