draft: Introduction to writing capa rules

Writing capa rules should be fun! Here's a great way to get started.

Getting Started

Prerequisites

There are only 2 things required to write a strong capa rule - a local installation of capa and a text editor of your choice.

The capa repo includes an installation instructions document.

Regardless of installation method, it is recommended to have the python files in the scripts directory of the capa repo handy. This would be done by default if following the 3rd method of installation described in the documentation.

Although not a requirement, you may wish to test your new rules against your test sample or a corpus of malware samples. There is a sister repository that houses test files for capa. You can download these files for testing locally using the bulk-process.py script. WARNING: These files are live malware samples, treat them with care.

Capa Basics

Capa rules are structured YAML files. Each rule needs to be placed in its own file. The contents of a rule are nested under the rule key at the beginning of the file. Where the meta and feature keys hold rule metadata and rule logic respectively.

Capa starts by extracting a number of features from a given file. Capa rules are then applied to the list of features generated for the file. Extracted features are split into 3 distinct groups.

Features describing the file as a whole (PE import/exports, PE sections, raw bytes in file, etc.).
Features describing a given function (api calls, instruction mnemonics, strings, offsets, etc.).
Features describing a basic block of code, similar to function features but more granular.

The rule format is described in depth here.

Writing a Rule

Rule Structure

The following keys are required under meta for each rule:

name: The name of the rule that will be displayed in capa output
namespace: This should essentially be the file path from the root of the capa-rules repo to your file. If you created a new rule that will be placed in the capa-rules/c2/shell/ directory, then the namespace of the rule is c2/shell.
author: Your name or email address.
scope: This should be either file, function, or basic block. This tells capa where within extracted features to apply the rule logic. Over the entire file, within any given function, or simply within a block of code.
examples: One or more MD5 hashes of a sample that this rule should fire on. HINT: You can reference specific function addresses or reference files in the capa-testfiles repo by name. View the documentation linked above for details.

There are numerous other optional keys that can be placed in meta. A few popular ones are below:

att&ck: A mapping to a Mitre ATT&CK technique.
references: Any URLs or other resources that can provide further information on the technique this capa rule detects.
mbc: A mapping to a Malware Behavior Catalogue technique.

The feature section of the capa rule contains the rule logic. This section starts with logical declarations about the features nested under them. The logic format is described well here, in an effort to avoid duplication we won't discuss it further in this document. The next portion of the documentation describes each available feature and how to use it.

Be sure to only use features that apply to the scope set for the rule. For example, don't use the api feature if the scope is set to file.

Rule Logic

You may ask yourself, "but how do I know what features my file has?" This is where the great show-features.py script comes in handy. You can provide your sample to this script and it will output a long list of features extracted by capa. Say you have a function of interest already picked out from your analysis efforts. You can search for the address of that function within the show-features output to understand what features capa was able to extract. Rule writing should be straightforward at this point, simply pick which features you'd like to detect and put them in to your new rule!

Testing the Rule

capafmt

It is recommended to run the capafmt.py script against a rule before submitting it to capa. This script does as advertised, rewrites capa rules to fit a consistent format. If a rule has not had capafmt.py run against it, it may fail the linter later on, especially in the CI/CD pipeline, and changes will have to made anyways.

linter

There is a lint.py script provided with capa that checks rule syntax. This is run as part of the CI/CD pipeline so doing this locally before submission is personal preference. The script could also help validate that a fix for a linter issue was truly fixed before committing the change.

Submitting the Rule

Opening a PR

Capa manages rules in its own GitHub repository, capa-rules. This repo is a submodule mapped to the rules directory in the capa repo. To add a rule or rules to the capa-rules repository, create a fork containing the new rule(s) and open a pull request in the fireeye/capa-rules repo to merge the changes.

After this has been done, please do the same with the capa-testfiles repo to add the rule's example file(s). The new rule will fail CI/CD if the capa-testfiles repo does not have the example(s) listed in the rule.

Reviewing Continuous Integration (CI) Results

After a few minutes, you should have CI/CD pipeline results. If you have any errors, you need to resolve these (or comment on the pull request that you're working on it). See Testing the Rule for pro tips on how to avoid most common CI/CD errors.

Efforts have been made to make output detailed, but this also means it is verbose. If it is not clear why a particular failure occurred; search for the string FAIL in the CI/CD output to get more information.

As already mentioned, a rule will not pass the CI/CD if the example(s) aren't in the capa-testfiles repo. If a CI/CD error is being thrown that begins with referenced example doesn't exist:... then this is likely the issue. Please be patient, once the capa-testfiles merge is complete the rule can be merged as long as no other issues are present.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly