-
Notifications
You must be signed in to change notification settings - Fork 163
draft: Introduction to writing capa rules
Writing capa rules should be fun! Here's a great way to get started.
There are only 2 things required to write a strong capa rule - a local installation of capa and a text editor of your choice.
The capa repo includes an installation instructions document.
Regardless of installation method, it is recommended to have the python files in the scripts
directory of the capa repo handy. This would be done by default if following the 3rd method of installation described in the documentation.
Although not a requirement, you may wish to test your new rules against your test sample or a corpus of malware samples. There is a sister repository that houses test files for capa. You can download these files for testing locally using the bulk-process.py script. WARNING: These files are live malware samples, treat them with care.
Capa rules are structured YAML files. Each rule needs to be placed in its own file. The contents of a rule are nested under the rule
key at the beginning of the file. Where the meta
and feature
keys hold rule metadata and rule logic respectively.
Capa starts by extracting a number of features from a given file. Capa rules are then applied to the list of features generated for the file. Extracted features are split into 3 distinct groups.
- Features describing the file as a whole (PE import/exports, PE sections, raw bytes in file, etc.).
- Features describing a given function (api calls, instruction mnemonics, strings, offsets, etc.).
- Features describing a basic block of code, similar to function features but more granular.
The rule format is described in depth here.
The following keys are required under meta
for each rule:
-
name
: The name of the rule that will be displayed in capa output -
namespace
: This should essentially be the file path from the root of the capa-rules repo to your file. If you created a new rule that will be placed in thecapa-rules/c2/shell/
directory, then the namespace of the rule isc2/shell
. -
author
: Your name or email address. -
scope
: This should be eitherfile
,function
, orbasic block
. This tells capa where within extracted features to apply the rule logic. Over the entire file, within any given function, or simply within a block of code. -
examples
: One or more MD5 hashes of a sample that this rule should fire on. HINT: You can reference specific function addresses or reference files in the capa-testfiles repo by name. View the documentation linked above for details.
There are numerous other optional keys that can be placed in meta
. A few popular ones are below:
-
att&ck
: A mapping to a Mitre ATT&CK technique. -
references
: Any URLs or other resources that can provide further information on the technique this capa rule detects. -
mbc
: A mapping to a Malware Behavior Catalogue technique.
The feature
section of the capa rule contains the rule logic. This section starts with logical declarations about the features nested under them. The logic format is described well here, in an effort to avoid duplication we won't discuss it further in this document. The next portion of the documentation describes each available feature and how to use it.
Be sure to only use features that apply to the scope set for the rule. For example, don't use the api feature if the scope is set to file.
You may ask yourself, "but how do I know what features my file has?" This is where the great show-features.py script comes in handy. You can provide your sample to this script and it will output a long list of features extracted by capa. Say you have a function of interest already picked out from your analysis efforts. You can search for the address of that function within the show-features output to understand what features capa was able to extract. Rule writing should be straightforward at this point, simply pick which features you'd like to detect and put them in to your new rule!
It is recommended to run the capafmt.py script against a rule before submitting it to capa. This script does as advertised, rewrites capa rules to fit a consistent format. If a rule has not had capafmt.py run against it, it may fail the linter later on, especially in the CI/CD pipeline, and changes will have to made anyways.
There is a lint.py script provided with capa that checks rule syntax. This is run as part of the CI/CD pipeline so doing this locally before submission is personal preference. The script could also help validate that a fix for a linter issue was truly fixed before committing the change.
Capa manages rules in its own GitHub repository, capa-rules. This repo is a submodule mapped to the rules
directory in the capa repo. To add a rule or rules to the capa-rules repository, create a fork containing the new rule(s) and open a pull request in the fireeye/capa-rules repo to merge the changes.
After this has been done, please do the same with the capa-testfiles repo to add the rule's example file(s). The new rule will fail CI/CD if the capa-testfiles repo does not have the example(s) listed in the rule.
After a few minutes, you should have CI/CD pipeline results. If you have any errors, you need to resolve these (or comment on the pull request that you're working on it). See Testing the Rule for pro tips on how to avoid most common CI/CD errors.
Efforts have been made to make output detailed, but this also means it is verbose. If it is not clear why a particular failure occurred; search for the string FAIL
in the CI/CD output to get more information.
As already mentioned, a rule will not pass the CI/CD if the example(s) aren't in the capa-testfiles repo. If a CI/CD error is being thrown that begins with referenced example doesn't exist:...
then this is likely the issue. Please be patient, once the capa-testfiles merge is complete the rule can be merged as long as no other issues are present.