How to use

1. Import the dependency

Ways to import the project:

Import the NuGet package.
Build and reference the OWASP.AntiSamy DLL in your project.

2. Choosing a base policy file

Chances are that your site's use case for AntiSamy is at least roughly comparable to one of the predefined policy files. They each represent a "typical" scenario for allowing users to provide HTML (and possibly CSS) formatting information. Let's look into the different policy files:

1) antisamy-slashdot.xml

Slashdot is a techie news site that allows users to respond anonymously to news posts with very limited HTML markup. Now, Slashdot is not only one of the coolest sites around, it's also one that's been subject to many different successful attacks. The rules for Slashdot are fairly strict: users can only submit the following HTML tags and no CSS: <b>, <u>, <i>, <a>, <blockquote>.

Accordingly, we've built a policy file that allows fairly similar functionality. All text-formatting tags that operate directly on the font, color or emphasis have been allowed.

2) antisamy-ebay.xml

eBay is the most popular online auction site in the universe, as far as I can tell. It is a public site, so anyone is allowed to post listings with rich HTML content. It's not surprising that given the attractiveness of eBay as a target that it has been subject to a few complex XSS attacks. Listings are allowed to contain much more rich content than, say, Slashdot -- so it's attack surface is considerably larger.

3) antisamy-myspace.xml

MySpace was, at the time this project was born, the most popular social networking site. Users could submit pretty much all the HTML and CSS they wanted -- as long as it didn't contain JavaScript. MySpace was using a word blacklist to validate users' HTML, which is why they were subject to the infamous Samy worm. The Samy worm, which used fragmentation attacks combined with a word that should have been blacklisted (eval) - was the inspiration for this project.

4) antisamy-anythinggoes.xml

I don't know of a possible use case for this policy file. If you wanted to allow every single valid HTML and CSS element (but without JavaScript or blatant CSS-related phishing attacks), you can use this policy file. Not even MySpace was this crazy. However, it does serve as a good reference because it contains base rules for every element, so you can use it as a knowledge base when using tailoring the other policy files.

3. Tailoring the policy file

You may want to deploy OWASP AntiSamy .NET in a default configuration, but it's equally likely that a site may want to have strict, business-driven rules for what users can allow. The discussion that decides the tailoring should also consider attack surface - which grows in relative proportion to the policy file.

Since v1.2.0 policy examples are NOT included in your project. One must be manually added and tailored based on the example policies present in this repository.

You may also want to enable/modify some "directives", which are basically advanced user options. Supported directives are the specified in the wiki.

More directives are supported on the Java-based project. For more detailed information on directive declaration and general policy format, inspect the example policies. Note: Every input policy will be validated by AntiSamy against the only defined schema.

4. Calling the OWASP AntiSamy .NET API

Using OWASP AntiSamy .NET is easy. Here is an example of invoking AntiSamy with a policy file:

using OWASP.AntiSamy.Html;

Policy policy = Policy.GetInstance(POLICY_FILE_LOCATION);

var antiSamy = new AntiSamy();
CleanResults results = antiSamy.Scan(dirtyInput, policy);

MyUser.StoreHtmlProfile(results.GetCleanHTML()); // Some custom function

There are a few ways to create a Policy object. The GetInstance() method can take any of the following:

A string filename.
A FileInfo object.
A Stream object.
Policy files can also be referenced by filename by passing a second argument to the AntiSamy.Scan() method as the following examples show:

var antiSamy = new AntiSamy();
CleanResults results = antiSamy.Scan(dirtyInput, policyFilePath);

5. Analyzing CleanResults

The CleanResults object provides a lot of useful stuff.

GetCleanHTML() - the clean, safe HTML output.
GetErrorMessages() - a list of String error messages. -- if this returns empty that does not mean there were no attacks!
GetNumberOfErrors() - the error message count in case the list itself is not needed. -- if this returns 0 that does not mean there were no attacks!
GetScanTime() - returns the scan time in seconds.
GetStartOfScan() and GetEndOfScan() - return the each DateTime object in case it is needed.

Important note: There has been much confusion about the GetErrorMessages() method. The GetErrorMessages() method (nor GetNumberOfErrors()) does not subtly answer the question "is this safe input?" in the affirmative if it returns an empty list. You must always use the sanitized input and there is no way to be sure the input passed in had no attacks.

The serialization and deserialization process that is critical to the effectiveness of the sanitizer is purposefully lossy and will filter out attacks via a number of attack vectors. Unfortunately, one of the tradeoffs of this strategy is that AntiSamy doesn't always know in retrospect that an attack was seen. Thus, the GetErrorMessages() and GetNumberOfErrors() APIs are there to help users understand whether their well-intentioned input meets the requirements of the system, not help a developer detect if an attack was present.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly