Scheming

In the realm of security, “trust” is generally defined as “something acting the way you expect it to act.” By that definition, URI schemes (http://en.wikipedia.org/wiki/URI_scheme) have a checkered history at best.

Many developers tend to assume that a URL simply refers to a network data loading protocol, such as HTTP, HTTPS, and FTP. However, it been extended over time to include:
– local file I/O (file:)
– network file I/O (smb:, nfs:)
– local application integration (mailto:, various IM & streaming media schemes)
– local operating system integration (generally very dangerous stuff like shell:, vshelp:, local:)
– actual code generation/injection schemes that can create JavaScript within the browser (javascript:, data:).

Flash in particular also supports the asfunction: scheme, which can be used to call local ActionScript functions within your SWF.

If you are building an application that handles URLs, you should be aware of your responsibility to ensure that any URLs that are passed to the application and acted upon–either by having the application loading it directly or by the user clicking on it–are handled appropriately.


The best way to do so is to enforce a whitelist. That is, identify which schemes your application should permit and deny everything else. For example, if your goal is to permit people to link back to their blogs or other websites, you may want to only permit URLs that begin with “http://” and “https://”. If you’d like to also allow people to subscribe to link to streaming music or videos, or click on emails, you can add the relevant schemes to the whitelist.The other alternative is to implement a blacklist, which means to identify only the bad schemes you want to block and permit everything else by default. This is problematic for a few reasons. First of all, it can be difficult to catch all permutations of escaping and encoding of scheme names, so its often possible to fool blacklists. This can be as simple as just using URL-encoding on one or more characters in the scheme name, for example using “javascript%3A” instead of “javascript:”. These attacks get dramatically more complex from there, so I won’t try to exhaustively list them here.But even worse is the problem of actually identifying all of the untrustworthy schemes (untrustworthy being defined as acting in a manner undesirable or otherwise unexpected given your application). You can start by looking at the list of know schemes here: http://en.wikipedia.org/wiki/URI_scheme#Official_IANA-registered_schemes and http://esw.w3.org/topic/UriSchemes/.But suffice it to say, be careful when processing any data that contains URLs, and don’t just rely on the browser to protect your applications.