bengillies.net

a blog by Ben Gillies

Validating Comments with reCAPTCHA

Since I enabled comments on my site, I've been getting quite a lot of spam. Now, this is a problem that's existed all over the web now for quite a long time. What's more, it's been solved quite often before, usually requiring log in, or some sort of CAPTCHA validation to ensure that whoever is posting a comment is actually a human being.

While TiddlyWeb supports log in off the bat, I'm not a huge fan of requiring it just to post a comment, as logging in seems, to me at least, a bit unnecessary when all you want to do is leave one comment on a site.

With that in mind, I've gone down the CAPTCHA route, and implemented a validator that supports the popular reCAPTCHA validator that you've probably seen all over the rest of the web.

To use it, you'll need to grab it from my GitHub repository here. Then you can add it to system_plugins in your tiddlywebconfig.py file. As its an external service, you'll then need to sign up to it at http://recaptcha.net/api/getkey in order to get the requisite private and public keys.

Once you've signed up, add your private key to the tiddlywebconfig.py file like so:
    'recaptcha_private_key': '<private_key>'

Then, wherever you need to use it, add the following HTML:
<script type="text/javascript"
src="http://api.recaptcha.net/challenge?k=<your_public_key>">
</script>
 
<noscript>
<iframe src="http://api.recaptcha.net/noscript?k=<your_public_key>"
height="300" width="500" frameborder="0"></iframe><br>
<textarea name="recaptcha_challenge_field" rows="3" cols="40">
</textarea>
<input type="hidden" name="recaptcha_response_field"
value="manual_challenge">
</noscript>

Making sure to substitute <your_public_key> for the public key you got when you signed up (its in there twice).

Finally, it's a validator, so don't forget to set your accept policy.

Validating TiddlyWeb

Recently, I've been looking at TiddlyWeb with a view to securing content and protecting from cross site scripting attacks. Luckily, this is all pretty simple in TiddlyWeb, which has the concept of Validators built right in and ready to extend at will.

Validators are essentially run on any tiddler that has been PUT to a bag by a user without "accept" permission. By "accept" permission, I mean that the accept field in the policy file for that bag has been set, but does not contain their username or role. You can have any number of validators running, and load in more by using the standard TiddlyWeb plugin model.

So, onto my Validators. I have written 2 validators: One for generic HTML, and one specifically for TiddlyWebWiki.

The HTML Validator is based on the Beautiful Soup package for Python, and works by providing a whitelist of allowed HTML tags and attributes, and simply removing anything that doesn't match it. The code is rather simple (thanks to Beautiful Soup), and has been taken and modified slightly from here. To use it, add or remove any tags/attributes as required from the two lists in the plugin, and add to system_plugins in tiddlywebconfig.py as usual.

The TiddlyWebWiki validator, as you might expect, is even simpler. All it does is remove the systemConfig tag (if present) and reject any tiddlers whose name is in a blacklist (MarkupPreHead for example). Again, you can add to this list as necessary.

That's it, hopefully you'll find one or the other useful.

Both plugins are available on my GitHub account.

html_validator.py - validate incoming tiddlers against a whitelist of allowed tags/attributes
tiddlywiki_validator.py - validate incoming tiddlers against a blacklist of tiddler titles and remove systemConfig tags