Archive for August, 2009

Yes, I'm metablogging again

Monday, August 24th, 2009

Recently, I got a piece of blogspam emploring me to turn off "no follow" on my comment links in order to increase my blog's readership. The "rel='nofollow'" attribute on a link (for those who don't know) means "I do not endorse this link - it was added by a user of the site, and may be spammy in nature", and is used by search engines (Google in particular) to mean that it should not contribute to the linked page's pagerank.

My first reaction was revulsion that spammers would use such a down-right dirty tactic. But then I realised that if spammers are doing this, it means that nofollow is working - it's getting harder to get non-nofollow links to your site via spam, and therefore harder to fraudulently increase your pagerank. This is a very good thing for bloggers because if spamming blogs is no longer effective, it will stop. So (in order to counteract any damage that this "disable nofollow" spam might do, I encourage bloggers and operators of other sites where users can add links to ensure that the "rel='nofollow'" attribute is placed on all links that you haven't personally checked and endorsed. It has no effect either way on your readership (only producing quality content that people want to read can do that) but it makes the efforts of spammers useless. Also beware of anyone coming out of the blue and wanting you to link to their site (even if they're offering reciprocal linking) - chances are they're SEO spammers.

Another Argulator feature that never was

Sunday, August 23rd, 2009

One Argulator feature I was, at one point, absolutely convinced that I needed was a web page cache/archive. The trouble is that one often wants to link to other sources of information when making statements (for example to cite a source or backup some assertion). On the web the natural way to do that is to use a link to a URL elsewhere on the web. But URLs are fickle things - the resource they point to can change with time or disappear altogether (something I've always tried to avoid happening with this site - I believe pretty much every URL that was ever valid is still valid today and points to the same information, even if it has been tweaked a bit over the years).

If you're referring to a URL in an Argulator statement, though, that's a problem because a statement's meaning is never supposed to change once it's been created. If it could change, you're putting words into the mouths of all the people who have expressed an opinion on that statement. So I wanted a way to make sure that web pages mentioned in Argulator statements never changed or disappeared, which meant bringing them under the control of the Argulator server. The idea was that when someone mentioned a URL, the Argulator server would fetch that URL, along with the graphics and CSS required to make the page display properly, just as if it was a web browser. Then it would store these files on the server and show it's cached copy (much like Google's cache or the Internet Archive does) when a link is clicked.

But that turns out to be surprisingly difficult, and opens up a whole cannery of worms. Security is a big problem. To ensure nothing bad is ever served from the Argulator site we need to strip out all executable content (script, Java, ActiveX, flash, XUL, Silverlight etc.) from the pages. Which means parsing all the CSS and tag attributes. After that treatment the page might not render at all (especially if it contains things like protection from ad-blockers). Such a cache should also respect the directive and refuse to cache pages whose authors do not want them cached.

Then what do we do if someone requests we take down a cached page (maybe it contains illegal content, or the content owner complains on copyright grounds)? We don't want to create work for ourselves, but on the other hand we don't want to make it too easy for people to sabotage statements they disagree with by getting that statement's sources removed.

I considered the possibility of just linking to pages on the Internet Archive but that may stifle discussions of current events since the IA doesn't serve any pages less than 6 months old. But the idea of linking to an external site instead of keeping the pages on Argulator got me thinking. Such a cache may be useful for other things besides Argulator, so maybe the cache should be a separate (but associated) site. Realizing that such a site would be useful in its own right got me to wondering why nobody had done that before, which got me to realizing that maybe they had. A few google searches later I found several sites which did exactly that. I then realized that Argulator should stick to its core competency and leave the web caching to the experts. Statements can link to any URL anywhere on the internet, but it is recommended for the sake of stability to link to one of these caching services. I hope Argulatists will realize that URLs pointing elsewhere might not have the same contents as they had when the statement was originally created, and take such statements with the appropriate amount of salt.

Why Argulator requires users to create an account before creating statements

Saturday, August 22nd, 2009

One thing I wanted to do with Argulator but didn't quite manage was to eliminate the login barrier - to allow people to use any part of the site without creating an account first. The idea was to essentially put all the site's data under multi-headed version control, and then put each temporary user's changes in its own branch, so that they wouldn't affect logged-in users or other temporary users. When a temporary user did decide to create an account, then those changes would be merged into the main branch.

The trouble with that is that arbitrary merges are very difficult. What happens if two temporary users happen to create identical statements (or statements which are similar enough that they would be duplicates if they had both been created by logged-in users)? What happens if a statement is deleted in one branch and created in another? Adding UI for resolving arbitrary merge conflicts would be difficult enough, but forcing users to go through it when they create an account would be downright user-hostile.

I've therefore come to the conclusion that the principle of "you can do anything without creating an account" only works for sites where users' actions don't really affect other users (such as a shopping site).

Error handling in Argulator

Friday, August 21st, 2009

One interesting thing I had to think about for Argulator was what to do with error messages. There are three types of errors that can occur in applications such as these when classified by the action that needs to be taken.

The first type of errors are things that real users can run into, and which are not problems with the program itself. For example, visiting a statement's page and finding that it has been deleted. These deserve user-friendly error messages and are usually pretty straightforward.

The second type are validation errors, for example someone trying to attempt SQL injection. There's no point giving a user-friendly error message for these as long as we're sure real users can't run into it, for example if we're doing the same validation on the client side, so these are generally handled just with a "die" call.

The third type are programming errors on my part (assertion failures, essentially). These also deserve user-friendly error messages (without details, just a generic "uh oh, something went wrong, we're on it") but they also send me email with some details so that I can try to figure out what's going wrong.

The trouble is it's not always easy to tell what is the second type and what is the third type. Some things seem like they would have to be programming errors but maybe there's some validation missing which means that they could be triggered by invalid input. Some things seem like they could only be triggered by invalid input but perhaps there's a programming error which could cause them to be exposed to real users.

I hope I haven't made too many mistakes classifying these error messages. When choosing whether a particular error is a type 2 or a type 3, I've tried to err on the side of type 3 because the consequences of getting it wrong are less severe (a type 2 misclassified as a type 3 just results in me getting spam, a type 3 misclassified as a type 2 results in me not getting notified, and the user getting an unfriendly error message). Please let me know if you see any error messages in Argulator that don't seem particularly friendly (for example, just text on an empty page instead of a popup div with a button on top of a normal page).

Other principles and innovations in Argulator

Thursday, August 20th, 2009

Some other little bits of Argulator that I'm quite proud of:

  • Pieces of UI that would be represented as a modal dialog in a desktop application (such as the search results, or the create user dialog) use an HTML div with absolute positioning, that "floats" above the rest of the page. Behind this is a window-filling div with a semi-transparent background, which causes the rest of the page to darken, focussing the user's attention on the dialog. The popup "window" can be dragged with the mouse and scrolling still works, so the rest of the page can still be read if one needs to refer to it during the dialog interaction.
  • When there is a script error, it is caught and transmitted to the server with an AJAX request. The server then emails it to me so that I can fix the bug. If this fails, we do nothing in order to avoid an infinite loop. I wish more sites would do this - there is so much broken Javascript out there it's pretty much impossible to surf normally with Javascript debugging switched on. This means I either need to switch my browser between "debugging mode" and "surfing mode" when starting/stopping work on Argulator, or use a different browser for debugging than I use for surfing.
  • Argulator uses 96-bit random character strings (encoded as 16 characters of 6 bits each) all over the place, for cookies, pseudo-cookies (for preventing cross-site request forgery attacks), salts and email authentication. These use few enough characters that a URL containing one easily fits on one line of an email message, but have enough entropy to make them unguessable for security purposes. They are generated using /dev/urandom if available, otherwise they are generated with a cryptographic hash, an entropy pool, and the current time.

Speaking of cryptography, Argulator's main cryptographic shortcoming is that the user's password is sent across the wire in plaintext (in a POST request, so it doesn't show up in server logs or caches). I thought seriously about hashing it on the client side (implementing a cryptographic hash function in Javascript). But in the end I decided that this was too much effort for too little gain. The danger of having your password stolen comes mostly from malware and keystroke loggers on the client side, which this does nothing to protect, or having your password stored in plain text in the database (it isn't - we salt it and hash it before storing it). Lots of other sites also send passwords in plaintext. And even if I did implement this, most users would not even be able to tell - really the way to solve this is to use https instead of http for authentication, which I will do if Argulator gets sufficiently popular that I can justify the expense. Then you'll get the little padlock icon in your web browser and you know you're secure. Until then, just don't use the same password for Argulator that you use for important things like your bank account.

How Argulator presents statements

Wednesday, August 19th, 2009

Even before you get around to creating any arguments there is the problem of how to display a complex network of statements in a user-friendly way. Part of the trouble here is that in general there may be a lot of statements in play at any given time, and a lot of different relationships between different statements. How do we present these relationships in a way that makes intuitive sense? And how can you be sure that the statement you're looking it is the one you think it is?

Programmers deal with structured information like this all the time. They have two major techniques for doing so - one is using nested sets of parentheses (or other brackets) and the other is naming some substructure and then using that name elsewhere. Both of these techniques are rather unintuitive to non-programmers and require some thought to use - if you're using brackets you have the problem of mismatching them, and with naming you have the problem of "okay, so c is a+b, where is c defined again?"

My "aha" moment was coming across the inline-block display style in HTML. This clever device allows one to have a piece of HTML which acts like an inline element from the outside and block from the inside. This means that 2 such blocks can be composed in (at least) 2 different ways: side-by-side and one-inside-the-other, and the web browser does the hard work of figuring out where to put each block and how big to make it. These relationships can be composed arbitrarily, and the blocks scale very well from tiny (just enough to hold a single identifying number) to so large you have to scroll the web page to see the entire contents. The only downside is that they don't work properly in FireFox 2 - I think this is now sufficiently rare to not be worth bothering with, but there may be a hack to get inline-block working in this browsers if there is demand.

So a rectangular block with a thin raised border became my "UI cue" that says "hey, this is a statement - you can go and give your opinion about this or argue for/against it". The block's border also acts as a set of parentheses of a sort - containing everything within it. An argument block, for example, encapsulates subblocks showing the argument it is arguing for or against and the substatements that make up the argument - these statements appear right there in the block rather than being incorporated by reference (avoiding naming problems) and the parentheses are automatic, implicit and intuitive.

The next problem is making the statements visually distinctive, so you can see at a glance which statements are which, while keeping them similar enough that anyone can see that they're all variations on the same basic object. There are several ways this is done. The most obvious is the background/border colour. Most statements have a white background and a grey border, but "special" statements (those that have some meaning to Argulator itself) are coloured - green for "for" arguments, red for "against" arguments and blue for deletion statements. This colour scheme reflects a colour scheme that can be found throughout Argulator: red also means "disagree" and "refuted", green also means "agree" and "corroborated" and blue also means "neither agree nor disagree". Colour vision is not a requirement for being able to use Argulator, but the extra visual clues in the colours should be helpful for those who can see them.

Since becoming a professional software engineer, one thing I have become very familiar with is bug databases. Bug databases are great because every bug is given a unique number, and if one wants to refer to a particular bug, one can just refer to it by its number. The person you're communicating with might have to look the bug up in the database, but it's still much more concise and accurate to say "bug 123456" than "that crashing bug Bob found last week". One often gets to know the numbers of the bugs that are currently important (like the phone numbers of close friends) - when this happens, one doesn't even need to look in the bug database to see which bug is being referred to. However, as an aide memoire when I'm writing an email about a bug I always try to remember to paste in the title as well as the bug number.

Argulator has a similar principle - every statement block has a number inside it, towards its left-hand side (or top-left corner if it spans multiple lines). This number uniquely identifies the statement. If you know a statement's number N you can visit its page just by typing the URL:, refer to it in a statement you're creating by typing {N} or use it in an argument by using its number as the substatement text. You can also negate a statement's meaning by negating its number, so anywhere you can use a statement you can also use the negation of that statement.

In a statement's block, the number serves two other purposes as well as uniquely identifying which statement you're looking at:

  1. The colour that is used to render the number gives a quick indication of the "average" opinion that people have expressed on that statement. As well as red, green and blue mentioned above, the colours can combine - red and green combine to form yellow meaning "polarizing", green and blue combine to form cyan meaning "uncontroversial" and red and blue combine to form magneta meaning "unpopular". Other hues and saturations represent different proportions of "agree", "disagree" and "neither agree nor disagree" opinions. Lighter colours mean more opinions, darker colours mean fewer. This is another visual hint that the statement is the one you're looking at.
  2. The number itself is a link that takes you to the statement's page, where you can see much more information about it, express your opinion on it and argue for or against it.

There are several other mnemonic cues next in a statement's block next to the statement's number - a coloured dot showing your opinion, an eye icon if it's a statement you're watching, and a red R or a green C to show if it's been refuted or corroborated. Together, these cues are a dense source of information about a statement but their deeper purpose is give each statement a unique visual "personality" richer than just its statement number alone. I hope that these cues will make it easy to tell at a glance that a given statement is the one think it is, even if you don't remember the number exactly.

There is one more aspect of statement block display, which is expanding and collapsing. Each statement block has a small black triangle in it after the number. The triangle points left if the statement is expanded (and its text is visible) and right if it is collapsed to just the number and icons. Clicking on this triangle toggles between the two states. This allows you to see the big picture at a glance (when statements are collapsed) or drill down to see the details if you want to. This bears some similarities to the "zooming user interface", another concept from "The Humane Interface".

Argulator's user interface

Tuesday, August 18th, 2009

One of the key problems I identified when first planning Argulator was how to make an interface which was sufficiently easy to use given the fundamental complexity of the tasks involved in manipulating a complex network of statements. In particular, UI for creating an argument was a big sticking point. An argument has several parts (at least one but as many as you like) each of which is a statement. When creating an argument, you're actually creating a pile of statements at once. However, some of those statements may already exist (once Argulator's database is sufficiently well-populated, I expect most arguments will recycle at least one statement). However, chances are that they use slightly different phrasing or spelling, so just checking for identical text is a bit of a non-starter. And if the database gets full of nearly-duplicate statements, Argulating will get very annoying - one would end up refuting the same thing 20 different times with slightly different phrasing each time.

To minimize duplicates, I realized that I wanted to force users to search for existing statements before creating a new one. This means doing some things on the server for each statement. If I used classic CGI this would be quite an annoying, unresponsive user experience:

  • Type in statement. Click to search. Wait for page to reload.
  • Click on search result to go back to argument creator. Wait for page to reload.
  • Once all statements are in place, click to create the argument. Wait for page to reload.

I almost implemented this but I thought the site would suffer greatly from being so clunky. On the other hand, I didn't want to make it a desktop application either as that would make it that much more difficult for people to try out.

The answer came in the form of AJAX. This was just starting to take off, but it was a while before it seemed to be something practical for mere mortals like me to implement, rather than deep magic that could only be done in something as complex as Outlook Web Access or Google Maps. Once I realized I could implement an AJAX application myself, things started to pick up. The sequence would be the same but with no page reloads the user experience would be much smoother and more intuitive.

Having decided to use AJAX (and make it a requirement for using the site) it made sense to AJAXify everything in sight - page reloads hardly ever happen in Argulator (except when you go to a different URL like the page for a different statement, user or definition). The main manipulations (setting an opinion, creating an argument, and creating a deletion statement) all happen in-page. Even logging in and out doesn't cause a reload. Whenever I thought of a way to use Javascript to make the interface more interactive, I did so. I think and hope the resulting interface is a pleasure to use. Whatever you're trying to achieve with the site, the next step should always be obvious - please let me know if it isn't.

One other UI principle I've tried to stick with is one from Jef Raskin's book "The Humane Interface" - there are no "are you sure" confirmation dialogs in Argulator, and just about every action is easy to undo (even deleting one's account can be undone for a year).

Introducing Argulator

Monday, August 17th, 2009

A while back I wrote about an idea I had for a website which allows arguing over the internet.

Well, it's taken almost 4 years but it's finally alive! The site is at It's a little different from how I originally envisioned it in some ways (having evolved somewhat over the years) but has many of the features I originally described.

Over the next week I'll be posting a series of articles about the development of the site.

Why is TV static in black and white?

Sunday, August 16th, 2009

Have you ever looked at the static that appears on a TV screen when it isn't tuned into anything? If so, you might have noticed that it's in black and white. That fact always used to puzzle me - the patterns are random so surely all the colours that can appear on the TV should be equally likely, right?

It wasn't until fairly recently that I learned why this is. Colour TV signals are a little bit different than black-and-white TV signals - a certain frequency band within the signal is used to transmit colour information. That band corresponds to high frequency horizontal detail (patterns about 1/200th of the width of the screen). In a colour TV signal, those details are elided and the information used to carry hue and saturation information instead.

However, if you're watching a black and white programme you can get a sharper picture by using those frequencies for horizontal detail. So colour TV sets were designed to have two "modes" - colour mode and black-and-white mode. A "colour burst" signal is broadcast in an otherwise unused part of the signal which has the dual purposes of signalling that colour information is available, and calibrating the correct hue phase offset (the "colour burst" signal, if it were on screen and within gamut, would be a very dark olive green colour). This signal has to be present for about half a field before the TV will switch to colour mode. This is an imperceptably short time but stops the TV flickering in and out of colour mode if the signal is marginal.

Having a signal of the correct frequency at the correct time for that period of time is extremely unlikely to occur by chance (and even if it did, it would disappear again before you had the chance to notice it). So when the TV is showing static, it thinks it's showing an old black-and-white movie and turns off the colour interpretation circuitry, leading to black-and-white static.

3D circuit ball

Saturday, August 15th, 2009

I'm sure this has been done before, but I can't seem to find any examples. I think it would be cool to make an electronic circuit with no PCB, breadboard, wire-wrapping board or anything like that just by soldering together the legs of components in 3D space, causing it to form a ball or tangle - a piece of functional art.

Edit 9th July 2010: Made one!

Here is the schematic.

Edit 14th July 2013:

Now this is is what I'm talking about!