Stringy identifiers

In most computer languages, identifiers can only contain alphanumeric characters (plus, often, the underscore '_') and can't start with a number. There are very good reasons for this, but there are occasions when it would be desirable to be able to use arbitrary strings as identifiers. For example, suppose that we have a language that has a parser generator built in. Suppose also that we give it a parsing rule like this:

WhileStatement := ("while" | "until") "(" Expression ")" Statement

This is really just another way of writing a parsing class:

class WhileStatement { ... }

We would like the members of this class to be automatically generated by the parser generator in a predictable way. With string-identifiers, the generated class might look something like this:

  class WhileStatement
  {
    Enum { `"while"`, `"until"` } `"while" | "until"`;
    Expression expression;
    Statement statement;
  }

i.e. the name of the member variable which holds the piece of data about whether this is a "while" loop or an "until" loop is called `"while" | "until"`. The type of this member is

Enum { `"while"`, `"until"` }

which means that it can take only the values `"while"` and `"until"`.

This class can be used as follows:

  WhileStatement whileStatement = WhileStatement::Parse(file);
  if (whileStatement.`"while" | "until"` == `"while"`)
    ...

This reads very naturally, and avoids the need for anyone (human or compiler) to make up any variable names. It's good to avoid making the programmer make up names for things when not absolutely necessary.

One Response to “Stringy identifiers”

  1. [...] internally. All you do with an Atom is compare it to another Atom and see if they are the same. See Stringy Identifiers for another perspective on these. This is not quite the same as the lisp usage of the word, where [...]

Leave a Reply