Contents
Back
Forward

26. How verbs are parsed


Grammar, which can govern even kings.

...Moliére (1622--1673), Les Femmes savantes

The parser's fundamental method is simple. Given a stream of text like

saint / peter / , / take / the / keys / from / paul
it first calls the entry point BeforeParsing (in case you want to meddle with the text stream before it gets underway). It then works out who is being addressed, if anyone, by looking for a comma, and trying out the text up to there as a noun (anyone animate or anything talkable will do): in this case St Peter. This person is called the "actor'', since he is going to perform the action, and is usually the player himself (thus, typing "myself, go north'' is equivalent to typing "go north''). The next word, in this case 'take', is the "verb word''. An Inform verb usually has several English verb words attached, which are called synonyms of each other: for instance, the library is set up with
"take" = "carry" = "hold"
all referring to the same Inform verb.

/\ The parser sets up global variables actor and verb_word while working. (In the example above, their values would be the St Peter object and 'take', respectively.)

/\/\ It isn't quite that simple: names of direction objects are treated as implicit "go'' commands, so that "n'' is acceptable as an alternative to "go north''. There are also "again'', "oops'' and "undo'' to grapple with.

/\ Also, a major feature (the grammar property for the person being addressed) has been missed out of this description: see the latter half of Section 16 for details.

Teaching the parser a new synonym is easy. Like all of the directives in this section, the following must appear after the inclusion of the library file Grammar:

    Verb "steal" "acquire" "grab" = "take";
This creates another three synonyms for "take''.

/\ One can also prise synonyms apart, as will appear later.

The parser is now up to word 5; i.e., it has "the keys from paul'' left to understand. Apart from a list of English verb-words which refer to it, an Inform verb also has a "grammar''. This is a list of 1 or more "lines'', each a pattern which the rest of the text might match. The parser tries the first, then the second and so on, and accepts the earliest one that matches, without ever considering later ones.

A line is itself a row of "tokens''. Typical tokens might mean 'the name of a nearby object', 'the word from' or 'somebody's name'. To match a line, the parser must match against each token in sequence. For instance, the line of 3 tokens

<a noun> <the word from> <a noun>
matches the text. Each line has an action attached, which in this case is Remove: so the parser has ground up the original text into just four numbers, ending up with
    actor = st_peter
    action = Remove   noun = gold_keys   second = st_paul
What happens then is that the St Peter's orders routine (if any) is sent the action, and may if it wishes cooperate. If the actor had been the player, then the action would have been processed in the usual way.

/\ The action for the line which is currently being worked through is stored in the variable action_to_be; or, at earlier stages when the verb hasn't been deciphered yet, it holds the value NULL.

The Verb directive creates Inform verbs, giving them some English verb words and a grammar. The library's Grammar file consists almost exclusively of Verb directives: here is an example simplified from one of them.

Verb "take" "get" "carry" "hold"
                * "out"                          -> Exit
                * multi                          -> Take
                * multiinside "from" noun        -> Remove
                * "in" noun                      -> Enter
                * multiinside "off" noun         -> Remove
                * "off" held                     -> Disrobe
                * "inventory"                    -> Inv;
(You can look at the grammar being used in a game with the debugging verb "showverb'': see Section 30 for details.) Each line of grammar begins with a *, gives a list of tokens as far as -> and then the action which the line produces. The first line can only be matched by something like "get out'', the second might be matched by
take the banana
get all the fruit except the apple
and so on. A full list of tokens will be given later: briefly,
"out"
means the literal word "out'',
multi
means one or more objects nearby,
noun
means just one and
multiinside
means one or more objects inside the second noun. In this book, grammar tokens are written in the style
noun
to prevent confusion (as there is also a variable called noun).

/\/\ Since this book was first written, the library has been improved so that "take'' and "get'' each have their own independent grammars. But for the sake of example, suppose they share the grammar written out above. Sometimes this has odd results: "get in bed" is correctly understood as a request to enter the bed, "take in washing" is misunderstood as a request to enter the washing. You might avoid this by using Extend only to separate them into different grammars, or you could fix the Enter action to see if the variable verb_word=='take' or 'get'.

/\ Some verbs are meta - they are not really part of the game: for example, "save'', "score'' and "quit''. These are declared using Verb meta, as in
Verb meta "score"
                *                                -> Score;
ninepoint and any debugging verbs you create would probably work better this way, since meta-verbs are protected from interference by the game and take up no game time.

After the -> in each line is the name of an action. Giving a name in this way is what creates an action, and if you give the name of one which doesn't already exist then you must also write a routine to execute the action, even if it's one which doesn't do very much. The name of the routine is always the name of the action with Sub appended. For instance:

[ XyzzySub; "Nothing happens."; ];
Verb "xyzzy"    *                                -> Xyzzy;
will make a new magic-word verb "xyzzy'', which always says "Nothing happens'' -- always, that is, unless some before rule gets there first, as it might do in certain magic places. Xyzzy is now an action just as good as all the standard ones: ##Xyzzy gives its action number, and you can write before and after rules for it in Xyzzy: fields just as you would for, say, Take.

/\ Finally, the line can end with the word reverse. This is only useful if there are objects and numbers in the line which occur in the wrong order. An example from the library's grammar:
Verb "show" "present" "display"
                * creature held                  -> Show reverse
                * held "to" creature             -> Show;
The point is that the Show action expects the first parameter to be an item, and the second to be a person. When the text "show him the shield'' is typed in, the parser must reverse the two parameters "him'' and "the shield'' before causing a Show action. On the other hand, in "show the shield to him'' the parameters are in the right order already.

The library defines grammars for the 100 or so English verbs most often used by adventure games. However, in practice you very often need to alter these, usually to add extra lines of grammar but sometimes to remove existing ones. For example, consider an array of 676 labelled buttons, any of which could be pushed: it's hardly convenient to define 676 button objects. It would be more sensible to create a grammar line which understands things like

"button j16", "d11", "a5 button"
(it's easy enough to write code for a token to do this), and then to add it to the grammar for the "press'' verb. The Extend directive is provided for exactly this purpose:
Extend "push"   * Button                    -> PushButton;   
The point of Extend is that it is against the spirit of the Library to alter the standard library files -- including the grammar table -- unless absolutely necessary.

/\/\ Another method would be to create a single button object with a parse_name routine which carefully remembers what it was last called, so that the object always knows which button it represents. See 'Balances' for an example.

Normally, extra lines of grammar are added at the bottom of those already there. This may not be what you want. For instance, "take" has a grammar line

                * multi                     -> Take
quite early on. So if you want to add a grammar line which diverts "take something-edible" to a different action, like so:
                * edible                    -> Eat
(
edible
being a token matching anything which has the attribute edible) then it's no good adding this at the bottom of the Take grammar, because the earlier line will always be matched first. Thus, you really want to insert your line at the top, not the bottom, in this case. The right command is
Extend "take" first
                * edible                    -> Eat;
You might even want to over-ride the old grammar completely, not just add a line or two. For this, use
Extend "push" replace
                * Button                    -> PushButton;
and now "push" can be used only in this way. To sum up, Extend can take three keywords:
replace completely replace the old grammar with this one;
first insert the new grammar at the top of the old one;
last insert the new grammar at the bottom of the old one;

with last being the default (which doesn't need to be said explicitly).

/\ In library grammar, some verbs have many synonyms: for instance,
"attack" "break" "smash" "hit" "fight" "wreck" "crack"
"destroy" "murder" "kill" "torture" "punch" "thump"
are all treated as identical. But you might want to distinguish between murder and lesser crimes. For this, try
Extend only "murder" "kill" replace * animate -> Murder;
The keyword only tells Inform to extract the two verbs "murder" and "kill". These then become a new verb which is initially an identical copy of the old one, but then replace tells Inform to throw that away in favour of an entirely new grammar. Similarly,
Extend only "get" * "with" "it" -> Sing;
makes "get" behave exactly like "take" (as usual) except that it also recognises "with it", so that "get with it" makes the player sing but "take with it" doesn't. Other good pairs to separate might be "cross'' and "enter'', "drop'' and "throw'', "give'' and "feed'', "swim'' and "dive'', "kiss'' and "hug'', "cut'' and "prune''.

/\/\ Bear in mind that once a pair has been split apart like this, any subsequent extension made to one will not be made to the other.

/\/\ There are (a few) times when verb definition commands are not enough. For example, in the original 'Advent' (or 'Colossal Cave'), the player could type the name of a not-too-distant place which had previously been visited, and be taken there. There are several ways to code this -- say, with 60 rather similar verb definitions, or with a single "travel" verb which has 60 synonyms, whose action routine looks at the parser's verb_word variable to see which one was typed, or even by restocking the compass object with new directions in each room -- but here's another. The library will call the UnknownVerb routine (if you provide one) when the parser can't even get past the first word. This has two options: it can return false, in which case the parser just goes on to complain as it would have done anyway. Otherwise, it can return a verb word which is substituted for what the player actually typed. Here is a foolish example:
[ UnknownVerb w;
  if (w=='shazam') { print "Shazam!^"; return 'inventory'; }
  rfalse;
];
which responds to the magic word "shazam" by printing Shazam! and then, rather disappointingly, taking the player's inventory. But in the example above, it could be used to look for the word w through the locations of the game, store the place away in some global variable, and then return 'go'. The GoSub routine could then be fixed to look at this variable.

??/\/\EXERCISE 65:
(link to
the answer)
Why is it usually a bad idea to print text out in an UnknownVerb routine?

/\/\ If you allow a flexible collection of verbs (say, names of spells or places) then you may want a single 'dummy' verb to stand for whichever is being typed. This may make the parser produce strange questions because it is unable to sensibly print the verb back at the player, but you can fix this using the PrintVerb entry point.

??/\/\EXERCISE 66:
(link to
the answer)
Implement the Crowther and Woods feature of moving from one room to another by typing its name, using a dummy verb.

??/\EXERCISE 67:
(link to
the answer)
Implement a lamp which, when rubbed, produces a genie who casts a spell over the player to make him confuse the words "white'' and "black''.

*REFERENCES:
'Advent' makes a string of simple Verb definitions; 'Alice Through The Looking-Glass' uses Extend a little.
'Balances' has a large extra grammar and also uses the UnknownVerb and PrintVerb entry points.

Contents / Back / Forward
Chapter I / Chapter II / Chapter III / Chapter IV / Chapter V / Chapter VI / Appendix
Mechanically translated to HTML from third edition as revised 16 May 1997. Copyright © Graham Nelson 1993, 1994, 1995, 1996, 1997: all rights reserved.