=head1 NAME opfunc - create operator precedence functions =head1 SYNOPSIS opfunc =head1 DESCRIPTION B is a Perl program which constructs operator precedence functions for an operator grammar. In particular, it constructs a pair of functions I and I which, for any grammar symbols I and I, satisfy the following properties: =over 4 =item - f(m) < f(n) whenever I yields precedence to I =item - f(m) = f(n) whenever I has the same precedence as I =item - f(m) > f(n) whenever I takes precedence over I =back The input to B is a plain text file containing an operator grammar, and the output is a table mapping each grammar symbol I to values f(x) and g(x) satisfying the above relationships. See below for a description of the input and output formats for B. =head1 INPUT FORMAT The input file is a plain text file, in which each line but the first describes the precedence relations for one symbol of the grammar. The first line is considered a "header", and should contain a sequence of token names, delimited by whitespace and/or commas. For example, the header for a simplified arithmetic grammar might look like this: id + * $ This means that there are four distinct terminal symbols in the grammar -- I, I<+>, I<*>, and I<$>. This last is typically used to denote the end of input in operator precedence parsing; however, note that there is no especial significance to any of these symbols -- you could as easily have written: name plus times end ...with equivalent meaning. It is merely important that each grammar symbol have a distinct representation, and that the header line should reflect the ordering of relations in the table which follows. Note that leading whitespace on the header line is ignored, as a formatting convenience. The remaining lines of the file should contain a grammar symbol, followed by a list of its precedence relations to the other grammar symbols. This list should give the relationships in the same linear order that was specified in the header line. Here, there are special tokens to indicate precedence relationships: =over 4 =item < yields precedence The symbol named at the beginning of this line yields precedence to the symbol in the corresponding column. =item = equal precedence The symbol named at the beginning of this line has the same precedence as the symbole in the corresponding column. =item > takes precedence The symbol named at the beginning of this line takes precedence over the symbol in the corresponding column. =back All other values are meaningless to B, and are ignored. If you wish to indicate that no precedence relation exists between a pair of symbols, you may use any non-blank text you wish. Here is an example of a complete operator grammar file suitable for input to the B program: id + * $ id x > > > + < > < > * < > > > $ < < < x It is left as an exercise to the reader to verify that this operator grammar actually encodes the correct precedence relations for infix addition and multiplication. =head1 OUTPUT FORMAT B prints its results to the standard output. The first line of the output is essentially the same as the header line of the input file, that is, the names of the grammar symbols delimited by tabs. The second line begins with I, and is followed by a tab-separated list of integer values which give the I function values for each corresponding grammar symbol. The third line does the same for I. For example, the output of B given the above example grammar file should be something like this: id + * $ f 4 2 4 0 g 5 1 3 0 This format should be comparatively easy to parse and render into a data structure for whatever language you're using to implement the operator precedence parser. =head1 RESTRICTIONS Because of trichotomy, operator precedence functions do not adequately express the case when no relation exists between two grammar symbols. This is an inherent limitation in the definition of precedence functions, rather than an outright bug, but you will need to address this problem when constructing your parser. Also, the program is fairly slow. The functional relationships are constructed by building a graph structure that reflects the interaction of the various functions, and then searching for longest paths in the transitive closure of each symbol. Doing this in Perl simplified the programming greatly, but at great penalty to performance. Presumably if you had a large grammar, you wouldn't be trying to use operator relations to parse it anyway, right? =head1 SEE ALSO I assume you understand operator precedence parsing already, if you have read this far already, but one good reference is Aho, Sethi and Ullman's I, otherwise known as "the Dragon book". This has a good explanation of the basics, and contains several other useful references to more fundamental literature. =head1 AUTHOR Michael J. Fromberger Department of Computer Science, Dartmouth College 6211 Sudikoff Laboratory, Hanover, New Hampshire, USA