zondag 16 augustus 2015

Tiler Update

In my last blog I talked a bit about a problem I had with generating all possible sets of rules from the tile 'grammar'. I've since solved that problem, and thought it'd be nice to share my results. (NB: I use the word 'grammar' loosely today and on other days. In compiler jargon a grammar is usually a thing that matches a linear stream of lexical tokens to a tree structure. In this case, it's really the other way around, because it's used to match a tree and output a linear sequence. However, these things are quite similar, so I hope you'll forgive me for using the term).

It may help if I state the problem as clearly as I can. The input is a list of rules mapping a head and up to two child nodes to a tile and a nonterminal symbol. A tile, to reiterate, is ultimately a piece of code that emits machine code. The nonterminal symbol (from now on: symbol) is what takes the place of a node after a tile has been selected for it. This abstraction is what allows the tiler to determine the rule(set) for a node by looking only at it's direct children.  Note that tile rules that refer to nested structures are 'flattened' by replacing nested children by special one-off symbols.

Many rules can be matched together. Take for instance these tiles starting with an add node. For x64, we can compile a register-to-register addition, a constant-to-register addition, and a memory-to-register addition. (x86 can also add from register to memory, but we ignore that).


Rule 1: Adding register to register


Rule 2: A tile that adds a constant value to a register

Rule 3: A tile that adds a register to a value loaded from memory


Given that both a (load mem) and a (const) node can be tiled to reg symbols, we can always match (add reg reg) when matching an add node. But it's not always possible to match (add reg (const)) or (add reg (load mem)). And the last two certainly cannot be combined. So when the tiler inspects the add node, it can emit the following tree combinations of tiles: {1}, {1,2} and {1,3}. The problem I had was given the complete set of rules, how can we find all permissible combinations that the tiler can generate?

(As an aside, I don't think generating only the necessary rulesets is an optimization, but very much necessary for the feasibility of the tiler. The tiler tables otherwise become either prohibitively large, or incomplete).

Why can we combine 1 and 3? Because the (load mem) node can generate a reg symbol as well as the 'placeholder' symbol generated in rule 3 (let's call that the 3a symbol). Similarly, the (const) node can generate a reg as well as the 2a symbol. Indeed, rules can be combined whenever the symbols they refer to can be combined. And they are separate whenever the symbols they refer to can occur separately. So the key to finding all permissible combinations is finding all possible combinations of symbols. That is something of a simpler problem.

First of all, all rules that refer to the same combination-of-terminals can be combined themselves, generating new combinations of terminals. Second, some terminal combinations may not be generated at all (3a will never occur in isolation, for example). Finally, when we do have all possible sets of terminals, applying all rules to each set will continue to generate these combinations. We can find rules which combine in such a way easily using a trie. So the solution to finding all permissible combinations of rules is this:
  1. Apply each rule to a trie of symbol combinations and store the rule number and symbol in each leaf.
  2. Extract all combinations of symbols that resolve to a single trie leaf.
  3. Compare the symbol combinations to those used in step 1. If there are any changes, repeat from step 1.
  4. The trie leafs also now also contain all possible rule combination, which simply need to be given a number.
The idea for this algorithm I derived from the table generation algorithm for parser generators, to which it is also quite similar.

A further change I had to make was to split the finding of rulesets - a bottom-up procedure - to the selection of the right rules, which is a top-down procedure. It is top-down, because the rule selected at a node determines the rules that can be applied for it's child nodes. All that is needed is a table that stores the minimum-cost rule to generate a given symbol for a given tile set, and a root rule to start with. (NB: I'm not 100% sure that's true, but it's my working hypothesis, and it can't generate wrong code, only suboptimal code.)

So that was my update. See you next time!

Geen opmerkingen:

Een reactie posten