It may help if I state the problem as clearly as I can. The input is a list of

*rules*mapping a

*head*and up to two

*child nodes*to a

*tile*and a

*nonterminal symbol*. A

*tile,*to reiterate, is ultimately a piece of code that emits machine code. The

*nonterminal symbol*(from now on:

*symbol*) is what takes the place of a node after a tile has been selected for it. This abstraction is what allows the tiler to determine the rule(set) for a node by looking only at it's direct children. Note that tile rules that refer to nested structures are 'flattened' by replacing nested children by special one-off symbols.

Many rules can be matched together. Take for instance these tiles starting with an

`add`

node. For x64, we can compile a register-to-register addition, a constant-to-register addition, and a memory-to-register addition. (x86 can also add from register to memory, but we ignore that).Rule 1: Adding register to register |

Rule 2: A tile that adds a constant value to a register |

Rule 3: A tile that adds a register to a value loaded from memory |

Given that both a

`(load mem)`

and a `(const)`

node can be tiled to *reg*symbols, we can always match

`(add reg reg)`

when matching an `add`

node. But it's not always possible to match `(add reg (const))`

or `(add reg (load mem))`

. And the last two certainly cannot be combined. So when the tiler inspects the `add`

node, it can emit the following tree combinations of tiles: `{1}, {1,2} and {1,3}`

. The problem I had was *given the complete set of rules, how can we find all permissible combinations that the tiler can generate?*

(As an aside, I don't think generating only the necessary rulesets is an optimization, but very much necessary for the feasibility of the tiler. The tiler tables otherwise become either prohibitively large, or incomplete).

Why can we combine 1 and 3? Because the

`(load mem)`

node can generate a *reg*symbol as well as the 'placeholder' symbol generated in rule 3 (let's call that the

*3a*symbol).

*Similarly, the*

`(const)`

node can generate a *reg*as well as the

*2a*symbol. Indeed,

**rules can be combined whenever the symbols they refer to can be combined.**And they are separate whenever the symbols they refer to can occur separately. So the key to finding all permissible combinations is finding all possible combinations of symbols. That is something of a simpler problem.

First of all, all rules that refer to the same combination-of-terminals can be combined themselves, generating new combinations of terminals. Second, some terminal combinations may not be generated at all (

*3a*will never occur in isolation, for example). Finally, when we

*do*have all possible sets of terminals, applying all rules to each set will continue to generate these combinations. We can find rules which combine in such a way easily using a trie. So the solution to finding all permissible combinations of rules is this:

- Apply each rule to a trie of symbol combinations and store the rule number and symbol in each leaf.
- Extract all combinations of symbols that resolve to a single trie leaf.
- Compare the symbol combinations to those used in step
*1*. If there are any changes, repeat from step*1*. - The trie leafs also now also contain all possible rule combination, which simply need to be given a number.

A further change I had to make was to split the finding of rulesets - a bottom-up procedure - to the selection of the right rules, which is a top-down procedure. It is top-down, because the rule selected at a node determines the rules that can be applied for it's child nodes. All that is needed is a table that stores the minimum-cost rule to generate a given symbol for a given tile set, and a root rule to start with. (NB: I'm not 100% sure that's true, but it's my working hypothesis, and it can't generate wrong code, only suboptimal code.)

So that was my update. See you next time!