zondag 29 november 2015

Moar JIT news

Hello there, I thought it high time to write to you again and update you on the world of JITs. Since last I wrote, PyPy 4.0 was released. Also in python-land, Pyston 0.4 was released, and finally Guile 2.1.1 was released and Andy Wingo wrote a nice piece about that, as is his custom. I present these links not only to give these projects the attention they deserve, but also because I think they are relevant to our own project.

In chronological order, the release of PyPy 4.0 marked the first 'production' release of that projects' autovectorizer, which was developed over the course of this years Google Summer of Code. I'd like to take this opportunity to publicly congratulate the PyPy team on this achievement. So called 'vector' or SIMD operations perform a computation on multiple values in a single step and are an essential component of high-performance numerical computations. Autovectorizing refers to the compiler capability to automatically use such operations without explicit work by the programmer. This is not of great importance for the average web application, but it is very significant for scientific and deep learning applictions.

More recently, the Pyston project released version 0.4. Pyston is another attempt at an efficient implementation of Python, funded by Dropbox. Pyston is, or I should rather say, started out based on llvm. Most of my readers know of LLVM; for those who don't, it is a project which has somewhat revolutionised compiler development in the last few years. Its strengths are its high-quality cross-platform code generation with a permissive license. LLVM is also the basis for such languages as rust and julia. Notable weaknesses are size, speed, and complexity. To make a long story short, many people have high expectations of LLVM for code generation, and not without reason.

There are a few things that called my attention in the release post linked above. The first thing is that the Pyston project introduced a 'baseline' JIT compiler that skips the LLVM compilation step, so that JIT compiled code is available faster. They claim that this provides hardly a slowdown compared to the LLVM backend. The second thing is that they have stopped working on implementing LLVM-based optimisation. The third thing is that to support more esoteric python feature, Pyston now resorts to calling the Python C API directly, becoming sort of a hybrid interpreter. I would not be entirely surprised if the end point for Pyston would be life as a CPython extension module, although Conways law will probably prohibit that.

Pyston is not the first, nor the only current JIT implementation based on LLVM. It might be important to say here that there are many projects which do obtain awesome results from using LLVM; julia being a prime example. (Julia is also an excellent counterexample to the recent elitist celebration of self-declared victory by static typing enthusiasts assertion that 'static types have won', being very dynamic indeed). But Julia was designed to use the LLVM JIT, which probably means that tradeoffs have been made to assure performance; and it is also new, so it doesn't have to run as much weird legacy code; the team is probably highly competent. I don't know why some mileages vary so much (JavascriptCore also uses LLVM successfully, albeit as it's fourth and last tier). But it seems clear that far from being a golden gun, using LLVM for dynamic language implementations is a subtle and complex affair.

Anybody willing to try building an LLVM-backed JIT compiler for MoarVM, NQP, or perl6 in general, will of course receive my full (moral) support, for whatever that may be worth.

The posts by Andy Wingo, about the future of the Guile Scheme interpreter, are also well worth reading. The second post is especially relevant as it discusses the future of the guile interpreter and ways to construct a high-performance implementation of a dynamic language; it generalizes well to other dynamic languages. To summarise, there are roughly two ways of implementing a high-performance high-level programming language, dynamic or not, and the approach of tracing JIT compilers is the correct one, but incredibly complex and expensive and - up until now - mostly suitable for big corporations with megabucks to spend.
 Of course, we aim to challenge this; but for perl6 in the immediate future correctness far outranks performance in priority (as it should).

That all said, I also have some news on the front of the MoarVM JIT. I've recently fixed a very longstanding and complex bug that presented itself during the compilation of returns with named arguments by rakudo. This ultimately fell out as a missing inlined frame in the JIT compiler, which ultimately was caused by MoarVM trying to look for a variable using the JIT-compiler 'current location', while the actual frame was running under the interpreter,and  - this is the largest mystery - it was not deoptimized. I still do not know why that actually happened, but a very simple check fixed the bug.

I also achieved the goal of running the NQP and rakudo test suites under the new JIT compiler, albeit in a limited way; to achieve this I had to remove the templates of many complex operations, i.e. ops that call a C function or that have internal branches. The reason is that computing the flow of values beyond calls and branches is more complex, and trying to do it inline with a bunch of other things - as the new JIT has tried to so far - is prone to bugs. This is true especially during tree traversal, since it may not be obvious that computations relying on values may live in another context as the computations that generate these values.

In order to compile these more complex trees correctly, and understandably, I aim to disentangle the final  phases of compilation, that is, the stages of instruction selection, register allocation, and bytecode generation. Next to that I want to make the tiler internals and interface much simpler and user-friendly, and solve the 'implied costs problem'. The benefit of having the NQP test suite working means I can demonstrate the effects of changes much more directly, and more importantly, demonstrate whether individual changes work or not. I hope to report some progress on these issues soon, hopefully before christmas.

If you want to check out the progress of this work, checkout the even-moar-jit branch of MoarVM. I try, but not always successfully, to keep it up-to-date with the rapid pace of the MoarVM master branch. The new JIT only runs if you set the environment variable MVM_JIT_EXPR_ENABLE to a non-empty value. If you run into problems, please don't hesitate to report on github or on the #moarvm or #perl6 channels on freenode. See you next time!