if you belong to the many people who are annoyed by the somewhat mediocre performance characteristics of rakudo, last week’s changes may get you quite excited
Let’s dig into the performance changes first:
- Rakudo used to box a fresh instance of 1 for each ++ or — you put in your code. Now it just uses the ones found in the constant pool.
- In the new and shiny “spesh” branch of MoarVM, jnthn has published his work for creating a Control Flow Graph and Single Static Assignment representation of MoarVM bytecode that can then be optimized based on a “facts database”. In particular, the following optimizations for this “bytecode specializer” have been created so far:
- method calls on known types will – if possible – get resolved at “specialize-time”. if the type is not known, we add a little cache for the cases where the type turns out to be the same many times in a row.
- the hllize operation (which is used to ensure things like boxed strings or integers, or arrays and such have been transferred into the right High Level Language, for example from NQP to Perl 6) gets turned into a set operation instead if the type is known and it’s already in the right HLL.
- the decont operation can likewise turn into a set operation instead if the objects that come in are known to already be decontainered.
- the operations to get arguments that were passed to the sub or method usually check for the proper amount of arguments passed before hand, but at the time we specialize, we already know exactly how many arguments got passed, so these operations all get replaced with quicker, specialized operations.
- likewise, the operations to get optional parameters are all conditional jumps in order to set default values if values were not passed; these jumps are now turned unconditional at specialize-time and the code that turns unreachable gets removed from the specialized bytecode
- operations that belong to the “if” or “unless” category that operate on known values will now be turned into unconditional jumps (or removed completely) at specialize-time.
- if the type of something is known at specialize-time, the “istype” operation will turn into a “load a constant 0 or 1 into the register” operation, if possible.
- a bunch of operations that usually would have to go indirectly through the REPR of the given type can now be inlined (when the type is known) and generate code that is faster and has less indirection. Currently, creating objects, getting and setting attributes, boxing and unboxing values, and finding out the number of elements an object has can be optimized by each REPR, though it has to be implemented for each REPR + operation individually. In particular, no boxing/unboxing or getattr speshes are implemented yet, but they likely take only a few minutes each to write
- There are a few things still missing, for example information gathered from turning operations into “constants” (like istype) are not yet aggressively propagated and the specializer will currently specialize anything that gets called more than 10 times, so it will waste a lot of time on things that are not actually “hot”, but it does help find out early if the specializer does anything wrong.
- The spesh branch will likely be merged this week in order for it to show up in the next MoarVM release. It currently regresses no spectests in Rakudo’s test suite, so that’s a good sign already!
- After I’ve tried – unsuccessfully – for a long time to get this particular optimization off the ground, jnthn implemented a much cleaner approach at it in an evening. The optimization in question is turning lexical variables into locals if they are not used in nested blocks and then turning nested blocks into in-lined blocks if they don’t define any lexical variables in them. These two optimizations harmonize perfectly and since the specializer doesn’t know how to operate on lexicals yet, it’s worth twice as much. Sadly, this optimization is only possible in NQP so far, as the analysis and care needed to make the same thing work in Perl 6 are much more complicated. On the other hand, every piece of compilation you do now is a bit faster.
- The work done in the spesh branch will serve as the basis for the JIT project that has been proposed for the GSoC.
And here’s the non-performance related changes:
- FROGGS introduced a variable $*EXECUTABLE which is an IO::Path object that points to the executable used in the given process
- FROGGS also worked on a tool to build actual executables that carry MoarVM bytecode and commandline parameters inside them. These can then be statically linked to libmoar and be completely stand-alone.
- lizmat worked on “winner” quite a lot and through her work found out that the construct is in desperate need of a re-think and re-spec.
- dwarring has constantly been creating test cases for the advent calendar posts of the past years
- retupmoca has merged patches to rakudo and panda to support handling multiple state files, for example if you install some modules system-wide and other modules in your home directory.
- vendethiel helped me flesh out the tests for the “orelse” operator. Then we found out that I have had a wrong understanding of how orelse is supposed to handle exceptions. Oh well, lesson learned!
- last week I pointed out that the I/O subsystem of MoarVM was lacking locks. This has been addressed and the I/O stuff should now be in very good shape for concurrency.