Prexonite 1.2.2 – Power to the macro

TL;DR The new Prexonite release 1.2.2 is here!

In the nick of time…

When I announced my 3-month milestone cycle last time, I was sure that three months were waaay too long.Well my gut feeling hasn’t betrayed me. The 1.2.2 release is late by over a week. Why? Two reasons: I had solved all the interesting problems at the beginning of the iteration, towards the end, all that was left was grunt work (re-implementing all call\* commands to support partial application). So my free time got more and more taken over by more other interesting activities, like StarCraft 2, exploring C++ and a super secret compiler-related tool/library. I only managed to finish the milestone by cutting one high-profile feature: automatic partial application of macros, but more on that later.

The Prexonite compiler architecture is really starting to show its age (read bad less-than-ideal design). It is currently very easy to have the compiler generate invalid byte code (evaluation stack underflow and/or overflow) and there is nothing I can do to prevent it, that isn’t a giant, ugly hack. I’d really love to completely rewrite the entire compiler of Prexonite (starting with the AST and code generation). But this is not something for the near future.

Macro Commands

Up until now, macros were in a very bizarre situation. For the first time, Prexonite Script code offered a feature with no equal counterpart in the managed (C#) world. Instead of functions, there are commands; Prexonite structs are mirrored by plain CLR types with IObject; Compiler hooks and custom resolvers can both be implemented in either Prexonite Script or managed languages. But for macros, there was no equivalent – until now.

Let’s face it, the first implementation of Prexonite Script macros was a hack. I mean, injecting a fixed set of ‘magic variables’ into the function smelled right from the beginning. Obviously, the new managed macros and the interpreted ones should share as much architecture as possible, have the same capabilities and protections etc. Could you have imagined the method signature of the managed mechanism had I kept those 6 variables around? And the versioning hell?!

Ever wondered why it is common practice to pack all your .NET event arguments in an EventArgs class? Same reason, you want to be able to evolve the interface without breaking the event (changing its signature). Conversely, the macro system now operates with a single value being passed to the macro implementation (whether it is written in Prexonite Script or managed code).

That one object, the macro context, provides access to the information previously available through those 6 variables. Mostly. The new interface is much more restrictive. It doesn’t give you direct access to the loader or the compiler target. This is a first step in turning macros from a compiler feature into a language/library feature. Ultimately, macros should have their own AST/internal representation, to be completely independent of the compiler.

Partial Application for macros

When designing programming language features, one should always strive for orthogonality. If there is a sensible way in which two features could be used together, it should probably be possible to use them together. This is not always easy, of course, and sometimes the one reason why you should not include a certain language feature.

In Prexonite Script, compile-time macros and partial application are kind-of at odd with one another. Macros generate new code to be substituted, while partial application creates function-values without generating new functions. These two features are not easily combined: In general macros can expand to unique and completely unrelated code. Partially applying an arbitrary block of code is not possible (control flow into and out of the block cannot be captured by a function object, not in a transparent fashion).

Consequently, macro authors have to implement partial applications manually. Let’s see how you could implement a very simple logging macro, that also supports partial application. It should accept a .NET format string, followed by an arbitrary number of arguments. When expanded it should print the name and signature of the function from which it was called (expanded), along with the containing script file name, line and column, followed by the formatted message. It should obviously be possible to disable logging, thus the macro should expand to null when the meta switch “debugging” is not set.

Now this just happens to be one of those macros that can be partially applied. It only uses the macro expansion mechanism to gather additional information (function name, source file, line and column). It doesn’t generate return statements, or needs to expand other macros in the same context. This means that you can implement the bulk of your macro in an ordinary function that takes two additional arguments: the source position and a reference calling function. The macro thus only needs to gather this information and assemble a call to the implementing function.

function log\impl(pos, func, msg) 
{ 
    var finalMsg = call(msg.Format(?),var args >> skip(3)); 
    println(pos, ", ", func, ": ", finalMsg); 
}

Generating this code via the AST wouldn’t be very fun anyway. Now on to the interesting part, the actual log macro.

macro log(msg) [is partial\macro;]
{
    if(context.Application.Meta["debugging"].Switch)
    {
        var funcStr = ast\const(context.Function.ToString); 
        var posStr = call\macro([__POS__]);
        
        var impl = ast\func(->log\impl.Id);
        impl.Call = context.Call;    
        impl.Arguments.Add(posStr);
        impl.Arguments.Add(funcStr);
        impl.Arguments.AddRange(var args);
        
        context.Block.Expression = impl;
    }
    
    return true;
}

On line 3, we check whether the debugging switch is enabled and skip the entire code generation if it isn’t. If we don’t generate any code, the macro system will supply an expression that evaluates to null.

The variable funcStr is initialized to a string literal containing the function’s name and signature (line 5). Calling ToString on the function object is necessary, because the constant node constructor ast\const expects its argument to be either a boolean, an integer, a double or a string. No implicit conversion is performed in order to avoid ambiguity.

On line 6, the source code position (file, line, column) is handled. We use a shorthand form of call\macro, where we can specify the macro call we’d like to perform in a list literal and the implementation of call\macro (a macro command since this release) will translate it into a proper invocation of call\macro. That would look somewhat like this:

var posStr = call\macro(macro\reference(__POS__));

Having to type macro\reference every time you call a macro is no fun, especially not, if the callee-macro is actually statically known (which is quite often the case). You can still supply argument lists as additional arguments, just like with all other call/* commands.

On line 8, we start assembling the invocation of our implementation function. We could have just supplied the string "log\\impl" to the function call node constructor ast\func, but the proper way to do this, is to acquire a reference to that function and retrieve its Id. This way, if we change the physical name of the function (but retain the log\impl symbol, our macro will still work.

Copying the call type of our macro to the implementation function call on line 9 is very important to maintaining the illusion of log being an ordinary function. If you write log = "hello", you will expect the "value" of this expression to be "hello", with the side-effect of logging that message. If we don’t propagate the call type of the macro expansion to our function call expression, it will default to Get, which is not what the user expects.

Arguably, a macro should have no control over arguments outside of its argument list, but the way Prexonite understands the assignment operator, the right hand side is just another argument and is thus fair game as far as the macro system is concerned. It’s also not easy to detect a forgotten propagation of context.Call: Even though most macros that are Set-called will expand to Set-call expression, this is not a necessity for a correct implementation.

On lines 10 through 12, we add our arguments to the implementation function call: the source position, the function name as well as all arguments passed to the log macro. Don’t forget to actually assign the function call to the expanded code block. I can’t remember how many times, I’ve forgotten this. The result would be the macro-system-supplied null expression. So, if your macros ever produce null, double check if you have actually returned the generated AST nodes.

Almost done. Up until now, we just wrote a normal macro with the Prexonite 1.2.2 macro system. To make it partially applicable, we need two more details. Firstly, we need to mark our macro with the macro switch partial\macro. This signals that our macro can attempt to expand a partial application. Because there are some situations, where a macro cannot be partially applied (when the one thing that needs to be known at compile-time is missing), a macro needs to be able to reject expansion.

A partial macro (short for partially applicable macro) must return true, otherwise the macro system reports this particular expansion of the macro as "not partially applicable". Unlike normal macros, partial macros cannot return their generated AST, they must assign it to context.Block and/or context.Block.Expression. Also, the same macro code is used for both partial and total applications of the macro. You can use context.IsPartialApplication to distinguish the two cases.

However, if you build your macro carefully, you might not need to make that distinction. Our log macro doesn’t use this property at all, because the code we generate is the same in both scenarios. By building an ordinary invocation of log\impl, we don’t have to worry about partial application ourselves and can instead let the rest of the compiler worry about the details.

Calling Conventions

Prexonite now provides a whole family of very similar commands: call, call\member, code\async and call\macro, collectively known as call\*. They allow you to perform calls with a larger degree of control or different semantics. With the exception of call\async, these are also the "calling conventions" found in Prexonite. call uses the IIndirectCall interface (used by ref variables and the expr.(args) syntax), call\member is used for object member calls, and call\macro allows you to invoke other macros rather than expanding them.

Unfortunately, these call\* commands didn’t play nicely with partial application. They were partially applicable, but not only in a subset of useful cases. Take this usage of call\member for instance:

call\member(obj,?,[arg,?])

You might think that it create a partial application with two mapped arguments: the name of the member to invoke as well as the second argument. All other arguments to that partial application would be treated as argument lists. Prior to Prexonite 1.2.2, this was not the case, because partial application doesn’t look for placeholder in nested expressions. The list constructor would be partially applied and passed to call\member as an argument. And since partial applications are not argument lists, this will result in a exception at runtime.

Prexontie 1.2.2 contains a new calling convention, call\star which, well, handles calling call\*-family commands/functions. While that alone isn’t very exciting, it also scans its arguments for partially applied lists and adopts their placeholders. It enables you to do this:

call\star(call\member(?),[obj,?,arg,?])

This expression will yield a partial application (even though none of call\star‘s direct arguments is a placeholder), that does what we expected our previous example to do. Of course, this is kind of unwieldy, which is why all other call\* commands have been rewritten as macros to take advantage of call\star. This means that in Prexonite 1.2.2, our example from above will actually work the way we expected it to work.

Automatic partial application compatibility

One of the features I had planned for this release, was a mechanism for automatically making some macros partially applicable. The user would have to announce (via a meta switch) that their macro conformed to certain constraints and the macro system would have automatically made it partially applicable. My idea was to take the code generated by the partially applied macro and put "extra-line" it into a separate function, sharing variables where necessary.

Unfortunately, the reality is somewhat more complicated. This kind of transplantation cannot be done in a completely transparent way. I’m also not sure, if the Prexonite compiler is equipped to perform the necessary static analysis. (It would certainly be possible, but at the cost of maintainability; a hack in other words) Finally, this is not really what partial application stands for. The promise of not generating new functions for every call site would be broken.

I had a number of ideas for dealing with this issue: Comparing functions with existing ones, trusting the user to generate the same code and re-using the "extra-lined" function after the first macro expansion. You have seen how straightforward partial macros are, when you follow the pattern of implementing the bulk of code in an implementation function. At this point, automatic partial application compatibility doesn’t seem worth pursuing. There are more important issues at hand, such as improving code re-use in macros.

The next milestone: Prexonite 1.2.3

The next milestone will focus on usability, compatibility and reliability. All very important topics (if a bit dull). I would like Prexonite to

  • compile, run and test Prexonite on Ubuntu via mono 2.8+
  • Support Unix #! script directives
  • have a unit-testing "framework" for Prexonite
  • provide automated tests for the Prexonite standard repository (psr)
  • Integrate those tests into the automated testing of Prexonite
  • Make the language a little bit more friendly
  • Automated testing of round-tripping (compile&load should be equivalent to interpreting directly)

While this doesn’t seem like a lot of new features for Prexonite itself, it will actually involve quite a lot of work. I don’t know, for instance, how I will address the issue of directory separator characters in build does require("psr\\macro.pxs"); and similar statements. Also, for #!-compatibility, I might have to completely change the command-line interface of Prx.exe. We’ll see how far I get. The fact that Ubuntu still doesn’t ship with a .NET 4.0 compatible version of mono doesn’t help.

After that?   That depends. I have some neat ideas for allowing scripts to specify minimum and maximum compiler- and runtime-versions. The macro system still needs some love, reusing code is difficult to say the least. Block expressions are a possibility, though in its current form, the compiler can’t handle those. Eventually I want to overcome this limitation, but that requires a rewrite of the entire compiler (at least large parts of it).

Discussion Area - Leave a Comment