Prexonite 1.2.3 – Best Friends Forever

I was tempted to just skip this silly version number, but here we go: I humbly present to you

Prexonite v1.2.3

(No, I couldn’t resist :3 )

As always, Prexonite release 1.2.3 comes with

Sadly linux-compatibility didn’t make the cut, so Prexonite still only builds on windows. The binaries might or might not work with mono.

Stability

In spite of the silly version number, v1.2.3 is a very serious release. A lot of the work of the past couple of months went into making sure that you could depend on what I ship as part of Prexonite. I extended the automated test suite to cover the PSR, while also running existing test cases in multiple configurations.

I don’t really want to call it a long-term-support version, because unless there actually is a need for support (in the form of backported patches), it would be an empty phrase. The idea is more that, starting with the next version, I will be bold and start extending the language in a way that will change how you write Prexonite Script programs.

Breaking changes

This release of Prexonite Script comes with two breaking changes that affect a number of existing programs, both compiled and in script form.

var args fallback was removed

In Prexonite, all local variables are initialized to null, with the notable exception of var args, which holds the list of arguments passed to the function. However, in previous versions of Prexonite, there was a rule that allowed you to have a function parameter called “args” that would not get overwritten with the argument list. Instead the “magic variable” for that particular function would be “\args”. If there was a parameter called “\args”, the magic variable would be called “\\args” and so on.

This behavior is unintuitive to say the least. It also makes writing macros that access the argument list hard to write. So starting with this release, the var args fallback mechanism is no longer part of Prexonite. If you have a parameter called args, tough luck. This change affects the Prexonite execution engines, which means that both compiled programs and scripts are affected.

psr\ast.pxs SI is now implemented as a function

SI from psr\ast.pxs is a very handy shortcut for Prexonite compiler constants and utility functions. SI stands for “SymbolInterpretation”, but has grown to also provide call and return mode constants. In short, it appears very often in macro code. Previously it was implemented as a global variable that was initialized by some code in a build block. While this works perfectly for both the interactive interpreter and just running the script, it doesn’t support serialization (storing the in-memory representation of an application in a *.c.pxs file). While a serialized application can be loaded and executed just fine, it can’t be used to continue compilation, because the build-code that initialized the SI variable at compile-time is absent.

The solution is to implement SI as a function that lazily initializes the global variable. Due to the principle of information hiding underlying Prexonite Script’s function invocation syntax (parentheses are optional if the argument list is empty) this doesn’t pose a problem for existing script code. Existing compiled code doesn’t matter because of static linking. The one area where the change breaks code, is with meta macros (macros that write macro code). Code generated by macro code often refers to SI, because it is simpler to generate than accessing the corresponding static CLR fields (and more performant).

So: If your code refers to ast\gvar(“SI”), you’re in trouble. Now, you could just use ast\func(“SI”), but psr\macro.pxs comes with a better option: ast\getSI gives you a node referring to SI, whatever SI is in your distribution. Just Ctrl+F your code, searching for “SI” (with the quotes) and replace the ast\gvar expression with ast\getSI.

Improved round-tripping

Up to this point, storing a compiled application into a *.c.pxs file did alter that application in a significant way: In memory, not all functions and global variables are represented by a symbol. Inner functions for instance had no global name that you could refer to. When you store this application, all these “hidden” functions will be represented as top-level functions. When this file is loaded again, the compiler naturally can’t differentiate between functions that were “originally” top-level and “hidden” functions. True, the `ParentFunction` meta key could be consulted as a heuristic, but there are other examples of functions that have no global symbol (e.g. generated by a macro).

Starting with Prexonite v1.2.3, there is a new meta switch called ‘\sps’, which is short for “Suppress Primary Symbol”. It instructs the compiler, not to create a global symbol for the defined name of the function (the id that immediately follows the function keyword). Instead, all symbols are declared explicitly in the symbols section at the end of the file.

If you previously relied on all functions getting global symbols after round-tripping (storing-then-loading an application), you’ll be in trouble: The storage algorithms now add \sps to all functions (even originally top-level ones) and declare all global symbols explicitly.

Safety and coroutines

Prexonite comes with two execution engines: a bytecode interpreter and a CIL compiler. The former is used by default. You have to explicitly request CIL compilation by issuing the CompileToCil command at runtime (at this time, you cannot compile Prexonite byte code ahead of time). From that point on, the CIL implementations of functions is always the preferred way of executing a function. The interpreter only acts as a fallback mechanism. And there is a good reason for that besides performance: safety.

Exception handling in the bytecode interpreter is really just “best-effort”. After an exception occurs, I cannot guarantee that I will be able execute the corresponding catch or finally block. This is mostly due to the open nature of the interpreter API. It is possible for a stack context to remove itself from the stack, preventing the interpreter from applying the handlers in that stack frame to exceptions. Even worse, from the interpreters view, functions can just stop executing at any time for no apparent reason. For instance when used a coroutines.

While exception handling itself will usually work as advertised, finally-blocks in coroutines are fundamentally broken. They only work when an exception is thrown or when control gets transferred naturally (i.e., reaches the end of the try block). But especially for coroutines there is a very important third scenario: The sequence stops being evaluated at some point (code might just look at the first couple of elements and then abandon the sequence). If your sequence is holding onto an opened file, you’ll be in trouble.

Exceptions are a cool concept but they are also hard to implement correctly (at least in relation to any particular language feature). Coroutine support was easy to add to the interpreter for precisely the same reason it is not safe: The interpreter does not have tight control over the code it executes. The ability to abandon a computation at any point made coroutines almost trivial to implement, while making correct exception handling almost impossible.

For the nearer future this has the following consequences:

  • You should use the CIL execution engine whenever possible. Prexonite v1.2.3 still does not compile to CIL by default, however!
  • The interpreter will remain part of Prexonite, mostly as a “quick and dirty” way of executing code (build blocks, initialization code, REP-loops)
  • The syntactic support for coroutines will remain in Prexonite language. I might one day implement a coroutine transformation targeting CIL.
  • The compiler tries to detect and warn about usages of yield inside protected blocks. Yield statements outside protected blocks (i.e., in a naked for-loop) remain perfectly safe.
  • You can use the syntax `new enumerator(fMoveNext, fCurrent, fDispose)` to create native Prexonite sequences (no boxing of CLR objects involved). This can be more performant than coroutines, since the individual functions can be compiled to CIL.

Making the language more friendly

Prexonite Script has never been a simple language, but I have used this release to make some aspects of it a bit less verbose and more programmer-friendly:

Lists allow for trailing comma (‘,’)

Whenever you encounter a comma separated list in Prexonite Script, it is now legal to add an additional comma at the end. This even goes for argument lists, making `max(a,b,)` a perfectly valid invocation of the max function. Similar features can be found in many of the more pragmatic languages out in the wild. It makes manipulating lists that represent data (often found in embedded domain specific languages) much easier and simplifies Prexonite Script code generators (they no longer have to worry about not adding a comma after last entries).

More sane meta information blocks

In a similar move, meta entries in a meta block need no longer be terminated by a semicolon (‘;’); they are instead separated by semicolons while allowing for trailing semicolons. Existing code (where there always is a trailing semicolon) thus remains valid, while the often found single-entry meta blocks look less silly.

[is test;] vs. [test]

But wait! Where did the ‘is’ keyword go? Well that is now optional too. Essentially whenever you omit the value of a meta entry, it is interpreted as a boolean switch. This also works on the global level, but there the (terminating) semicolon remains mandatory.

Reliability

Up to this point, the scripts found in the PSR (please don’t ask what this acronym stands for) were never really considered a part of Prexonite/Prexonite Script. I just shipped them with every release, made them available to all scripts loaded via Prx.exe. They changed interface frequently and were often quite buggy.

There never was a need to. The PSR isn’t really a standard or runtime library. It’s just a collection of scripts that I found useful for day-to-day hacking in Prexonite Script. Prexonite itself doesn’t depend on the PSR in any way (this makes writing the command line tool a bit awkward at times).

Things have changed a bit with the inclusion of psr\macro.pxs, which is pretty much the only sane way to write macros in Prexonite Script and has de factor attained the status of a proper standard library. As a consequence I have started writing automated tests (of the unit/integration sort) for the most widely used PSR files.

I don’t have an automated testing framework for Prexonite script files yet, but you can use psr\test.pxs today to start implementing your own tests. In the spirit of xUnit it features a meta tag “test” and an “assert” function. Running all tests in an application is accomplished via “run_tests”. You can also specify additional tags that need to be present for a test to be included in the suite. By default, “run_tests” renders its results to standard output via what is called the “basic_ui”. You can provide your own “ui” (a structure that supports certain operations.) For more details, have a look at the implementation of basic_ui. You can of course use psr\struct.pxs to create the structure, I just didn’t want psr\test.pxs to depend on psr\struct.pxs and through that on psr\macro.pxs.

Through some arcane trickery (a bit of T4 and a Prexonite Script file) the automated test written using psr\test.pxs are executed as part of the NUnit test suite for Prexonite itself. One test case (a function) in Prexonite Script is mapped to a set of test cases in the NUnit world.

The way forward

The next release v1.2.4 will not be an ordinary version. After this rather uninteresting iteration, I will be a bit more bold in the next one, as hinted at in the first section. While staying largely backwards compatible, v1.2.4 will include a number of experimental features, that might be disabled by default. It will also break the Prx.exe command line interface. Look for more details in future posts.

Prexonite 1.2.2 – Power to the macro

TL;DR The new Prexonite release 1.2.2 is here!

In the nick of time…

When I announced my 3-month milestone cycle last time, I was sure that three months were waaay too long.Well my gut feeling hasn’t betrayed me. The 1.2.2 release is late by over a week. Why? Two reasons: I had solved all the interesting problems at the beginning of the iteration, towards the end, all that was left was grunt work (re-implementing all call\* commands to support partial application). So my free time got more and more taken over by more other interesting activities, like StarCraft 2, exploring C++ and a super secret compiler-related tool/library. I only managed to finish the milestone by cutting one high-profile feature: automatic partial application of macros, but more on that later.

The Prexonite compiler architecture is really starting to show its age (read bad less-than-ideal design). It is currently very easy to have the compiler generate invalid byte code (evaluation stack underflow and/or overflow) and there is nothing I can do to prevent it, that isn’t a giant, ugly hack. I’d really love to completely rewrite the entire compiler of Prexonite (starting with the AST and code generation). But this is not something for the near future.

Macro Commands

Up until now, macros were in a very bizarre situation. For the first time, Prexonite Script code offered a feature with no equal counterpart in the managed (C#) world. Instead of functions, there are commands; Prexonite structs are mirrored by plain CLR types with IObject; Compiler hooks and custom resolvers can both be implemented in either Prexonite Script or managed languages. But for macros, there was no equivalent – until now.

Let’s face it, the first implementation of Prexonite Script macros was a hack. I mean, injecting a fixed set of ‘magic variables’ into the function smelled right from the beginning. Obviously, the new managed macros and the interpreted ones should share as much architecture as possible, have the same capabilities and protections etc. Could you have imagined the method signature of the managed mechanism had I kept those 6 variables around? And the versioning hell?!

Ever wondered why it is common practice to pack all your .NET event arguments in an EventArgs class? Same reason, you want to be able to evolve the interface without breaking the event (changing its signature). Conversely, the macro system now operates with a single value being passed to the macro implementation (whether it is written in Prexonite Script or managed code).

That one object, the macro context, provides access to the information previously available through those 6 variables. Mostly. The new interface is much more restrictive. It doesn’t give you direct access to the loader or the compiler target. This is a first step in turning macros from a compiler feature into a language/library feature. Ultimately, macros should have their own AST/internal representation, to be completely independent of the compiler.

Partial Application for macros

When designing programming language features, one should always strive for orthogonality. If there is a sensible way in which two features could be used together, it should probably be possible to use them together. This is not always easy, of course, and sometimes the one reason why you should not include a certain language feature.

In Prexonite Script, compile-time macros and partial application are kind-of at odd with one another. Macros generate new code to be substituted, while partial application creates function-values without generating new functions. These two features are not easily combined: In general macros can expand to unique and completely unrelated code. Partially applying an arbitrary block of code is not possible (control flow into and out of the block cannot be captured by a function object, not in a transparent fashion).

Consequently, macro authors have to implement partial applications manually. Let’s see how you could implement a very simple logging macro, that also supports partial application. It should accept a .NET format string, followed by an arbitrary number of arguments. When expanded it should print the name and signature of the function from which it was called (expanded), along with the containing script file name, line and column, followed by the formatted message. It should obviously be possible to disable logging, thus the macro should expand to null when the meta switch “debugging” is not set.

Now this just happens to be one of those macros that can be partially applied. It only uses the macro expansion mechanism to gather additional information (function name, source file, line and column). It doesn’t generate return statements, or needs to expand other macros in the same context. This means that you can implement the bulk of your macro in an ordinary function that takes two additional arguments: the source position and a reference calling function. The macro thus only needs to gather this information and assemble a call to the implementing function.

function log\impl(pos, func, msg) 
{ 
    var finalMsg = call(msg.Format(?),var args >> skip(3)); 
    println(pos, ", ", func, ": ", finalMsg); 
}

Generating this code via the AST wouldn’t be very fun anyway. Now on to the interesting part, the actual log macro.

macro log(msg) [is partial\macro;]
{
    if(context.Application.Meta["debugging"].Switch)
    {
        var funcStr = ast\const(context.Function.ToString); 
        var posStr = call\macro([__POS__]);
        
        var impl = ast\func(->log\impl.Id);
        impl.Call = context.Call;    
        impl.Arguments.Add(posStr);
        impl.Arguments.Add(funcStr);
        impl.Arguments.AddRange(var args);
        
        context.Block.Expression = impl;
    }
    
    return true;
}

On line 3, we check whether the debugging switch is enabled and skip the entire code generation if it isn’t. If we don’t generate any code, the macro system will supply an expression that evaluates to null.

The variable funcStr is initialized to a string literal containing the function’s name and signature (line 5). Calling ToString on the function object is necessary, because the constant node constructor ast\const expects its argument to be either a boolean, an integer, a double or a string. No implicit conversion is performed in order to avoid ambiguity.

On line 6, the source code position (file, line, column) is handled. We use a shorthand form of call\macro, where we can specify the macro call we’d like to perform in a list literal and the implementation of call\macro (a macro command since this release) will translate it into a proper invocation of call\macro. That would look somewhat like this:

var posStr = call\macro(macro\reference(__POS__));

Having to type macro\reference every time you call a macro is no fun, especially not, if the callee-macro is actually statically known (which is quite often the case). You can still supply argument lists as additional arguments, just like with all other call/* commands.

On line 8, we start assembling the invocation of our implementation function. We could have just supplied the string "log\\impl" to the function call node constructor ast\func, but the proper way to do this, is to acquire a reference to that function and retrieve its Id. This way, if we change the physical name of the function (but retain the log\impl symbol, our macro will still work.

Copying the call type of our macro to the implementation function call on line 9 is very important to maintaining the illusion of log being an ordinary function. If you write log = "hello", you will expect the "value" of this expression to be "hello", with the side-effect of logging that message. If we don’t propagate the call type of the macro expansion to our function call expression, it will default to Get, which is not what the user expects.

Arguably, a macro should have no control over arguments outside of its argument list, but the way Prexonite understands the assignment operator, the right hand side is just another argument and is thus fair game as far as the macro system is concerned. It’s also not easy to detect a forgotten propagation of context.Call: Even though most macros that are Set-called will expand to Set-call expression, this is not a necessity for a correct implementation.

On lines 10 through 12, we add our arguments to the implementation function call: the source position, the function name as well as all arguments passed to the log macro. Don’t forget to actually assign the function call to the expanded code block. I can’t remember how many times, I’ve forgotten this. The result would be the macro-system-supplied null expression. So, if your macros ever produce null, double check if you have actually returned the generated AST nodes.

Almost done. Up until now, we just wrote a normal macro with the Prexonite 1.2.2 macro system. To make it partially applicable, we need two more details. Firstly, we need to mark our macro with the macro switch partial\macro. This signals that our macro can attempt to expand a partial application. Because there are some situations, where a macro cannot be partially applied (when the one thing that needs to be known at compile-time is missing), a macro needs to be able to reject expansion.

A partial macro (short for partially applicable macro) must return true, otherwise the macro system reports this particular expansion of the macro as "not partially applicable". Unlike normal macros, partial macros cannot return their generated AST, they must assign it to context.Block and/or context.Block.Expression. Also, the same macro code is used for both partial and total applications of the macro. You can use context.IsPartialApplication to distinguish the two cases.

However, if you build your macro carefully, you might not need to make that distinction. Our log macro doesn’t use this property at all, because the code we generate is the same in both scenarios. By building an ordinary invocation of log\impl, we don’t have to worry about partial application ourselves and can instead let the rest of the compiler worry about the details.

Calling Conventions

Prexonite now provides a whole family of very similar commands: call, call\member, code\async and call\macro, collectively known as call\*. They allow you to perform calls with a larger degree of control or different semantics. With the exception of call\async, these are also the "calling conventions" found in Prexonite. call uses the IIndirectCall interface (used by ref variables and the expr.(args) syntax), call\member is used for object member calls, and call\macro allows you to invoke other macros rather than expanding them.

Unfortunately, these call\* commands didn’t play nicely with partial application. They were partially applicable, but not only in a subset of useful cases. Take this usage of call\member for instance:

call\member(obj,?,[arg,?])

You might think that it create a partial application with two mapped arguments: the name of the member to invoke as well as the second argument. All other arguments to that partial application would be treated as argument lists. Prior to Prexonite 1.2.2, this was not the case, because partial application doesn’t look for placeholder in nested expressions. The list constructor would be partially applied and passed to call\member as an argument. And since partial applications are not argument lists, this will result in a exception at runtime.

Prexontie 1.2.2 contains a new calling convention, call\star which, well, handles calling call\*-family commands/functions. While that alone isn’t very exciting, it also scans its arguments for partially applied lists and adopts their placeholders. It enables you to do this:

call\star(call\member(?),[obj,?,arg,?])

This expression will yield a partial application (even though none of call\star‘s direct arguments is a placeholder), that does what we expected our previous example to do. Of course, this is kind of unwieldy, which is why all other call\* commands have been rewritten as macros to take advantage of call\star. This means that in Prexonite 1.2.2, our example from above will actually work the way we expected it to work.

Automatic partial application compatibility

One of the features I had planned for this release, was a mechanism for automatically making some macros partially applicable. The user would have to announce (via a meta switch) that their macro conformed to certain constraints and the macro system would have automatically made it partially applicable. My idea was to take the code generated by the partially applied macro and put "extra-line" it into a separate function, sharing variables where necessary.

Unfortunately, the reality is somewhat more complicated. This kind of transplantation cannot be done in a completely transparent way. I’m also not sure, if the Prexonite compiler is equipped to perform the necessary static analysis. (It would certainly be possible, but at the cost of maintainability; a hack in other words) Finally, this is not really what partial application stands for. The promise of not generating new functions for every call site would be broken.

I had a number of ideas for dealing with this issue: Comparing functions with existing ones, trusting the user to generate the same code and re-using the "extra-lined" function after the first macro expansion. You have seen how straightforward partial macros are, when you follow the pattern of implementing the bulk of code in an implementation function. At this point, automatic partial application compatibility doesn’t seem worth pursuing. There are more important issues at hand, such as improving code re-use in macros.

The next milestone: Prexonite 1.2.3

The next milestone will focus on usability, compatibility and reliability. All very important topics (if a bit dull). I would like Prexonite to

  • compile, run and test Prexonite on Ubuntu via mono 2.8+
  • Support Unix #! script directives
  • have a unit-testing "framework" for Prexonite
  • provide automated tests for the Prexonite standard repository (psr)
  • Integrate those tests into the automated testing of Prexonite
  • Make the language a little bit more friendly
  • Automated testing of round-tripping (compile&load should be equivalent to interpreting directly)

While this doesn’t seem like a lot of new features for Prexonite itself, it will actually involve quite a lot of work. I don’t know, for instance, how I will address the issue of directory separator characters in build does require("psr\\macro.pxs"); and similar statements. Also, for #!-compatibility, I might have to completely change the command-line interface of Prx.exe. We’ll see how far I get. The fact that Ubuntu still doesn’t ship with a .NET 4.0 compatible version of mono doesn’t help.

After that?   That depends. I have some neat ideas for allowing scripts to specify minimum and maximum compiler- and runtime-versions. The macro system still needs some love, reusing code is difficult to say the least. Block expressions are a possibility, though in its current form, the compiler can’t handle those. Eventually I want to overcome this limitation, but that requires a rewrite of the entire compiler (at least large parts of it).

Project Announcement: TinCal

I’d like to announce a project of mine that has been in planning since the end of January 2010: „TinCal“. It’s a project I’ve been working on and off (mostly off) for the past ten months, besides school, Prexonite and another secret project. At this point TinCal can really become anything from vapor ware to my next „successful“ (i.e. finished) programming language (and maybe bachelor’s thesis). So yes, it is a programming language and thus the successor project to Prexonite, and as before, I’m aiming much higher now. TinCal is definitely going to be statically typed and compiling directly to Common Intermediate Language (.NET byte code). But I don’t plan to stop there: True to my love for functional programming in general and Haskell in particular, TinCal is going to be a language extremely similar to the most successful lazy programming language to date.

[Read more →]

Multithreaded Prexonite

Yesterday, I watched the Google TechTalk presentation of the programming language Go (see golang.org). Even though I don’t believe Go will become very popular, there was one aspect of the language that “inspired” me. One of Go’s major advantages is the fact that concurrency (and coordination thereof) is built deep into the language. Every function has the potential to become what they call a “goroutine” that works in parallel to all the other code. Communication and coordination is achieved through an equally simple system of synchronous channels (a buffer of size 0).

The code samples all looked so incredibly simple to me that I thought: I can do that too. I started by writing a case study (a code sample) and a sketched implementation of the synchronization mechanisms. It all looked nice on paper (in the Prexonite Script editor) but there was one major problem: None of the Prexonite execution engines is really capable of running on multiple threads. Even though the CIL compiled functions behave nicely, they don’t cover the whole language. Coroutines and other dynamic features are just too important to be ignored.

Fortunately the design of the interpreter is centered around one abstract class: the StackContext. StackContexts come in a variety of forms and provide complete access to the current state of the interpeter. For the most part, these objects are passed around in the call hierarchy. And that’s good, because functions/methods that call each other are in the same thread. So stack contexts stay thread local. At least the ones obtained via method arguments. But there is also another way to get your hands onto stack contexts and that’s the interpreters stack itself.

Of course a multithreaded Prexonite will have multiple stacks but just providing those isn’t enough. I had to ensure that direct queries to/manipulation of the stack are only directed at the stack of the currently executing thread. Since stack contexts are sometimes “captured” by other objects (coroutines for instance), they can on occasion slip into other threads and wreak havoc with the stack of their original thread.

The only solution I saw, was to use thread local storage via unnamed data slots. This has disadvantages, of course:

  • data slots are rather slow (not Reflection slow, but also not virtual call fast)
  • it is difficult to manipulate a different stack if you want to

but

  • it works!
//name~String, source~Channel
function work(name, source)
{
    println("\tWhat did $name say again?");
    var a = source.receive;
    println("\t$name said \"$(a)\"");
    return name: a;
}

function main()
{
    var source = chan; //Creates a new channel
    println("Get to Work!");
    //`call\async` invokes `work` in the backgrouund
    //`resp` is a channel that will receive the return value of `work`
    var resp = call\async(->work,["Chris",source ]);
    
    println("Oh, I forgot to tell you this: Chris said Hello!");
    source .send("Hello!"); //supply the missing value
    
    println("I'm waiting for you to finish...");
    var res = resp.receive;
    println("So your answer is $res then...");
    
    return 0;
}

results (sometimes) in

Get to Work!
        What did Chris say again?
Oh, I forgot to tell you this: Chris said Hello!
I'm waiting for you to finish...
        Chris said "Hello!"
So your answer is Chris: Hello! then...

Lists, coroutines and all other kinds of sequences appear very often in Prexonite. It is only logical to combine multithreading with the IEnumerable interface. The result is the `async_seq` command that takes an existing sequence and pushes its computation into a background thread.

The following example applies the two fantasy-functions `sumtorial` and `divtorial` to the numbers 1 through 100. The ordinary `seq` version uses can only use one processor core, while `async_seq` pushes the computation of `sumtorial` into the background and onto the second core. Even though the management overhead is gigantic, a performance improvement can be measured (a modest factor of ~X1.3).

function sumtorial(n) =
    if(n <= 1) 1
    else       n+sumtorial(n-1);
function divtorial(n) =
    if(n <= 1) 1
    else       n/divtorial(n-n/38-1);

function main()
{    
    var n = 100;
            
    var seq = 
        1.To(n) 
        >> map(->sumtorial)
        >> map(->divtorial);
   
    println("Sequential");
    println("\t",seq >> sum);

    
    var par = 
        1.To(n) 
        >> map(->sumtorial)
        >> async_seq
        >> map(->divtorial);
   
    println("Parallel");
    println("\t",par >> sum);
}

Multithreaded Prexonite currently lives in its own SVN branch as I am still experimenting with concrete implementations. A very nice improvement would be the move away from CLR threads as they don’t scale well. Communicating sequential processes spend a lot of time waiting for messages, something that does not require its own thread. Go can use 100’000 goroutines and more. I don’t even want to try this with CLR threads…

ACFactory Day 5: Welcome to Paper World

This is the fifth part of a series documenting the development of ‘ACFactory’ (ACFabrik in German), an application that generates printable character sheets for the Pen & Paper role playing game Arcane Codex (English page).You might also want to read ACFactory Day 4: Command & Control.ACFactory prototype showing two almost entirely empty pages.Pixels in WPF are device-independent and set at 96 pixels per inch. This means, that one WPF-pixel corresponds to one device-pixel if and only if the resolution of the device is 96 dpi. Despite sounding complicated, this is a good thing, because WPF pixels have fixed dimensions. One pixel equals 1/96th of an inch, or 3.77952755905512 pixels are equal to one millimetre. (I’m sorry, but who ever invented inches should be slapped). Naturally, it would not be practical to author the talent sheet in WPF-pixels, constantly having to convert back and forth using a calculator or a table (or both: Excel).ACFactory Zoom controlIt would be nice to let someone else worry about my coordinate system of choice, someone like WPF. Unlike GDI+, WPF does not provide explicit coordinate system transformations, probably for a good reason. What you can do, are layout transforms. The scale transform looks particularly promising. Initialised with my magic number, I could author my sheets in millimetres and WPF converts them into pixels. There is one problem there: ScaleTransform not only transforms the coordinate system, but everything inside the affected control, including font sizes. While in 99 out 100 cases, this is the expected behaviour, in my exact scenario it’s not. Cambria 12pt should render like Cambria 12pt would render on paper. However, these normal-looking 12 points are scaled to giant 45.3354… points by the ScaleTransform.Maybe the StackOverflow.com community knows an answer. Until then, I created the hopefully temporary markup extension PaperFontSize, that automatically reverses the effects of such a ScaleTransform.Having everything set up, I can finally start implementing the talent sheet.

ACFactory Day 4: Command&Control

This is the fourth part of a series documenting the development of ‘ACFactory’ (ACFabrik in German), an application that generates printable character sheets for the Pen & Paper role playing game Arcane Codex (English page).

You might also want to read Day 3: Authoring XAML.

Today, I was mostly lost in the vast unknown jungle that WPF is, even after having read “Windows Presentation Foundation: Unleashed by Adam Nathan” (I really recommend it!).

Control

First there was the difference between UserControl and custom control. (No, I am not very familiar with any other UI framework). Whereas UserControls are little more than include on steroids (you get a code behind file), if you *really* want to create something new or abstract over a composition of controls, then creating custom controls is your only option.

But custom controls are just C# (or VB.NET) code files. There is no XAML involved. How can that be the preferred way to author new controls? Remember that WPF controls are supposed to be look-less, platonic ideas of what they represent.
I wanted to create a zoom view control. What are the abstract properties of such a zoom view control?

  1. It has content
  2. It can magnify its content

Number 1 tells us to derive from ContentControl, the type that defines the Content property. Number 2 is a bit trickier. I decided that my control has a ZoomFactor property (type double, 1.0 == 100%) to which a ScaleTransform is bound. Whether or not this works, I am not exactly sure as the control is not working yet.

But how does the control look? Well that’s not the controls concern. A look is provided by the Themes/Generic.xaml resource dictionary, the default fallback in the absence of local definitions and system specific themes. In my case there is going to be a neat little zoom control (combo box + slider) hovering in the top left corner.

Command

To establish communication between the ZoomViewer template and the ZoomViewer control, there is really only one good mechanism: Commands. Commands are yet another abstraction that makes event handling more modular. Controls like buttons, hyperlinks and menu item can be “bound” to a certain command. That command determines whether they are enabled or not and what happens when they are clicked. You could for instance bind the menu item Edit > Paste, a toolbar button and Ctr+V to the Application.Paste command and they would all automatically be activated/deactivated depending on the state of the clipboard.

Even better, the default implementation, RoutedCommands, work just like routed events and bubble up the tree of your XAML interface. You can then define different command bindings at different locations in your UI. The best of all: Via the command target property, you can tell the command routing where to look for command bindings. I could have two buttons, that both invoke the Navigation.Zoom command, but on two different ZoomViewers.

My ZoomViewer does support the Navigation.IncreaseZoom, .DecreaseZoom and .Zoom commands. This is how the default control template can communicate with the ZoomViewer, by invoking those commands.

There is however one thing, I found very irritating: neither the slider nor the combo box implement commands by default. The msdn contains a sample, that shows how to do this. It turns out to come with quite a few things to watch out for:

  • You must differentiate between routed and ordinary commands as only the former can react to command bindings and can be set to originate from different InputElements.
  • You should rather pass a reference to the invoking control than null as the command target.
  • You must be careful with the CanExecuteChanged event handler. It must be correctly unset, when the command is changed/removed.

How well this all turns out, will hoepfully see soon. Development right now is a bit sluggish as I keep switching over to msdn and/or my book for reference, since Visual Studios XAML editor is not very sophisticated, even with basic ReSharper support. This must get much better in VS10. Up until now, I have observed VS08SP1 only crash 3 times (twice due to a recursive binding *blush*) and once with an HRESULT of E_FAIL (whatever that exactly was). But at least I lost no code.

Oh, and why exactly the Microsoft Blend XAML editor does not provide any support is totally beyond me. I mean free (that means it costs nothing) tools provide better code completion than Blend: Kaxaml, the “better XamlPad”. Even though it takes some time to load, I can definitely recommend it.

ACFactory Day 2: ExtendableEnum and XAML Serialization

This is the second part of a series documenting the development of ‘ACFactory’ (ACFabrik in German), an application that generates printable character sheets for the Pen & Paper role playing game Arcane Codex (English page).You might also want to read Day 1: SealedSun goes WPF.

ExtendableEnum

Talents in Arcane Codex have a number of properties that are best described as enumerations. Unfortunately plain old enumerations are very flat. Translating them (when displayed in a user interface) requires you to wrap each and every appearance in the system. This is why I need a richer enumeration type.Enter ExtendableEnum, an abstract base class that handles comparison and parsing of enumeration values. An existing enumeration could even be extended by a plug-in, should my application ever implement a plug-in system.I was, however, confronted with a very annoying problem: type initialisation is lazy. The CLR employs certain “heuristics” to find out when to initialise a type. By default, a type is marked with “beforeFieldInit”, which means that the type is usable before its static fields have been initialised. Of course, static fields are always initialised “just-in-time” when they are accessed. The attribute is not applied when the class contains a static constructor (or class constructor or type initializer). In that case, the type is initialised when one of its members is first accessed.While this is a pretty good strategy for the “normal” use of types, it might be a problem in XAML-based applications since the ExtendableEnum-parser is used before any of the enumeration values is referenced in code, which in turn means that the corresponding enumeration types had no chance to register their enumeration values with the corresponding registry.One Solution I have found is the System.Runtime.CompilerServices.RuntimeHelpers.RunClassConstructor method which, well, runs type initializers. I can only hope that the method (which points right into the heart of the CLR) is smart enough not to initialise a type twice. My approach is not perfect as extensions of enumerations are not necessarily included. For a future plug-in system, some sort of InitializeOnLoad attribute could save the plug-in writers day. But my hack works 100% for all non-extended enumerations and that’s enough for milestone 1.

XAML Serialization

Step 1: Make your objects “expressable” in XAML

The “read”-aspect of XAML serialization is much more important as it is an absolute requirement for milestone 1. The tricky thing with XAML is, that you cannot express circular references for plain CLR objects (no DependencyObject). My object model, however, requires two way relationships in some places. For instance: Talents need access to the attributes of “their” hero in order to compute effective talent levels. Now since there has to be a default constructor, the hero reference will be initialised with null, resulting in an invalid state. There is no way to ensure that your object is initialised correctly, as you don’t know when WPF/XAML is “done” with its manipulations.The only option is to propagate the hero reference down the hero graph once the hero is created, which means that even collections have references to the hero they belong to.

Step 2: Make your objects serialize correctly

Contrary to what people might tell you, XAML Serialization does not come for free. There are severe limitations and not that many customisation options. Here is how I wished I could store my heroes:

<Hero ShortName="Kyle" FullName="Kyle MacDuncan"><Hero.Attributes><Attribute Level="8">Strength</Attribute>...</Hero.Attributes><Hero.Talents><Talent Level="5">Sword</Talent>...</Hero.Talents></Hero>

Automatically converting the hero-less Attribute and Talent to their bound equivalents, HeroAttribute and HeroTalent respectively. Interpreting this is one thing. A bit of Voodoo magic and a couple of virgins (read: TypeConverters and the like) would make this work. But as I said, there is no way to tell the XAML serializer to first convert certain values to more serializable equivalents.So eventually I gave up and implemented  XAML serialization using a pretty nasty hack: All the properties that need processing prior to assignment are loaded off into a xData class (HeroData, TalentData, and so on). This essentially means that I have to implement each non-trivial property at least twice and that extension via plug-ins has become at least twice as difficult. My serialised hero looks like this:

<Hero x:Key="codeHero"xmlns="clr-namespace:ACFabrik.Model"xmlns:wpf="http://schemas.microsoft.com/winfx/2006/xaml/presentation"ShortName="Kyle" FullName="Kyle Mac Duncan"ExperienceTotal="14" ExperienceUsed="12"FameTotal="15" FameUsed="10" Encumberment="0" ><HeroData LocalLibrary="{wpf:StaticResource defaultLibrary}"><HeroData.Attributes><HeroAttribute Level="8" Attribute="Strength" /><HeroAttribute Level="7" Attribute="Constitution" />...</HeroData.Attributes><HeroData.Talents><HeroTalent Level="7">Alchemy</HeroTalent><HeroTalent Level="8">Attention</HeroTalent>...</HeroData.Talents></HeroData></Hero>

The HeroData node defines a new attribute: “LocalLibrary”. It is used to map the talent names (and possibly others) to their definition in an external library. This way multiple heroes can share the same talents. LocalLibrary is only required when heroes are loaded as part of XAML resources, i.e. when you don’t have control over the XamlReader.Load method.The result is a bit more verbose and not that beautiful.Well, at least it will yield good compression ratios (the “<Hero”-prefix…).

Next Steps

On the data side, the next question to answer is how the file formats look exactly. While I now have the basic capability to serialize my objects to streams, I need to come up with concrete formats for heroes and libraries.Also, I need to start thinking about the general UI concept. Printing in WPF works via Visuals, so I literally get previewing for free. I therefore include a very basic UI (preview + print button) in the requirements for milestone 1.

ACFactory Day 1: SealedSun goes WPF

This is the first part of a series documenting the development of ‘ACFactory’ (ACFabrik in German), an application that generates printable character sheets for the Pen & Paper role playing game Arcane Codex (English page).Before I explain why I chose to implement ACFactory in WPF, let me quickly sketch the application’s functionality planned for the first iteration:

  • Read a hero definition from a human-writeable text file
  • Print a sheet that lists all the talents known by a hero

Note that having a graphical UI is explicitly not a requirement for the first iteration. It is not the first character sheet generator I’m writing. The Heldenfabrik (Hero factory, not released) project for instance uses the Prexonite scripting engine to define a DSL used to specify heroes and the same GDI+ based rendering framework (read: bunch-a-classes) as the DSA calendar generator (DSA Kalendergenerator) to generate an almost-A4-sized png. Needless to say, that it was very painful to hard code all the spacings and layout characteristics. I needed something that supported me in designing fixed-page layouts. One solution would have been to write a custom rendering engine that would be tailored to the special needs of fitting content of variable size onto a fixed page. I even went so far as to prototype a possible architecture in F#, only find out that

  1. It is really hard to express complex object models in F# due to the lack of forward declarations.
  2. It is a lot of work to implement automated layout.

So what are the alternatives? What publicly available layout engines exist out there? One obvious answer is XHTML+CSS. XHTML documents are relatively easy to generate (System.Xml.Linq aka LINQ to XML or XLINQ), well understood and can be printed by modern browsers. The problem: XHTML documents are inherently digital and thus optimised for variable sizes. Also the final look depends on the layout engine (read: browser) used to display the document. I don’t even want to talk about the horror that printing from a browser is.Living in a .NET 3.5 environment I could no longer ignore our little newcomer to the graphical user interface world: Windows Presentation Foundation (WPF). We all suspected, that this user interface framework has more to offer than a ridiculously unpronounceable name and fancy 3D graphics. It comes with layout features web designers can only dream about (one word: Grid). So what’s the catch apart from acknowledging that WPF might be cool?A friggin’ steep learning curve. You can learn C# without a book if you’ve already worked with the .NET framework (via, say, VB.NET). You can learn CSS, XML, XPath, XSLT without a book given good online tutorials. But WPF is so complex, that it would take you months to harvest even basic knowledge of WPF from blogs and tutorials all over the net.Windows Presentation Foundation Unleashed (WPF) (Paperback)Since I want to finish the project within 7 days, this was not an option (also, I’m a very impatient person). I needed a book. There were two books recommended all over the internet (Applications = Code + Markup by Charles Petzold) and Windows Presentation Foundation Unleashed by Adam Nathan. Based on the customer reviews I decided to buy the latter (the fact that this one was printed in colour made the decision much easier :-)).So here I am, roughly in the middle of the book on page 355 at the beginning of chapter 11 about 2D graphics. One thing I noticed was the alarming number of pitfalls or things-to-remember when coding WPF interfaces. Although the majority of the framework builds on predictable and consistent patterns, there are odd edges here and there such as having to use {Binding RelativeSource={RelativeSource …}} for relative sources as opposed to the uniform {Binding Source={RelativeSource …}. Overall I have the feeling that there will be a lot of reinventing-the-wheel involved as WPF is not as complete as it could (should?) be. Sortable ListView out of the box, anyone?But wait! Didn’t I forget something? Right: Like XHTML, WPF is tailored towards flexible layouts in containers of variable size. Doesn’t this rule WPF out as the layout engine of choice? Nope: WPF was designed to be resolution independent, operating on “virtual” pixels the size of 1/96th of an inch (or device pixels at DPI=96). I can therefore calculate the size of an A4 page at 96 DPI, which is roughly 793.701 pixels times 1122.52 pixels or 3.77953 pixels per millimetre (Beautiful number isn’t it? 🙁 ).

Next Steps

Before I start designing the character sheet, I want to have my Business Objects (the hero + his/her talents) to be up and running. I need a format that is human-writeable and can be loaded by my application. I’m currently looking into the object serialization aspect of XAML, which I already successfully used to define a serializable library of talents available to heroes.XAML (aka System.Windows.Markup) knows how to serialize objects that

  • Have a public default constructor (no arguments)
  • Expose their representation in writeable properties

It’s not that adhering to those rules is especially difficult but it is extremely tedious as immutable objects are essentially ruled out. I like to build my objects (classes) like fortresses, using the type system, readonly modifiers and immutablity to reduce the number of ways an object can enter an invalid state. Designing a business model that serializes to XAML therefor means opening all doors to malicious (and/or stupid) users (developers that use my code). I practically live in the library-writer-world and having to write class that expose their representation the way WPF likes it, makes me twitch.But seeing how elegant XAML maps XML elements to the familiar .NET objects really makes up for the mental pain I am about to go through. You’ll hear from me.Continue reading on ACFactory Day 2: ExtendableEnum and XAML Serialization

Creating a programming language

On October 31, I handed in a paper I have been working on for the past few months. It shortly outlines the process of creating a programming language with a focus on compiler construction:

Compilers

Since computers only process instructions that are part of their instruction set, programs written in programming languages have to be translated into functionally equivalent programs in machine language prior to their execution. This process is called compilation and is performed by compilers.To cope with this task, the translation is commonly split up into multiple steps, also referred to as phases. A compiler starts with reading the input file byte by byte, character by character.Like our eye splits up a text into individual words, the second step is to group meaningful characters together and remove those without meaning (e.g.,~whites paces). This step is called lexical analysis and results in a stream of tokens: short, categorized strings of characters.In the next phase, the compiler determines the relationship between the tokens. It applies the syntax of the programming language and is therefor called syntactical analysis. It results in a tree structure that represents the program.At this stage, some compilers apply additional transformations to the program to increase performance before finally generating code in the language of the target machine.

So, if you want to know how your favorite compiler works, have a look at “Creating a programming language” (PDF, 400KiB).