Prexonite May 2008 Update

Downloads

Update

There have not been any significant changes since the introduction of the CIL compiler into the Prexonite, yet the current version comes with a number of performance optimizations regarding the generated CIL byte code.The majority of the built-in commands and types now use the ICilCompilerAware interface, which is used by the CIL compiler to let commands and types emit highly customized code. Calling println with no arguments for instance, results in a static call to void System::Console.WriteLine() directly in the compiled method.Similarly, type expressions in CIL functions are no longer implemented via type expression parsing but by directly referencing the corresponding singleton PType objects.But the most important improvement is the possibility to statically link Prexonite function calls in CIL compiled methods, which makes yet another hashtable lookup redundant at the cost of additional memory: A dynamically generated class has static fields for each and every function used by the compiled application. This can be a problem if you plan to re-compile your CIL-implementations, as dynamic type, unlike dynamic functions, cannot be garbage collected by design. It is, however, possible to disable the generation of such a class by passing false to CompileToCil.And on a side node: The often used library function struct has been implemented as a compiler hook for improved performance. By resolving the members at compile time one does not only save run time, but also removes the need for dynamic lookups, which in turn enables the use of CIL compilation for struct-functions. This is especially helpful for immutable structs.

A functional touch

The last days, I've been working on two things: The reorganization of built-in commands and the improvement of the "Functional Experience".Why do commands need reordering? Because it gets difficult to find the right file among over 40 commands.Why the sudden increase in numbers? I added proxies for System.Math methods for both easy and fast access to mathematical functions such as Sqrt and Sin, but also Pi.Additionally, the most important coroutines from the Prexonite Standard Repository for list processing have been implemented in managed code, again for performance reasons. Map, Where, Limit, Skip and friends now inject managed coroutines into the stack.The commands are now organized in the namespaces Core, List, Math and Text. The latter currently contains the fixed layout functions SetCenter, SetLeft and SetRight, which fill a given string with some character sequence until it has a certain length and is aligned correctly.Now what the hell do you mean by "Functional Experience"?I haven't told anyone but the Prexonite VM is absolutely terrible when it comes to recursion. Unfortunately, recursion happens to be one of the key elements in functional programming and, as you might have noticed, Prexonite Script comes with a lot of syntactic sugar that makes it look like a functional programming language.Ok, lambda expressions and closures are "true" functional features but the lack of a sophisticated type system makes it almost impossible to reason about a program in the way functional compilers do. Nonetheless, I added two features with the last commit, that make PXS a tiny little bit more functional.

First of all: Tail Call Optimization
Yes, the thing that helps with recursion.
function fac(n,r) =
    if(n == 1)
        r
    else
        fac(n-1,n*r);

I benchmarked this function three times, with different tail call optimization strategies. The difference is huge. See for yourself (10'000 computations of 16!):Comparison of different tail call optimization strategies.Two strategies are employed: An implementation of tail call optimization for directly recursive functions inside the compiler, that turns recursive calls into direct iterations (jumps to the beginning of the function with different arguments). What I call "virtual machine optimization" is a special tail call instruction that removes the current stack frame after having called the function or closure.Now apparently the virtual machine "optimization" is not particularly fast but uses far less memory than the normal invocation.Prexonite will never be able to recognize indirect recursion due to the lack of control flow analysis. This, however, does not mean that return statements inside conditions or calls in tail position are not recognized. I'm not sure if Prexonite will ever handle simple recursive return expressions like the normal definition of the factorial:
function fac n =
    if (n==1)
        1
    else
        n*fac(n-1);

Also in the repository is an experimental and partial implementation of the famous call-with-current-continuation from Scheme. In PXS it is known as callcc.I must admit that I don't really know much about call/cc and how it works, especially regarding the stack. Creating a callable object from the current state of a function invocation is no problem. I just don't understand some of the scheme samples, I've been looking at (terribly difficult to read...)The following snippet stores a continuation of the function two in the global variable plusone. Invoking this continuation with, say, 6 returns 7 as the name suggests.
var plusone;

function two =
    1 + call\cc(->one);
       
function one(continuation)
{
    plusone = continuation;
    return 1;
}

Creating a programming language

On October 31, I handed in a paper I have been working on for the past few months. It shortly outlines the process of creating a programming language with a focus on compiler construction:

Compilers

Since computers only process instructions that are part of their instruction set, programs written in programming languages have to be translated into functionally equivalent programs in machine language prior to their execution. This process is called compilation and is performed by compilers.To cope with this task, the translation is commonly split up into multiple steps, also referred to as phases. A compiler starts with reading the input file byte by byte, character by character.Like our eye splits up a text into individual words, the second step is to group meaningful characters together and remove those without meaning (e.g.,~whites paces). This step is called lexical analysis and results in a stream of tokens: short, categorized strings of characters.In the next phase, the compiler determines the relationship between the tokens. It applies the syntax of the programming language and is therefor called syntactical analysis. It results in a tree structure that represents the program.At this stage, some compilers apply additional transformations to the program to increase performance before finally generating code in the language of the target machine.

So, if you want to know how your favorite compiler works, have a look at "Creating a programming language" (PDF, 400KiB).

“Reference-To”Logic

One of the reasons I never liked C/++ is it's illogical pointer syntax.

int v0=5,v1=6;int* p0;int *p1;p0 = &v0;*p0 = v1;

The first pointer declaration is ok: The star turns the type integer into a pointer-to-integer.The second one is a bit confusing. To the readers (at least mine) eye it looks like the star belongs to the identifier, modifying it somehow. Ok, it's just the result of C/++ being a freeform language so let's go on.The star turns any data type T into a pointer-to-T. So the same must be true for values; turning values into pointers to those values, right? Bad luck. The inventor of C chose to use a different symbol for returning the address of a value. Fine with me, as long as the star has no other appearances in the indirection business. [Read more →]