→ Heliotrope Pajamas

Joy in Prolog

Wherein is discussed the lexing, parsing, semantics, etc. of Joy language implemented in Prolog.

Compare and contrast with similar work for a small FP language:

→ Programming Language Semantics in Prolog

Syntax

Symbols can be made of any non-blank characters except ``[`` and ``]`` which are fully reserved for list literals (aka "quotes"). ``true`` and ``false`` would be valid symbols but they are reserved for Boolean literals.

Lexer

First lex a stream of character codes into tokens separated by square brackets or whitespace. We keep the brackets and throw away the blanks.

joy_lex([tok(Token)|Ls]) --> chars(Token), !, joy_lex(Ls).
joy_lex([  lbracket|Ls]) --> "[",          !, joy_lex(Ls).
joy_lex([  rbracket|Ls]) --> "]",          !, joy_lex(Ls).

joy_lex(Ls) --> [Space], {code_type(Space, space)}, !, joy_lex(Ls).

joy_lex([]) --> [].

The `chars//1` predicate is just a greedy non-blank, non-bracket character code recognizer that scoops them into a list

chars([Ch|Rest]) --> char(Ch), chars(Rest).
chars([Ch])      --> char(Ch).

char(Ch) --> [Ch],
    { Ch \== 0'[,
      Ch \== 0'],
      code_type(Ch, graph) }.

Parser

Then parse the tokens converting them to Prolog values and building up the list structures (if any.) Integers are converted to Prolog integers, symbols and bools to Prolog atoms, and list literals to Prolog lists. Then the values are wrapped in unary "type" predicates:

int/1, symbol/1, bool/1, list/1

The parser just recognizes streams of terms:

joy_parse([J|Js]) --> joy_term(J), joy_parse(Js).
joy_parse([]) --> [].

Terms are either tokens or lists of terms:

joy_term(list(J)) --> [lbracket], joy_parse(J), [rbracket].
joy_term(Atomic) --> [tok(Codes)], {joy_token(Atomic, Codes)}.

Tokens are integers, Boolean values, or symbols (names of functions):

joy_token(int(I), Codes) :- number(I, Codes, []), !.  % See dcg/basics.
joy_token(bool(true), "true") :- !.
joy_token(bool(false), "false") :- !.
joy_token(symbol(S), Codes) :- atom_codes(S, Codes).

Text to Expression

All that remains is to put the lexer and parser together in a single relation like so

text_to_expression(Text, Expression) :-
    phrase(joy_lex(Tokens), Text), !,
    phrase(joy_parse(Expression), Tokens).

Semantics

The fundamental Joy relation involves an expression and two stacks. One stack serves as input and the other as output.

thun(Expression, InputStack, OutputStack)

Base Case

The null expression (denoted by an empty Prolog list) is effectively an identity function and serves as the end-of-processing marker.

thun([], S, S).

Literals

Literals are put on the stack.

thun([Lit|E], Si, So) :- literal(Lit), !, thun(E, [Lit|Si], So).

literal(int(_)).
literal(bool(_)).
literal(list(_)).

Symbols

Symbols are the names of either functions, combinators, or definitions, and are looked up (as Prolog clauses ``func/3`` or ``combo/5`` or ``def/2``) and evaluated

thun([Func|E], Si, So) :-
    func(Func, Si, S),
    thun(E, S, So).

thun([Combo|E], Si, So) :-
    combo(Combo, Si, S, E, Eo),
    thun(Eo, S, So).

thun([Def|E], Si, So) :-
    def(Def, Body),
    append(Body, E, Eo),
    thun(Eo, Si, So).

Note that functions cannot affect the expression while combinators can, that's the semantic difference between them. Technically combinators are functions.

Definitions simply append their body expressions to the current expression and carry on evaluation from there.

(These Prolog relations are all tail-recursive, which is nice.)

Functions

Functions are relations ``func/3`` that relate a symbol name to some operation from stacks to stacks.

Stack Manipulation

Basic stack manipulation use logic variables as a kind of "any type" as they do not inspect or otherwise use the values on the stack:

func(swap, [A, B|S], [B, A|S]).

func(dup, [A|S], [A, A|S]).

func(pop, [_|S], S).

Math, Logic, Comparison

Can be described in several ways. Symbolic:

func(add, [int(A), int(B)|S], [int(A+B)|S]).

Note that, in Prolog, the above code does not actually perform arithmetic. The term ``A+B`` is syntactic sugar for ``'+'(A, B)``, it's a plain ol' Prolog term.

We can use CLP(FD) constraints instead:

func(add, [int(A), int(B)|S], [int(C)|S]) :- C #= A + B.

This gets into "Categorical" programing (using different semantics is akin to evaluating over different categories, I think.)

Combinators

The difference between regular functions and combinators is that functions (as we are using the word here) cannot change the pending expression (continuation) but combinators can, typically by prepending quoted programs to the pending expression. As currently written, there are no combinators that can alter the pending expression in ways other than prepending code.

combo(i, [list(P)|S], S, Ei, Eo) :- append(P, Ei, Eo).

combo(branch, [list(T), list(_),  bool(true)|S], S, Ei, Eo) :- append(T, Ei, Eo).
combo(branch, [list(_), list(F), bool(false)|S], S, Ei, Eo) :- append(F, Ei, Eo).

combo(loop, [list(_), bool(false)|S], S, E,  E ).
combo(loop, [list(B),  bool(true)|S], S, Ei, Eo) :-
    append(B, [list(B), symbol(loop)|Ei], Eo).

As you can see, there is nothing to it. The branch and loop combinators are prefectly straightforward, the loop combinator mentions itself showing the underlying unity of looping and recursion.

Definitions

New functions are just named Joy expressions, and work by prepending their expressions to the pending expression.

swapd :=  [swap] dip
swons := swap cons
swoncat :=  swap concat
etc...

And that's it. That's all there is to Joy. It's extremely simple and elegant.