Andreas Rozek
[ Imprint ]   [ Data Privacy Statement ]   [ Contact ]       [ deutsche Fassung ]   english version

PEG.js

The well-known parser generator PEG.js originates from David Majda and is now being continued by Futago-za Ryuu. The generator can also be operated purely online without prior installation.

In the lecture "Fundamentals of Computer Science" at the University of Applied Sciences in Stuttgart, students can use PEG.js to actively familiarise themselves with the syntax (not only) of programming languages and the necessary steps in analysing and processing texts.

A guide explains how to use the pages in this website, their behaviour may be adjusted in the settings.

Converter from XML to JSON

In a first step, an XML file is analysed and converted into an equivalent JSON file.

Simplified XML Grammar for PEG.js

In order to serve as a teaching example in the limited time available, the grammar used covers only the most important aspects of XML.

XML           = Declaration _ Element _
Declaration = '<?xml' (__ Attribute)* _ '?>'
Element = emptyElement / filledElement
emptyElement = '<' _ Tag (__ Attribute)* _ '/>'
filledElement = '<' _ Tag (__ Attribute)* _ '>' (_ Content)* _ '</' _ Tag _ '>'
Content = Element / Text / CDATA
Attribute = Name (_ '=' _ '"' Value '"')?
CDATA = '<![CDATA[' [^\]]* ']]>'
Tag = [a-zA-Z]+
Name = [a-zA-Z]+
Text = [^<]+
Value = [^"]*
_ = [ \t\n\r]*
__ = [ \t\n\r]+

EBNF version for Syntax Diagrams

Syntax diagrams are sometimes easier to understand (especially for beginners) than an abstract (E)BNF or PEG notation.

The well-known Railroad Diagram Generator generates the corresponding syntax diagrams for a given grammar (in a BNF-like notation).

XML           ::= Declaration Element
Declaration ::= '<?xml' Attribute* '?>'
Element ::= emptyElement | filledElement
emptyElement ::= '<' Tag Attribute* '/>'
filledElement ::= '<' Tag Attribute* '>' Content* '</' Tag '>'
Content ::= Element | Text | CDATA
Attribute ::= Name ('=' '"' Value '"')?
CDATA ::= '<![CDATA[' [^#x5D]* ']]>'
Tag ::= [a-zA-Z]+
Name ::= [a-zA-Z]+
Text ::= [^<]+
Value ::= [^"]*

Converter Definition for PEG.js

Decorated with appropriate JavaScript instructions, the pure XML grammar for PEG.js becomes a converter that generates the corresponding JSON equivalent from a given XML file.

{ function indented (Text) {
return Text.split('\n').map(Line => ' ' + Line).join('\n')
} }

XML = Declaration _ Element:Element _
{ return Element }
Declaration = '<?xml' (__ Attribute)* _ '?>'
Element = emptyElement / filledElement
emptyElement = '<' _ Tag:Tag Attributes:(__ Attribute:Attribute { return Attribute })* _ '/>'
{ return '{\n' + indented(
' "Tag":"' + Tag + '"' + (
Attributes.length > 0
? ',\n "Attributes": {' + Attributes.join(', ') + '}'
: ''
) + '\n}')
}
filledElement = '<' _ StartTag:Tag Attributes:(__ Attribute:Attribute { return Attribute })*
_ '>' Contents:(_ Content:Content {return Content})*
_ '</' _ EndTag:Tag _ '>'
{ if (StartTag !== EndTag) {
error('end tag differs from start tag')
}

return '{\n' + indented(
' "Tag":"' + StartTag + '"' + (
Attributes.length > 0
? ',\n "Attributes": {' + Attributes.join(', ') + '}'
: ''
) + (
Contents.length > 0
? ',\n "Contents": [' + Contents.join(', ') + ']'
: ''
) + '\n}')
}
Content = Element / Text / CDATA
Attribute = Name:Name Value:(_ '=' _ '"' inner:Value '"' { return inner })?
{ return (
Value == null
? Name + ':true'
: Name + ':' + Value
) }
CDATA = '<![CDATA[' Content:[^\]]* ']]>'
{ return '"' + Content.join('').replace(/"/g,'\\"') + '"' }
Tag = [a-zA-Z]+ { return text() }
Name = [a-zA-Z]+ { return '"' + text() + '"' }
Text = [^<]+ { return '"' + text().replace(/"/g,'\\"') + '"' }
Value = [^"]* { return '"' + text() + '"' }
_ = [ \t\n\r]* { return '' }
__ = [ \t\n\r]+ { return '' }

The example file is an article for a fictitious blog.

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>Mustermann&#39;s Blog</title>
<link href="/atom.xml" rel="self"/>

<author>
<name>Mustermann</name>
<email>Max.Mustermann@gmx.de</email>
</author>

<entry>
<title>Just a simple Blog Entry</title>
<content type="html"><![CDATA[<p>...</p>]]></content>
</entry>
</feed>

PEG.js as a Script Interpreter

A file converter is already quite practical, but computer science students actually thirst for their own programming language.

Thanks to PEG.js, the effort required is very manageable, even for a beginner:

  1. as soon as you have an idea of what the language should look like and what it should be able to do,
  2. first create a few theoretical examples (covering all syntactic aspects of the language).
  3. next, define the syntax of the programming language and
  4. test it using the previously developed examples. These tests should be carried out quite carefully, because later changes can quickly become very costly).
  5. as soon as the pure grammar works, you can decorate it with JavaScript statements that create an "abstract syntax tree" (AST) - if possible in a form that can be easily read in the PEG.js output.
  6. This grammar should be tested again - this time, however, using examples that can be executed later (i.e. that contain semantic meaning).
  7. last but not least, the output of the primary rule (i.e. the rule for the entire script) is changed so that the AST is not output, but interpreted.

Et voilà, you have specified and implemented your own scripting language!

slang - the "simple Language"

As an example, let's use "slang", a very simple scripting language, which is primarily intended to demonstrate the basic mechanisms of an interpreter (and less for practical use).

Data Types

slang only knows the following data types

  • nil - represents "nothing", i.e. an undefined value (JavaScript undefined)
  • logical values ('true' and 'false')
  • numeric values (floating point numbers in double-precision IEEE 754 format)
  • strings
  • functions and blocks
  • "native" functions (JavaScript)

In particular, slang does not know any data structures (this makes the grammar a bit simpler)

Functions and blocks are "first-class values", i.e. they can be assigned to variables and used as arguments for function calls.

Literals

Values of the above data types are defined in the program text as follows

  • nil
  • true or false
  • numbers
    • in binary format (only for unsigned integers) preceded by 0b (e.g. 0b01101100) with up to 32 digits/bits
    • in hexadecimal format (only for unsigned integers) preceded by 0x (i.e. 0x0123AFFE) with up to 8 digits (32 bits)
    • as decimal number with opt. sign, digits before and after the decimal point and exponents
  • character strings
    • single-line in single or double inverted commas ('...' or "...")
    • multiple lines in single or double inverted commas (''...'' or """..."")
    • with support of the usual (under JavaScript) escape sequences (\b, \f, \n, \r, \t, \v, \0, \', \", \\ as well as \x## and \u####)
  • Blocks {...}
  • lambdas (script functions) (...) => {...}.
  • native functions native (...) => string

Identifiers

"Identifiers" are the names of context entries, i.e. global, local or lexical variables and functions.

They begin with one of the letters a-z or A-Z or one of the two characters $ or _, optionally followed by one or more further letters, $ or _ or decimal digits (0-9).

Variables and functions, however, must not be named like a reserved word. Reserved words are global, local, it, native, nil, true, false, nan, e, pi, div, mod, and, or, not, if, then, else, select, when, otherwise, for, from, down, to, by, repeat, while, until, continue, break and return.

The case of the identifiers is significant.

Contexts

Contexts group values and functions together so that they can be addressed from functions and blocks by their name.

There are "outer", "inner" and "lexical" contexts.

The outermost context is the "global context" with all global values and functions.

When a function is started, it creates an "inner" context for call parameters and local variables. When reading, context entries are first searched locally and then globally. When writing, entries are always searched for in the local context only (and created there if necessary).

When a block is started, it also creates a local context, but this time in the "lexical" context of the place where it was defined. When reading, context entries are therefore first searched for in the local context, then (recursively if necessary) in the associated lexical contexts and finally also in the global context. When writing, entries are also searched in the local and lexical contexts - but never in the global context. Found entries are changed where they were found, entries not found are created locally.

Blocks

Blocks contain sequences of instructions that can (but do not have to) be separated from each other in the program text by one or more semicolons, if desired.

Stand-alone expressions are also considered to be instructions.

Each executed instruction returns a result, this is stored in the local variable it.

The result of a block is the result of the last executed instruction.

Lambdas (Script Functions)

Lambdas are (anonymous) functions with 0, 1 or more "parameters". They are defined by expressions of the form

  • () => {...} for parameterless lambdas,
  • (parameter) => {...} for lambdas with one parameter,
  • (parameter, parameter, ...) => {...} for lambdas with multiple parameters.

The parameters are available within the function as local variables and are initially preassigned with the values of the arguments passed during the call.

Lambdas can be called immediately, passed as arguments to other functions or assigned to a variable.

The result of a function is

  • the result of the last statement in the function block or
  • the result of a return statement.
Nota bene: currently it is not yet ensured that all parameters have different names

Native Functions

"Native functions" are functions programmed in JavaScript and callable from slang and are used to extend the scripting language. They are defined by expressions of the form

  • native () => string for parameterless functions,
  • native (parameter) => string for functions with a parameter,
  • native (parameter, parameter, ...) => string for functions with multiple parameters.

The source text of the native function is given (within slang) as a single or multi-line slang string.

The parameters of the JavaScript function carry the names specified in the definition; the lexical context in which the native function was called is available as a "this" object.

Expressions and Operators

Expressions consist of

  • assignments,
  • operator links,
  • calls,
  • literals, variables, parentheses or
  • lambdas or native functions

slang supports the following operators (with decreasing precedence)

  • (...) (parenthesis)
  • ^ (exponentiation)
  • *, /, div (integer division), mod (modulo division)
  • +, -
  • <, <=, =, >=, > and <> (unequal)
  • not (logical negation)
  • or (logical disjunction)
  • and (logical conjunction)
  • := (assignment)

The proper order of mathematical operations (PEMDAS) is taken into account.

Most operators are "left-associative", only assignments, exponentiations and logical negation are evaluated "right-associative".

Assignments

Assignments change the value of an existing context entry (usually a variable) or create it.

A "generic assignment" has the form

identifier := expression

and changes an existing entry in the local or lexical context or creates a new entry in the local context.

A "local assignment" has the form

local identifier := expression

and always writes an entry into the local context.

A "global assignment" has the form

global identifier := expression

and writes an entry into the global context. Global assignments are the only way to create or change entries in the global context.

Assignments also return a result (and can thus be used in expressions), namely the value that was assigned.

Control Statements

The following statements are available to control program flow:

  • if (...) then {...}
    if (...) then {...} else {...}
    if the expression specified after if is true, the block noted after then is executed. If the expression is false, the block noted after else is executed instead - if such a block exists, otherwise the statement is simply ignored. The result of the statement is the result of the executed block or nil if no block was executed.
  • select {
      when (...) then {...}
      ...
      otherwise {...}
    }
    evaluates all expressions listed after when in order, stops as soon as the first expression with the value true is found and then executes the block after the corresponding then. If no when expression returns the value true, the block after otherwise is executed instead - if one exists, otherwise the instruction is ignored. The result of the statement is the result of the executed block or nil if no block was executed.
  • for identifier from ... to ... by ... repeat {...}
    for identifier from ... down to ... by ... repeat {...}
    executes the loop block noted after repeat repeatedly and with different values for the given identifier, where these values start with the value of the expression after from and are changed by the value of the expression after by on each pass. In the version with the keyword down, the loop ends when the value of the expression after to falls below it, otherwise when it exceeds it. If the expression after by has the value nil, +1 (in the version without down) or -1 (in the version with down) is assumed instead. The identifier may denote an already existing variable from the lexical context, otherwise the corresponding variable is created in the context of the loop. Local variables created in the loop block retain their value from one pass to the next. The result of the loop is the result of the last execution of the loop block
  • while (...) repeat {...}
    executes the loop block noted after repeat repeatedly as long as the expression noted after while returns the value true. Local variables created in the loop block retain their value from one run to the next, and the termination condition is also executed in the context of the loop block. The result of the loop is the result of the last execution of the loop block.
  • repeat {...} until (...)
    executes the loop block noted after repeat repeatedly until the expression noted after until returns the value true. Local variables created in the loop block retain their value from one run to the next, and the termination condition is also executed in the context of the loop block. The result of the loop is the result of the last execution of the loop block.
  • break
    breaks the innermost loop running in the (script) function just executed.
  • continue
    requests a new iteration of the innermost loop running in the currently executing (script) function - or terminates it if the termination condition of the loop is met.
  • return ...
    aborts the currently running function and returns the value of the (optional) expression noted after return - if such an expression is missing, nil is returned

Comments

slang supports single-line and (non-nesting) multi-line comments:

  • single-line // ... to end of line
  • multiline /* ...single or multiline, non-nestable */

slang Runtime Environment

Just like "real" programming languages, slang also comes with a basic set of "intrinsic" (i.e. built-in) values and functions. These are defined in the global context and thus accessible from everywhere - as long as they are not "shadowed" by functions with the same name in the local context.

intrinsic Values

  • nil - represents "nothing", i.e. an undefined value (JavaScript undefined)
  • true - represents the logical value for "true"
  • false - represents the logical value for "false"
  • pi - contains the value for the circle number pi
  • e - contains the value for the Euler constant

Operators as Functions

The logical, mathematical and comparative operators can also be invoked as functions - by overwriting these functions, the behaviour of the associated operators can thus be changed. Incidentally, the parser calls operators in expressions directly from the global context, so that local functions of the same name cannot override operators.

  • neg(a) - returns number a with reversed sign (i.e. -a)
  • plus(a,b) - adds two numbers a and b or appends strings to each other
  • minus(a,b) - subtracts number b from number a.
  • times(a,b) - calculates the product of two numbers a and b.
  • through(a,b) - divides number a by number b.
  • div(a,b) - divides number a by number b and returns the integer part of the result
  • mod(a,b) - returns the remainder of the division of number a by number b.
  • raised(a,b) - returns the bth power of number a.
     
  • lt(a,b) - returns true if and only if a < b holds true
  • le(a,b) - returns true if and only if a <= b holds true
  • eq(a,b) - returns true if and only if a = b holds true
  • ge(a,b) - returns true if and only if a >= b holds true
  • gt(a,b) - returns true if and only if a > b holds true
  • ne(a,b) - returns true if and only if a <> b holds true
     
  • not(a) - returns the logical negation of the boolean value a.
  • and(a,b) - returns the logical conjunction of boolean values a and b.
  • or(a,b) - returns the logical disjunction of the boolean values a and b.

Intrinsic mathematical functions

The following mathematical functions are built into slang:

  • is_nan(a) - returns true if and only if a has the value NaN.
     
  • sqrt(a) - returns the square root of a
  • sin(a) - returns the sine of a
  • cos(a) - gives the cosine of a
  • tan(a) - returns the tangent of a
     
  • rnd(a) - returns a pseudo-random number in the range 0...a, where a itself is excluded

Control Statements as Functions

The instructions for controlling the program flow are also present as global functions and can consequently be changed. Incidentally, the parser calls statements directly from the global context, so that local functions of the same name cannot override statements.

  • if_then_else (condition, then_clause, else_clause)
    if the boolean value condition is true, block then_clause is executed; if it is false, block else_clause is executed - if such a block exists, otherwise the statement is simply ignored. The result of the function is the result of the executed block or nil if no block was executed.
  • select_when (otherwise_clause, condition, block,...)
    evaluates all condition blocks or (logical) values of the given condition-block pairs in order, stops as soon as the first condition with the result true is found and then executes the corresponding block. If no condition returns the value true, the otherwise_clause is executed instead. The result of the function is the result of the executed block
  • for_repeat (identifier, downwards, start_value, stop_value, step_value, loop_body)
    executes block loop_body repeatedly with different values for the given identifier, these values start with start_value and are incremented by step_value on each pass. If the boolean argument downwards has the value true, the loop ends when the value of identifier falls below stop_value, otherwise when it exceeds that limit. If step_value has the value nil, +1 (if downwards = false) or -1 (if downwards = true) is assumed instead. The identifier may denote an already existing variable from the lexical context, otherwise the variable is created in the context of the loop. Local variables created in loop_body retain their value from one run to the next. The result of the loop is the result of the last execution of loop_body.
  • while_repeat (condition, loop_body)
    executes the block loop_body repeatedly as long as an execution of the block condition returns the value true. Local variables created in loop_body retain their value from one run to the next. The result of the loop is the result of the last execution of loop_body.
  • repeat_until (loop_body, condition)
    executes the block loop_body repeatedly until an execution of the block condition returns the value true. Local variables created in loop_body retain their value from one run to the next. The result of the loop is the result of the last execution of loop_body.
  • break (value)
    breaks the innermost loop running in the (script) function just executed and returns the given value.
  • continue
    requests a new iteration of the innermost loop running in the currently executing (script) function - or terminates it if the termination condition of the loop is met.
  • return (value)
    aborts the currently running function and returns the given value.
  • assert (value)
    checks whether value has the value true - if so, the program execution will continue, otherwise it will be aborted with an error message
  • assert_failure (block)
    checks if an error occurs during the execution of block block - if so, the error is ignored and program execution is continued, otherwise it is aborted with an error message
  • log (value)
    prints value to the browser console

slang Grammar

The grammar for slang in PEG.js notation does not require a separate "lexer" (as usual) and is nevertheless surprisingly compact.

script               = statement_list? _
block = '{' statement_list _ '}'
statement_list = (_ statement / _ ';' )*
statement = if_statement / select_statement
/ for_statement / while_statement / until_statement
/ continue_statement / break_statement / return_statement
/ expression

if_statement = 'if' _ '(' _ expression _ ')' _ 'then' _ block (_ 'else' _ block)?
select_statement = 'select' _ '{' _ when_clause* otherwise_clause? '}'
when_clause = 'when' _ '(' _ expression _ ')' _ 'then' _ block _
otherwise_clause = 'otherwise' (__ 'do')? _ block _

for_statement = 'for' __ identifier _ 'from' alone _ expression _
('down' _)? 'to' alone _ expression _ ('by' alone _ expression)? _
'repeat' _ block
while_statement = 'while' _ '(' _ expression _ ')' _ 'repeat' _ block
until_statement = 'repeat' _ block _ 'until' _ '(' _ expression _ ')'

continue_statement = 'continue' alone
break_statement = 'break' alone
return_statement = 'return' alone _ expression?

expression = assignment / or_term
or_term = (and_term _ 'or' alone _)* and_term
and_term = (not_term _ 'and' alone _)* not_term
not_term = ('not' alone _)* comparison
comparison = (additive_term _ ('<' ![=>] / '<=' / '=' !'>' / '>=' / '>' !'=' / '<>') _)? additive_term
additive_term = (multiplicative_term _ ('+' / '-') _)* multiplicative_term
multiplicative_term = (exponential_term _ ('*' / '/' / 'div' alone / 'mod' alone) _)* exponential_term
exponential_term = invocation (_ '^' _ exponential_term)?
invocation = primary ( _ '(' _ argument_list? _ ')')?
primary = literal / identifier / '(' _ expression _ ')' ! (_ '=>')
/ lambda_definition / native_definition

assignment = global_assignment / local_assignment / generic_assignment
global_assignment = 'global' __ identifier _ ':=' _ expression
local_assignment = 'local' __ identifier _ ':=' _ expression
generic_assignment = identifier _ ':=' _ expression

lambda_definition = '(' _ parameter_list? _ ')' _ '=>' _ block
native_definition = 'native' _ '(' _ parameter_list? _ ')' _ '=>' _ string

parameter_list = identifier (_ ',' _ identifier)*
argument_list = expression (_ ',' _ expression)*

literal = 'nil' alone / boolean / number / string
boolean = 'true' alone / 'false' alone
number = integer / floating_point / 'pi' alone / 'e' alone / 'nan' alone
integer = binary / decimal / hexadecimal
binary = '0b' [01]+
decimal = [+-]? digit+ ! '.'
hexadecimal = '0x' hex_digit+
floating_point = mantissa exponent?
mantissa = [+-]? (digit+ '.' digit* / '.' digit+)
exponent = [eE] [+-]? digit+
string = single_quoted_text / double_quoted_text /
single_quoted_string / double_quoted_string
single_quoted_string = "'" (escape_sequence / & no_control_char [^\'])* "'"
double_quoted_string = '"' (escape_sequence / & no_control_char [^\"])* '"'
single_quoted_text = "'''" (escape_sequence / !"'''" [\'] / [^\'])* "'''"
double_quoted_text = '"""' (escape_sequence / !'"""' [\"] / [^\"])* '"""'
escape_sequence = ('\\' [bfnrtv0'"\\]) / ('\\x' hex_digit{2}) / ('\\u' hex_digit{4})
no_control_char = [^\x00-\x1F\x7F-\x9F\u200B\u2028\u2029\u2060]
digit = [0-9]
hex_digit = [0-9a-fA-F]

identifier = ! (reserved_word alone) identifier_start identifier_part*
identifier_start = [$_a-zA-Z]
identifier_part = [$_a-zA-Z0-9]

alone = ! identifier_part

reserved_word = 'global' / 'local' / 'it' / 'native'
/ 'nil' / 'true' / 'false' / 'nan' / 'e' / 'pi'
/ 'div' / 'mod' / 'and' / 'or' / 'not'
/ 'if' / 'then' / 'else' / 'select' / 'when' / 'otherwise'
/ 'for' / 'from' / 'down' / 'to' / 'by' / 'repeat'
/ 'while' / 'until' / 'continue' / 'break' / 'return'

comment = (line_comment / block_comment)
line_comment = '//' [^\n\r]* [\n\r]*
block_comment = '/*' (!'*/' .)* '*/'

_ = ([ \t\n\r] / comment)*
__ = ([ \t\n\r] / comment)+
/**** syntax smoke tests ****/

/* block comment */
// line comment

/**** literals ****/

nil
true false
e pi nan; +123; -123. 123.456 .456 .456e+78 .456e-78 .456e78
'' 'single-quoted string' 'escape sequences \b\f\n\r\t\v\0\'\"\\ \x12 \u1234 '
"" "double-quoted string" "escape sequences \b\f\n\r\t\v\0\'\"\\ \x12 \u1234 "
'''
single-quoted text
with ' " and escape sequences \b\f\n\r\t\v\0\'\"\\ \x12 \u1234
'''
"""
double-quoted text
with ' " and escape sequences \b\f\n\r\t\v\0\'\"\\ \x12 \u1234
"""

/**** variable definition and access ****/

a b := true local c := nil global d := pi

a := () => {}
b := (a) => { a }
c := (a,b,c) => { a+b+c }

a := native () => ''
b := native (a) => "return a"
c := native (a,b,c) => "return a+b+c"

/**** statements ****/

if (a = b) then { true }
if (a = b) then { true } else { false }

select {
when (a > b) then { 'greater than' }
when (a = b) then { 'equal' }
otherwise { 'smaller than' }
}

for i from 0 to 9 repeat { i }
for i from 9 down to 0 by -1 repeat { i }

i := 0
while (i < 9) repeat { i := i+1 }
repeat { i := i-1 } until (i = 0)

break continue return

/**** expressions ****/

a + b - c * d / e div f mod g ^ h
a < b b <= c c = d d >= e e > f f <> g
not a or b and not c

slang Grammar with AST Generation

With a few additional JavaScript decorations, this grammar is able to output an "abstract syntax tree" (AST).

{
let whitespace = {}

function without_whitespace (list) {
let result = []
for (let i = 0, l = list.length; i < l; i++) {
switch (true) {
case (list[i] === whitespace):
continue
case Array.isArray(list[i]):
result.push(without_whitespace(list[i]))
break
default:
result.push(list[i])
}
}
return result
}

let name_of = Object.assign(Object.create(null),{
'and':'and', 'or':'or', 'not':'not',
'<':'lt', '<=':'le', '=':'eq', '>=':'ge', '>':'gt',
'+':'plus', '-':'minus', '*':'times', '/':'through', 'div':'div', 'mod':'mod',
'^':'raised'
})

function prefixed (operators, operand) {
return (
operators == null
? operand
: operators.reduceRight(
(result,operator) => ['#call',name_of[operator],result], operand
)
)
}

function left_associative (left_operand, operations) {
return (
operations == null
? left_operand
: without_whitespace(operations).reduce(
(result,operation) => ['#call',name_of[operation[0]],result,operation[1]],
left_operand
)
)
}

function right_associative (left_operand, operation) {
return (
operation == null
? left_operand
: (
operation = without_whitespace(operation),
['#call',name_of[operation[0]],left_operand,operation[1]]
)
)
}

function unescaped (char_list) {
return char_list.join('').replace(/\\[bfnrtv0'"\\]|\\x[0-9]{2}|\\u[0-9a-f]{4}/gi,(match) => {
switch (match.charAt(1)) {
case 'b': return '\b'; case 'f': return '\\f'
case 'n': return '\n'; case 'r': return '\\r'
case 't': return '\t'; case 'v': return '\\v'
case '0': return '\0'; case "'": return "'"
case '"': return '"'; case '\\': return '\\'
case 'x':
case 'u': return String.fromCharCode(parseInt(match.slice(2),16))
}
})
}
}

script = statements:statement_list? _ { return statements }
block = '{' statements:statement_list _ '}'
{ return ['#block',statements] }
statement_list = statements:(_ statement / _ ';' )*
{ return statements.filter((stmt) => stmt[1] !== ';').map((stmt) => stmt[1]) }
statement = if_statement / select_statement
/ for_statement / while_statement / until_statement
/ continue_statement / break_statement / return_statement
/ expression

if_statement = 'if' _ '(' _ condition:expression _ ')' _ 'then' _ then_clause:block
else_clause:(_ 'else' _ block:block { return block })?
{ return ['#if_then_else', condition, then_clause, else_clause] }
select_statement = 'select' _ '{' _ when_clauses:when_clause* otherwise_clause:otherwise_clause? '}'
{ return ['#select_when', otherwise_clause].concat(when_clauses) }
when_clause = 'when' _ '(' _ condition:expression _ ')' _ 'then' _ block:block _
{ return [condition,block] }
otherwise_clause = 'otherwise' (__ 'do')? _ block:block _ { return block }

for_statement = 'for' __ identifier:identifier _ 'from' alone _ start_value:expression _
downwards:('down' _ { return true })? 'to' alone _ stop_value:expression _
step_value:('by' alone _ value:expression { return value })? _
'repeat' _ loop_body:block
{ return ['#for_repeat', identifier, ['#value',!!downwards], start_value, stop_value, step_value, loop_body] }
while_statement = 'while' _ '(' _ expression:expression _ ')' _ 'repeat' _ block:block
{ return ['#while_repeat',['#block',expression],block] }
until_statement = 'repeat' _ block:block _ 'until' _ '(' _ expression:expression _ ')'
{ return ['#repeat_until',block,['#block',expression]] }

continue_statement = 'continue' alone { return ['#continue'] }
break_statement = 'break' alone { return ['#break'] }
return_statement = 'return' alone _ argument:expression? { return ['#return'].concat(argument || []) }

expression = assignment / or_term
or_term = left_operand:and_term operations:(_ 'or' alone _ and_term)*
{ return left_associative(left_operand,operations) }
and_term = left_operand:not_term operations:(_ 'and' alone _ not_term)*
{ return left_associative(left_operand,operations) }
not_term = operators:('not' alone _)* operand:comparison
{ return prefixed(operators,operand) }
comparison = left_operand:additive_term operations:(_ ('<' ![=>] / '<=' / '=' !'>' / '>=' / '>' !'=' / '<>') _ additive_term)?
{ return left_associative(left_operand,operations) }
additive_term = left_operand:multiplicative_term operations:(_ ('+' / '-') _ multiplicative_term)*
{ return left_associative(left_operand,operations) }
multiplicative_term = left_operand:exponential_term operations:(_ ('*' / '/' / 'div' alone / 'mod' alone) _ exponential_term)*
{ return left_associative(left_operand,operations) }
exponential_term = left_operand:invocation operation:(_ '^' _ exponential_term)?
{ return right_associative(left_operand,operation) }
invocation = callee:primary args:( _ '(' _ argument_list? _ ')')?
{ return (args == null ? callee : ['#call',callee].concat(args[3] || [])) }
primary = literal
/ 'global' __ identifier:identifier { return ['#get-global',identifier] }
/ 'local' __ identifier:identifier { return ['#get-local',identifier] }
/ identifier:identifier { return ['#get-var',identifier] }
/ '(' _ expression:expression _ ')' ! (_ '=>') { return expression }
/ lambda_definition / native_definition

assignment = global_assignment / local_assignment / generic_assignment
global_assignment = 'global' __ key:identifier _ ':=' _ value:expression
{ return ['#set-global',key,value] }
local_assignment = 'local' __ key:identifier _ ':=' _ value:expression
{ return ['#set-local',key,value] }
generic_assignment = key:identifier _ ':=' _ value:expression
{ return ['#set-var',key,value] }

lambda_definition = '(' _ parameters:parameter_list? _ ')' _ '=>' _ body:block
{ return ['#lambda',parameters,body] }
native_definition = 'native' _ '(' _ parameters:parameter_list? _ ')' _ '=>' _ body:string
{ return ['#native',parameters,body] }

parameter_list = identifier:identifier identifiers:(_ ',' _ identifier)*
{ return [identifier].concat((identifiers || []).map((list) => list[3])) }
argument_list = expression:expression expressions:(_ ',' _ expression)*
{ return [expression].concat((expressions || []).map((list) => list[3])) }

literal = value:('nil' alone { return undefined } / boolean / number / string)
{ return ['#value',value] }
boolean = 'true' alone { return true }
/ 'false' alone { return false }
number = integer / floating_point / 'nan' alone { return NaN }
integer = binary { return parseInt(text().slice(2),2) }
/ decimal { return parseFloat(text()) }
/ hexadecimal { return parseInt(text().slice(2),16) }
binary = '0b' [01]+
decimal = [+-]? digit+ ! '.'
hexadecimal = '0x' hex_digit+
floating_point = mantissa exponent?
mantissa = [+-]? (digit+ '.' digit* / '.' digit+)
exponent = [eE] [+-]? digit+
string = single_quoted_text / double_quoted_text /
single_quoted_string / double_quoted_string
single_quoted_string = "'" content:(escape_sequence / & no_control_char [^\'])* "'" { return unescaped(content) }
double_quoted_string = '"' content:(escape_sequence / & no_control_char [^\"])* '"' { return unescaped(content) }
single_quoted_text = "'''" content:(escape_sequence / !"'''" [\'] / [^\'])* "'''" { return unescaped(content) }
double_quoted_text = '"""' content:(escape_sequence / !'"""' [\"] / [^\"])* '"""' { return unescaped(content) }
escape_sequence = ('\\' [bfnrtv0'"\\]) / ('\\x' hex_digit{2}) / ('\\u' hex_digit{4})
no_control_char = [^\x00-\x1F\x7F-\x9F\u200B\u2028\u2029\u2060]
digit = [0-9]
hex_digit = [0-9a-fA-F]

identifier = ! (reserved_word alone) identifier_start identifier_part*
{ return ['#value',text()] } /* makes control statements callable */
identifier_start = [$_a-zA-Z]
identifier_part = [$_a-zA-Z0-9]

alone = ! identifier_part

reserved_word = 'global' / 'local' / 'native'
/ 'nil' / 'true' / 'false' / 'nan'
/ 'div' / 'mod' / 'and' / 'or' / 'not'
/ 'if' / 'then' / 'else' / 'select' / 'when' / 'otherwise'
/ 'for' / 'from' / 'down' / 'to' / 'by' / 'repeat'
/ 'while' / 'until' / 'continue' / 'break' / 'return'

comment = (line_comment / block_comment) { return whitespace }
line_comment = '//' [^\n\r]* [\n\r]*
block_comment = '/*' (!'*/' .)* '*/'

_ = ([ \t\n\r] / comment)* { return whitespace }
__ = ([ \t\n\r] / comment)+ { return whitespace }

Complete slang Interpreter

It is not far from the AST output to a full-fledged interpreter - however, some JavaScript code must be written for this, so the domain of pure syntax specification is left behind.

{
let whitespace = {}

function without_whitespace (list) {
let result = []
for (let i = 0, l = list.length; i < l; i++) {
switch (true) {
case (list[i] === whitespace):
case (list[i] === undefined):
continue
case Array.isArray(list[i]):
result.push(without_whitespace(list[i]))
break
default:
result.push(list[i])
}
}
return result
}

/**** constructor functions ****/

function form (type, items) {
Object.assign(this,{ type,items })
}

function block (statement_list, context) {
Object.assign(this,{ statement_list,context })
}

function lambda (parameter_list, statement_list) {
Object.assign(this,{ parameter_list, statement_list })
}

function new_native (parameter_list, native_body) {
return (
parameter_list.length === 0
? new Function(native_body)
: new Function(parameter_list.join(),native_body)
)
}

/**** operator handling ****/

let name_of = Object.assign(Object.create(null),{
'and':'and', 'or':'or', 'not':'not',
'<':'lt', '<=':'le', '=':'eq', '>=':'ge', '>':'gt', '<>':'ne',
'+':'plus', '-':'minus', '*':'times', '/':'through', 'div':'div', 'mod':'mod',
'neg':'neg', '^':'raised'
})

function prefixed (operators, operand) {
return (
operators.length === 0
? operand
: without_whitespace(operators).reduceRight(
(result,operator) => new form('#call-global',[name_of[operator],result]), operand
)
)
}

function left_associative (left_operand, operations) {
return (
operations.length === 0
? left_operand
: without_whitespace(operations).reduce(
(result,operation) => new form('#call-global',[name_of[operation[0]],result,operation[1]]),
left_operand
)
)
}

function right_associative (left_operand, operation) {
return (
operation == null
? left_operand
: (
operation = without_whitespace(operation),
new form('#call-global',[name_of[operation[0]],left_operand,operation[1]])
)
)
}

function unescaped (text) {
return text.replace(/\\[bfnrtv0'"\\]|\\x[0-9]{2}|\\u[0-9a-f]{4}/gi,(match) => {
switch (match.charAt(1)) {
case 'b': return '\b'; case 'f': return '\\f'
case 'n': return '\n'; case 'r': return '\\r'
case 't': return '\t'; case 'v': return '\\v'
case '0': return '\0'; case "'": return "'"
case '"': return '"'; case '\\': return '\\'
case 'x':
case 'u': return String.fromCharCode(parseInt(match.slice(2),16))
}
})
}

function flattened (list) {
return list.reduce((result,sublist) => result.concat(sublist),[])
}

/**** script evaluation ****/

function evaluated_script (statement_list) {
let context = Object.create(global_context)
let callee = new block(statement_list,context)

activation_stack.push({ callee,context })
let result
try {
result = executed_block(callee,[],context)
} catch (signal) {
switch (true) {
case (signal instanceof loop_continuation):
throw new Error('no loop to be continued')
case (signal instanceof loop_termination):
throw new Error('no loop to be terminated')
case (signal instanceof function_termination):
throw new Error('cannot "return" from script')
case (signal instanceof function_replacement):
throw new Error('cannot "return" from script')
default:
throw signal
}
}
activation_stack.pop()

console.log('script result',result)
return result
}

/**** runtime system ****/

let global_context = Object.create(null)
let activation_stack = []

function context_contains (context, var_name) {
return Object.prototype.hasOwnProperty.call(context,var_name)
}

function evaluated (value, context) {
return (value instanceof form ? evaluated_form(value,context) : value)
}

function evaluated_form (form, context) {
let callee, value_list
switch (form.type) {
case '#block': return new block(form.items,context)
case '#get-var': return context[form.items[0]]
case '#set-var': return set_var(context,form.items[0],evaluated(form.items[1],context))
case '#get-global': return global_context[form.items[0]]
case '#set-global': return global_context[form.items[0]] = evaluated(form.items[1],context)
case '#get-local': return (context_contains(context,form.items[0]) ? context[form.items[0]] : undefined)
case '#set-local': return context[form.items[0]] = evaluated(form.items[1],context)
case '#call':
value_list = form.items.map((value) => evaluated(value,context))
return executed(value_list[0],value_list.slice(1))
case '#call_tce':
value_list = form.items.map((value) => evaluated(value,context))
throw new function_replacement(value_list[0],value_list.slice(1))
case '#call-global':
callee = global_context[form.items[0]]
value_list = form.items.slice(1).map((value) => evaluated(value,context))
return executed(callee,value_list)
default: throw new Error('unforeseen form type "' + form.type + '"')
}
}

function executed (callee, argument_list) {
for (;;) {
let context, stack_depth, result
try {
switch (true) {
case (typeof callee === 'function'):
return callee.apply(activation_stack,argument_list)
case (callee instanceof block):
context = Object.create(callee.context)
stack_depth = activation_stack.length
activation_stack.push({ callee,context })
result = executed_block(callee,argument_list,context)
activation_stack.pop()
return result
case (callee instanceof lambda):
context = Object.create(global_context)
stack_depth = activation_stack.length
activation_stack.push({ callee, context })
result = executed_block(callee,argument_list,context)
activation_stack.pop()
return result
default:
console.log('cannot execute',callee,'(',argument_list,')')
throw new Error('cannot execute value of type "' + typeof callee + '"')
}
} catch (signal) {
switch (true) {
case (signal instanceof loop_continuation):
if (callee instanceof lambda) {
throw new Error('no loop to be continued')
}
throw signal
case (signal instanceof loop_termination):
if (callee instanceof lambda) {
throw new Error('no loop to be terminated')
}
throw signal
case (signal instanceof function_termination):
if (callee instanceof lambda) {
activation_stack.length = stack_depth
return signal.value
}
throw signal
case (signal instanceof function_replacement):
if (callee instanceof lambda) {
activation_stack.length = stack_depth
callee = signal.callee; argument_list = signal.argument_list
continue
}
throw signal
default: throw signal
}
}
}
}

function executed_block (callee, argument_list, context) {
if (callee.parameter_list != null) {
callee.parameter_list.forEach((parameter,index) => {
context[parameter] = argument_list[index]
})
}

let statement_list = callee.statement_list
for (let i = 0, l = statement_list.length; i < l; i++) {
context.it = evaluated(statement_list[i],context)
}

return context.it
}

function set_var (context, var_name, var_value) {
let original_context = context
while (context !== global_context) {
if (context_contains(context,var_name)) { return context[var_name] = var_value }
context = Object.getPrototypeOf(context)
}
return original_context[var_name] = var_value
}

/**** special exceptions ****/

function loop_continuation () { /* nop */ }
function loop_termination (value) { this.value = value }

function function_termination (value) { this.value = value }
function function_replacement (callee, argument_list) {
Object.assign(this,{ callee,argument_list })
}

/**** auxiliary functions ****/

function active_context () {
return activation_stack[activation_stack.length-1].context
}/**** runtime environment ****/

function throwError (message) { throw new Error(message) }

function boolean_type (a) {
return (typeof a === 'boolean') ||
throwError('boolean value expected (got ' + typeof a + ')')
}

function boolean_types (a,b) {
return boolean_type(a) || boolean_type(b)
}

function numeric_type (a) {
return (typeof a === 'number') ||
throwError('numeric value expected (got ' + typeof a + ')')
}

function numeric_types (a,b) {
return numeric_type(a) || numeric_type(b)
}

function block_type (a) {
return (a instanceof block) ||
throwError('statement block expected')
}

function identifier_type (a) {
return (typeof a === 'string') && /^[$_a-z][$_a-z0-9]*$/i.test(a) ||
throwError('valid identifier expected (got ' + typeof a + ')')
}

function same_types (a,b) {
return (typeof a === typeof b) ||
throwError('values are of different types (' + typeof a + ' <> ' + typeof b + ')')
}

function addable_types (a,b) {
return (
((typeof a === 'number') || (typeof a === 'string')) &&
((typeof b === 'number') || (typeof b === 'string')) ||
throwError('numeric or literal values expected')
)
}

let assertion_counter = 0

Object.assign(global_context,{
nil:undefined,
true:true, false:false,

plus: (a,b) => addable_types(a,b) && (a+b),
minus: (a,b) => numeric_types(a,b) && (a-b),
times: (a,b) => numeric_types(a,b) && (a*b),
through: (a,b) => numeric_types(a,b) && (a/b),
div: (a,b) => numeric_types(a,b) && Math.trunc(a/b),
mod: (a,b) => numeric_types(a,b) && (a%b),
raised: (a,b) => numeric_types(a,b) && Math.pow(a,b),

neg: (a) => numeric_type(a) && (-a),

lt: (a,b) => same_types(a,b) && (a < b),
le: (a,b) => same_types(a,b) && (a <= b),
eq: (a,b) => (a === b),
ge: (a,b) => same_types(a,b) && (a >= b),
gt: (a,b) => same_types(a,b) && (a > b),
ne: (a,b) => (a !== b),

and: (a,b) => boolean_types(a,b) && (a && b),
or: (a,b) => boolean_types(a,b) && (a || b),
not: (a) => boolean_type(a) && (! a),

pi: Math.PI,
e: Math.E,

is_nan: isNaN,
sqrt: (a) => numeric_type(a) && Math.sqrt(a),
sin: (a) => numeric_type(a) && Math.sin(a),
cos: (a) => numeric_type(a) && Math.cos(a),
tan: (a) => numeric_type(a) && Math.tan(a),
rnd: (a) => numeric_type(a) && (Math.random() * a),

if_then_else: (condition, then_clause, else_clause) => {
switch (condition) {
case true: return block_type(then_clause) && executed(then_clause,[])
case false: return (
else_clause == null
? undefined
: block_type(else_clause) && executed(else_clause,[])
)
default: throw new Error('boolean value expected')
}
},

select_when: function (otherwise_clause /* condition,block... */) {
let clause_list = Array.prototype.slice.call(arguments,1)

for (let i = 0, l = clause_list.length; i < l; i += 2) {
let condition = clause_list[i]
if (
boolean_type(
condition instanceof block
? condition = executed(condition,[])
: condition
) && (condition == true)
) {
return executed(clause_list[i+1],[])
}
}

return otherwise_clause == null ? undefined : executed(otherwise_clause,[])
},

for_repeat: (identifier, downwards, start_value, stop_value, step_value, loop_body) => {
identifier_type(identifier) && boolean_type(downwards) &&
numeric_type(start_value) && numeric_type(stop_value) &&
((step_value == null) || numeric_type(step_value)) &&
block_type(loop_body)

if (step_value === 0) { throw new Error('step value must not be 0') }
if (step_value == null) { step_value = (downwards ? -1 : 1) }

let callee = global_context.for_repeat
let context = Object.create(active_context())
activation_stack.push({ callee,context }); let stack_depth = activation_stack.length
let i, to_be_continued = () => (downwards ? i >= stop_value : i <= stop_value)

for (i = start_value; to_be_continued(); i += step_value) {
context[identifier] = i

try {
executed_block(loop_body,[],context)
} catch (signal) {
switch (true) {
case (signal instanceof loop_continuation):
activation_stack.length = stack_depth
break
case (signal instanceof loop_termination):
activation_stack.length = stack_depth-1
return signal.value
default: throw signal
}
}
}
activation_stack.pop()

return context.it
},

while_repeat: (condition, loop_body) => {
block_type(condition) && block_type(loop_body)

let callee = global_context.while_repeat
let context = Object.create(active_context())
activation_stack.push({ callee,context }); let stack_depth = activation_stack.length
for (;;) {
try {
let condition_value = executed_block(condition,[],context)
if (boolean_type(condition_value) && (condition_value == false)) { break }
} catch (signal) {
switch (true) {
case (signal instanceof loop_continuation):
throw new Error('must not "continue" in loop condition')
case (signal instanceof loop_termination):
throw new Error('must not "break" in loop condition')
default: throw signal
}
}

try {
executed_block(loop_body,[],context)
} catch (signal) {
switch (true) {
case (signal instanceof loop_continuation):
activation_stack.length = stack_depth
continue
case (signal instanceof loop_termination):
activation_stack.length = stack_depth-1
return signal.value
default: throw signal
}
}
}
activation_stack.pop()

return context.it
},
repeat_until: (loop_body, condition) => {
block_type(loop_body)

let callee = global_context.while_repeat
let context = Object.create(active_context())
activation_stack.push({ callee,context }); let stack_depth = activation_stack.length
for (;;) {
try {
executed_block(loop_body,[],context)
} catch (signal) {
switch (true) {
case (signal instanceof loop_continuation):
activation_stack.length = stack_depth
break
case (signal instanceof loop_termination):
activation_stack.length = stack_depth-1
return signal.value
default: throw signal
}
}

try {
let condition_value = executed_block(condition,[],context)
if (boolean_type(condition_value) && (condition_value == false)) { break }
} catch (signal) {
switch (true) {
case (signal instanceof loop_continuation):
throw new Error('must not "continue" in loop condition')
case (signal instanceof loop_termination):
throw new Error('must not "break" in loop condition')
default: throw signal
}
}
}
activation_stack.pop()

return context.it
},

'break': (value) => {
throw new loop_termination( value == null ? undefined : value )
},

'continue': () => {
throw new loop_continuation()
},

'return': (value) => {
throw new function_termination( value == null ? undefined : value )
},

assert: (value) => {
assertion_counter++
if (value === true) {
return true
} else {
throw new Error('assertion #' + assertion_counter + ' failed')
}
},

assert_failure: (block) => {
block_type(block)

assertion_counter++
try {
executed(block,[])
} catch (signal) { return true }

throw new Error(
'assertion #' + assertion_counter + ' failed (block did not fail)'
)
},

log: (value) => {
console.log(value)
}
})
}

script = statements:statement_list _
{ console.log('statements',statements); return evaluated_script(statements) }
block = '{' statements:statement_list _ '}'
{ return new form('#block',statements) }
statement_list = statements:(_ statement / _ ';' )*
{ return statements.filter((stmt) => stmt[1] !== ';').map((stmt) => stmt[1]) }
statement = if_statement / select_statement
/ for_statement / while_statement / until_statement
/ continue_statement / break_statement / return_statement
/ expression

if_statement = 'if' _ '(' _ condition:expression _ ')' _ 'then' _ then_clause:block
else_clause:(_ 'else' _ block:block { return block })?
{ return new form('#call-global',['if_then_else',condition,then_clause,else_clause]) }
select_statement = 'select' _ '{' _ when_clauses:when_clause* otherwise_clause:otherwise_clause? '}'
{ return new form('#call-global',['select_when',otherwise_clause].concat(flattened(when_clauses))) }
when_clause = 'when' _ '(' _ condition:expression _ ')' _ 'then' _ block:block _
{ return [condition,block] }
otherwise_clause = 'otherwise' (__ 'do')? _ block:block _ { return block }

for_statement = 'for' __ identifier:identifier _ 'from' alone _ start_value:expression _
downwards:('down' _ { return true })? 'to' alone _ stop_value:expression _
step_value:('by' alone _ value:expression { return value })? _
'repeat' _ loop_body:block
{ return new form('#call-global',['for_repeat',identifier,!!downwards,start_value,stop_value,step_value,loop_body]) }
while_statement = 'while' _ '(' _ expression:expression _ ')' _ 'repeat' _ loop_body:block
{ return new form('#call-global',['while_repeat',new form('#block',[expression]),loop_body]) }
until_statement = 'repeat' _ loop_body:block _ 'until' _ '(' _ expression:expression _ ')'
{ return new form('#call-global',['repeat_until',loop_body,new form('#block',[expression])]) }

continue_statement = 'continue' alone { return new form('#call-global',['continue']) }
break_statement = 'break' alone (_ result:expression { return result })?
{ return new form('#call-global',['break',result]) }
return_statement = 'return' alone _ result:expression?
{
if ((result instanceof form) && (result.type === '#call')) {
result.type = '#call_tce'
}
return new form('#call-global',['return',result])
}

expression = assignment / or_term
or_term = left_operand:and_term operations:(_ 'or' alone _ and_term)*
{ return left_associative(left_operand,operations) }
and_term = left_operand:not_term operations:(_ 'and' alone _ not_term)*
{ return left_associative(left_operand,operations) }
not_term = operators:('not' alone _)* operand:comparison
{ return prefixed(operators,operand) }
comparison = left_operand:additive_term operation:(_ ('<' ![=>] { return '<' } / '<=' / '=' !'>' { return '=' } / '>=' / '>' !'=' { return '>' } / '<>') _ additive_term)?
{ return left_associative(left_operand,operation == null ? [] : [operation]) }
additive_term = left_operand:multiplicative_term operations:(_ ('+' / '-') _ multiplicative_term)*
{ return left_associative(left_operand,operations) }
multiplicative_term = left_operand:negation_term operations:(_ ('*' / '/' / 'div' alone { return 'div' } / 'mod' alone { return 'mod' }) _ negation_term)*
{ return left_associative(left_operand,operations) }
negation_term = operators:('-' ! (_ digit) _ { return 'neg' })* operand:exponential_term
{ return prefixed(operators,operand) }
exponential_term = left_operand:invocation operation:(_ '^' _ exponential_term)?
{ return right_associative(left_operand,operation) }
invocation = callee:primary args:( _ '(' _ argument_list? _ ')')?
{ return (args == null ? callee : new form('#call',[callee].concat(args[3] || []))) }
primary = literal
/ 'global' __ identifier:identifier { return new form('#get-global',[identifier]) }
/ 'local' __ identifier:identifier { return new form('#get-local',[identifier]) }
/ identifier:identifier { return new form('#get-var',[identifier]) }
/ '(' _ expression:expression _ ')' ! (_ '=>') { return expression }
/ lambda_definition / native_definition

assignment = global_assignment / local_assignment / generic_assignment
global_assignment = 'global' __ key:identifier _ ':=' _ value:expression
{ return new form('#set-global',[key,value]) }
local_assignment = 'local' __ key:identifier _ ':=' _ value:expression
{ return new form('#set-local',[key,value]) }
generic_assignment = key:identifier _ ':=' _ value:expression
{ return new form('#set-var',[key,value]) }

lambda_definition = '(' _ parameters:parameter_list? _ ')' _ '=>' _ body:block
{ return new lambda(parameters || [],body.items) }
native_definition = 'native' _ '(' _ parameters:parameter_list? _ ')' _ '=>' _ body:string
{ return new_native(parameters || [],body) }

parameter_list = identifier:identifier identifiers:(_ ',' _ identifier)*
{ return [identifier].concat((identifiers || []).map((list) => list[3])) }
argument_list = expression:expression expressions:(_ ',' _ expression)*
{ return [expression].concat((expressions || []).map((list) => list[3])) }

literal = 'nil' alone { return undefined }
/ boolean / number / string
/ block
boolean = 'true' alone { return true }
/ 'false' alone { return false }
number = binary / hexadecimal / decimal / 'nan' alone { return NaN }
binary = '0b' [01]+ { return parseInt(text().slice(2),2) }
hexadecimal = '0x' hex_digit+ { return parseInt(text().slice(2),16) }
decimal = mantissa exponent? { return parseFloat(text()) }
mantissa = [+-]? (digit+ ('.' digit*)? / '.' digit+)
exponent = [eE] [+-]? digit+
string = single_quoted_text / double_quoted_text /
single_quoted_string / double_quoted_string
single_quoted_string = "'" content:(escape_sequence / & no_control_char char:[^\'] { return char })* "'" { return unescaped(content.join('')) }
double_quoted_string = '"' content:(escape_sequence / & no_control_char char:[^\"] { return char })* '"' { return unescaped(content.join('')) }
single_quoted_text = "'''" content:(escape_sequence / !"'''" char:[\'] { return char } / [^\'])* "'''" { return unescaped(content.join('')) }
double_quoted_text = '"""' content:(escape_sequence / !'"""' char:[\"] { return char } / [^\"])* '"""' { return unescaped(content.join('')) }
escape_sequence = (('\\' [bfnrtv0'"\\]) / ('\\x' hex_digit{2}) / ('\\u' hex_digit{4}))
{ return text() }
no_control_char = [^\x00-\x1F\x7F-\x9F\u200B\u2028\u2029\u2060]
digit = [0-9]
hex_digit = [0-9a-fA-F]

identifier = ! (reserved_word alone) identifier_start identifier_part*
{ return text() }
identifier_start = [$_a-zA-Z]
identifier_part = [$_a-zA-Z0-9]

alone = ! identifier_part

reserved_word = 'global' / 'local' / 'native'
/ 'nil' / 'true' / 'false' / 'nan'
/ 'div' / 'mod' / 'and' / 'or' / 'not'
/ 'if' / 'then' / 'else' / 'select' / 'when' / 'otherwise'
/ 'for' / 'from' / 'down' / 'to' / 'by' / 'repeat'
/ 'while' / 'until' / 'continue' / 'break' / 'return'

comment = (line_comment / block_comment) { return whitespace }
line_comment = '//' [^\n\r]* [\n\r]*
block_comment = '/*' (!'*/' .)* '*/'

_ = ([ \t\n\r] / comment)* { return whitespace }
__ = ([ \t\n\r] / comment)+ { return whitespace }

Examples

The following examples show the use of different values, operators and statements - and at the same time serve as a kind of "smoke test" for the interpreter

  assert(nil = nil)

assert(true)
assert(not false)

assert(pi > 3)
assert(e > 2)
assert(is_nan(nan))

assert(0b1001 = 9)
assert(0x11 = 17)

assert(-123 < 0)
assert(+123.456 > 123)
assert(.123e3 = 123)

assert('Test' = "Test")
assert('''
Test
''' = """
Test
""")
assert('''
Test
''' = '\n Test\n ')

global test := 'Test ' + (0b0001+2*0x03)
assert(test = 'Test 7')
  assert((true and false) = false)
assert((true or false) = true)
assert((not true) = false)

assert(0 < 1)
assert(0 <= 0)
assert(1 = 1)
assert(1 >= 1)
assert(1 > 0)
assert(1 <> 0)

assert(1 + 1 = 2)
assert(1 - 1 = 0)
assert(1 * 1 = 1)
assert(1 / 1 = 1)
assert(3 div 2 = 1)
assert(3 mod 2 = 1)
assert(-(1) < 0)
assert(2 ^ 3 = 8)
  global a := 'global a'
a := 'a'
{
assert(a = 'a')
assert(global a = 'global a')

a := 'context a'
assert(a = 'context a')
assert(global a = 'global a')

local a := 'local a'
assert(a = 'local a')
assert(global a = 'global a')
}()

assert(a = 'context a')
  a := 'test'
block := { b := 'function ' b + a }
assert (block() = 'function test')

fn := (a,b) => { return a + 'function ' + b }
assert (fn('this is a ',a) = 'this is a function test')

fn := native (a,b) => "return a + 'function ' + b"
assert (fn('this is a ',a) = 'this is a function test')
  a := 0
if (a = 0) then { 'zero' } else { 'not zero' }
assert(it = 'zero')

if (a < 0) then { 'negative' } else { 'not negative' }
assert(it = 'not negative')

select {
when (a < 0) then { 'negative' }
when (a = 0) then { 'zero' }
otherwise { 'positive' }
}
assert(it = 'zero')

test := ''
for i from 1 to 10 by 2 repeat { test := test + i }
assert(test = '13579')

test := ''
for i from 9 down to 1 by -2 repeat { test := test + i }
assert(test = '97531')

test := '' i := 1
while (i < 10) repeat {
if (i mod 2 = 0) then { test := test + i }
i := i + 1
}
assert(test = '2468')

test := '' i := 1
repeat {
if (i mod 2 = 0) then { test := test + i }
i := i + 1
} until (i < 10)
assert(test = '2468')

test := ''
for i from 1 to 10 repeat {
if (i mod 2 = 0) then { continue }
test := test + i
}
assert(test = '13579')

test := '' i := 1
while (i < 100) repeat {
if (i mod 2 = 0) then { test := test + i }
i := i + 1
if (i > 9) then { break } else { continue }
}
assert(test = '2468')

Test of "Tail Call Elimination" in slang

One of the special features of "slang" is the elimination of "tail calls" and especially "tail recursions" - in this way, loops can be safely replaced by recursions (similar to some strictly functional programming languages).

Among the classic examples of functions that can be rewritten for tail recursion elimination are the mathematical factorial and the calculation of the Fibonacci series - even though modern languages (such as JavaScript) are nowadays more likely to reach the limit of the numerical value range on powerful computers than to encounter problems with recursion.

  native_factorial := native (n) => '''
function factorial (n) {
return (n <= 1 ? 1 : n * factorial(n-1))
}
return factorial(n)
'''

global factorial := (n) => {
if (n <= 1) then { 1 } else { n * factorial(n-1) }
}
assert(native_factorial(170) = factorial(170))
  native_fibonacci := native (n) => '''
function fibonacci (n, aux1, aux2) {
if (n <= 1) {
return aux1 + aux2
} else {
return fibonacci(n-1, aux1 + aux2, aux1)
}
}
return fibonacci(n-1,1,0)
'''

global fibonacci := (n, aux1, aux2) => {
if ((aux1 = nil) or (aux2 = nil)) then {
return fibonacci(n-1,1,0)
} else {
if (n <= 1) then {
return aux1 + aux2
} else {
return fibonacci(n-1, aux1 + aux2, aux1)
}
}
}

assert(native_fibonacci(1476) = fibonacci(1476))

If you want to bring a modern computer to its knees, you have to resort to more "violent" methods - e.g. using recursions instead of loops.

  native_sum_up := native (n, sum) => '''
function sum_up (n, sum) {
return (n <= 0 ? sum : sum_up(n-1,sum+n))
}
return sum_up(n,sum)
'''
native_sum_up(100000,0)
// Maximum call stack size exceeded
  global sum_up := (n, sum)  => {
if (n <= 0) then { sum } else { return sum_up(n-1,sum+n) }
}
sum_up(100000,0)

This web page uses the following third-party libraries, assets or StackOverflow answers:

The author would like to thank the developers and authors of the above-mentioned contributions for their effort and willingness to make their works available to the general public.