Posts Tagged ‘booter’

BooTer Reduced

Friday, November 30th, 2007

Edit: Apparently the way to prove a new esoteric language is turing-complete is to implement a BF interpreter. This technique has been used to show that LOLCODE is turing complete, so I suppose I could target that as the first non-trivial program to write as soon as I get a working BooTer interpreter :D Or, I suppose I could implement a LOLCODE interpreter in BooTer in which I could run a BF interpreter…

Thinking about BooTer, I decide that I wasn’t making it esoteric enough..so I’ve slimmed down my specification for simpler implementation, and much more difficulty doing anything useful with it :)

The only thing allowed is simple expressions and boolean ternary expressions. No comma-separated lists of expressions, no assigning expressions to variables, no symbols, no quoting of expressions.

Maybe in the future I’ll re-expand it out into a more “real” language, but for now I just want to do something that I can do simply and easily without accidentally building an entire broken LISP.

So, with all that stuff gone..what’s left? What’s different now?

  • The boolean-ternary operator is in an order which won’t drive me insane to write programs in. The boolean expression comes first. If true, the expressions evaluates to the second expression, if not, it evaluates to the third. This now works basically just like the ternary operator in most other languages…but with a different syntax and the possibility of using the re-eval operator.

  • A BooTer program is now a single boolean-ternary (booter) expression, with optional nested booter expressions.

  • The center portion of a booter expression (the “boo” part), must evaluate to either true or false. False is represented by the integer 0, or by null. Anything else is true.

  • null is represented by the lack of a value. This is useful for instances where the left or right portion of an expression will never be used, for example:

( *…do something here…* : x = 5 : )
  • There are only numbers and strings as primitive types.

  • Variable names can start with upper or lowercase letters.

  • There’s only a single array type. Array items are assigned and accessed using the standard [] subscript operator. Creation of an array is implicit with the assignment of the first item to that array. If you use the subscript operator to make an assignment to a previously existing variable, the old variable is overwritten with the new array. The previous array literal syntax still applies. Only primitive values and variable names can be used in the array literal syntax.

  • Math and boolean order of operations are the same as in most programming languages these days.

  • The re-eval operator, ^, still works as previously stated.

  • Comments start with a semicolon and continue to the end of a line.

The previous example programs are both a bit more complex than they were previously (but hey, the parser and interpreter will be much easier to implement!):

Infinite NOOP loop:

( ^ : : )

100 bottles of beer:

(n = 101 :
  (
    (n = n-1 : 
      print [n, "bottles of beer on the wall!\n"] : 
    ) : 
    ^ : 
  ) : 
)

In the beer example, assume that print prints the items passed into its array to the stdout and then evaluates to the same string it printed out.

  • While n is greater than 0, the sub-expression will evaluate to the string returned by print, which by the rules above equates to true, thus causing the re-eval operator to be evaluated, looping back to the expression (n = n-1).

  • When n finally gets assigned 0, the expression will evaluate to the third, empty, sub-expression, which equates to false, which means the center of the outer expression now evaluates to false as well, and thus the third expression of the outer expression is evaluated. It evaluates to null, which is the return of the program.

Here’s a BooTer “for loop” with more of an explanation of what’s going on:

(i = -1 :              ; initialize loop variable
  (
    (100 > i = i + 1 : ; increment loop variable, check it against condition
      ...              ; do things, must evaluate to true
    :
      ...              ; this one must evaluate to false (leaving it empty works)
    ) 
  :
    ^                  ; re-evaluate this expression, causing the loop
  :
    ...                ; whatever can go here, this will be evaluated after the loop is done
  )
: )                    ; done, this last sub-expression will never be run

Even with this new “simplified” syntax, it’s enough to make your head hurt. Now I just need to actually write a reference implementation with some standard library calls, then try to write something non-trivial with it. Who wants to try their hand at writing a BooTer web server?

An Introduction to BooTer

Friday, November 23rd, 2007

Intro

For a long time I’ve had lots of ideas for a programming language with all sorts of advanced features. Of course, I’ve never written a programming language before, so I’d like to get a bit of experience before attempting to create the next killer language. So what I needed was a simpler language to implement first. BooTer (Boolean-Ternary) is the result of some brainstorming of a “simple” first language.

I’m calling it an Esoteric Programming Language, since it probably won’t be that useful for any sort of real programming. It will be Turing Complete, if a bit difficult to write anything non-trivial in.

I don’t have the whole language figured out yet, so any of the information here will be subject to change at any time.

Syntax

A BooTer program is a series of nested expressions. Every expression evaluates to a value. An expression can be one of the following:

  • A simple value.

    These are your standard primitive values, such as integers, floating point numbers, strings, symbols, or variables:

      1
      1.1
      "blah blah blah"
      fooSymbol
      BarVar

    A symbol is a symbolic constant, with no specified value. Symbols start with a lowercase letter, while variables start with an uppercase letter.

  • An array or hash.

    These are similar to javascript array and object literals:

      [1, 2, "blah blah", 4.2]
      {blah: "blah blah blah", foo: "foo foo foo", aNumber: 124}
    

    Both arrays and hashes can hold any expressions, including more arrays and hashes, of course. Hash keys can be any of the simple types, numbers, strings, or symbols.

  • A comma-separated list of expressions.

    This will evaluate to the value of the last expression in the list:

     1, 2, "blah blah", 4.2
    

    Each expression will be evaluated in order. This is useful for doing things like processing input and output or changing the value of a variable before evaluating another expression using it.

  • A variable assignment.

    This will evaluate to the value of the variable after it is assigned:

     Avariable = "something"
    

    The right side of the assignment can be any expression, optionally wrapped in a set of parenthesis for disambiguation, in case the variable is being assigned the value of a comma-separated list of expressions.

    Avariable = 1, 2, 3, 4

    Will assign the Avariable to the interger 1 and then go on to evaluate the rest of the expression, whereas

    Avariable = (1, 2, 3, 4)

    will assign the value of Avariable to be the integer 4 after evaluating the rest of the expression.

  • A boolean expression.

    These are expressions with the standard ==, <, > <=. >=, !=, !, &&, ||. The result of any boolean expression is either “true” or “false”, and in the case of !, &&, and ||, those are the only valid values to pass in to them. This means that !Avar is not a valid boolean expression unless Avar’s value is currently one of the two boolean symbols.

  • A boolean-ternary expression.

    ("yes" : X < 42 : "no")
    

    This expression was the inspiration for the BooTer language (hence the name), and is the sole form of flow control that the language has. In the center is a boolean expression. If the boolean expression evaluates to true, then the expression evaluates to the evaluation of the subexpression on the left. Likewise, if the center expression evaluates to false, then the expression evaluates to the evaluation of the subexpression to the right.

    Why the emphasis on “the evaluation”? Because the subexpressions are not evaluated at all unless the center expression evaluates in their favor, at which point the value of the whole boolean-ternary expression becomes the value of the evaluated subexpression.

  • Special forms. For lack of a better name, everything else that is needed for a complete language will be called a special form.

    • built-in functions.

      These are functions built into the language which allow for things such as IO and Math. I haven’t fully made up my mind about the syntax of calling one of these built-in functions, but I’m thinking it will be the name of the function followed by a single expression, allowing for single values to be passed in for simple functions, or arrays or hashes to be passed into functions requiring more parameters. Passing in a single expression keeps the syntax simple.

    • ^ - The Re-eval operator.

      This operator causes the parent expression of the current expression to be re-evaluated, or the parent of a parent, etc.. This, combined with the boolter operator, provides the language with a means for looping. An endless loop could be acheived with the following complete BoolTer program:

      print "Hello World!", ^
      

    The simplest possible BoolTer program could consist of a single ^, thus looping forever doing nothing. A (slightly) more useful example would be a program to print the song 100 Bottles of Beer on the Wall:

      N = 100,
      (N = N - 1, ^ : print [N, " bottle(s) of beer on the wall!\n"], N > 0 : done)
      

    “done” has no meaning, it’s just used as something to evaluate to, since the expression has to evaluate to something. For going back up more than one parent-expression, simply put the appropriate number of re-eval operators next to eachother, without a comma. For example, changing “done” to “^^” in the previous example would cause the program to loop endlessly, counting down from 100 beers again and again.

    • ‘ - The quote operator.

      This operator, causes the following expression to evaluate to the expression itself, rather than to the evaluation of the expression. In other words, it defers the evaluation to a later time. This can be used to save expressions into variables or pass expressions into built-in functions. Most importanty, it allows expressions to be used as something like functions.

    • Temporary bind.

      I haven’t figured out how I want to do this yet, but I want to allow temporary binding of variables for a certain expression, something like let in lisp.

There are still a few details that I need to work out in regards to how everything is going to work, but I do have a working lexer and a mostly-complete parser, thanks to Flex and Lemon. All I need to do now is decide how I want to represent the parse tree, build it, and then get a working interpreter going. Due to the way the language is set up, it seems like it wouldn’t be too hard to generate assembler from it and make compilable programs…but first I want to see the language running at all.

What I'm Listening to

Loading...