Shift of focus

Usual ways

 

 

Problems

 


Two sides of the same coin

 

So why is reading parse nodes so much more complicated than writing them? ...It is curious after all, the reading and writing are two faces of a same coin, it is both ways of a bijection.

 

Ideally, it would be possible to simply "state" the bijection between syntax and internal representation. Then, read/write functions would be built upon this declaration.

 

This is exactly what we will do. A syntax declaration and associated read/write functions could look like this:

 

 

syntax-def s

FunDef ::= "function" name args "=" expr

args  ::= List "(" Variable "," ")"

 

 

read = readSyntax(s)

write = writeSyntax(s)

 

syntax-def is some customized sub-language resulting in a Syntax data. Then, the readSyntax and writeSyntax are simply functions producing a parsing/display function based on the Syntax data given above.

 


In practice

 

To make this possible, the language needs to be able to support various concepts.

 

The first question is on how the Syntax entity is represented. After all, it is just declarative data made of:

  • Result data
  • rules
    • strings ("function", "=",...)
    • symbols (FunDef, name, args,...)
    • special constructs (List, Either, Maybe,...)

Moreover, it is important to notice that the symbols used in the rules are either attributes of the result data or the name of another rule.

 

So what do we need?

  • Symbols themeselves as valid values
  • Full introspection:
    • Access to the list of attributes of a type
    • Given constructor symbol and attribute symbols, build an instance of the target type (*)
  • Ability have "build" functions based on data. I.e. functions that return new functions.

 

The last point is not a problem as it is a basic concept in functional programming. The first point is a little subbtle but depends mainly on early design choices of the language, like in scheme. Getting a list of attributes of a type is common place as well. The point that provides difficulty is (*). Handling symbol manipulation to this extend and providing type safety is a delicate matter. See the manual pages on symbolic data manipulation for more information.

 


Structured example

 

Below is an example of how syntax definition data could be structured:

 

data Syntax

result :: Data

content :: Content

rules :: {} of (Symbol -> Content)

 

type Content = [] of (String | Symbol | Construct)

type Construct = List | Either | Maybe

 

data List

left :: String

elem :: Data

sep :: String

right :: String

 

Using these definitions, the initial:

 

syntax-def s

FunDef ::= "function" name args "=" expr

args  ::= List "(" Symbol "," ")"

 

...would be transformed into:

 

Syntax s

result = 'FunDef'

content = ["function", 'name', 'args', "=", 'expr']

rules = {

'args' -> [List("(", 'Variable', ",", ")")]

}

 

 

Doesn't this break type safety?

 

Yes yes yes!!!

symbols are typeless, they are just symbols! not connected to any identifier!

 

The following is crap:

 

It must be clear that one can only create valid symbols. That is, symbols that are defined in the scope. Since they are defined, the symbols are also well typed. Therefore, any manipulation of them is well typed too.

 

It brings in a kind of meta-programming. In the sense that the code is basically data and can be manipulated like just like any other data.

 

The last question is how to distinguish what can be preproccessed during compile time and what has to be run at runtime.

 

 

 

 

 

--------------------------------------------------------------------------

 

Examples

  • It should be able to store the type 'FunDef' (the type itself, not an instance of it)
  • It should be able to store the symbol 'name' (the symbol itself, and not the value it stands for)

 

So it should be possible to write:

 

Person joe

name = "Joe"

age = 45

 

x = joe      # the person defined above

x° = 'joe'   # the symbol 'joe'

y = age     # error: no age defined in scope

y° = 'age'  # the symbol 'age'

 

x == $x°     # True

 

joe.age  # 45

x.y         # error: x is a Person and has no member attribute named 'y'

x°.y°     # error: x° is a symbol and has no member named 'y°'

$x°.$y° # 45

 

 

 

macros:

 

macro getAttr x y

x.y

 

getAttr joe age     # 45

getAttr 'joe' 'age'  # error: symbol 'joe' has no member named 'age'

 

function getAttr2 (x,y) -> z

$x.$y

 

getAttr2 joe age     # error: "joe" is not a symbol, "age" is not a symbol

getAttr2 'joe' 'age'  #45

 

 

function getAttr (x :: Symbol, y :: Symbol) -> z

z = x.y

 

 

It works as follows:

 

syntax syntax-def syn

syn = Syntax

 

readSyntax node (e:es)

function (str :: Stream) -> (

 

 

 

 

 

NO!!!

 

sub-syntaxes must be compile time transformations!!!

 

# defines a struct (read, write)

syntax syntax-def = (read, write)

 

assert

syntax-def.header == []

size(syntax-def.body) > 0

for line in syntax-def.body

size(line) >= 3

line.(1) is Symbol

line.(2) == '::='

 

rules = map line->rule syntax-def.body

 

 

write = function (node) -> ans

ans = interperse ++ (map toStr rule.(result))

 

function toStr (x) = str

if x is String

str = x

if x is Symbol

if x in rules

str = write(x)

else

str = write(rules.(x))

if x is Construct

...