-
Notifications
You must be signed in to change notification settings - Fork 10
/
Copy pathsemantics_documentation.txt
606 lines (400 loc) · 18.7 KB
/
semantics_documentation.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
tw=80
In programming language theory, semantics is the field concerned with the
rigorous mathematical study of the meaning of programming languages. It does so
by evaluating the meaning of syntactically legal strings defined by a specific
programming language, showing the computation involved.
-- Wikipedia Semantics (computer science)
Lever syntax is defined by a context-free grammar file lever-0.8.0.grammar. That
file changes over time. This ability allows this programming language to evolve
over time. People can fork the whole language and adjust it to their needs.
Later the mainline can pick and merge adjustments that become valuable.
Main purpose of this file, semantics_documentation.txt is to give the reader a
basis to reason about behavior of lever source code.
In summary, Lever forms a lexically scoped, dynamically typed programming
language, resembling Python, Ruby, Perl, Scheme, Javascript.
Lever source code is compiled into bytecode. Runtime can load a bytecode object.
The loaded bytecode forms a program. The program can be invoked and invocation
requires a module as an argument.
compilation unit
Since compiling takes place, we define a compilation unit. Compilation unit is a
dictionary that holds:
'version' = 0
'sources' - List of references to source files the compilation unit used as
sources.
'constants' - List of constants referenced by this compilation unit.
'functions' - List of function declarations in this compilation unit.
The function declarations consist of bytecode blobs, annotated with everything
required to construct a function when loading the unit in the runtime.
The bytecode files are treated casually. They are ordinary objects in this
programming language, we can load and save them with any .json -like formats.
Currently lever runtime uses binon, which is an experimental serialization
format meant to evolve along the language into something universally useful.
The multiple source file references are for forwards compatibility.
source files
Person reading lever source files should treat them as programs that run with a
module associated to the file. The source file runs in sequence from top to
bottom, with module attached to it.
modules
Module works as a "global scope" for a program. Variables defined on the
top-level of source file end up into the module. For example. The following
program sets a variable:
greet = "hello world"
When you load this program and run it in some module, that module obtains .greet
-variable if it didn't already have one.
Most lever modules are derived by extending existing modules, such as 'base'
which contains the default runtime environment for lever.
When you define a new module scope, you can also define a base_module from which
the module scope should derive new modules.
statements
Lever source file is a list of statements. Every lever statement evaluates into
some value when run.
The value of last statement is obtained and returned implicitly. This feature
is used in read-eval-print -loops.
Statements appear as objects inside the compiler, but otherwise they never
appear explicitly as objects. Every statement is compiled into bytecode and the
concept of individual statement is erased at that point.
List of statements form functions and blocks in compilation unit.
In the lever grammar you see lot of things that are understood as 'statement'.
You find out there are 'block statements', ordinary 'statements' and
'expressions'. This grouping resemble most popular programming languages at
the moment. I find it comfortable most of the time.
The grammar binds every rule into their semantic meaning, and the semantic
meaning is documented in this file.
constants
Constants evaluate to themselves. Here's how they appear in the grammar:
int {int}
hex {hex}
float {float}
string {string}
Here are examples of constants:
1234
0x400
1.23
"terve"
'hei'
Constants fill up an entry in 'constants' -table of compilation unit.
composite values
Lever has lists and dictionaries. They have notation in Lever language.
list {"[" arguments "]"}
dict {"{" pairs "}"}
dict {"{" nl_pairs "}"}
Here are some examples of each:
[1, 2, 3, 4]
{"hello":4, fair=5}
{
a = 1
b = 2
c = 3
}
The lists and dictionaries evaluate their containing statements, then they
evaluate to value they represent.
variables
Lever variable lookup evaluates to the value of a variable-slot visible in
current scope.
lookup {symbol}
lookup {"%" string}
Examples of lookup statements:
%"import"
%"+"
print
Lever has a lexical scope. The behavior of this scope is slightly different from
other languages.
'scopegrabber', 'class' and 'function' statements create a scope. A scope is
active within the statement that creates it, and a scope contains variable-slots.
Every variable slot in a scope has a name, and in ordinary resolution the name
is fetched from higher scope if it is not present in currently active scope.
'function' -scope considers both variables defined above and below in
higher scope are considered, but most other scopes only consider the variables
in higher scope that are locally defined above it.
Lever has several forms of assignments:
local_assign {local_symbol "=" block_statement}
upvalue_assign {symbol ":=" block_statement}
op_assign {slot op "=" block_statement}
slot =>
lookup_slot {symbol}
attr_slot {expr "." symbol}
item_slot {expr "[" expr "]"}
op => ["|", "^", "&", "<<", ">>", "++", "+", "-", "%", "/", "*"]
Example of local_assign:
hello = "world"
local_assign always creates a variable-slot in currently active, or 'local' scope.
Example of upvalue_assign and op_assign:
test := 4
tryout += 2
upvalue_assign and op_assign do an ordinary scope lookup for a variable slot.
When a correctly labelled slot is found, it uses that slot.
attr_slot and item_slot virtualize a getattr/setattr and getitem/setitem, so
they can be used in op-associated assignments. Examples of those:
expr.attribute += 11
expr[4] *= 2
While we go those virtualizations through, it is worthwhile to mention the
setattr and setitem statements that resemble assignments:
setitem {expr "[" expr "]" "=" block_statement}
setattr {expr "." symbol "=" block_statement}
Examples:
expr[5] = 1
expr.test = 2
All assignment-like statements evaluate to the value they set.
Internally setattr/setitem -functions also return a value. They are meant to
return a value that was in the place of a slot they replaced a value in. This
feature is currently unused.
Of course, getitem and getattr:
getitem {postfix "[" expr "]"}
getattr {postfix "." symbol}
They are in the same format as the setattr/setitem statements, and evaluate to
what a specific getattr/getitem action on the object returns.
Currently there is one case where the lexical scoping of lever can cause
significant confusion:
confuser = 4
func = ():
if X
confuser = 10
else
print(confuser) # prints 'null' and not '4'
The problem is that scope of these statements is built linearly, rather than
following the logical flow of the program.
This is potentially something that should be fixed later. For now it's
documented here that there's this kind of problematic case.
function calls
There's lot of talk about function calls above, here we define what they are.
call {postfix "(" arguments ")"}
callv {postfix "(" arguments "..." ")"}
Examples:
print("cabbage", "rolls")
print("cabbage", folio_patty...)
At first these statements will evaluate the statements they contain, from left
to right. The call gets a value to call, and list of arguments to call it with.
'callv' behavior differs in that the last argument is used to extend the given
list of arguments.
When a function is called, it runs through a list of statements. If it doesn't
return before that list of statements are run, it will return a 'null' -value in
the end.
How the arguments affect the call of a function will be described in 'functions'
-section.
Additionally 'prefix' and 'binary' statements are also function calls. They
behave similarly to ordinary function calls.
For example lets look at these binary and prefix statements:
binary {expr100 ^"+" expr200}
binary {expr100 ^"-" expr200}
prefix {^"+" postfix}
prefix {^"-" postfix}
Here's what they look like in source file when you meet them:
1+2
1-2
+3
-3
The are functionally equivalent to:
%"+"(1, 2)
%"-"(1, 2)
%"+expr"(3)
%"-expr"(3)
You see that prefix statement adds that "expr" to the name to differentiate
between single and two-arity functions. This is done to simplify implementation
of those functions.
The grammar has binary and prefix statements in hierarchies to establish
precedence rules within these statements. The precedence rules should match that
of python or C.
So every operator is a function, there are some exceptions:
in {expr10 "in" expr10}
not_in {expr10 "not" "in" expr10}
not {"not" expr8}
Examples:
"foo" in [foo, bar]
"foo" not in [foo, bar]
not false
The 'in' invokes '+contains' in the interface of a right-hand object, and
evaluates to 'true' or 'false', depending if the object contains an another.
The 'not_in' forms an inversion. This may be later described in the grammar with
a different rule, and would be omitted from the semantics.
The 'not' is there to invert a boolean value. It always evaluates to an inverted
boolean value of it's containing statement.
Lever has implicit conversion to booleans, and it follows the following rule:
boolean(null) == false
boolean(false) == false
boolean(anything else) == true
import
Import evaluates to null?
While it does so, it will also call 'import' function in the current scope
for each name and then it assigns the imported module to variable of that name.
import {"import" symbols_list}
Example:
import blub, bar
functions
function {"(" bindings ")" ":" block}
Examples:
():
print("hello")
(foo):
print(foo)
(foo, bar=2):
print(foo, bar)
(foo, bar=2, rest...):
print(rest)
return foo + bar
Function consists of bindings associated with list of statements. Function
statement evaluates into a function. The statements inside are evaluated in new
variable scope when the function is called.
Scope resolution inside a function considers the whole higher scope, including
the local_assign operations before and after the function statement.
Arguments given to function during a call are bound using the binding rules.
The bindings consists of mandatory arguments, followed by optional arguments,
followed by variadic function argument.
Arguments are consumed and assigned from argument list in the order they are
given. If there is a variadic function argument then the argument list with
remaining arguments will be assigned to that.
Every optional argument has a default statement that will be evaluated if the
function doesn't get enough arguments or if the given argument is 'null'.
The 'null' behavior in optional is arguably weird, but also very logical. It is
to be considered later, if it is worthwhile to keep this way.
control flow statements
Control flow statements form a large part of lever. Everything defined below,
down to the pseudoscopes are form of control flow statements.
If not otherwise described, the control flow statements always evaluate to what
the last evaluated statement in them evaluate to.
Simplest control flow statement is return:
return {"return" statement}
Example:
return 5
When return is evaluated, it returns from a function or program that
is running. The given statement is evaluated and the value is returned as a
return value of the given function or program.
Return doesn't evaluate to anything, because subsequent evaluations do not
evaluate, including the statements that contained 'return'.
The next simplest control flow statements would be 'or' and 'and':
or {expr3 "or" expr}
and {expr5 "and" expr3}
Examples:
true or false
true and false
If the first item in 'or' evaluates to 'null' or 'false', then the second item
is evaluated and the statement evaluates to it instead.
Otherwise the statement evalutes to what the first item evaluates to.
If the first item in 'and' evaluates to 'null' or 'false', then the statement
evaluates to first item.
Otherwise the statement evaluates to what the second item evaluates to.
For more serious control flow, Lever has if/elif/else, just like python got it!
if {"if" statement block otherwise}
otherwise =>
done {}
elif {%newline "elif" statement block otherwise}
else {%newline "else" block}
Examples:
if holiday
rage_party()
elif tuesday
lazy_mode()
else
churn()
If-block goes through conditions in order. If a condition evaluates to true
value, then the statement list is evaluated below it.
Otherwise if else is present, that else-block is evaluated.
Otherwise if-block evaluates to 'null'.
loops
while {"while" statement block}
for {"for" symbol "in" statement block}
break {"break"}
continue {"continue"}
Examples:
while tuesday
keep_working()
for x in days
print(x)
while true
if lazy()
break
while tuesday
if eager()
continue
The while statement is a simplest way to form a loop in lever. If the condition
evaluates to true value, then the statement list is evaluated. This is then repeated
until condition evaluates to false.
'for' statement evaluates the first statement given to it and retrieves an
iterator from it. It calls '.next()' -function in the iterator until the
function signals to stop iteration.
For every value the 'for' statement gets, it local_assigns the value to a symbol
and evaluates the statement list with the value.
If the containing statement list in loops do not evaluate, the loop statement
evaluates to 'null'.
error reporting and handling
Lever has tracebacks and exceptions. The exception handling is intended to be
used for handling errors and releasing resources in Lever.
assert {"assert" statement "," statement}
assert {"assert" statement block}
Examples:
assert win, "window not opened"
assert win
result = run_error_diagnostic()
"window not opened, reason: " ++ format_result(result)
Assert is the entry-level error reporting tool in Lever. It is the first thing
you should spam when you have an error condition in your program.
If the argument given to assert evaluates to false value, the assert will
evaluate it's block and raise AssertionError with the evaluated value.
When program gets sophisticated and raise is not enough, you can start using
raise with exception object of your choice:
raise {"raise" statement}
Examples:
raise Exception("bad error")
raise BadThingsHappened()
Preferably the value to 'raise' should extend from 'Exception'. Also the value
needs a '.traceback' -attribute. This attribute starts collecting a list of
traceback entries while the exception travels in the call graph of a program.
Exceptions can be catched too!
try {"try" block excepts}
except =>
except {"except" expr "as" symbol block}
Example:
try
catastrophy()
except Collapse as c
print_traceback(c)
return GoldenParachute()
The 'try' block evaluates its containing block normally and evaluates to the
value of its inner block like it should. In normal operation the except -clauses
are getting ignored.
When exception happens inside the 'try' block, the program proceeds through
exceptions treating them as a condition block similar to 'if'. 'isinstance(x,
Exception)' gets used to test whether the block should catch an exception.
The current implementation cannot build a traceback without traversing and
returning. This matters for debuggers and tracers so it should be fixed
eventually.
pseudoscopes
Pseudocodes are on the borderline of being yet another cool feature of lever, or
a nuisance in reasoning about the program.
This form of scope is used for forming classes and populating objects in lever.
Pseudoscope in Lever works just like an ordinary scope, except that the scope
resolution inside pseudoscope considers only the local_assign operations that
come logically before the statement.
Class definitions and scopegrabbers form pseudoscopes:
class {"class" class_header}
class {"class" class_header block}
class_header =>
class_header {symbol}
class_header {symbol "extends" expr}
scopegrabber {":" expr block}
Examples:
class Protocol
class Protocol
print("hello")
+init = (self):
null
class Protocol extends Convention
rogue = :Droid()
callsign = "rogue"
The class statement forms a custom object for capturing the scope variables,
then it evaluates every statement inside itself in the new pseudoscope.
Afterwards it constructs a class that matches with the class header and
evaluates to it.
As side effect the class also gets locally assigned.
Classes are a tool to produce custom objects in Lever. There will be separate
documentation for behavior of classes in future.
Pseudoscopes allow to do the same trick with any object. The example above is
equivalent to:
rogue = Droid()
rogue.callsign = "rogue"
Just note that the variables are not backfed from the object back into the
scope. You can only access those variables in the pseudoscope that you assign
to.
Ability to backfeed from objects would be potentially really complicated to
implement, but it would be possible given the powerful approach Lever uses to
compiling. It might be implemented if it ever can be proven to be very useful
for some purpose.