-
Notifications
You must be signed in to change notification settings - Fork 105
Lazy Evaluation
There are three major evaluation strategies used by programming languages today:
- Strict or call-by-value evaluation, where all arguments in a function application are evaluated before the function is applied,
- Call-by-name evaluation, where arguments in a function application are evaluated each time they are needed in the body of the function being applied, and
- Lazy or call-by-need evaluation, where all arguments in a function application are evaluated the first time they are needed in the body of the function being applied. The result of this evaluation is memoized and used again if the argument is needed later in the function body.
The regular Source language, like most programming languages, uses strict evaluation. However, only minor changes are required in the implementation of the language to introduce lazy evaluation as discussed in section 4.2 of SICPJS. In this document, we give a brief overview of the specification of "Lazy Source," and discuss our implementation in the interpreter and transpiler of js-slang
. We conclude with some example programs demonstrating the expressive power afforded by lazy evaluation to the programmer.
Because Lazy evaluation makes the exact point at which arguments to a function will be evaluated difficult to reason about, lazy evaluation is only useful in the context of pure functional programming, where there are no side-effects and variable mutability. For this reason, "Lazy Source" comprises only lazy versions of Source§1 and Source§2. The full specification of Source §1 Lazy is available here, and the full specification of Source §2 Lazy is available here. Here we give a brief overview of these specifications.
We say that a function is strict if it is a built-in primitive, and non-strict otherwise. Then, in Source§1 Lazy,
- The arguments of strict functions are evaluated before function application,
- The evaluation of non-strict function arguments is delayed until they are needed for the first time. Thereafter, the result of this initial evaluation is used wherever the argument appears in the function body,
- The arguments of binary and unary operators are evaluated before the operators are applied (i.e., operators are strict),
- The conditions of conditional expressions must be evaluated to a boolean value to determine whether to return the consequent or alternate expression, and
- Constant declarations are immediately evaluated (i.e., constant declarations are strict).
We say that a function is strict if it s a built-in primitive other than pair
, and non-strict otherwise. Then, in Source§2 Lazy,
- The arguments of strict functions are evaluated before function application,
- The evaluation of non-strict function arguments is delayed until they are needed for the first time. Thereafter, the result of this initial evaluation is used wherever the argument appears in the function body,
- The arguments of binary and unary operators are evaluated before the operators are applied (i.e., operators are strict),
- The conditions of conditional expressions must be evaluated to a boolean value to determine whether to return the consequent or alternate expression, and
- Constant declarations are immediately evaluated (i.e., constant declarations are strict).
Making pair
non-strict allows for the construction of lazy lists, potentially infinite data structures that can be used to implement streams and afford programmers new expressive power.
Haskell is perhaps the most prominent language using lazy evaluation today. In Haskell, function and operator applications, as well as constant declarations are all non-strict. This differs from Lazy Source, where only some function applications are non-strict, with the rest remaining strict as in regular Source. There are merits to both approaches, but we adopt the latter to match the presentation of lazy evaluation in SICPJS.
We modified js-slang
to support lazy evaluation as a toggle-able feature through the variant
property of the execution context (setting this property to lazy
enables lazy evaluation). The interpreter and transpiler both support optional lazy evaluation. For both implementations, computations are delayed through the creation of thunk objects, which contain:
- The computation to be delayed as an expression,
- The environment the thunk was created in, to be used if the delayed computation is "forced," and
- Space for storing the memoized result of evaluation.
Each implementation contains a forceIt
operator used to force the evaluation of a thunk and yield the final result.
We begin with the interpreter. Here, thunks are defined as a Typescript class. For convenience and alignment with SICPJS, we define a delayIt
operator that constructs thunks:
class Thunk {
public value: Value
public isMemoized: boolean
constructor(public exp: es.Node, public env: Environment) {
this.isMemoized = false
this.value = null
}
}
const delayIt = (exp: es.Node, env: Environment): Thunk => new Thunk(exp, env)
Forcing thunks is achieved through mutual recursion of forceIt
and actualValue
. We take care to evaluate the expression of a thunk in its associated environment:
function* forceIt(val: any, context: Context): Value {
if (val instanceof Thunk) {
if (val.isMemoized) return val.value
pushEnvironment(context, val.env)
const evalRes = yield* actualValue(val.exp, context)
popEnvironment(context)
val.value = evalRes
val.isMemoized = true
return evalRes
} else return val
}
export function* actualValue(exp: es.Node, context: Context): Value {
const evalResult = yield* evaluate(exp, context)
const forced = yield* forceIt(evalResult, context)
return forced
}
This mutual recursion is necessary to ensure that actualValue
doesn't return any thunks. If evalRes
were computed by simply evaluating val.exp
in forceIt
, it could itself be a thunk. Calling actualValue
within forceIt
ensures that any thunks within the original thunk are also forced.
The heart of the toggle-ability of lazy evaluation is in the helper function getArgs
used in function application. Here, we delay arguments if lazy evaluation is enabled, and force their evaluation if not:
function* getArgs(context: Context, call: es.CallExpression) {
const args = []
for (const arg of call.arguments) {
if (context.variant !== 'lazy') {
args.push(yield* actualValue(arg, context))
} else {
args.push(delayIt(arg, currentEnvironment(context)))
}
}
return args
}
We use getArgs
to evaluate function call expressions. Note that we must compute the actual value of node.callee
, because we must have an actual function to apply, and not a thunk.
export const evaluators: { [nodeType: string]: Evaluator<es.Node> } = {
...
CallExpression: function*(node: es.CallExpression, context: Context) {
const callee = yield* actualValue(node.callee, context)
const args = yield* getArgs(context, node)
...
const result = yield* apply(context, callee, args, node, thisContext)
return result
},
...
}
In apply
, we must only take care to force the evaluation of function arguments if the function being applied is a non-lazy built-in:
export function* apply(
context: Context,
fun: Closure | Value,
args: (Thunk | Value)[],
node: es.CallExpression,
thisContext?: Value
) {
let result: Value
let total = 0
while (!(result instanceof ReturnValue)) {
if (fun instanceof Closure) {
... // User-defined functions
if (result instanceof TailCallReturnValue) {
...
} else if (fun instanceof LazyBuiltIn) {
...
} else if (typeof fun === 'function') { // Strict built-ins
try {
const forcedArgs = []
for (const arg of args) {
forcedArgs.push(yield* forceIt(arg, context))
}
result = fun.apply(thisContext, forcedArgs)
break
} catch (e) {
...
}
} else {
return handleRuntimeError(context, new errors.CallingNonFunctionValue(fun, node))
}
}
...
}
We finish by forcing the evaluation of expressions in operator applications, conditional expressions, and constant declarations. Although the specification of Lazy Source does not specify the behavior of imperative programming constructs like variable declarations, assignment, and loops, we are forced to specify their behavior if lazy evaluation is enabled because the interpreter is designed to simultaneously support all chapters of the Source language. Here, we adopt the decision of preserving the strictness of these constructs. Although lazy versions of Source§3 and Source§4 are not intended to be made available to the users of the Source Academy frontend, we do take advantage of their availability for testing purposes.
Ensuring that an expression is forced instead of simply evaluated requires replacing the call to evaluate
in question with a call to actualValue
. For example, for binary operators:
export const evaluators: { [nodeType: string]: Evaluator<es.Node> } = {
...
BinaryExpression: function*(node: es.BinaryExpression, context: Context) {
const left = yield* actualValue(node.left, context)
const right = yield* actualValue(node.right, context)
const error = rttc.checkBinaryExpression(node, node.operator, left, right)
if (error) {
return handleRuntimeError(context, error)
}
return evaluateBinaryExpression(node.operator, left, right)
},
...
}
To ensure that the final result of evaluating a program is a non-thunk value, we also make sure to force the evaluation of Program
statements.
In order to implement lazy pairs, we needed to differentiate between normal built-ins (which are strict) and lazy built-ins. We use a wrapper class LazyBuiltIn
for this purpose:
class LazyBuiltIn {
func: (...arg0: any) => any
evaluateArgs: boolean
constructor(func: (...arg0: any) => any, evaluateArgs: boolean) {
this.func = func
this.evaluateArgs = evaluateArgs
}
}
The importBuiltins
and importPrelude
methods are then modified in order to take into account adding built-ins (and list prelude) based on the variant
property of the context.
For the list prelude, we add lazyLists.prelude.ts
in the stdlib folder. The "normal" lists prelude can be used, however it may give rise to surprising infinite loops errors. We modify the relevant methods in order to stay consistent with the non-lazy lists errors. It would maybe make more sense to have better and stricter argument checking for these methods, instead of relying on errors raised by the lists methods. An example of this is:
function list_ref(xs, n) {
return n === 0 ? head(xs) : list_ref(tail(xs), n - 1);
}
If n is negative, then this would result in infinite recursion, because the tail
call is never executed, and therefore we never reach the tail(null)
call (which is where the error is reported).
For the Built-ins, we made the choice of making only the pair constructors list
and pair
lazy, while head
, tail
and is_pair
would evaluate their arguments (which would result in a list). For head
and tail
, unevaluated thunks of the first and second element respectively are returned.
Finally we change the apply
method in interpreter.ts
to take into account instances of LazyBuiltIns
:
export function* apply(
context: Context,
fun: Closure | Value,
args: (Thunk | Value)[],
node: es.CallExpression,
thisContext?: Value
) {
let result: Value
let total = 0
while (!(result instanceof ReturnValue)) {
...
else if (fun instanceof LazyBuiltIn) {
try {
let finalArgs = args
if (fun.evaluateArgs) {
finalArgs = []
for (const arg of args) {
finalArgs.push(yield* forceIt(arg, context))
}
}
result = fun.func.apply(thisContext, finalArgs)
break
} ...
} else if (typeof fun === 'function') {
...
}
}
...
}
In modifying the transpiler to support lazy evaluation, we take a two-pronged approach. First, we modify the transpiler to generate native code containing thunks where appropriate if lazy evaluation is enabled. Then, we alter native runtime operators to handle thunks if they are contained in transpiled code.
Given a JavaScript expression E
, the easiest way to thunk E
is to make it the body of a nullary arrow function: () => E
. This function will automatically capture the current environment, and its evaluation can be forced by simply applying it. Since we also want thunks to support memoization, we represent them as JavaScript objects with the computation in question delayed using the arrow-function technique, and a property set aside to store the memoized value:
{ isThunk: true,
memoizedValue: 0,
isMemoized: false,
expr: () => E }
We implement a delayIt
operator for the transpiler to create thunks:
function delayIt(expr: es.Expression) {
const exprThunked = create.blockArrowFunction(
[],
[create.returnStatement(expr)],
expr.loc === null ? undefined : expr.loc
)
const obj = create.objectExpression([
create.property('isThunk', create.literal(true)),
create.property('memoizedValue', create.literal(0)),
create.property('isMemoized', create.literal(false)),
create.property('expr', exprThunked)
])
return obj
}
The corresponding operator forceIt
is implemented in the native runtime, which we will discuss later. The bulk of the changes to the transpiler itself involve altering the transformation transformCallExpressionsToCheckIfFunction
. This transformation wraps function calls in Source code with a call to the native runtime operator callIfFuncAndRightArgs
, which applies the function in question after verifying that it is, in fact, a function, and that it is being applied to the correct number of arguments. So, a function application in Source like:
f(a,b,c);
is transformed to the following native JavaScript code:
callIfFuncAndRightArgs(f, lineno, colno, a, b, c);
If lazy evaluation is enabled, we also thunk the arguments during this transformation:
callIfFuncAndRightArgs(f,
lineno,
colno,
{isThunk: true, expr: () => a, ...},
{isThunk: true, expr: () => b, ...},
{isThunk: true, expr: () => c, ...});
This requires a relatively small change:
function transformCallExpressionsToCheckIfFunction(program: es.Program, variant: Variant) {
simple(program, {
CallExpression(node: es.CallExpression) {
const { line, column } = node.loc!.start
const args = variant !== 'lazy' ? node.arguments : ([] as es.Expression[])
if (variant === 'lazy') {
for (const arg of node.arguments) {
args.push(delayIt(arg as es.Expression))
}
}
...
}
})
}
We also make a similar modification to the transformation transformReturnStatementsToAllowProperTailCalls
, because this makes alterations to return statements which are function calls:
function transformReturnStatementsToAllowProperTailCalls(program: es.Program, variant: Variant) {
function transformLogicalExpression(expression: es.Expression): es.Expression {
switch (expression.type) {
case 'LogicalExpression':
...
case 'ConditionalExpression':
...
case 'CallExpression':
expression = expression as es.CallExpression
const { line, column } = expression.loc!.start
const functionName =
expression.callee.type === 'Identifier' ? expression.callee.name : '<anonymous>'
const args = variant !== 'lazy' ? expression.arguments : ([] as es.Expression[])
if (variant === 'lazy') {
for (const arg of expression.arguments) {
args.push(delayIt(arg as es.Expression))
}
}
...
default:
...
}
}
...
}
Since this results in the creation of additional arrow functions in transpiled code, we must alter the transformation wrapArrowFunctionsToAllowNormalCallsAndNiceToString
, which runs after transformCallExpressionsToCheckIfFunction
, to ignore the arrow functions used to delay computations within thunks. This is easy to achieve, since these functions will not be in the functions-to-string map:
function wrapArrowFunctionsToAllowNormalCallsAndNiceToString(
program: es.Program,
functionsToStringMap: Map<es.Node, string>
) {
simple(program, {
ArrowFunctionExpression(node: es.ArrowFunctionExpression) {
// If it's undefined then we're dealing with a thunk
if (functionsToStringMap.get(node)! !== undefined) {
create.mutateToCallExpression(node, globalIds.wrap, [
{ ...node },
create.literal(functionsToStringMap.get(node)!)
])
}
}
})
}
In our last change to the transpiler, we make sure that the evaluation of the last statement is forced. We do this by returning the forceIt
operator applied to the final result:
export function transpile(
program: es.Program,
id: number,
skipUndefinedVariableErrors = false,
variant: Variant = 'default'
) {
...
const wrapped = wrapInAnonymousFunctionToBlockExternalGlobals([
wrapWithPreviouslyDeclaredGlobals([
...statements,
lastStatementStoredInResult,
...statementsToSaveDeclaredGlobals
]),
create.returnStatement(
create.callExpression(globalIds.forceIt, [globalIds.lastStatementResult])
)
])
program.body = [...getDeclarationsToAccessTranspilerInternals(), wrapped]
...
}
Now we move on to the native runtime. We start by defining the new forceIt
operator:
export function forceIt(val: any): any {
if (val !== undefined && val !== null && val.isThunk === true) {
if (val.isMemoized) {
return val.memoizedValue
}
const evaluatedValue = forceIt(val.expr())
val.isMemoized = true
val.memoizedValue = evaluatedValue
return evaluatedValue
} else {
return val
}
}
We invoke forceIt
recursively on the result of applying the thunk's arrow function for the same reason we make forceIt
and actualValue
mutually recursive in the interpreter: this ensures that if val.expr()
evaluates to a thunk that it is also forced. Notice as well that if lazy evaluation is not enabled, the transpiled code will contain no thunks, and forceIt
becomes a noöp that returns the value it was applied to unmodified.
Now we modify the runtime operator callIfFuncAndRightArgs
so that it explicitly forces the evaluation of arguments to strict (i.e., built-in other than pair
) functions. In all other cases, arguments are passed to the function being applied unmodified. If lazy evaluation is not enabled, then this will ensure that all functions are applied strictly as intended, since the transpiled code will contain no thunks. Otherwise, non-strict functions will be passed the thunks created at transpile time as required.
export function callIfFuncAndRightArgs(
candidate: any,
line: number,
column: number,
...args: any[]
) {
...
candidate = forceIt(candidate)
if (typeof candidate === 'function') {
if (candidate.transformedFunction === undefined) { // Strict built-in function
try {
const forcedArgs = args.map(forceIt)
return candidate(...forcedArgs)
} catch (error) {
...
}
} else {
...
return candidate(...args)
}
} else if (candidate instanceof LazyBuiltIn) {
try {
if (candidate.evaluateArgs) {
args = args.map(forceIt)
}
return candidate.func(...args)
} catch (error) {
...
}
} else {
throw new CallingNonFunctionValue(candidate, dummy)
}
}
The condition of loops and conditional statements are wrapped with a call to the native runtime operator boolOrErr
. To force the evaluation of condition expressions, we simply apply forceIt
to the candidate expression here:
export function boolOrErr(candidate: any, line: number, column: number) {
candidate = forceIt(candidate)
const error = rttc.checkIfStatement(create.locationDummyNode(line, column), candidate)
if (error === undefined) {
return candidate
} else {
throw error
}
}
We make similar modifications to the native runtime operators unaryOp
and binaryOp
used to apply operators.
After the modifications to builtins discussed in previous section, the changes needed in order to support lazy lists in the transpiler are very minimal. Namely, the methods callIfFuncAndRightArgs
and callIteratively
of operators.ts
need to be extended with the case of applying lazyBuiltIn
instances:
export function callIfFuncAndRightArgs(
candidate: any,
line: number,
column: number,
...args: any[]
) {
...
if (typeof candidate === 'function') {
...
}else if (candidate instanceof LazyBuiltIn) {
try {
if (candidate.evaluateArgs) {
args = args.map(forceIt)
}
return candidate.func(...args)
} catch (error) {
...
}
...
}
This is identical to the changes in the interpreter. We mainly test whether or not the arguments of the LazyBuiltIn
instance need to be forced, then apply the corresponding function.
An identical change is needed in callIteratively
in order to handle the case of tail calls.
We include a variety of unit tests to ensure that our modifications to the interpreter and transpiler follow our specification for Lazy Source. We also check that the interpreter and transpiler yield identical results when evaluating the same program lazily. As stated before, we make use of the imperative programming of Source§3 to test the memoization of thunks with the following test-case:
let x = 1;
function incX() {
x = x + 1;
return x;
}
function square(n) {
return n * n;
}
square(incX()); // should return 4
If thunks are properly memoized, then the call to incX()
should take place once, when n
first appears in the body of square
. The result of this call is 2
. If this result is memoized, the second appearance of n
in square
should also evaluate to 2
, leading to final result of 4
. On the other hand, if thunks were not properly memoized, the call to incX()
would be executed twice (once for each appearance of n
in the body of square
), leading to a final result of 6
.
We conclude with some programs demonstrating the usefulness of lazy evaluation.
Consider the simple program:
function f(a, b) {
return a === 1 ? a : b;
}
f(1, head(null));
Calling head(null)
will certainly yield an error, as will evaluating this entire program using the regular, strict Source language. When f
is applied strictly, all of its arguments (including head(null)
) are evaluated, even if they are not used. In fact, b
is not used in f
if a
is 1. As a result, this program evaluates to 1
if lazy evaluation is enabled. Non-strict function application means that arguments are evaluated only if they are actually used within the body of the function.
Note the following recursive definition of factorial:
0! = 0
(n+1)! = (n+1) × n!
Using lazy lists, it is possible to construct a list of the all factorials 0!, 1!, 2!,...
. We start by generating a function intsFrom
that given an integer n
will produce an infinite list containing all the integers within [n, ∞)
:
function intsFrom(n) {
return pair(n, intsFrom(n+1));
}
We also make use of the helpful list operator zipWith
, which combines two lists into one by applying a given operator to corresponding values from each list:
function zipWith(f, xs, ys) {
return is_null(xs) ?
null
: is_null(ys) ?
null
: pair(f(head(xs), head(ys)), zipWith(f, tail(xs), tail(ys)));
}
Now we can use intsFrom
and zipWith
to create our list of factorials:
const factorials = pair(1, zipWith((x, y) => x * y, intsFrom(1), factorials));
Using OCaml notation for lists, this has the effect of constructing the list:
1 :: 1 * 0! :: 2 * 1! :: 3 * 2! :: ... =
0! :: 1! :: 2! :: 3! :: ...
Given a non-negative integer n
, one can compute n!
by finding the n
th element of factorials
.
This implementation of a very ancient algorithm for computing prime numbers is a particularly good example of the power of infinite streams. The algorithm is simple and goes as follows:
- Build an infinite list of integers starting from 2.
- Two is a prime number, remove all multiples of 2 from the list.
- Next number is 3, it is a prime, remove all multiples of 3 from the list.
- etc ...
We start by constructing the list of integers, similar to the previous section:
function intsFrom(n) {
return pair(n, intsFrom(n+1));
}
Then we write a function sieve which does the main computation of the algorithm:
function sieve(xs){
return pair (head (xs), sieve (filter (x=> x % head(xs)!==0, tail(xs)));
}
The sieve function returns a pair containing the head of the list xs
as the first element, and a thunk representing a call to sieve with the filtered list (so as to have no multiples of the head of xs
) as argument.
finally:
const allPrimes = sieve(intsFrom(2));
As we can see here, the implementation is very simple and translates directly from the algorithm. It illustrates the power of computing list elments only when you need them.