-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add basic string interpolation / string templates #743
base: master
Are you sure you want to change the base?
Conversation
This is a great first implementation, but I'm not sure that desugaring in parser is the best choice here :/ Personally, I'd much rather see templates up until the transformation to core so that we can:
|
You raise some very good concerns and I completely agree. I will try to go the 'non-lazy route' and I actually completely thread through templates to core. |
Don't get me wrong, having a version is still super helpful; I truly have no idea how difficult it really is to do the thing I outlined above. But I hope it's worth it, especially wrt technical debt. :) |
I am not completely against desugaring in parser, but maybe we could actually desugar
to something like:
this way, as a user I can still overload |
I just pushed a prototype that threads through string templates all the way until code generation. This has the added benefit that we remain somewhat flexible for later changes, as Jiří mentioned. Furthermore, we can re-use the string interpolation capabilities of each respective backend, e.g. currently the JS string templates are generated as template literals. As a trade-off, we need to add a Note: I currently only fully implemented it for JS. If we come to an agreement that this is the right way to proceed, I will implement for the remaining backends. (I already started working on this before Jonathan's remark and I'd rather discuss this implementation briefly before scrapping it right away :) ) |
This relates to some strategy we'll need to come up with in order to represent custom string interpolators (which is very outside of the scope of this PR, I believe...) Then I'd expect some desugaring like, for example: val name = "Joe"
val age = 42
////
val s: SQL::Stmt =
sql"SELECT * FROM Person p WHERE p.last_name = ${name} AND p.age = ${age}"
// ~>
val s: SQL::Stmt = sql { // sql: { prog: () => Unit / { Splice[String], Splice[Int] } }: SQL::Stmt
do literal("SELECT * FROM Person p WHERE p.last_name = ")
do splice(name)
do literal(" AND p.age = ")
do splice(age)
} |
187d5b0
to
3ff4df3
Compare
3ff4df3
to
9ba3039
Compare
639f835
to
497b538
Compare
3bef638
to
5ac9604
Compare
5ac9604
to
e4f24a6
Compare
There are three things keeping this from merging:
|
I'll merge #780 ASAP, then it should at least tell you what's wrong :) |
Resolves the `Effect Compiler Crash: null` part of #760. The actual problem there was that `e.getMessage` can return `null` and it does for... _StackOverflowException_ 🥳. Looking at the docs, it's recommended to use `e.toString`. While I was at it, I modified stack trace printing, now we do it properly by using `.printStackTrace`. This is very related to #731, I just want to get this out quickly to unblock other problems.
libraries/common/stringbuffer.effekt
Outdated
// resize buffer to strip trailing zeros that otherwise would be converted into 0x00 characters | ||
buffer = bytearray::resize(buffer, pos) | ||
val str = buffer.toString() | ||
// after flushing, the stringbuffer should be empty again | ||
buffer = bytearray::allocate(initialCapacity) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be made somewhat more efficient -- resize
copies into a newly allocated bytearray, but toString
doesn't (since on LLVM strings are bytearrays)
So it should be possible to just clear the buffer without allocating it again?
// resize buffer to strip trailing zeros that otherwise would be converted into 0x00 characters | |
buffer = bytearray::resize(buffer, pos) | |
val str = buffer.toString() | |
// after flushing, the stringbuffer should be empty again | |
buffer = bytearray::allocate(initialCapacity) | |
// resize buffer to strip trailing zeros that otherwise would be converted into 0x00 characters | |
val str = bytearray::resize(buffer, pos).toString() | |
// after flushing, the stringbuffer should be empty again | |
buffer = buffer.fill(0) |
where buffer.fill(0)
means memset(buffer, 0)
(which we don't have yet) :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you could just clear the buffer, however, then the buffer's size will not be reset to a (potentially) smaller size and instead just buffer's previous size is kept. This probably would not be a big deal, though.
Closes #783. And closes #778. I believe the way we currently handle size calculations for stack pushes/pops of non-structural types has some flaws for non-pointer values. Some changes in how we use the GEP instruction will probably fix this, but I find its syntax really confusing. I have a proof-of-concept alternate fix at the end. To bypass the calculation issues, we can also just change the size of a byte to 8 byte instead. This *hotfix* fixes the segfault in the original issue as well as all of my minified versions. Some additional explanations of the issue using an example program: ```scala def foo { action: Byte => Unit }: Unit = action(42.toByte) def main() = { foo { b => // does not segfault when printlns are swapped println(42) println(b) } } ``` <details> <summary>Pseudo-flow with relevant (for the segfault) lines marked</summary> ``` main(stack) main(stack): alloc(3 * 8 = 24) push returnAddress_1 push sharer_5 push eraser_7 blockLit = vtable[blockLit_3289_clause_13] foo(blockLit, stack) foo(blockLit, stack): alloc(3 * 8 = 24) push returnAddress_40 push sharer_5 push eraser_7 literal = 42 byte = toByte(literal) vtable = blockLit[0] closure = blockLit[1] fp = *vtable // = blockLit_3289_clause_13 fp(closure, byte, stack) blockLit_3289_clause_13(closure, byte, stack): alloc(3 * 8 + 1 = 25) // <--------------------------- push byte push returnAddress_14 push sharer_26 push eraser_30 literal = 42 printlnInt(literal, stack) printlnInt(value, stack): // _4 str = show(value) unit = println(str) dealloc(3 * 8 = 24) pop _ pop _ pop returnAddress // _14 returnAddress(unit, stack) // <--------------------------- returnAddress_14(ret, stack): dealloc(1) pop byte erase(ret) alloc(3 * 8 = 24) push returnAddress_17 push sharer_5 push eraser_7 printlnByte(byte, stack) printlnByte(byte, stack): str = show(byte) unit = println(str) dealloc(24) pop _ pop _ pop returnAddress // _17 returnAddress(unit, stack) returnAddress_17(unit, stack): dealloc(24) pop _ pop _ pop returnAddress // _40 returnAddress(unit, stack) returnAddress_40(unit, stack) dealloc(24) pop _ pop _ pop returnAddress // _1 returnAddress(unit, stack) returnAddress_1(unit, stack): dealloc(24) pop _ pop _ pop returnAddress returnAddress(unit, stack) ``` </details> The corresponding LLVM code of the relevant sections: ```llvm define tailcc void @blockLit_3289_clause_13(%Object %closure, i8 %b_2365, %Stack %stack) { entry: ; new blockLit_3289_clause_13, 1 parameters %stackPointer_33 = call ccc %StackPointer @stackAllocate(%Stack %stack, i64 25) ; <--------------- %b_2365_pointer_34 = getelementptr {i8}, %StackPointer %stackPointer_33, i64 0, i32 0 store i8 %b_2365, ptr %b_2365_pointer_34, !noalias !2 %returnAddress_pointer_35 = getelementptr {{i8}, %FrameHeader}, %StackPointer %stackPointer_33, i64 0, i32 1, i32 0 %sharer_pointer_36 = getelementptr {{i8}, %FrameHeader}, %StackPointer %stackPointer_33, i64 0, i32 1, i32 1 %eraser_pointer_37 = getelementptr {{i8}, %FrameHeader}, %StackPointer %stackPointer_33, i64 0, i32 1, i32 2 store ptr @returnAddress_14, ptr %returnAddress_pointer_35, !noalias !2 store ptr @sharer_26, ptr %sharer_pointer_36, !noalias !2 store ptr @eraser_30, ptr %eraser_pointer_37, !noalias !2 ; literalInt longLiteral_3291, n=42 %longLiteral_3291 = add i64 42, 0 ; substitution ; substitution [value_3 !-> longLiteral_3291] ; jump println_4 musttail call tailcc void @println_4(i64 %longLiteral_3291, %Stack %stack) ret void } ... define tailcc void @println_4(i64 %value_3, %Stack %stack) { entry: ; definition println_4, environment length 1 ; foreignCall pureApp_3284 : String(), foreign show_14, 1 values %pureApp_3284 = call ccc %Pos @show_14(i64 %value_3) ; foreignCall pureApp_3283 : Positive(), foreign println_1, 1 values %pureApp_3283 = call ccc %Pos @println_1(%Pos %pureApp_3284) ; substitution ; substitution [v_r_3164_3278 !-> pureApp_3283] ; return, 1 values %stackPointer_56 = call ccc %StackPointer @stackDeallocate(%Stack %stack, i64 24) %returnAddress_pointer_57 = getelementptr %FrameHeader, %StackPointer %stackPointer_56, i64 0, i32 0 %returnAddress_55 = load %ReturnAddress, ptr %returnAddress_pointer_57, !noalias !2 musttail call tailcc void %returnAddress_55(%Pos %pureApp_3283, %Stack %stack) ; <------------------------- ; here is the segfault, since returnAddress_55 is missing one byte! ret void } ``` --- A fix I came up with that does not involve making the size of a byte large: ``` llvm define tailcc void @blockLit_3289_clause_13(%Object %closure, %Byte %b_2365, %Stack %stack) { entry: ; new blockLit_3289_clause_13, 1 parameters ; SPLIT the allocation into two segments <----------------------- %stackPointar_33 = call ccc %StackPointer @stackAllocate(%Stack %stack, i64 1) %b_2365_pointer_34 = getelementptr %Byte, %StackPointer %stackPointar_33, i64 0 store %Byte %b_2365, ptr %b_2365_pointer_34, !noalias !2 %stackPointer_33 = call ccc %StackPointer @stackAllocate(%Stack %stack, i64 24) %returnAddress_pointer_35 = getelementptr %FrameHeader, %StackPointer %stackPointer_33, i64 0, i32 0 %sharer_pointer_36 = getelementptr %FrameHeader, %StackPointer %stackPointer_33, i64 0, i32 1 %eraser_pointer_37 = getelementptr %FrameHeader, %StackPointer %stackPointer_33, i64 0, i32 2 store ptr @returnAddress_14, ptr %returnAddress_pointer_35, !noalias !2 store ptr @sharer_26, ptr %sharer_pointer_36, !noalias !2 store ptr @eraser_30, ptr %eraser_pointer_37, !noalias !2 ; literalInt longLiteral_3291, n=42 %longLiteral_3291 = add i64 42, 0 ; substitution ; substitution [value_3 !-> longLiteral_3291] ; jump println_4 musttail call tailcc void @println_4(i64 %longLiteral_3291, %Stack %stack) ret void } ... ``` --------- Co-authored-by: Philipp Schuster <[email protected]>
Regarding the prelude, we have an undocumented invariant: all imports of prelude modules have to be in the prelude as well... That is, the prelude in the JS runner needs to look like this:
and That is, the following changes would be necessary (+ LLVM Runner) diff --git a/effekt/jvm/src/main/scala/effekt/Runner.scala b/effekt/jvm/src/main/scala/effekt/Runner.scala
index ab26700e..6dbaac0b 100644
--- a/effekt/jvm/src/main/scala/effekt/Runner.scala
+++ b/effekt/jvm/src/main/scala/effekt/Runner.scala
@@ -151,7 +151,7 @@ object JSNodeRunner extends Runner[String] {
override def includes(path: File): List[File] = List(path / ".." / "js")
- override def prelude: List[String] = List("effekt", "option", "list", "result", "exception", "array", "string", "ref")
+ override def prelude: List[String] = List("effekt", "option", "list", "result", "exception", "array", "string", "ref", "bytearray", "stringbuffer", "splice")
def checkSetup(): Either[String, Unit] =
if canRunExecutable("node", "--version") then Right(())
diff --git a/examples/pos/string_interpolation.effekt b/examples/pos/string_interpolation.effekt
index 60aeb0b0..54bf9a20 100644
--- a/examples/pos/string_interpolation.effekt
+++ b/examples/pos/string_interpolation.effekt
@@ -1,6 +1,3 @@
-import splice
-import stringbuffer
-
type Expr {
Var(id: String)
Abs(param: String, body: Expr)
@@ -38,4 +35,3 @@ def main() = {
val fixpoint = Abs("f", App(Abs("x", App(Var("f"), App(Var("x"), Var("x")))), Abs("x", App(Var("f"), App(Var("x"), Var("x"))))))
println(pretty"Fix point combinator: ${fixpoint}")
}
-
diff --git a/libraries/common/bytearray.effekt b/libraries/common/bytearray.effekt
index d7552919..d7232e44 100644
--- a/libraries/common/bytearray.effekt
+++ b/libraries/common/bytearray.effekt
@@ -1,5 +1,7 @@
module bytearray
+import effekt
+
/**
* A memory managed, mutable, fixed-length array of bytes.
*/
diff --git a/libraries/common/splice.effekt b/libraries/common/splice.effekt
index 9ba74d15..7de68c0d 100644
--- a/libraries/common/splice.effekt
+++ b/libraries/common/splice.effekt
@@ -1,5 +1,7 @@
module splice
+import effekt
+
import stringbuffer
effect literal(s: String): Unit
diff --git a/libraries/common/stringbuffer.effekt b/libraries/common/stringbuffer.effekt
index 3b6d3acb..bfa57e62 100644
--- a/libraries/common/stringbuffer.effekt
+++ b/libraries/common/stringbuffer.effekt
@@ -1,5 +1,6 @@
module stringbuffer
+import effekt
import bytearray
interface StringBuffer {
|
Fixes #722 and adds basic support for string templates.
There are a few caveats to keep in mind with the current initial implementation:
"${a.show} and ${b.show}"
instead ofa.show ++ " and " ++ b.show
wherea
andb
are some 'showable' values. Internally, however, this is just how it is desugared.show
on arguments which are not already of type string, the error messages are hideous and possibly confusing:error
Apart from that, the changes are really lightweight because of the early desugaring in the parser.
I am looking forward to your feedback and ideas.