-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
strings option 5 - Use branded strings with extended prototype #7
base: main
Are you sure you want to change the base?
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -164,15 +164,30 @@ Whilst we still can't accept string literals on their own, the tagged template i | |
|
||
Having `bytes` and `str` behave like a primitive value type (value equality) whilst not _actually_ being a primitive is not strictly semantically compatible with EcmaScript however the lowercase type names (plus factory with no `new` keyword) communicates the intention of it being a primitive value type and there is an existing precedence of introducing new value types to the language in a similar pattern (`bigint` and `BigInt`). Essentially - if EcmaScript were to have a primitive bytes type, this is most likely what it would look like. | ||
|
||
|
||
### Option 5 - Use branded strings with extended prototype | ||
|
||
Option 2 has the developer experience that will be the most familiar to developers (coming from TypeScript or TEALScript), but suffers from semantic incompatability. In paticular, index-based functions would not work as expected (or be very expensive to implement) because EcmaScript indexes strings by characters, not bytes. | ||
|
||
For example, `'á'[0]` would return `'á'` in EcmaScript, but would return `0xC3` in TEALScript because it gets the first byte (and this character is a two byte sequence). | ||
|
||
To solve this, we could extend the prototype of `string` to have byte-specific functions. For example, `.getByte(i)` instead of `[i]` and `.sliceBytes(i)` instead of `.slice(i)`. If a developer tries to use the character-based functions, the compiler can throw an error. We can also show an error in the IDE via TypeScript plugins. | ||
|
||
If the AVM were to ever support character-based operations, we could enable the character-based functions. | ||
|
||
The main downside of this approach is "extra" methods in the `string` prototype that are not applicable to the AVM. This, however, is currently how TEALScript functions with many native types and it has not been a problem for developers (provided the error is clear). As mentioned, this can also be solved at the IDE level via TypeScript plugins. | ||
|
||
## Preferred option | ||
|
||
Option 3 can be excluded because the requirement for a `new` keyword feels unnatural for representing a primitive value type. | ||
|
||
Option 1 and 2 are not preferred as they make maintaining semantic compatability with EcmaScript impractical. | ||
|
||
Option 5 offers the most familiar developer experience at the expensive of extra methods in the prototype. | ||
|
||
Option 4 gives us the most natural feeling api whilst still giving us full control over the api surface. It doesn't support the `+` operator, but supports interpolation and `.concat` which gives us most of what `+` provides other than augmented assignment (ie. `+=`). | ||
|
||
We should select an appropriate name for the type representing an AVM string. It should not conflict with the semantically incompatible EcmaScript type `string`. | ||
Option 4 would also require us to select an appropriate name for the type representing an AVM string. It should not conflict with the semantically incompatible EcmaScript type `string`. | ||
- `str`/`Str`: | ||
- ✅ Short | ||
- ✅ obvious what it is | ||
|
@@ -189,8 +204,8 @@ We should select an appropriate name for the type representing an AVM string. It | |
- ✅ very obvious what it is | ||
- ✅ obvious how it differs to `string` | ||
|
||
|
||
Option 5 would be the preferred option if we were to prioritize developer experience whereas option 4 would be best if we priotized control over the prototype. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This feels a little subjective. What aspects of the developer experience are prioritized by this option? The obvious one I can see is saving a couple of characters in declaration, but at the expense of having to explicitly type variables
Are there any others? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think "a couple of characters in declaration" is a bit reductive of the impact it has on the developer experience. When developers use strings they expect to be able to put literals between quotation marks. Adding any friction to that can be a bit jarring. We saw this with PyTeal That being said, I agree that "developer experience" is too subjective, so I've changed it to familiarity |
||
|
||
## Selected option | ||
|
||
Option 4 has been selected as the best option | ||
TBD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.