-
• #2
Tue 2021.11.02
Title: 'Minification bug: erroneous fusion of operator symbols'
I feel this is more of a parsing issue, rather than a minification issue.
Now I find this very interesting. I first tried using a trusted flashed device 2V08.220 and the native app.
On the right hand editor side, in the left hand margin a yellow warning 'Confusing minuses'
When I upload, and execute function x(), I concur with the same error message.
Uncaught Error: Unable to assign value to non-reference Number
at line 1 col 8
print(1--1)
Using parenthesis and modifying as follows:
x = () => print(1 - (-1));
The above executes as expected as does the mod() example, but with no error(s):
>x = () => print(1 - (-1));
=function (undefined) { ... }
>x()
2
=undefined
>x => mod(x- -x, 1)
=function (x) { ... }
>x()
2
=undefined
When I run the tests using the online IDE indicating the same version and using either emulator, although the original line of code does also display the orange 'Confusing minuses' warning, those upload without errors also!
2v10.187 (c) 2021 G.Williams
Found EMSCRIPTEN2, 2v10.187
Connected to Emulator
>dump()
=undefined
>x = () => print(1 - -1);
=function (undefined) { ... }
>x()
2
=undefined
>y = () => print(2 + -1);
=function (undefined) { ... }
>y()
1
=undefined
>z = () => print(2 - -1);
=function (undefined) { ... }
>z()
3
=undefined
>
Wouldn't you agree that adding the parenthesis indicates a parsing issue?No clue why errors don't show in my environment when using the online WebIDE:
Browser Chrome Version 94.0.4606.81 (Official Build) (64-bit) on Windows10 v20H2 19042.1288
Maybe a few more community members can give this a whirl and report in perhaps?
-
• #3
It is obviously not a parsing issue, sensu stricto, both because the distinction between
- -
and--
is lexical, and because, as you point out, the runtime evaluator suffers no such problem when its input is correct. The question that needs to be answered in diagnosing the problem, than, is what could cause the space between the two hyphen-minuses to be elided (or otherwise ignored)? If it is not the minification logic that apparently runs during full-file upload, I'll be surprised. What else is empowered to remove spaces? Why else would the error message actually quote maltransformed input? Why else would the problem vanish on paths where minification is not performed?
I think the key insight here is that the expression1--1
is erroneous (semantically, because1
is not an lvalue, and syntactically, because--
is not an infix operator), but it is not what was written in the source.
I'm sure other explanations are possible, but for now my money is definitely on the assumptions made by the minifier. -
• #4
'What else is empowered to remove spaces?'
Maybe we are splitting hairs here, but doesn't a parser read the user command and break up into the individual pieces without spaces that are processed in order to determine the command syntax for execution? As I understand the process of minification reduces each part to it's simplest representation, the examples in post #1 are already minified as best they can be aren't they?
Should one remove that whitespace, syntax errors occur and the editor points this out with a red 'Bad assignment' error.
When uploaded:
Uncaught SyntaxError: Got ')' expected EOF
at line 1 col 21
x = () => print(2 --1);
and this syntax without the space is parsed and executes just fine:z = () => print(1 -(-1));
=function (undefined) { ... }
>z()
2
=undefined
In any event, as I don't receive the errors that are indicated in post #1 using the emulator, I'm wondering what the differences are in browser and PC environment by comparison.As an end user like yourself, I've just gotten running and are working through the Advanced Debug documentation and this would be a good sample to use to locate the problematic area.
Am willing to put in the time to learn, but haven't grasped how to set up the environment to get started, even after following the 'Gotchas' section.
-
• #5
Terminology: a lexer is responsible for tokenisation of the input—breaking it into words, if you will, like
if
,camelVar
,6.02e+23
,{
,'string'
,+
and+=
—while the parser is responsible for syntactic analysis, the grouping of those tokens into statements and expressions (with, admittedly, a few grey areas such as backquotes and regular expressions). It's not a logical necessity that languages be designed with this distinction, but as a practical matter almost all programming languages since the 60s seem to in one way or another. In the C family (to which JS belongs notationally, though not semantically), it's pretty crucial to understanding what's going on, and the formal concept (though not a typical implementation—and certainly not this implementation) is that tokenisation is a separate process that runs ‘before’ parsing. In any case, in JS,--
is one token and- -
is two, just asxx
is different fromx x
and==
is different from= =
, so in this case, the space is significant. The language processor, taken as a whole. is wrong to treat- -
as--
, and would be just as wrong to treat--
as- -
.
Espruino does not always make this mistake—var x
does not get confused withvarx
and the far more obscure exampledo x(); while (!done());
does not get misinterpreted as the (equally legal)dox(); while (!done());
, even during full file interactive upload to the emulator (the context that triggers the bug I was reporting). -
• #6
Sorry @user135362 please don't waste any time answering @Robin. For some reason he repeatedly ignores my requests not to post when he has nothing useful to add.
@Robin I've mentioned this in personal messages, but let's just say it publically: I've never banned anyone from this forum, but I'm fed up reading your toxic, smart-arse comments every day and I've lost track of how many people you've offended or put off. I've repeatedly warned you via PMs and email and I just get ignored. If I feel I ever have to warn you again, you're getting banned.
Right. @user135362 - thank you!
So as far as I can tell this comes not from the minification in the IDE, but from Espruino's internal minification that it applies when
E.setFlags({pretokenise:1})
is set, which is the default for Bangle.jsE.setFlags({pretokenise:0}) // change this to 1 and it breaks
a = () => print(1 - -1)
a();
I've filed an issue for it here: https://github.com/espruino/Espruino/issues/2086
I guess one question is how I actually fix this. I can explicitly keep a space between
- -
and+ +
but can you think of other combinations in valid JS that would also cause issues? Looking at the list of tokens I can't see anything obvious. -
• #7
@Gordon—Oh! Mea culpa. I thought I had ruled out internal minification, but I had neglected to defer execution in my experiments, which rather invalidates them :-}. Sorry!
Here are the ones I can think of:
Misinterpreting valid and useful code:
- -
/--
+ +
/++
Wildly misinterpreting technically valid but extremely unlikely code:
/ /
///
⇐ Humorously, the bizarre sequence1 / /a/
does indeed throw an amazing fit, and can produce arbitrarily inappropriate error messages—add a multiline template literal later in the function for total hilarity!Incorrectly allowing syntactically invalid code in an especially confusing way:
/…/ i
//…/i
⇐ The IDE flags/a/ i
as a syntax error, but the runtime sees/a/i
.The obscure example
do x(); while(…);
is already handled correctly, and sequences with- --
and+ ++
display incorrectly in error messages but apparently execute correctly—are the increment and decrement operators internally pretokenised, or something?I can't think of any other examples because alphanumeric tokens already preserve whitespace and JS has few cases where an operator symbol is at the left edge of an expression. Maybe
**
andyield*
, should they ever be implemented, might surprise people carelessly converting C code?HTH.
-
• #8
Great - thanks! Wow, I hadn't even considered the regex. I'll get something in for that.
I can come up with
1 / /(\d+)/.exec("foo123bar")[1]
as valid JS that would have failed :)But yes, '++'/etc are pretokenised. Basically, anything that's more than 1 character but doesn't have a value (eg not string/id/number/etc) is turned into a token.
Do you have an example of the error message? Errors should reconstruct everything from the tokens
-
• #9
Here's a example:
>(_=>print(x - --1))()
Uncaught Error: Unable to assign value to non-reference Number
at line 1 col 10
print(x---1)
My code is indeed incorrect for the reason given, but the source snippet (if reparsed) is about
x-- - 1
, which is probably what I thought I had typed(!).
Bangle emulator 2v10.187.
Minimal reproduction:
Upload into the Bangle.js emulator from the editor panel as full file (not snippet), run
x()
:The underlying cause has to do with the accidental fusion of the two
-
tokens. This case:reports a parse error rather than a semantic error.
+
is similarly affected.As ever, it should be stressed that this is a minimal reproduction and the error also occurs with real, useful code. (Why would I ever explicitly subtract a negative number? Because the numbers I'm using come from tables of data plugged into a regular formula.)