I'm writing an ECMAScript tokeniser and parser and trying to find out if I can eliminate the switching from tokenising "/" as start of regex or the division operator depending on the parser feedback - essentially, if I can make the tokeniser independent of the parser. (I have a gut feeling this needs too much special casing to be worth it).
Division operator, as well as division-assignment, requires one operand on either side. This is in most cases easy to check for - division can ONLY appear if there is an expression before it. But if it was as easy as just checking one simple thing, there would be no need for ECMAScript to specify two scanner modes, would there? No, there exist cases where "/" potentially could be the start of a regular expression, but where division is the natural interpretation of it. One example is semicolon insertion.
Is naturally handled as if there was no new-line there. But, the following is a possibility
Here, the engine inserts a semicolon and treats the second line as a second statement. Since a statement can open with a regex, this isn't entirely a far fetched situation:
However, that fails. For what reason? Well, because the "/" is then treated like division in the expression started in the line above instead of start of regex in a new statement.
I'm looking for other cases where division and regex start would both be possible things the author meant to do, but where the engine prioritises division. Anybody know any?