warning(200): Decision can match input such as "{EOF, RULE_ID, '('}" using multiple alternatives: 1, 2The parse generator basically complains about an ambiguous grammar. At some point in the syntax description it cannot decide which path to follow for a given input sequence. It's rather obvious that the warning message is not really helpful. Neither is there any chance to find the correct line that caused the error (which is not a problem of Antlr but cause by the translation of Xtext to Antlr) nor is it easily possible to spot the concrete decision that the parser generator complains about. The worst about this is that it's not really a warning either. What the parser generator basically did is the following: It removed a possible path from the grammar description. It will always choose the one remaining path for that particular situation. Which could by chance be the one that you'd expect. But it could also be the wrong path.
As a result, alternative(s) 2 were disabled for that input
AntlrWorks
Fortunately there is a tool that helps to identify the problem: AntlrWorks allows to take a look at the grammar and visualize all the problems that it has graphically. It's still far from trivial to find the root cause of a problem but better than nothing. Make sure you pick the version 3.2 from download section if you want to give it a try.Now you may wonder how you should handle the cases that are ambiguous by definition and by intention. You could of course enable backtracking in your language and afterwards everything appears to be fine. However, you can think about backtracking as a wildcard for Antlr to remove alternatives from your grammar everywhere where it spots an ambiguity. This will shadow the real problems in the grammar that may be introduced due to subsequent changes, a refactoring or new language features. That's why I strongly recommend to go for the hard way and analyze the root cause for the warnings. As soon as you found the actual decision that the parser generator complained about, you can use a syntactic predicate to fix that locally. Now you are in control on which alternative to remove and which path to follow. I think that makes perfect sense to be in charge in those cases.
Backtracking
But the shadowing of errors in the generator is only one drawback of backtracking. It will furthermore lead to surprising messages at run-time. If you consider the following snippet it's easy to see that the right operand of the binary operation is missing.The parser will correctly report something along the lines of
mismatched input '}' expecting RULE_INTUnfortunately it does so on a totally unexpected location. If you enabled backtracking and the algorithm decides that the function declaration is not complete - it fails to read a valid function body - the parsing will roll back to the start of the function and put the most specific error message on that token. You'll see an error marker under the keyword function. However, it would be more intuitive to have that error on the binary operation. At least that's what I would expect, wouldn't you?
Syntactic Predicates
Nevertheless it's not always possible to write an unambiguous grammar. There are some common patterns that are undecidable by definition. The most famous one is the Dangling else. If a language allows nested if-else constructs, it's not definite where a subsequent else keyword may belong to. Consider the following Java snippets which only differ in formatting:The semantics of both code snippets should be independent from the formatting. Nevertheless it's ambiguous for the parser in the same way as a reader might be confused by an inconsistent indentation. Therefore you have to force the parser into one concrete direction in order to disambiguate the grammar. A syntactic predicate has to be added.
The => operator forces the parser to go a certain path if the input sequence would allow two or more possible decisions. It can be read as If you see these tokens, go this way. It's even possible to use alternatives or groups of elements as the criteria. Only the UnorderedGroup is prohibited in predicates.
In this example, the parser shall follow the given path if it can look ahead to a sequence like person.name= (or more abstract ID . ID =).
3 comments:
Once more an excellent post! Thanks Sebastian!
--
oliver
Thank you! The operator '=>' saves my life
The best ever sentence structure checker are provided here. Good grammar and good sentence structure, plus a compelling topic make up the best article.
Post a Comment