Wednesday, January 28, 2009

Xtext Corner #3 – M5: What's in the pipeline?

Some cool things will come up with the next milestone of TMF Xtext on February, 6th 2009.
  • The outline view has been redesigned. From now on it is possible to create a representation of your source file's structure, that does not necessarily map the semantic model one to one. You are free to include virtual nodes to emphasize aspects of special interest or to group the objects in another way as it is done in the textual source file.
  • Xtext resources observe their referenced models and can reload them to reflect recent changes transparently. This improves the overall user experience and provides faster feedback.
  • We will come up with a first draft of an Xtend API for the new Xtext. It is not as powerful as the corresponding Java API, but still very useful especially for early prototyping. Most notably Xtend is very convenient when you have to work with dynamic EMF models. Furthermore it comes with a nice collection API.
There is one more thing ...

Due to an IP issue with Antlr we are not allowed to use this mature parser generator if we want to be part of the next Eclipse release with TMF Xtext. Unfortunatly we are not satisfied with the eclipse compatible alternatives. That's why we decided to build our own parser generator based on the packrat algorithm. Besides the effort with implementing yet another parser generator, there are some positive side effects:
  • We learned a lot about minimizing dependencies and therefore made the parser more pluggable. If you don't like our home grown packrat parser and even don't want to use Antlr, you can theoretically take any generator of your choice and use it.
    Attention: This feature comes without warranty.
  • We found a nice way to define terminal symbols. With Xtext M4 you could already write your own lexical rules in a somewhat awkward syntax. The whole body of the rule was a pure string without any syntactical check at design time. Xtext M5 comes with terminal rules. At a first glance, they seem to be like any other parser rule. The clue is, that they allow to define kind of a lexer body with a rich syntax and known semantics. But instead of plain parser rules, they will produce exacly one (leaf-)node in the parsed tree and may not be interrupted by any whitespace or comment.

    Terminal rules will supersede the old school lexer rules.

  • Hidden tokens per rule were introduced: You can define terminal tokens as hidden on a per rule basis. If you do not want to have whitespaces between your fully qualified names, you can disallow them easily.

1 comment:

yijunyu said...

How to define indentations just like Python rules?