Added Robot Framework lexer #2

AmiZya · 2024-10-15T20:31:39Z

This PR adds support for the Robot Framework test suite https://robotframework.org/

jeffwilliams · 2024-10-15T23:39:06Z

lexers/embedded/robot.xml

+      <rule pattern="#.*$">
+        <token type="Comment"/>
+      </rule>
+      <rule pattern="\$\{[^\}]+\}|\@{[^\}]+\}|\&\{[^\}]+\}">


I noticed this ampersand and the other one below were causing the lexer not to be loaded. It seems to work as <rule pattern="${[^}]+}|@{[^}]+}|&{[^}]+}">

You're right, I fixed it.

lexers/embedded/robot.xml

jeffwilliams · 2024-10-15T23:43:11Z

lexers/embedded/robot.xml

+      <rule pattern="\$\{[^\}]+\}|\@{[^\}]+\}|\&\{[^\}]+\}">
+        <token type="NameVariable"/>
+      </rule>
+      <rule pattern="(True|False|None|null|on|off)\b">


One thing I noticed here is that this pattern will match portions of words. For example, when I tried it with a file that had the word Documentation in it, it highlighted the on at the end differently from the rest of the word.

Good catch, it wrapped it in \b in order to match the whole word and not only parts of it.

lexers/embedded/robot.xml

jeffwilliams · 2024-10-21T12:31:01Z

lexers/embedded/robot.xml

+      <rule pattern="\$\{[^\}]+\}|\@{[^\}]+\}|&amp;\{[^\}]+\}">
+        <token type="NameVariable"/>
+      </rule>
+      <rule pattern="\b(True|False|None|null|on|off)\b">


I think we're getting closer.

The change to use word boundaries has fixed the issue when the keyword appears at the beginning or in the middle of a word, but there still seems to be an issue when the keyword appears at the end of a word. Here I mean 'word' as a 'space separated' word.

For example, for the following document still has the 'on' at the end of the word Documentation highlighted:

*** Settings *** Documentation A test suite

I think the problem is that there is an ambiguity in the lexer grammar between the 'keyword' rule and the '.' rule in the root state that the initial word-boundary doesn't solve. I temporarily added a debug log to print the tokens and for that document I see:

syntax: token: token: Type: Keyword Value: '*** Settings ***' Start: 0 End: 16 syntax: token: token: Type: TextWhitespace Value: ' ' Start: 16 End: 17 syntax: token: token: Type: Text Value: 'Documentati' Start: 17 End: 28 syntax: token: token: Type: KeywordConstant Value: 'on' Start: 28 End: 30 syntax: token: token: Type: TextWhitespace Value: ' ' Start: 30 End: 31 syntax: token: token: Type: Text Value: 'A' Start: 31 End: 32 syntax: token: token: Type: TextWhitespace Value: ' ' Start: 32 End: 33 syntax: token: token: Type: Text Value: 'test' Start: 33 End: 37 syntax: token: token: Type: TextWhitespace Value: ' ' Start: 37 End: 38 syntax: token: token: Type: Text Value: 'suite' Start: 38 End: 43 syntax: token: token: Type: EOFType Value: '' Start: 0 End: 0

It looks like the grammar '.' rule allows parsing each letter in 'Documentati' as text (which is then combined internally), but when it sees the 'on' it matches the keyword rule. I think the word boundary is somehow honoured because the text being passed to match the regexp is the remainder of the unmatched text, which is 'on A test suite', and so the first word-boundary is satisfied.

Perhaps we must only enter the 'keyword' state in a more specific condition? I don't know the syntax of robot files, but I am assuming that the keywords can only appear in test case sections, and only in a specific place in a line, and maybe only after some other syntax element that introduces it. If so, maybe we only need to enter the keyword state under those conditions.

If you want I can give you the source-code diff to print those logs from Anvil, or give you a custom compiled binary that prints the logs so you can troubleshoot it.

Added Robot Framework lexer

2ba0bcf

jeffwilliams requested changes Oct 15, 2024

View reviewed changes

AmiZya added 3 commits October 20, 2024 02:28

Fixed the ampersand breaking the lexer

f68fe13

Changed NameTag to Name

3110d74

Match whole word instead of parts of it

9ba51ae

jeffwilliams requested changes Oct 21, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added Robot Framework lexer #2

Added Robot Framework lexer #2

Uh oh!

AmiZya commented Oct 15, 2024 •

edited

Loading

Uh oh!

jeffwilliams Oct 15, 2024

Uh oh!

AmiZya Oct 20, 2024

Uh oh!

Uh oh!

jeffwilliams Oct 15, 2024

Uh oh!

AmiZya Oct 20, 2024

Uh oh!

Uh oh!

jeffwilliams Oct 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Added Robot Framework lexer #2

Are you sure you want to change the base?

Added Robot Framework lexer #2

Uh oh!

Conversation

AmiZya commented Oct 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeffwilliams Oct 15, 2024

Choose a reason for hiding this comment

Uh oh!

AmiZya Oct 20, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jeffwilliams Oct 15, 2024

Choose a reason for hiding this comment

Uh oh!

AmiZya Oct 20, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jeffwilliams Oct 21, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AmiZya commented Oct 15, 2024 •

edited

Loading