Highlight literal matches with Treesitter

With vanilla syntax highlighting, we can specify something like:

:syn match ErrorMsg /start/

if we want all occurrences of the word start to be highlighted with the ErrorMsg group. nvim-treesitter allows the key additional_vim_regex_highlighting to be passed to a parser when it is being setup, however it appears that Treesitter highlighting always overrides anything from normal syntax highlighting. For instance, if the matched highlight group with Treesitter is a light gray, I can run the above :syn match command and the color doesn’t change, but the word inherits the bold attribute from ErrorMsg. On top of this, :TSHighlightCapturesUnderCursor from nvim-treesitter/playground shows both the _Treesitter and syntax match groups correctly.

Poking around a bit (mostly reading about query syntax), it seems like I should be able to create a highlights.scm file for the appropriate :h filetype that I am dealing with, and put the file in after/queries/{language}/highlights.scm. In this file, I can then do something like:

"start" @my-start

(which is an anonymous capture?) and then use the following when setting up my nvim-treesitter.configs:

    ...,
    custom_captures = {
        ["my-start"] = "ErrorMsg",
    },
    ...

Once all of this is setup, I get errors specifying that start is not a valid node, which makes sense as all the query examples use (mostly) named nodes. It seems like matching to a node via something like (node @x (#match? @x "\\(start\\)")) would work, however this is just searching for node values that contain start, and doesn’t extract it as a special item that can be highlighted separately (as far as I can tell).

I’m not super familiar with scheme, but will continue messing around with :TSPlayground to see if I can find anything that works.

As far as I’m aware this is not currently possible, however any advice is greatly appreciated.

Hi and sorry for the delay.

There is a little misconception that I just want to clarify before actually going into possible solutions of you problem, so please bear with me for a second.

Tree-sitter (and all parser library for that matter) understand the text you write in a deeper manner than just and array of character: they arrange it in a tree, associating each part of the “array of character” they got from the start with a node in what is called the abstract syntax tree.

This leads us to your issue, and two ways of fixing it.

If what you want is to highlight the occurences of start in a given file, parser libraries ask you another question: where do you expect this start to appear ? That is, which node type should I expect this start to be ? If it is an identifier then, the answer to that question is:

((identifier) @start (#eq? @start "start')) 

You can see in the example above, that I say the following: mark as @start all the identifier nodes whose texts are "start". There is the two informations we talked about: where, and what.

If you want to highlight all occurences of the word start regardless of the node type, most of the time it will be difficult, because the concept of word is most of the time not known by parsers, as a word is not part of the language they are attempting to parse.

Thus you might have to resort to some kind of nvim_buf_set_extmark tricks in order to do so.

1 Like