But I think it will have difficulty in crossing paradigm boundaries, by simply using documentation.
The exact syntax does not matter, only the grammar. If you give it the grammar, and then the keywords, it can find something that has similar grammar and then use your keywords.
As a for instance, back in the day some academics wrote a paper that compared GPT 3.5 to a couple of inductive programming systems (including one of mine) on solving programming problems in a certain well-known esoteric language which I shall call "L". The task was to solve those programming problems one-shot. The authors asserted that the "L" problem sets were unlikely to be in 3.5's training set, but I found them without much search in a public github repo. I mean the entire dataset was right there. In this case the researchers are colleagues and friends and I know they weren't simply negligent or malicious, they just missed the fact that their "unlikely to be in the training set" data was on the web.
So I'd always assume that if an LLM can perform a task that's because it's seen examples of the task during its training.
Without forgetting that LLMs have this really shockingly powerful ability to interpolate between examples and they can improve their performance on say Task A by training on Task B, where A and B are different but similar.
e.g. they seem to get better at translating between language pairs of which they have few examples of parallel text by training on other pairs of languages for which they have more parallel text; they seem to learn something about language translation in general by training on more examples of translation. I haven't got a good reference on that handy but it's well-known (and of course over-hyped and exaggerated by tech CEOs).
So without wanting to diminish your work, I'd guess that your new language's syntax is different and novel but everything else about it is more ordinary and the similarities are such that an LLM can wing it and write you a lexer etc. After all, the whole point about parser generators and similar tools is that the task can be abstracted and separated from syntax in the first place.
In fact LLMs are very good at that sort of thing, filling in the blanks as it were. I'm old enough to remember the excitement about GPT 3.5 being able to form syntactically correct sentences with nonsensical words give to it.
For example, I just asked Chat [1]:
Hey chat. The gostak distims the doshes. What happens to the doshes?
And it promptly answered: The doshes get distimmed.
See, it even got the spelling right!_________________
[1] https://chatgpt.com/c/6a242b65-e248-83ed-9a6e-f238a1e871b6