upvote
A universal document converter knows what document it is working with and what to do with it once it has it. 'What a document is' is an AST that has resulted from a few thousand years of literate civilization. You can detect the outline of this AST - or AAST as you might call it - by asking what must be preserved in a different printing of the same, or in a translation.

A universal document converter is 'expected' to admit transformations on the AST of a document. Luafilters do this more or less directly; operations via json representation do it in another.

I never used luafilters before, not knowing lua, but these days use them all the time for simple problems and am getting a clearer picture of the possibilities. This is because claude and codex write luafilters at the drop of a hat.

One simple illustration I have found of use with academic writing published inter alia in html arises from the willful decision of the html bureaucracy never to include a footnote syntax - and thus fall short of ABCs of any document concept however narrow and curtailed - because having said 'o we don't need footnotes, we have hypertext' back in clintontime they are too proud to change. In fact of course html is the format par excellence of footnotes ... as a gander at wikipedia will tell you. Pandoc can't parse them out of html - including its own html - since there is nothing to parse: the reader recognizes them by inspection in the browser. But you can ask claude to write a lua filter e.g. recognizing pandoc's own html footnotes - which are as arbitrary as everyone else's - and generate the structure intended by the author, in which they are footnotes.

reply