upvote
(Properly formatted) XML can be parsed, and streamed, by a visibly-pushdown automaton[1][2].

"Visibly Pushdown Expressions"[3] can simplify parsing with a terse syntax styled after regular expressions, and there's an extension to SQL which can query XML documents using VPAs[4].

JSON can also be parsed and validated with visibly pushdown automata. There's an interesting project[5] which aims to automatically produce a VPA from a JSON-schema to validate documents.

In theory these should be able outperform parsers based on deterministic pushdown automata (ie, (LA)LR parsers), but they're less widely used and understood, as they're much newer than the conventional parsing techniques and absent from the popular literature (Dragon Book, EAC etc).

[1]:https://madhu.cs.illinois.edu/www07.pdf

[2]:https://www.cis.upenn.edu/~alur/Cav14.pdf

[4]:https://web.cs.ucla.edu/~zaniolo/papers/002_R13.pdf

[3]:https://homes.cs.aau.dk/~srba/courses/MCS-07/vpe.pdf

[5]:https://www.gaetanstaquet.com/ValidatingJSONDocumentsWithLea...

reply
Without looking, I guessed that all your quotes come from academic papers. I was right.

Because real life is nothing like what is taught in CS classes.

reply
I'm not an academic and have extensive experience with parsing.

But for whataver reason, VPAs have slipped under my radar until very recently - I only discovered them a few weeks ago and have been quite fascinated. Have been reading a lot (the citations I've given are some of my recent reading), and am currently working on a visibly pushdown parser generator.

Thought it might be interesting to share in case others like me have missed out on VPAs.

reply