> Dion language demo (experimental project which stores program as AST).
Michael Franz [1] invented slim binaries [2] for the Oberon System. Slim binaries were program (or module) ASTs compressed with the some kind of LZ-family algorithm. At the time they were much more smaller than Java's JAR files, despite JAR being a ZIP archive.[1] https://en.wikipedia.org/wiki/Michael_Franz#Research
[2] https://en.wikipedia.org/wiki/Oberon_(operating_system)#Plug...
I believe that this storage format is still in use in Oberon circles.
Yes, I am that old, I even correctly remembered Franz's last name. I thought then he was and still think he is a genius. ;)
Why? The ast needs to be stored as bytes on disk anyways, what is problematic in having having those bytes be human-readable text?
Of course text is so universal and allows for so many ways of editing that it's hard to give up. On the other hand, while text is great for input, it comes with overhead and core issues for (most are already in the article, but I'm writing them down anyway):
1. Substitutions such as renaming a symbol where ensuring the correctness of the operation pretty much requires having parsed the text to a graph representation first, or letting go of the guarantee of correctness in the first place and performing plain text search/replace.
2. Alternative representations requiring full and correct re-parsing such as:
- overview of flow across functions
- viewing graph based data structures, of which there tend to be many in a larger application
- imports graph and so on...
3. Querying structurally equivalent patterns when they have multiple equivalent textual representations and search in general being somewhat limited.
4. Merging changes and diffs have fewer guarantees than compared to when merging graphs or trees.
5. Correctness checks, such as cyclic imports, ensuring the validity of the program itself are all build-time unless the IDE has effectively a duplicate program graph being continuously parsed from the changes that is not equivalent to the eventual execution model.
6. Execution and build speed is also a permanent overhead as applications grow when using text as the source. Yes, parsing methods are quite fast these days and the hardware is far better, but having a correct program graph is always faster than parsing, creating & verifying a new one.
I think input as text is a must-have to start with no matter what, but what if the parsing step was performed immediately on stop symbols rather than later and merged with the program graph immediately rather than during a separate build step?Or what if it was like "staging" step? Eg, write a separate function that gets parsed into program model immediately, then try executing it and then merge to main program graph later that can perform all necessary checks to ensure the main program graph remains valid? I think it'd be more difficult to learn, but I think having these operations and a program graph as a database, would give so much when it comes to editing, verifying and maintaining more complex programs.
I guess the most used one is styles editor in chrome dev tools and that one is only really useful for small tweaks, even just adding new properties is already pretty frustrating experience.
[edit] otherwise I agree that structural editing a-la IDE shortcuts is useful, I use that a lot.
In all seriousness this is being done. By me.
I would say structural editing is not a dead end, because as you mention projects like Unison and Smalltalk show us that storing structures is compatible with having syntax.
The real problem is that we need a common way of storing parse tree structures so that we can build a semantic editor that works on the syntax of many programming languages
[edit] on the level of a code in a function at least.