undefined

points

[-]

  > Dion language demo (experimental project which stores program as AST).

Michael Franz [1] invented slim binaries [2] for the Oberon System. Slim binaries were program (or module) ASTs compressed with the some kind of LZ-family algorithm. At the time they were much more smaller than Java's JAR files, despite JAR being a ZIP archive.

[1] https://en.wikipedia.org/wiki/Michael_Franz#Research

[2] https://en.wikipedia.org/wiki/Oberon_(operating_system)#Plug...

I believe that this storage format is still in use in Oberon circles.

Yes, I am that old, I even correctly remembered Franz's last name. I thought then he was and still think he is a genius. ;)

by fuhsnn13 hours ago|

prev|

[-]

Structural diff tools like difftastic[1] is a good middle ground and still underexplored IMO.

[1] https://github.com/Wilfred/difftastic

by panstromek13 hours ago|

parent|

[-]

IntelliJ diffs are also really good, they are somewhat semi-structural I'd say. Not going as far as difftastic it seems (but I haven't use that one).

by zokier2 hours ago|

prev|

[-]

> but storage format as text is just fundamentally problematic.

Why? The ast needs to be stored as bytes on disk anyways, what is problematic in having having those bytes be human-readable text?

by flowerbreeze11 hours ago|

prev|

[-]

I'm quite sure I've read your article before and I've thought about this one a lot. Not so much from GIT perspective, but about textual representation still being the "golden source" for what the program is when interpreted or compiled.

Of course text is so universal and allows for so many ways of editing that it's hard to give up. On the other hand, while text is great for input, it comes with overhead and core issues for (most are already in the article, but I'm writing them down anyway):

  1. Substitutions such as renaming a symbol where ensuring the correctness of the operation pretty much requires having parsed the text to a graph representation first, or letting go of the guarantee of correctness in the first place and performing plain text search/replace.
  2. Alternative representations requiring full and correct re-parsing such as:
  - overview of flow across functions
  - viewing graph based data structures, of which there tend to be many in a larger application
  - imports graph and so on...
  3. Querying structurally equivalent patterns when they have multiple equivalent textual representations and search in general being somewhat limited.
  4. Merging changes and diffs have fewer guarantees than compared to when merging graphs or trees.
  5. Correctness checks, such as cyclic imports, ensuring the validity of the program itself are all build-time unless the IDE has effectively a duplicate program graph being continuously parsed from the changes that is not equivalent to the eventual execution model.
  6. Execution and build speed is also a permanent overhead as applications grow when using text as the source. Yes, parsing methods are quite fast these days and the hardware is far better, but having a correct program graph is always faster than parsing, creating & verifying a new one.

I think input as text is a must-have to start with no matter what, but what if the parsing step was performed immediately on stop symbols rather than later and merged with the program graph immediately rather than during a separate build step?

Or what if it was like "staging" step? Eg, write a separate function that gets parsed into program model immediately, then try executing it and then merge to main program graph later that can perform all necessary checks to ensure the main program graph remains valid? I think it'd be more difficult to learn, but I think having these operations and a program graph as a database, would give so much when it comes to editing, verifying and maintaining more complex programs.

by zelphirkalt13 hours ago|

prev|

[-]

Why would structural editing be a dead end? It has nothing to do with storage format. At least the meaning of the term I am familiar with, is about how you navigate and manipulate semantic units of code, instead of manipulating characters of the code, for example pressing some shortcut keys to invert nesting of AST nodes, or wrap an expression inside another, or change the order of expressions, all at the pressing of a button or key combo. I think you might be referring to something else or a different definition of the term.

by panstromek13 hours ago|

parent|

[-]

I'm referring to UI interfaces that allow you to do structural editing only and usually only store the structural shape of the program (e.g. no whitespace or indentation). I think at this point nobody uses them for programming, it's pretty frustrating to use because it doesn't allow you to do edits that break the semantic text structure too much.

I guess the most used one is styles editor in chrome dev tools and that one is only really useful for small tweaks, even just adding new properties is already pretty frustrating experience.

[edit] otherwise I agree that structural editing a-la IDE shortcuts is useful, I use that a lot.

by pegasus11 hours ago|

parent|

[-]

Some very bright Jetbrains folks were able to solve most of those issues. Check out their MPS IDE [1], its structured/projectional editing experience is in a class of its own.

[1] https://www.youtube.com/watch?v=uvCc0DFxG1s

by conartist613 hours ago|

prev|

[-]

Come the BABLR side. We have cookies!

In all seriousness this is being done. By me.

I would say structural editing is not a dead end, because as you mention projects like Unison and Smalltalk show us that storing structures is compatible with having syntax.

The real problem is that we need a common way of storing parse tree structures so that we can build a semantic editor that works on the syntax of many programming languages

by panstromek13 hours ago|

parent|

[-]

I think neither Unison nor Smalltalk use structural editing, though.

[edit] on the level of a code in a function at least.

by conartist611 hours ago|

parent|

[-]

No, I know that. But we do have an example of something that does: the web browser.