What does HackerNews think of graphtage?

A semantic diff utility and library for tree-like files such as JSON, JSON5, XML, HTML, YAML, and CSV.

Language: Python

#11 in Node.js
#201 in Hacktoberfest
#10 in Library
#164 in Python
I'm not familiar with Pijul, and haven't finished watching this presentation, but IME the problems with modern version control tools is that they still rely on comparing lines of plain text, something we've been doing for decades. Merge conflicts are an issue because our tools are agnostic about the actual content they're tracking.

Instead, the tools should be smarter and work on the level of functions, classes, packages, sentences, paragraphs, or whatever primitive makes sense for the project and file that is being changed. In the case of code bases, they need to be aware of the language and the AST of the program. For binary files, they need to be aware of the file format and its binary structure. This would allow them to show actually meaningful diffs, and minimize the chances of conflicts, and of producing a corrupt file after an automatic merge.

There has been some research in this area, and there are a few semantic diffing tools[1,2,3], but I'm not aware of this being widely used in any VCS.

Nowadays, with all the machine learning advances, the ideal VCS should also use ML to understand the change at a deeper level, and maybe even suggest improvements. If AI can write code for me, it could surely understand what I'm trying to do, and help me so that version control is entirely hands-free, instead of having to fight with it, and be constantly aware of it, as I have to do now.

Or, since it's more than likely that humans won't be writing code or text in the near future, we'll skip the next revolution in VCS tools, and AI will be able to version its own software. /sigh

I just finished watching the presentation, and Pijul seems like an iterative improvement over Git. Nothing jumped out at me like a killer feature that would make me want to give it a try. It might be because the author focuses too much on technical details and fixing Git's shortcomings, instead of taking a step back and rethinking what a modern VCS tool should look like today.

[1]: https://semanticdiff.com/

[2]: https://github.com/trailofbits/graphtage

[3]: https://github.com/GumTreeDiff/gumtree

I had (maybe unreasonably) hoped that this course would provide a glimpse into how CT can be applied to organizing and processing data in the sense of keywords like "knowledge graphs", "graph databases", "ontologies", "model-based engineering".... And on top of that, representing operations to do meaningful (semantic) version control on these representations (e.g. [1, 2]), and bidirectional transformations [3] between structured representations (e.g. [3, 4] and "Triple Graph Grammars"). I have the sense that there are dozens of disparate concepts and that category theory offers some unifying power.

I hope that there simply hasn't been enough work done to do a category-theoretic treatment of all of these topics, and that perhaps even more category theory itself needs to be developed so that there are good ways to talk about concepts that are almost-but-not-quite-entirely described or subsumed by category theory.

The alternative is that I'm painfully wrong about what applied category theory aims to be, and that I have a ton of application-specific terms to learn about and won't find a formalization of the sense in which all of these concepts relate.

[1] https://en.wikibooks.org/wiki/Understanding_Darcs/Patch_theo...

[2] https://github.com/trailofbits/graphtage

[3] http://bx-community.wikidot.com

[4] https://github.com/grammarware/bx-parsing

[5] https://en.wikipedia.org/wiki/QVT

[6] http://graphdatamodeling.com/Graph%20Data%20Modeling/GraphQL...

[7] https://neo4j.com/developer/guide-data-modeling/

[8] https://web-cats.gitlab.io/#some-of-the-cats-we-come-across

[9] http://pauillac.inria.fr/~pilkiewi/papers/boomerang-tr.pdf

This is an interesting tool for computing minimal diffs, but the result is not very human friendly. If this is your goal and you are looking for something better than diff, have a look at graphtage: https://github.com/trailofbits/graphtage

Also works for XML, HTML, YAML and CSV.