What does HackerNews think of parser?

Rewriting the Ruby Parser | Jun 2023

Well described. Also I found this on github.

Tree-sitter: an incremental parsing system for programming tools | Feb 2021

This is more a function of Ruby than of tree-sitter. The tree-sitter grammars for other languages are hopefully less inscrutable. For Ruby, we basically just ported whitequark's parser [1] over to tree-sitter's grammar DSL and scanner API.

[1] https://github.com/whitequark/parser

Ruby adds experimental support for rightward assignments | Sep 2020

Ruby's parser is notoriously complex; if I remember correctly, only a few members of the core team even know how to maintain it without introducing regressions.

The craziest part of this is that Ruby does not provide a full featured Ruby parser, so its entire static and dynamic analysis ecosystem depends on a (actually very high quality) 3rd party parser, begrudgingly maintained by someone who (AFAIK) doesn't even write Ruby anymore: https://github.com/whitequark/parser

When I see new language features like this, I think of how Ruby's entire tooling ecosystem depends on the dramatically underfunded (and therefore primarily goodwill) efforts of high output maintainers like whitequark and a few others. Ruby's highly dynamic and untyped nature means these tools are all the non-runtime guarantees you can get, basically. Epitome of digital infrastructure.

Consider asking your company to fund some of these people:

* https://github.com/whitequark (maintains parser)

* https://github.com/sponsors/bbatsov (maintains RuboCop)

* https://github.com/sponsors/mbj (maintains unparser and mutant)

---

As context, I know this stuff intimately because I used to contribute heavily to most static and dynamic analysis tools in the Ruby ecosystem (https://github.com/backus?tab=overview&from=2017-12-01&to=20...) and used to track new ruby changes really closely: https://cognitohq.com/new-features-in-ruby-2-4/

Compiler Construction by Niklaus Wirth [pdf] | Jan 2020

Expand Context ↕

Agreed. I'm not going to say whether Ruby syntax is nice or not, but its use of parser generators is not a model of simplicity:

https://github.com/ruby/ruby/blob/master/parse.y

https://whitequark.org/blog/2013/04/01/ruby-hacking-guide-ch...

It also changes syntax in patch level versions: https://github.com/whitequark/parser.

A criticism of Ruby (2013) | Dec 2017

I disagree substantively with this paragraph:

> Ruby has no well defined grammar. All the Ruby implementations today reuse Matz's original parsing code. There are various BNF grammars people have written for the language, but they may or may not match the actual implementation. Nor does Ruby provide a mechanism to turn code into an abstract syntax tree for you as Python and Lisp do. Anyone writing tools beyond a unit testing library must solve this problem first, before ever doing any real work.

Ruby's grammar is well-defined, it's just implementation defined rather than specification defined. That alone is a problem, but I think the author makes a mistake in associating an external specification with the state of being well-defined[1].

Being able to turn your code into an AST can be useful, but I think the author overstates its value. Personally, I've never wanted or needed to inspect Ruby's AST in a practical application. When I'm fiddling around, I use whitequark's parser[2][3], which has always worked well.

[1]: It's also worth noting that Ruby does have a formal ISO specification (https://www.iso.org/standard/59579.html), albeit for 1.9.x syntax (IIRC).

[2]: https://github.com/whitequark/parser

[3]: Which, notably, just takes Ruby's `rubyNN.y` and exposes the AST as Ruby objects.

Ruby Is Defined by Terrible Tools | Jul 2015

Expand Context ↕

>but Ruby has pretty terrible support for parsing Ruby

https://github.com/whitequark/parser

Neither unmaintained, broken nor just for old versions. It has some really minor incompatibilities though:

https://github.com/whitequark/parser#known-issues

The same person also wrote a Python parser because apparently the ast module doesn't provide precise location information of tokens:

https://github.com/m-labs/pythonparser