> Given a method specification in the form of tests, type and effect annotations with RDL, this synthesizes a Ruby function that will pass the tests. This reduces programmer effort to just writing tests that specify the function behavior and the computer writing the function implementation for you.

That's really neat! But doesn't this just mean that instead of writing your code and logic in Ruby you're writing it in RDL?

Also, it would be great to see some examples in the README or in an examples folder. It's kind of hard for me to tell exactly what I need to write and what to call to generate the Ruby.

> But doesn't this just mean that instead of writing your code and logic in Ruby you're writing it in RDL?

RDL allows the specification of types and effect labels only. You still write the tests in standard Ruby and synthesized code should satisfy the logic as checked by the tests.

> Also, it would be great to see some examples in the README or in an examples folder. It's kind of hard for me to tell exactly what I need to write and what to call to generate the Ruby.

Sorry for the sparse documentation, I plan to improve that. I have added an example to the readme. Other examples can be found in `test/benchmark` folder.

> You still write the tests in standard Ruby and synthesized code should satisfy the logic as checked by the tests.

I think its important that the code still be user-modifiable:

Take this function, y = 2 * x.

Write a test, which proves for all real numbers, that y = 2 * x.

Can't do it, can you?

Tests can't fully specify code - this effort is doomed to be incomplete.

> Write a test, which proves for all real numbers, that y = 2 * x.

Tests by definition do not prove any for all property. They just check on correctness on concrete values. Formal properties are great for proving properties on values, i.e., for your example one could easily write a logical formula and hand it off to a SMT solver for synthesis and there is enough prior work on that.

> Tests can't fully specify code - this effort is doomed to be incomplete.

RbSyn addresses a different problem: it is difficult to formally specify the behavior of an application in a high-level language that handles HTTP requests, accesses the database. Incompleteness of tests are an advantage here--it allows the specification to be tractable (these are tests programmers would have written anyway), while constraining the search space enough for the synthesizer to give a solution. In practice, it turns out these tests are good enough to synthesize methods from a lot of production grade Ruby on Rails web apps (https://arxiv.org/abs/2102.13183).

> while constraining the search space enough for the synthesizer to give a solution.

Keyword being "a". Best solution? Most easily maintained? I don't know about that.

> In practice, it turns out these tests are good enough to synthesize methods from a lot of production grade Ruby on Rails web apps

Yeah, for some very basic stuff, you might have invented an overly complicated way of building boilerplate code.

Your argument is that the tests being incomplete make it tractable and this is an "advantage" somehow - I don't think this is such an advantage. The majority of "useful" tests either are stand-ins for what the language should have given you anyways (proper type checking), or they find and fix known bugs, e.g. the intersection of three features.

They rarely properly specify functionality, usually the best way to do this is writing the actual code. At which point, to properly specify your functionality in "test" form, you must write the actual code as a test. This doesn't provide much benefit, outside of having perhaps automated one side of a counterproductive process. You do save some programmer time, but writing the code is and was never the hard part...

> Best solution? Most easily maintained?

We apply the Occam's razor here. For a given problem, a smaller program that solves the problem is a preferred solution. It is not a very good metric, and it is fine to disagree with it. But getting a "best" solution, requires one to define what "best" is, and there is no such agreed definition for code.

> The majority of "useful" tests either are stand-ins for what the language should have given you anyways (proper type checking), or they find and fix known bugs, e.g. the intersection of three features.

Find and fix known bugs is one aspect of it. You are not accounting the entire philosophy of test driven development, where one specifies functionality as tests upfront, and later writes code to pass those tests. Anyhow, we have found large projects like Discourse [1], Gitlab [2], Diaspora [3] have tests that specify code behavior well enough so that RbSyn can synthesize them. Very few tests are type-checking, bug fixing and intersection of 3 features.

[1]: https://github.com/discourse/discourse [2]: https://github.com/gitlabhq/gitlabhq [3]: https://github.com/diaspora/diaspora