What does HackerNews think of louise?

Polynomial-time Meta-Interpretive Learning

Language: Prolog

Hey, if you're interested in higher-order logic programming then you might want to have a look at Meta-Interpretive Learning (MIL). It's a new approach to Inductive Logic Programming (ILP) that learns by SLD-resolution with first- and second-order definite clauses (hence, "higher-order").

You won't find much on MIL as "higher-order logic programming" but that's really what it is: it's logic programming; with a higher-order program (i.e. one containing both first- and second-order clauses). The catch is that once you do this any proof you complete gives you instantiations into the second-order variables in the second-order clauses and now you have yourself a first-order program, that you didn't have before; hence "inductive". So it's basically a form of machine learning of logic programs, from other logic programs.

There are a few papers you can find online about MIL, but I recommend (PLUG ALERT) the documentation of the MIL system I created for my doctoral thesis as a starting point:

https://github.com/stassa/louise

I recommend that as a starting point because it's written for people who are familiar with logic programming, but not necessarily ILP. Anyway, plug over. I think MIL is an exciting new development that has of course gone under the radar of the logic programming community at large.

As triska says, there are different execution strategies for Prolog and left-recursion is not a problem in practice. To plug my own work, the Inductive Logic Programming (ILP) system I developed during my PhD (https://github.com/stassa/louise) can learn left-recursive Prolog programs thanks to an inductive Prolog meta-interpreter executed with SLG-resolution (a.k.a. tabling). I mean to say that the inductive meta-interpreter itself is tabled, so it can prove left-recursive programs without entering an infinite recursion (it needs to do that because it learns a program by proving its generalisation). And so it can learn left-recursive programs without "going infinite". That used to be a big problem for ILP, but it's now well and truly solved, in theory as well as in practice. And I say "now" but SLG-resolution is old as a a very old thing; Tamaki and Sato described it in 1986, I believe:

https://link.springer.com/chapter/10.1007/3-540-16492-8_66

But then implementations took a bit to catch up.

>> Prolog is deeply unsuited for work, use Clingo instead.

Well, that doesn't make sense. ASP doesn't even allow lists, let alone side-effects (you know, like writing to a stream), so what "work" is it that it's better suited for, than Prolog?

Also, if I remember correctly, Clingo is not a language but a grounder? It takes a first-order program and grounds the entire Herbrand base into a set of propositions so that a SAT solver can then derive the stable models of the program? Do I have that wrong?

Let me say up front that I've never heard a worse idea in my life, and I've heard a few. First of all, that's why ASP can't use lists, or other functions, because then the Herbrand base goes infinite and good luck grounding it. Second, SAT-solving is NP-complete. As I realised recently (https://news.ycombinator.com/item?id=34072923) ASP is an elegant framework for dealing with uncertainty in a purely symbolic manner, with negation as failure and axiomatic, classical negation. But then, why go and mess this elegance up with a dumb propositionalisation approach that loses all the good stuff about first order logic, and that's terminally inefficient to boot? That, I still don't get.

Thanks, that's a nice example.

>> For an example. The potential hypothesis here are pre generated, but you can imagine an algorithm or adapt an existing one with a tight generalise/specialise loop.

Yes! I'm thinking of how to adapt Louise (https://github.com/stassa/louise) to do that. The fact that s(CASP) is basically a Prolog-y version of ASP (with constraints) could make it a very natural sort of modification. Or, of course, there's always Well-Founded Semantics (https://www.swi-prolog.org/pldoc/man?section=WFS).

What's hyp_gen.pl?

Note you can do machine learning of logic programs. My PhD research:

https://github.com/stassa/louise

In which case it _is_ machine learning and it still really works :D

If you want to auto-write Haskell, use MagicHaskeller:

http://nautilus.cs.miyazaki-u.ac.jp/~skata/MagicHaskeller.ht...

And whle we're at it, if you want to auto-write Prolog, use my own Louise:

https://github.com/stassa/louise

My guess is the end result of all this "AI" assisted code-generation is that it will have the same impact on the software engineering industry as spreadsheets had on accounting. I also believe that this AI-powered stuff is a bit of a "two-steps forward, one step back" situation and the real innovation will begin when ideas from tools like Louise [1] are integrated into the approach taken in Codex.

When VisiCalc was released departments of 30 accountants were reduced to 5 accountants because of the improvement for individual worker efficiency, however accounting itself remains largely unchanged and accountants are still a respected profession who perform important functions. There's plenty of programming problems in the world that simply aren't being solved because we haven't figured out how to reduce the burden of producing the software; code generation will simply increase the output of an individual software developer.

The same forces behind "no-code" are at work here. In fact I see a future where these two solutions intermingle: where "no-code" becomes synonymous with prompt-driven development. As we all know, however, these solutions will only take you so far -- and essentially only allow you to express problems in domains that are already well-solved. We're just expressing a higher level of program abstraction; programs that generate programs. This is a good thing and it is not a threat to the existence of our industry. Even in Star Trek they still have engineers who fix their computers...

[1] - https://github.com/stassa/louise

>> 3) all predicates work in all-modes, all the time

I think this is the feature that is the most sorely missed in Prolog programming and as you say it is possible to have it, but only if one is prepared to jump through no end of hoops. Sometimes it's just not possbile with any reasonable amount of effort and any sensible amount of coding. This feature is also the one most characteristic of relations, rather than functions, or the demi-functions that Prolog predicate definitions end up as (i.e. with more or less standard inputs and outputs, a.k.a. modes).

I think I detect a hint of frustration in your comment, perhaps after one too many encounter with recalcitrant Prolog programmers, unwilling to give up their convenient hacks :)

Well, these days I can wear another hat, that of the ILP researcher, and there is an interesting use of modes in ILP: they are often used to reduce the size of the hypothesis search space. Because, of course, the hypothesis space for the program append(+,+,-) is much smaller than the hypothesis space for the program append(?,?,?)! Aleph, probably the best-known ILP system of all time (and its ancestor, Progol) has a special syntax of "mode declarations" to determine input and output modes, for this reason.

Interestingly, the more recent approach (attention: I'm about to talk about my PhD work :) of Meta-Interpretive Learning (MIL), does away with specific modes and Aleph and Progol's (and many other systems') mode declarations. Instead, MIL encodes inductive bias by means of second-order datalog clauses, the "metarules" (e.g. ∃P,Q,R,X ∀x,y,z: P(x,y) ← Q(x,z), R(z,y,X) is a metarule with second-order variables P,Q,R existentially quantified over the set of predicates, first-order variable X existentially quantified over the set of constants and first-order variables x,y,z universally quantified over constants). Metarules do not specify input and output modes -they are, after all, datalog. This would normally blow up the hypothesis space to unsearchable proportions, but this is managed in various ways- and an actual search of the hypothesis space (i.e. the set of sets of clauses) can be avoided alltogether, for a polynomial time learning approach.

But I digress. What I mean to say is that this theoretically mode-free learning has so far only been implementedin Prolog [1][2][3] or ASP [4]. ASP is purely declarative, but incomplete in its own way (because the Herbrand base must be fully ground, which is impossible to do for infinite Herbrand bases) and Prolog has- the mode problem we discussed. So while we have a powerful learning approach, with a nice theoretical basis, in practice we implement it in a less than satisfying manner, always. And, I dare say I've seen the result in annoying "edge cases". So, after our discussion, I wonder if a minikanren implementation of MIL would overcome the limitations of the Prolog and ASP implementations - even at the cost of expensive occurs-checks etc. I mean, at least if one gives up efficiency, there should be an equally important reward, and an in-practice realisation of the in-principle soundness and completeness of the approach would probably be that. In my opinion anyway. At the very least it's a more compelling reason to re-implement a system in a new language than "well, it's an implementation in a new language".

Food for thought, post-thesis :)

____________

[1] Metagol:

https://github.com/metagol/metagol

(Try to ignore the scary header)

[2] Thelma:

https://github.com/stassa/thelma

[3] Louise:

https://github.com/stassa/louise

[4] Hexmil:

https://arxiv.org/abs/1805.00068

Statistics, no. Machine learning, yes. The most successful inductive programming approaches (learning programs from data) are Inductive Logic Programming approaches that basically learn Prolog programs from data. Prolog as a language with an automated deductive theorem prover as an interpreter is uniquely suited to learning new prorgams, because, it turns out, induction (learning new theories from observations and existing theories) is the inverse of deduction (explaining observations with existing theories).

For some modern examples of learning Prolog programs with Prolog see:

1. Metagol:

https://github.com/metagol/metagol

2. Louise:

https://github.com/stassa/louise

3. Popper:

https://arxiv.org/abs/2005.02259

The first repo has a list of biblio right at the end, though it's a bit too formal maybe.

Full disclosure: I started my PhD on Metagol and Louise is my work.

For scientific computing, check out this paper:

Using logic programming for theory representation and scientific inference

https://www.sciencedirect.com/science/article/pii/S0732118X2...

And see repo with code here:

https://jeanchristopherohner.github.io/theory-toolbox-2/

In the interest of pruning this conversation a bit I will not continue the discussion about GPT-3. Apologies, but this thread is growing too fast and I don't have the time to give your comments the attention they deserve. I am happy for you to have the last word in that matter.

>> These techniques were invented from the field of AI, but that does not mean they remain in the field of AI.

Like I say above, it is pretty uncontroversial that these approaches are part of the field of AI research. You can consult wikipedia or e.g. the calls for papers from major AI conferences, AAAI and IJCAI, if in doubt.

So I have to ask again, why do you say these approaches are are not in the field of AI research? According to whom? And based on what?

I would please like an answer to the above question.

Further, I can certainly point you to successes of symbolic AI, which you say don't exist. For one thing, the entire fields of automated theorem proving, planning, search, game playing, knowledge representation and reasoning, etc. that you say are "not AI", but are like I say still active and still state of the art in their respective tasks. These are certainly successful- they have produced systems and techniques that still work best than any alternative and actually quite well.

For examples of specific systems that were successful in their time, see Logic Theorist [1] that proved 38 of the first 52 theorems in Principia Mathematica; Chinook [2], the first computer program to win a world championship against humans (in checkers/draughts); Deep Blue [3], the first AI system to defeat a human grandmaster (Garry Kasparov) in chess; MYCIN [4] the first AI system to outperform human experts in disease diagnosis (specifically, diagnosis of infections); and so on.

Of course these systems have been superseded - but they were successes nonetheless. Another reason to learn the history of AI is to become aware of those systems- they, indeed, were "there".

Again I have to ask you- where does your knowledge of AI come from? When you make such strong statements about what works and what doesn't, what failed and what succeeded, are you sure you are well informed? Do you draw your knowledge from primary sources, or are you trusting the opinions of others who claim to be experts- but may not be (like in the article above)?

>> How would you tackle this with ILP?

Below I've defined the problem in the format expected by Louise [5]:

  ?- list_mil_problem(ordered/3).
  Positive examples
  -----------------
  ordered([a],[b,c],[d,e,f]).
  
  Negative examples
  -----------------
  []
  
  Background knowledge
  --------------------
  shorter/2:
  shorter(A,B):-length(A,C),length(B,D),C
Given this problem definition, Louise can learn the following (Prolog) program:

  ?- learn(ordered/3).
  ordered(A,B,C):-shorter(A,B),shorter(B,C).
  true.
To explain, shorter/2 is a predicate defined as background knowledge by me. triadic_chain is a metarule, a second-order clause that provides inductive bias. length/2 is an ISO Prolog predicate.

Like I say, this is a trivial problem, not least because its solution is easy to figure out and the background knowledge and metarules are trivial to define by hand. Louise can also perform predicate invention to define new background knowledge (kind of like inventing new features) and also new metarules. That is to say, Louise can learn the shorter/2 and length/2 programs, also from very few examples- and then reuse them as background knowledge. But showing how to do that would make for a larger example. I'm happy to oblige if you are curious.

I should point out that there exists no neural net approach that can learn the same (or a similar) program from a single positive example- not least because neural nets cannot make use of background knowledge (i.e. a library of programs from which to build other programs).

__________________

[1] https://en.wikipedia.org/wiki/Logic_Theorist

[2] https://en.wikipedia.org/wiki/Chinook_(computer_program)

[3] https://en.wikipedia.org/wiki/Deep_Blue_versus_Garry_Kasparo...

[4] https://en.wikipedia.org/wiki/Mycin

[5] https://github.com/stassa/louise/

I like the idea of the whole plot being described as json. I was looking for a way to automatically generate plots from a project written in Prolog (https://github.com/stassa/louise). Until now I was composing some R plotting scripts with Prolog, which can get a bit clunky. Swi-Prolog has a solid library to convert from Prolog terms to json so it should be straightforward to translate between program output in Prolog and Vega-Lite json. I will definitely give this a try.

Just a question to @domoritz - I noticed that in this example plot, that shows relative numbers of different types of animal produced in the US and UK rendered as emoji:

https://vega.github.io/vega-lite/examples/isotype_bar_chart_...

- the data is encoded as hard-coded values that tell the engine where to place each icon of a sheep, pig or cow. Isn't it possible to derive these positions from numerical data, automatically?

For instance, instead of enumerating each instance of a "pig":

    "values": [
      {"country": "Great Britain", "animal": "pigs"},
      {"country": "Great Britain", "animal": "pigs"},
     ... etc. 
Shouldn't it be possible to input the actual number of pigs and have the engine put it into bins as necessary?
My research group (I'm a PhD student) is working on algorithms that learn logic programs from examples and background knowledge (also given as logic programs). The programs our algorithms learn are first order definite datalog programs, which means they are sets of first order Horn clauses without negation as failure and without any functions of arity more than 1 as arguments of literals or terms.

The reason for the restriction is completeness and efficiency. Definite datalog programs are guaranteed to terminate (given a finite predicate and constant signature), so the search for a hypothesis (the learned program) cannot "go infinite" even when the target theory (what you are trying to learn) is recursive, even when it has mutually recursive clauses. As a result our algorithms can learn recursive programs (which, is rather important). The space of hypotheses doubles in size when negation as failure is allowed, which improves the efficiency of the search for hypotheses.

The problem of course is that some programs cannot be expressed in definite datalog than can be expressed in Prolog with negation as failure and arbitrary functions as arguments. So that's a bit of a limitation. For instance, one has to jump through hoops to learn programs with "exceptions" (A if B except if C), say like a program calculating leap years (which have exceptions for years divisible by 100 and 400) or fizzbuzz.

This is not directly programming with datalog- it's machine learning of datalog programs from data. But I think it's similar to the experience you're asking for.

My group's algorithms:

Metagol (a meta-interpretive learner for definite datalog programs):

https://github.com/metagol/metagol

Louise (a polynomial-time version of Metagol):

https://github.com/stassa/louise