What does HackerNews think of duckling?
Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings.
Which libraries? I know of Duckling [0] but what others?
I've been using Duckling [0] for extracting fuzzy dates and times from text. It does a good job but I needed a custom build with extra rules to make that into a great job. And that's just for dates, 1 of 13 dimensions supported. Being able to use an AI that handles them with better accuracy will be fantastic.
Does a specialised model trained to extract times and dates already exist? It's entity tagging but a specialised form (especially when dealing with historical documents where you may need Gregorian and Julian calendars).
Good. I set myself the challenge of compiling a Haskell program [1] during the Christmas holidays. It was meant to be a "one mince pie" challenge, but after an hour I discovered the VM I used didn't have enough RAM (during compilation we were approaching 4GB), then I ran out of disk space as stack approaches 5GB & I had other stuff installed. Once a few hours had gone by (this program isn't fast to compile) I had a working program. I now have to figure out if I can distribute just the resulting binary to other servers, or if it needs other software like GHC installing. Having finished the pack of mince pies, that can wait to another day.
I know when I first started compiling C/C++ software there was a learning curve and it took hours the first time, but I found it easier to get started. With Haskell, the way one version of GHC is installed first and then Stack installs a completely isolated version is confusing; plus the inscrutable error messages (haven't got it to hand, but one means OOM but doesn't say that - it takes a Google to find the GitHub issue to work that out).
And this is before I try and experiment/decide to learn some Haskell. Apart from the error messages they're not issues with Haskell per se, but they contribute to the experience of it.
There are several libraries for temporal normalization:
- Duckling: https://github.com/facebook/duckling - JChronic: https://github.com/samtingleff/jchronic - There's also Chronic (Ruby version that jchronic was made from).
Stanford NLP and SpaCy also do tagging: - https://github.com/stanfordnlp/stanfordnlp - https://spacy.io/usage/linguistic-features#named-entities
Edit: Stanford NLP does not do temporal normalization. Added SpaCy
Some really nice projects do NLP without using ML at all, for instance Duckling [1] (a library made by facebook to find entities in a text) works a 100% with parsing rules, and is surprisingly efficient.
I agree with your point though, most of the time there is ML at some point in your pipeline so you can't really avoid learning it !
rewritten from clojure, used to power smart products at facebook