What does HackerNews think of fastText?

Fediverse and Naive Bayes: The Grandfather of Spam Filters Still Making Waves | Aug 2023

Was working in NLP from about 2014-2020. At the start, NB was indeed generally the best performing baseline model you could use, and we would train it for every task. However, we came to realise that Facebook's FastText model[1] was almost always the better choice as soon as your training set was more than a few hundred samples. Some advantages:

- accuracy generally a few points better than NB

- better generalisability due to the word embeddings acting as a bottleneck on expressiveness compared to NB or logistic regression which essentially model all words / bigrams as independent

- trained with cross-entropy, meaning that model scores can be used more effectively as a 'confidence' - e.g. for spam if you want to say something like "if prediction score > X, then filter", Naive Bayes is not ideal due to the 'naive' assumption which makes the scores very un-calibrated (it tends to give extremely high or low confidence scores, which gets worse with document length).

- is completely linear (or at least log-linear like NB), so explainability is super simple.

disclaimer: I haven't really thought about NLP for about 3 years so there may be something better than this now

[1] https://github.com/facebookresearch/fastText

How to solve most NLP problems: a step-by-step guide | Jan 2018

Word2Vec and bag-of-words/tf-idf are somewhat obsolete in 2018 for modeling. For classification tasks, fasttext (https://github.com/facebookresearch/fastText) performs better and faster.

Fasttext is also available in the popular NLP Python library gensim, with a good demo notebook: https://radimrehurek.com/gensim/models/fasttext.html

And of course, if you have a GPU, recurrent neural networks (or other deep learning architectures) are the endgame for the remaining 10% of problems (a good example is SpaCy's DL implementation: https://spacy.io/). Or use those libraries to incorporate fasttext for text encoding, which has worked well in my use cases.

Word vectors are awesome but you don’t need a neural network to find them | Oct 2017

https://github.com/facebookresearch/fastText this is Facebook's super efficient word2vec like implemention. i thought ppl might find it interesting

Word vectors are awesome but you don’t need a neural network to find them | Oct 2017

also, word2vec are super fast and work great. the text has no convincing argument on why not to use them, unless you don't want to learn basic neutral nets. even then, just use Facebook fast text : https://github.com/facebookresearch/fastText

How we built Tagger News: machine learning on a tight schedule | May 2017

Link to original HN submission: https://news.ycombinator.com/item?id=14337275

It's worth noting for future reference that in terms of supervised learning of labels given a text document input, fasttext (https://github.com/facebookresearch/fastText) is leagues ahead of conventional approaches in both accuracy and training speed, and there is a Python interface (https://github.com/salestock/fastText.py) for use with Django/Flask (unfortunately, recent fasttext changes have broken the interface for now).

Tomáš Mikolov on Word2vec and AI research at Microsoft, Google, Facebook [audio] | Feb 2017

Great podcast.

The world owes a big THANK YOU to Tomáš Mikolov, one of the creators of Word2Vec[0] and fastText[1], and also to Radim Řehůřek, the interviewer, who is the creator of gensim[1].

The number of software developers and researchers in industry and academia who rely on the work of these two individuals is large and growing every day.

[0] https://code.google.com/p/word2vec/

[1] https://github.com/facebookresearch/fastText

[2] https://radimrehurek.com/gensim/

Parallelizing Word2Vec in Multi-Core and Many-Core Architectures | Dec 2016

It's worth noting that fasttext, which was made in part by the original word2vec authors, can handle as many cores as you throw at it.

https://github.com/facebookresearch/fastText

How Quid uses deep learning with small data | Nov 2016

The baseline I'd like to see this compared to is the not-very-deep-learning "bag of tricks" that's conveniently implemented in fastText [1].

[1] https://github.com/facebookresearch/fastText

Machine Learning for Emoji Trends | Aug 2016

This was posted in 2015, but now that fasttext (https://github.com/facebookresearch/fastText), just released by Facebook and can scale to Instagram-sized datasets, can create word vectors better than word2vec which account for context (https://arxiv.org/pdf/1607.04606v1.pdf), this type of analysis will only improve in the future.

Bag of Tricks for Efficient Text Classification | Aug 2016

Link to source code: https://github.com/facebookresearch/fastText

The big result here is the 15,000x speedup compared to a neural network, and which increases as the size of the dataset increases. But this doesn't mean neural networks are worthless. From the paper:

Although deep neural networks have in theory much higher representational power than shallow models, it is not clear if simple text classification problems such as sentiment analysis are the right ones to evaluate them.