While there is a lot to get excited about with ML both as a consumer and as a software developer I can't help feeling a pang of sadness.

Big data and in this case the relationship (graph) between big data points are whats needed to make great ML/AI products. By nature the only companies that will ever have access to this data in a meaningful way are going to be larger companies: Google, Amazon, Apple, ect. Because of this I worry that small upstarts may never be able to compete on these type of products in a viable way as the requirements to build these features are so easily defensible by the larger incumbents.

I hope this is not the case but I'm getting less and less optimistic when I see articles like this.

The graph algorithm they're describing is basically Manaal Faruqui's "retrofitting", although they don't cite him. I will make the charitable assumption that they came up with it independently (and managed to miss the excitement about it at NAACL 2015).

Here's why not to be sad: Retrofitting and its variants are quite effective, and surprisingly, they're not really that computationally intensive. I use an extension of retrofitting to build ConceptNet Numberbatch [1], which is built from open data and is the best-performing semantic model on many word-relatedness benchmarks.

The entirety of ConceptNet, including Numberbatch, builds in 5 hours on a desktop computer.

Big companies have resources, but they also have inertia. Some problems are solved by throwing all the resources of Google at the problem, but that doesn't mean it's the only way to solve the problem.

[1] https://github.com/commonsense/conceptnet-numberbatch