What does HackerNews think of bert?

TensorFlow code and pre-trained models for BERT

Language: Python

#4 in Google
#4 in Tensorflow
> Has Google/Alphabet publicly released any of their AI models yet?

You mean NLP field changing models from Google like BERT [1]? or Transformers paper [2]? or T5 model [3] (used by company doing ChatGPT like search currently on the front page on HN)?

1. https://arxiv.org/abs/1810.04805 code+models: https://github.com/google-research/bert

2. https://arxiv.org/abs/2112.04426

3. https://arxiv.org/abs/1910.10683 code+models: https://github.com/google-research/text-to-text-transfer-tra...

> resulting in large programs with lots of boilerplate

That was what I was trying to say when I said "the code required to implement the challenges is large enough that they are considered too inconvenient to use". This makes sense to me.

Thank you for this benchmark! I'll probably switch to spyql now from jq.

> So, orjson is part of the reason why a python-based tool outperforms tools written in C, Go, etc and deserves credit.

Yes, I definitely think this is worth mentioning upfront in the future, since, IIUC, orison's core uses Rust (the serde library, specifically). The initial title gave me the impression that a pure-Python json parsing-and-querying solution was the fastest out there.

A parallel I think is helpful to think about is saying something like "the fastest BERT implementation is written Python[0]". While the linked implementation is written in Python, it offloads the performance critical parts to C/C++ through TensorFlow.

I'm not sure how such claims advance our understanding of the tradeoffs of programming languages. I initially thought that I was going to change my mind about my impression that "python is not a good tool to implement fast parsing/querying", but now I haven't, so I do think the title is a bit misleading.

[0] https://github.com/google-research/bert

" BERT is a method of pre-training language representations, meaning that we train a general-purpose "language understanding" model on a large text corpus (like Wikipedia), and then use that model for downstream NLP tasks that we care about (like question answering). BERT outperforms previous methods because it is the first unsupervised, deeply bidirectional system for pre-training NLP."

https://github.com/google-research/bert

This video explains a legendary paper, BERT. It leverages the Transformer encoder and comes up with an innovative way to pre-training language models (masked language modeling). BERT has a significant influence on how people approach NLP problems and inspires a lot of following studies and BERT variants.

Code https://github.com/google-research/bert (TensorFlow) https://github.com/huggingface/transformers (PyTorch)

The project mentioned (BERT) in the article is really interesting, you should look into it.

https://github.com/google-research/bert

I don't think you could get access to the actual models that are being used to run e.g. Google Translate, but if you just want a big pretrained model as a starting point, their research departments release things pretty frequently.

For example, https://github.com/google-research/bert (the multilingual model) might be a pretty good starting point for a translator. It will probably still be a lot of work to get it hooked up to a decoder and trained, though.

There's probably a better pretrained model out there specifically for translation, but I'm not sure where you'd find it.