What does HackerNews think of hhvm?

A virtual machine for executing programs written in Hack.

Language: C++

#4 in PHP
> are they using some subset of the language?

They're using a derived language called Hack.

> Does the compiled code use a garbage collector? Or reference counting?

Not sure but it's open source so I'm sure you can dig up the answers one way or another: https://github.com/facebook/hhvm

The answer to your first question is yes.

Your second question is based on an incorrect premise. Hack is internally built, but it is also open source:

https://github.com/facebook/hhvm

I joined Facebook in 2019 and left this year. Prior to FB, I worked mostly with python, javascript, and C++. Even at FB, I worked mainly in python (instagram backend), but spent a lot of time in the Hack codebase.

My experience was that FB's Hack + HHVM stack is much easier to work with and felt more productive than any other backend stack I've used. It's important to consider that a huge portion of Facebook's backend is one giant HHVM monorepo (called www). The consistency and uniformity has allowed FB to build lots of tooling and developer productivity on top of this one stack. For example, when you add feature flags, the tooling will automatically create a diff (PR) to remove the feature flag once it has been fully rolled out for a few weeks.

There are rough edges and weirdnesses, but HHVM is pretty actively being improved. Old mutable builtins are being (or have been) removed, the type system has gotten better, better error and warning messages all the time. FB is very responsive to data on developer productivity.

EDIT: Another anecdote, when I first joined I was working on both the Instagram python codebase and the Hack API codebase (some of Instagram's APIs are in Hack/HHVM). I constantly wondered why we didn't migrate the Hack code to python, and talked with various people about proposals and potential paths to do the migration for the APIs I worked on.

After a few months of working in both codebases, I completely flipped. After witnessing insane bugs and horrible architectural contortions designed to mitigate python performance issues, I wondered why we didn't just migrate the python codebase to Hack. Python (on cpython/cinder) is just not ready for large scale web services, and Hack is a much more productive environment than Java/C++/Rust for backend.

Typescript may be an even better option, but has some issues of its own.

Though it is internally-built, Hack is already open source at https://github.com/facebook/hhvm/.

FB uses a pretty wide array of languages internally -- I don't know if they release statistics publicly, but you can filter/search their open-source projects by language at https://opensource.fb.com/projects/#filter.

While tools like MySQL and PHP aren't as bad as the community makes them out to be - Facebook is a terrible example how "impressive" these tools can be considering that Facebook almost entirely rewrote both of those tools (https://github.com/webscalesql/webscalesql-5.6, https://github.com/facebook/hhvm)
It's all open source. Here's HHVM: https://github.com/facebook/hhvm. What's stopping you from working on it now? For all intents and purposes you would be hacking on it with them.
HipHop its Facebook sponsored just in time compiler for php. HHVM (Hip Hop Virtual Machine). My understanding is its 95%ish compatible with most php. They're working using framework unit tests to get the percent higher. \nhttp://www.hhvm.com/frameworks/

https://github.com/facebook/hhvm

That's because it isn't. Or at least not in the traditional sense. It's not just some old bullshit, scripted in PHP, running on an array of scrappy LAMP boxen.

They run their PHP on HHVM for one:

https://en.m.wikipedia.org/wiki/HipHop_for_PHP

https://github.com/facebook/hhvm

...and yeah, it executes PHP code, for sure. But right there, things are already different, and the reality is that they've written a substantial code base in C/C++.

And, two, I'm sure they retain some serious business proprietary trade secrets about their server infrastructure, meaning that while the web front-end might render out HTML like a souped-up CDN, behind the scenes, there is a shit ton of other stuff going down.

Honestly, I think they just leave the file name extensions in the URL for the sake of nostalgia.