What does HackerNews think of gcsfuse?

A user-space file system for interacting with Google Cloud Storage

Language: Go

Is this the same gcsfuse that's been around for years, only now with official Google support?

https://github.com/GoogleCloudPlatform/gcsfuse

The only advantage of Google Cloud is the TPUs--if you're not running massive machine learning workloads, AWS is almost always the better choice on features, service, and reliability.

The *link between compute and storage* is not even officially a production product:

"Please treat gcsfuse as beta-quality software. Use it for whatever you like, but be aware that bugs may lurk, and that we reserve the right to make small backwards-incompatible changes." https://github.com/GoogleCloudPlatform/gcsfuse/

If a supposed cloud platform can't even produce a reliable way to access your data, then they have no basis being used in any halfway serious setting.

Disclosure: I work on Google Cloud, and helped out with this effort.

We're hoping to provide some more "How Fermilab did it" later (maybe at NEXT?) but our blog post had a little more information [1]. End-to-end from "Let's do this!" to Supercomputing was about 3 weeks (including some bugfixes to the GCE support in HTCondor).

For people asking, Fermilab put up about 500T into a GCS Regional bucket (into us-central1). We have great peering at Google, so I think this was streaming at about 100 Gbps from Fermilab to us.

As surmised, the science that we ran was lots of individual tasks, each one needing about 2 GB of RAM per vCPU, so during Supercomputing we right-sized the VMs using Custom Machine Types [2]. IIRC, several tasks needed up to 20 GB per task, so we sized PD-HDD to 20 GB per vCPU. All of this was read from GCS in parallel via gcsfuse [3]; we chose the Regional bucket for GCS to minimize cost per byte, and to maximize throughput efficiency (no reason to replicate elsewhere for this processing).

All the data after processing went straight back to Fermilab over that pipe. The output data size though was pretty small IIRC, and I don't think we were ever much over 10Gbps on output.

HTCondor was used to submit work from Fermilab onto GCE directly (the submission / schedd boxes were at Fermilab). and spun up Preemptible VMs. We used a mix of custom-32vCPU-64GB in us-central1-b and us-central1-c, as well as custom-16vCPU-32GB in us-central1-a and us-central1-f. You can see a graph of Fermilab's monitoring here [4] when it was all setup. 160k vCPUs for $1400/hr!

[Edit: Newlines, I always forget two newlines]

[1] https://cloudplatform.googleblog.com/2016/11/Google-Cloud-HE...

[2] https://cloud.google.com/custom-machine-types/

[3] https://github.com/GoogleCloudPlatform/gcsfuse

[4] https://twitter.com/googlecloud/status/798293201681457154