What does HackerNews think of goofys?
a high-performance, POSIX-ish Amazon S3 file system written in Go
And some cool github links:
- https://github.com/kahing/goofys -- a high-performance, POSIX-ish Amazon S3 file system written in Go
- https://github.com/maxogden/mount-url -- mount a http file as if it was a local file using fuse
IMHO, if you're going to do this, I'd recommend not doing this in Postgres itself, but rather doing it at the filesystem level. It's effectively just a tiered-storage read-through cache, and filesystems have those all figured out already.
You know how pgBackRest does "partial restore" (https://pgbackrest.org/user-guide.html#restore/option-db-inc...), by making all the heap files seem to be there, but actually the ones you don't need are sparse files ftruncate(1)'d to the right length to make PG happy? And that this works because PG only cares about DB objects it's not actively querying insofar as making sure they're there under readdir(2) with the expected metadata?
Well, an object-storage FUSE filesystems, e.g. https://github.com/kahing/goofys, would make PG just as happy, because PG could see all the right files as "being there" under readdir(2), even though the files aren't really "there", and PG would block on first fopen(2) of each file while goofys fetched the actual object to back the file.
(IIRC PG might fopen(2) all its files once on startup, just to ensure it can; you can hack around this by modding the origin-object-storage filesystem library to not eagerly "push down" its fopen(2)s into object fetches — instead just returning a file-descriptor connected to a lazy promise for the object — and then have read(2) and write(2) thunk that lazy promise, such that the first real IO done against the virtual file be what ends up blocking to fetch the object.)
So you could just make your pg_base dir into an overlayfs mountpoint for:
• top layer: tmpfs (only necessary if you don't give temp tables their own tablespace)
• middle layer: https://github.com/kahing/catfs
• bottom layer: goofys mount of the shared heap-file origin-storage bucket
Note that catfs here does better than just "fetching objects and holding onto them" — it does LRU cache eviction of origin objects when your disk gets full!
(Of course, this setup doesn't allow writes to the tables held under it. So maybe don't make this your default tablespace, but instead a secondary tablespace that "closed" partitions live in, while "open" partitions live in a node-local tablespace, with something like pg_partman creating new hourly tables, and then pg_cron running a session to note down the old ones and do a VACUUM FREEZE ?; ALTER TABLE ? SET TABLESPACE ?; on them to shove them into the secondary tablespace — which will write-through the catfs cache, pushing them down into object storage.)
As author of https://github.com/kahing/goofys/ I respectfully disagree :-)
Disclaimer: I am the author
catfs - https://github.com/kahing/catfs/ - generic disk cache for fuse filesystems
Compare to catfs (https://github.com/kahing/catfs/) which I recently posted but did not make to front page, and right now it's at 14 stars. I would say both projects have similar audiences comparable in complexity, which would mean front page on HN gave goofys a 20x or so boost in terms of github stars.
Note that the first time I posted goofys it did not make it to front page. @dang emailed me to re-post it and the second time it was boosted to front page.
I've been spending what free time I have on this. It started out as a curious project to learn Go and to prove that a useful and good s3fs like project can be done relatively quickly. These days it's used by companies moving PBs of storage into S3 to research labs trying to analysis RNA sequences with 100s of machines.
A couple things I hope to get done this year:
* a reasonably easy way to use it in conjunction of docker
* a reasonably easy way to expose this over NFS/CIFS (for devices/OSes that don't support fuse)
* a reasonably easy way to do caching
A bigger vision is to build more things on top of relatively commoditized web services so free software can adapt to the 21st century without a large operating budget.