What does HackerNews think of seaweedfs?

SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.

Language: Go

#22 in Kubernetes
A very good alternative is seaweedfs https://github.com/chrislusf/seaweedfs/ based on facebook haystack paper (efficient small files) & more.
I am in the process of migrating off of Rook Ceph after using it in production for two years. Setting it up is easy thanks to Rook, but wait until Ceph gets under load, then the real fun begins. If you only need object storage, I suggest looking into SeaweedFS[0]. It's a far more lightweight and performant solution.

[0]: https://github.com/chrislusf/seaweedfs/

Usually you would benchmark on difficult tasks, not the easiest one. For computers, batch IO operations are much faster than random IO and can easily saturate the network.

This benchmark uses large batch size, 64MB, to test. There is nothing new here. Most common file systems can easily do the same.

The difficult task is to read and write lots of small files. There is a term for it, LOSF. I work on SeaweedFS, https://github.com/chrislusf/seaweedfs , which is designed to handle LOSF. And of course, no problem with large files at all.

As a main contributor to an open source project (https://github.com/chrislusf/seaweedfs), I can confirm that this finding is so true.

However, seems this research did not look into Apache projects, which basically maintain a different culture to encourage more contributors, so much as to encourage the main contributors to refrain from jumping to solve an issue until another person steps in first.

> That would make it impossible to link directly to a single storage server as a single file would also be distributed.

Check out https://github.com/chrislusf/seaweedfs/ implementation of reed solomon. Small files can still be served from 1 server.

It's also efficient for small files, which a image store requires.

MinIO team care about an issue if you are paid customer, not for people who use the open source. Indeed MinIO is not even fully S3 compatible with many edge cases and close the issues related to it by saying it’s not a priority.

You might want to look at other options as well like SeaweedFS [0] a POSIX compliant S3 compatible distributed file system.

[0] https://github.com/chrislusf/seaweedfs

I don't know when this was written, but MinIO does not have a great story (or really any story) around horizontal scalability. Yes, you can set it up in "distributed mode", but that is completely fixed at setup time and requires a certain number of nodes right from the beginning.

For anyone who wants HA and horizontal elastic scalability, checkout SeaweedFS instead, it is based on the Facebook "Haystack" paper: https://github.com/chrislusf/seaweedfs

(I work on SeaweedFS) How about using SeaweedFS? https://github.com/chrislusf/seaweedfs

With your dedicated server, the latency is consistent, No API/network cost. Extra data can be tiered to S3.

SeaweedFS as a Key-Large-Value store https://github.com/chrislusf/seaweedfs/wiki/Filer-as-a-Key-L...

Cloud Tiering https://github.com/chrislusf/seaweedfs/wiki/Cloud-Tier

I am working on SeaweedFS. It was originally designed to store images as Facebook Haystack paper, and should be ideal for your use case. See https://github.com/chrislusf/seaweedfs

And it already supports S3 API, and other HTTP, FUSE, WebDAV, Hadoop, etc.

There should be many existing hardware options that is much cheaper than AWS S3.

I am receiving $337 per month as of now for https://github.com/chrislusf/seaweedfs

Not something to be proud of if including the time spent to evolve the project.

Lowest I've seen is https://www.hetzner.com/dedicated-rootserver/sx62 at 1.6€ / TB / month. Anyone seen lower ?

You should be extra careful on big servers with little bandwidth, you might need a month to fill/empty/rebalance them.

How will you host the images ? Metadata will become a bottleneck before hdd size.

Check out https://github.com/chrislusf/seaweedfs

This is the "quick installation": http://docs.ceph.com/docs/master/start/

It really shouldn't be this complex. I would love to just be able to boot an executable with a simple config file and be done with it. SeaweedFS shines a light on how this could be improved: https://github.com/chrislusf/seaweedfs

SeaweedFS Distributed File System

https://github.com/chrislusf/seaweedfs

Seaweed-FS is a simple and highly scalable distributed file system. There are two objectives: to store billions of files! to serve the files fast!