What does HackerNews think of zfs-auto-snapshot?

ZFS Automatic Snapshot Service for Linux

Language: Shell

For every ZFS fan, I can recommend zfs-auto-snapshot[1]. I use it on my proxmox server[2] to auto manage snapshots incl throwing away old ones.

[1]: https://github.com/zfsonlinux/zfs-auto-snapshot

[2]: https://pilabor.com/series/proxmox/restore-virtual-machine-v...

Not every operation. It's manual but easily automated. https://github.com/zfsonlinux/zfs-auto-snapshot
(Near) zero-cost snapshots and filesystem-based incremental backups are amazing. Just today I was saved by my auto snapshots [1]. Apparently I didn't `git add` a file to my feature branch and without the snapshot I wouldn't have been able to recover it after some extensive resetting and cleaning before I switched back to the feature branch. It's really comforting to have this easy to access [2] safety net available at all times.

Now that Ubuntu has ZFS build-in by default, I'm seriously considering switching back, and since I too have been burned by Btrfs, I guess I'll stay with ZFS for quite some time. Still, the criticism of the blog post is fair, e.g. I was only able to get the RAM usage in control after I set hard lower and upper limits of the ARC as kernel boot parameters (`zfs.zfs_arc_max=1073741824 zfs.zfs_arc_min=536870912`).

[1] https://github.com/zfsonlinux/zfs-auto-snapshot

[2] The coolest feature is the virtual auto mount where you can access the snapshots via the magical `.zfs` directory at the root of your filesystem.

I'm using zfs-auto-snapshot[1] and zfs send/recv to backup my work desktop machine to a remote server. The naive approach of just doing 'zfs send -R -I snap1 snap2 | ssh remote zfs recv -dF pool'[2] has a number of drawbacks:

* Assumes remote is available 100% of the time. Recovery from downtime at the remote probably requires some manual intervention to get things back in sync.

* If the remote is in a different physical location with limited bandwidth between the hosts, compressing the stream on the fly isn't going to be particularly efficient with bandwidth.

I've built some scripts to help[3], dumping the necessary incremental streams into a local directory and then compressing them with lrzip[4]. This decouples the host and remote systems and the zfs streams compress really well: a 33Mb incremental stream I have here compresses to 4.4Mb with lrzip. Once you have a directory of compressed streams you can push them wherever you want (a remote server where you convert them back into zfs snapshots, giving you a live filesystem; S3 buckets etc.) You also are able to restore using standard operating system tools.

I'd assume btrfs is comparable, but haven't tried it myself.

[1]: https://github.com/zfsonlinux/zfs-auto-snapshot

[2]: see https://github.com/adaugherity/zfs-backup for example

[3]: currently in my local fork of zfstools at https://github.com/mhw/zfstools/blob/master/bin/zfs-save-sna...

[4]: http://ck.kolivas.org/apps/lrzip/README

I have been running ZFS on Linux in production on many storage servers for almost 3 years now. I'd like to address several points from the forum post:

If you want ZFS you need lots of expensive ECC RAM. The fact that somebody is running ZFS in her/his laptop on 2GB of RAM will be of no importance to you when that 16TB storage pool fails during the ZFS scrub or de-duplication because you have insufficient RAM or the RAM is of low quality.

First, I doubt many have 16TB of disk with 2GB of RAM in their laptop.

Second, while it's true that ZFS will use all the RAM it can, this is a good thing, not a bad thing. The more you can cache in fast RAM, the less you can rely on slow disk. I've never understood why people think this is a bad thing. The ZFS ARC is also highly configurable, and the system administrator can limit its capacity in RAM, by setting hard limits.

However, the requirement of "you needs lots of RAM" is bogus. You only need "lots of RAM" when you enable deduplication, as the deduplicated blocks are stored in a table (the DDT), which is stored in the ZFS ARC, which is indeed stored in RAM. If you do not have enough RAM to store the DDT safely (a good rule of thumb is "5GB RAM for every 1TB of disk"), then the DDT will spill to platter, and degrade overall pool performance. However, if you have a fast SSD as an L2ARC, say a PCI express SSD, then the performance impact on DDT lookups are minimized.

If you do not enable deduplication, then you do not need "lots of RAM". I am running ZFS on my 12GB workstation for /home/ without any problems, and initially it was installed with only 6GB. The upgrade was at the request of our CTO, who wanted to upgrade all the admin workstations, even if we didn't need it. Of course, I'll take advantage of it, and adjust my ARC limits, but I didn't need it.

Lastly, ZFS doesn't require ECC RAM any more than HAMMER or ext4. ECC is certainly a welcomed advantage, as ZFS blindly trusts what comes out of RAM to be correct. But making the argument for ECC should be applied to any situation where data integrity must be exact, 100% predictable, and redundant. ZFS isn't any more special here than any other filesystem, as far as RAM is concerned. In reality, every system should have ECC RAM; even laptops.

Well known data base degradation (not suitable to keep SQL data bases).

I have not heard of this. ZFS has very rigid and strict synchronous operation. The data that is queued to be written is group into a transaction group (TXG) and flushed synchronously to the ZFS intent log (ZIL). Once the ZIL has the TXG, it sends the acknowledgement back that it is on stable storage, and is flushed to the pool in the background (default every 5 seconds). This is default behavior. When ZFS says it has your data, it has your data.

Now, if the concern is performance, ZFS does fragment over time due to the copy-on-write (COW) design. This is problematic with any COW filesystem, including Btrfs. ZFS minimizing its impact through the slab allocator. More details on that can be found from the author's blog: https://blogs.oracle.com/bonwick/entry/zfs_block_allocation

No file grained journaling like hammer history.

I'm not sure what the OP means here, but if it's related to snapshots, then it's false. Granted, snapshots are not executed by default when you build a ZFS dataset. You must execute "zfs snapshot" manually, typically in cron. But ZFS snapshots are cheap, first class filesystems, that can be navigated to get to a specific file for restoration, or the entire dataset can be rolled back to a single snapshot.

Sun had Time Slider, which was baked into the operating system by default, and allowed you to rollback a ZFS dataset easily. Even though the OpenZFS team doesn't have Time Slider, we do have https://github.com/zfsonlinux/zfs-auto-snapshot which mimics its behavior. It set this up on all of my storage servers with the following policy:

Snapshot every

* 15 minutes, keeping the most recent 4

* 1 hour, keeping the most recent 24

* 1 day, keeping the most recent 30

* 1 week, keeping the most recent 8

* 1 month, keeping the most recent 12

This seems pretty fine grained to me.

No volume growing at least FreeBSD version.

I'm not sure what the OP means here. Does he mean growing the pool, a dataset, or a ZVOL? In the case of growing the pool, as hard drives are replaced with larger capacity drives, once all the drives have been replaced, the pool can automatically expand to fit the larger capacity. "zpool set autoexpand=on pool" must be executed before replacing the drives.

If the OP means growing datasets, then I guess we're talking about quotas, as datasets by default occupy the entire size of the pool. By default, if I have 2TB of ZFS pool space, and I have 100 datasets, each of the 100 datasets occupies the full 2TB of space. As datasets slowly fill, each dataset is aware of the occupied storage of the other datasets, so it knows how much free space remains on the pool.

If quotas are needed, then "zfs set quota=100G pool/data1" can be executed to limit that dataset to 100GB. If at any time you wish to expand, or shrink its quota, just re-execute the command again, with a new value.

Finally, if ZVOLs are the concern, after creation, they can be resized. Suppose a ZVOL was created with "zfs create -V 1G pool/swap" to be used as a 1GB swap device, and more swap is needed. Then "zfs set volsize=2G pool/swap" can be executed to grow it to 2GB, and it can be executed live.

The only thing I think the OP means, is adding and removing VDEVs. This is tricky, and he's right that there are some limitations here. You can grow a pool by adding more VDEVs. Helping in #zfs and #zfsonlinux on Freenode, as well as on Twitter and the mailing lists, I have seen some funky ZFS pools. Usually, the administrator needs more space, so a couple more disks are added, mirrored, then thrown into their existing pool, without too much concern about balancing. While you can add VDEVs all day long, once added, you cannot remove them. At that point, you must tear down the datasets and the pool, and rebuild it with the proper layout you want/need.

But, in terms of growing, I'm not sure what the OP is referring to.

Legally associated with Oracle.

Maybe. There are basically two ZFS implementations out there, one which Oracle has legal rights to, and the other they don't. Oracle's ZFS is proprietary software, and is the ZFS shipped with Oracle Solaris. Currently Oracle ZFS is at ZFS version 6 and pool version 35. The version used with FreeBSD, PCBSD, GNU/Linux, Illumos, and the rest of the Open Source operating systems forked from Sun ZFS at pool version 28 and ZFS version 5. By the CDDL, it is legal to fork the software, and Oracle cannot claim any copyright claims over the fork, as they are not the owner of the forked copyright.

Oracle may have some software patents over ZFS (I haven't looked) that could be troubling, but if so, Sun released ZFS to the community with the intent for the community to use it and make it better. If software patents on ZFS exist, my guess is they are royalty-free. Again though, I haven't looked this up, and it really doesn't bother me.

Native encryption for ZFS has been only available in Oracle version.

While true, this really hasn't been a problem. Most storage administrators that want encrypted ZFS datasets, either put encryption under ZFS, such as LUKS, or put it on top of the dataset, like eCryptfs. Also, the OpenZFS codebase has "feature flags", which setup the framework for things like native encryption, and last I checked, it was actively being worked on. Initially, native encryption was not a priority, mostly due to the fear of losing backwards compatibility with the proprietary Oracle ZFS. I think that fear has become a reality, not because of anything that the OpenZFS developers have done, but because Oracle is pushing forward with their proprietary ZFS, without any thought about staying compatible with OpenZFS. At any event, this point is moot, and any storage administrator who needs/wants encryption with OpenZFS knows how to set it up with a separate utility.

Upstream is dead. I don't know much about you but OpenZFS doesn't inspire lots of confidence in me.

This is news to me. OpenZFS is very actively developed. There is TRIM support for SSDs, and code is actively shared between the FreeBSD and GNU/Linux teams. There are OpenZFS developers from Illumos, SmartOS, OmniOS, as well as the LLNL contract, and plenty from the community. So, I guess "[citation needed]".