What does HackerNews think of httm?

Interactive, file-level Time Machine-like tool for ZFS/btrfs

Language: Rust

#9 in Node.js
#87 in Linux
#93 in Rust
#29 in Shell
As others have noted, these are really opinionated suggestions. And while it's perfectly fine to have an opinion, many of these vary between "this isn't the way I'm used to Linux doing it" to the actually objectionable.

The ones I find most personally objectionable:

> - Don't make me name pools or snapshots. Assign pools the name {hostname}-[A-Z]. Name snapshots {pool name}_{datetime created} and give them numerical shortcuts so I never have to type that all out

Not naming pools is just bonkers. You don't create pools often enough to not simply name them.

Re: not naming snapshots, you could use `httm` and `zfs allow` for that[0]:

    $ httm -S .
    httm took a snapshot named: rpool/ROOT/ubuntu_tiebek@snap_2022-12-14-12:31:41_httmSnapFileMount
> - collapse `zpool` and `zfs` into a single command

`zfs` and `zpool` are just immaculate Unix commands, each of which has half a dozen sub commands. One the smartest decisions the ZFS designers made was not giving you a more complicated single administrative command.

> - Provide an obvious way to mount and navigate a snapshot dataset instead of hiding the snapshot filesystem in a hidden directory

Again -- you can do this very easily via `zfs mount`, but you'll have to trust me that a stable virtual interface also makes it very easy to search for all file versions, something which is much more difficult to achieve with btrfs, et. al. See again `httm` [1].

[0]: https://kimono-koans.github.io/opinionated-guide/#dynamic-sn... [1]: https://github.com/kimono-koans/httm

Outside of it being a good idea to manually set `ashift=12` on each vdev creation (because disks lie and zfs’s whitelist isn’t omniscient. I’ve seen too many people burned by this) and `atime=off` for datasets (because burning IO updating the metadata because you’ve accessed something is just fucking stupid), the defaults are sane and you can basically refuse to care about them after that.

Every system I’ve used has `compression=on` set as default, which currently means lz4. People set it manually out of paranoia from earlier days I think.

For linux systems you can set `xattr=sa` and `acltype=posixacl` if you like, which offers a minor optimization that you’ll never notice.

I suppose if you don’t like how much memory zfs uses for ARC, you can reduce it. For desktop use 2-4gb is plenty. For actual active heavier storage use like working with big files or via a slow HDD filled NAS, 8GB+ is a better amount.

Dataset recordsize can be set as well, but that’s really something for nerds like me that have huge static archives (set to 1-4MiB) or virtual machines (match with qcow2’s 64KiB cluster size). The default recordsize works well enough for everything that unless you have a particular concern, you don’t need to care.

I should note, beware of rolling release linux distros and ZFS. The linux kernel breaks compatibility nonstop, and sometimes it can take a while for zfs to catch up. This means your distro can update to a new kernel version, and suddenly you can’t load your filesystem. ZFSbootmenu is probably the best way to navigate that, it makes rolling back easy.

You also want to setup automatic snapshots, snapshot pruning, and sending of snapshots to a backup machine. Highly recommend https://github.com/jimsalterjrs/sanoid

If you really find yourself wanting to gracefully deal with rollbacks and differences in a project, HotTubTimeMachine (HTTM) is nice to be aware of: https://github.com/kimono-koans/httm

If you haven't already seen it, for restoring files/browsing different snapshots etc, httm is great: https://github.com/kimono-koans/httm
> How cool would it be if we had a great GUI for ZFS (snapshots, volume management, etc.).

How cool would it be if we had a great TUI for ZFS...

Live in the now: https://github.com/kimono-koans/httm

Excited, because in addition to ref copies/clones, httm will use this feature, if available (I've already done some work to implement), for its `--roll-forward` operation, and for faster file recoveries from snapshots [0].

As I understand it, there will be no need to copy any data from the same dataset, and this includes all snapshots. Blocks written to the live dataset can just be references to the underlying blocks, and no additional space will need be used.

Imagine being able to continuously switch a file or a dataset back to a previous state extremely quickly without a heavy weight clone, or a rollback, etc.

Right now, httm simply diff copies the blocks for file recovery and roll-forward. For further details, see the man page entry for `--roll-forward`, and the link to the httm GitHub below:

    --roll-forward="snap_name"

    traditionally 'zfs rollback' is a destructive operation, whereas httm roll-forward is non-destructive.  httm will copy only the blocks and file metadata that have changed since a specified snapshot, from that snapshot, to its live dataset.  httm will also take two precautionary snapshots, one before and one after the copy.  Should the roll forward fail for any reason, httm will roll back to the pre-execution state.  Note: This is a ZFS only option which requires super user privileges.
[0]: https://github.com/kimono-koans/httm
Really excited about this.

Once support hits in Linux, a little app of mine[0] will support block cloning for its "roll forward" operation, where all previous snapshots are preserved, but a particular snapshot is rolled forward to the live dataset. Right now, data is simply diff copied in chunks. When this support hits, there will be no need to copy any data. Blocks written to the live dataset can just be references to the underlying snapshot blocks, and no extra space will need to be used.

[0]: https://github.com/kimono-koans/httm

If you decide to dive into it depending on your filesystem of choice I can recommend either Sanoid [0] and httm [1] for ZFS or Snapper [2] for BTRFS as automatic snapshot solutions. Good luck with your endeavors!

[0] https://github.com/jimsalterjrs/sanoid

[1] https://github.com/kimono-koans/httm

[2] https://wiki.archlinux.org/title/Snapper

> What is httm? I like this script as a proof of concept.

See: https://github.com/kimono-koans/httm

> But i still can imagine failure modes, eg. inotify might start acting weird when ZFS remounts the watched directory, OOM killer terminates it without anyone noticing, bash loop go haywire when package manager updates that script (bash is running directly from the file and when it changes during execution, it might just continue running from the same byt offset in completely different script).

I mean, sure, scripts gonna script. You're gonna have to make the POC work for you. But, for instance, I'm not sure half of your issues are problems with a systemd service. I'm not sure one is a problem with a well designed script, which accounts for your particular issues, and a systemd service.

> All these things actualy happened to me in the past. Not to say that if you have multiple datasets in ZFS you cannot inotify wait on all of them at once, so you will have to manage one bash process per dataset. And performance of bash and sudo might not be that awesome.

Yes, you can?

Just re this POC, you can inotifywait a single directory, which contains multiple datasets, and httm will correctly determine and snapshot the correct one upon command. Your real bottleneck here is not sudo or bash. It's the zfs command waiting for a transaction group sync, or arranging for the trans group (or even something else, but its definitely zfs?), to snap.

You can also use `httm -m` to simply identify the dataset and use a channel program and/or a separate script to sync. sudo and bash may not have the performance for your use case, hell, they are composable with everything else?

> So for real reliability you would probably want this to actualy run in ZFS/kernel context...

Yeesh, I'm not sure? Maybe for your/a few specific use cases? Note, inotify (a kernel facility) is your other bottleneck. You're never going to want to watch more than a few/10s of thousand files. The overhead is just going to be too great.

But for most use cases (your documents folder)? Give httm and inotifywait a shot.

As someone who is a total ZFS fan, I think the `zfs` and `zpool` commands are some of the best CLI commands ever made. Just immaculate. So this comment was a head scratcher for me.

> I also don't really want to become a ZFS wizard

Admittedly, ZFS on Linux may require some additional work simply because its not an upstream filesystem, but, once you're over that hump, ZFS feels like it lowers the mental burden of what to do with my filesystems?

I think the issue may be ZFS has some inherent new complexity that certain other filesystems don't have? But I'm not sure we can expect a paradigm shifting filesystem to work exactly like we've been used to, especially when it was originally developed on a different platform? It kinda sounds like you weren't used to a filesystem that does all these things? And may not have wanted any additional complexity?

And, I'd say, that happens to everyone? For example, I wanted to port an app I wrote for ZFS to btrfs[0]. At the time, it felt like such an unholy pain. With some distance, I see it was just a different way of doing things. Very few btrfs decisions with which I had intimate experience, do I now look back on and say "That's just goofy!" It's more -- that's not the choice I would have made, in light of ZFS, etc., but it's not an absurd choice?

> "what's the procedure if your motherboard dies and you need to migrate your disks to a new machine?"

If you're setup is anything like mine, I'm pretty certain you can just boot the root pool? Linux will take care of the rest? The reason you may not find an answer is because the answer is pretty similar to other filesystems?

If you have problems, rescue via a live CD[1]. Rescuing a ZFS root pool that won't boot is no joke sysadmin work (redirect all the zpool mounts, mount --bind all the other junk, and create a chroot env, do more magic...). For people, perhaps like you, that don't want the hassle, maybe it is easier elsewhere? But -- good luck!

[0]: https://github.com/kimono-koans/httm [1]: https://openzfs.github.io/openzfs-docs/Getting%20Started/Ubu...

> I think the only thing I would want is early warning for a disk failure.

The only other thing, you may be forgetting, is an app you never knew you needed.[0] ;)

[0, an interactive, file-level Time Machine-like tool for ZFS]: https://github.com/kimono-koans/httm

Not been my experience at all.

I have seen folks get in over their heads though. Like -- "OMG I've somehow broken boot for ZFS on root and I don't know how to fix!!" or "I'm using a custom kernel and OMG things don't work!!"

Canonical has even been complicit. When you download an HWE kernel, you don't also download a newer/matching versions of the tools. This should pretty clearly be considered a bug[0], and fixed but Canonical perhaps doesn't have the bandwidth?

I just can't lay any of these problems at the feet of ZFS though. Right now I trust ZFS a hell of a lot more than I trust Linux to Do The Right Thing. When ZFS seems broken (though I have little experience with native encryption so this may not apply to it), I am far more likely to think the issue is Linux, my distribution, or me, than any problem with ZFS.

I'm a pretty serious ZFS lover though.[1]

[0]: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/193... [1]: https://github.com/kimono-koans/httm

> I’m no expert but always saw Go as the better choice for CLI tooling where performance is important

I'm no expert in golang, so golang may in fact be better(?) by some metric of better, but as the author of a Rust CLI tool[0], I will say that Rust is extremely performant, and pretty fantastic at this very use case. It seems like a sweet spot to me.

[0]: https://github.com/kimono-koans/httm

Very cool. Wish you all the best.

The very day when you need a self taught dev (with sterling ZFS lover bona fides -- https://github.com/kimono-koans/httm), please give me a ring.

I think you have to really love the idea of creating an app/script or hot key for a work flow to really start to enjoy using 'fzf'. And for most, I get it, it's "Where to start? Ugh, looks like work."

For those that haven't tried yet, and want an entry point, I'd highly recommend you play around with the 'fzf' key bindings and completion scripts for zsh[0] to see what's possible. A little app of mine[1] also has an example of what one might call the minimal viable hot key script for you (note: for skim or 'sk', a 'fzf' Rust clone).

[0]: https://github.com/junegunn/fzf/tree/master/shell [1]: https://github.com/kimono-koans/httm

I have been really inspired by 'fzf' recently. Wrote a fun little ZFS utility[0] which I intended to just script with 'fzf', but have since found skim[1], or 'sk', and now use it both as my sole fuzzy finder app (because it's supposed to be interactive and it's faster!), and as a library for my little utility.

I like them both for making fun zsh key bindings, so, so easy.

[0]: https://github.com/kimono-koans/httm [1]: https://github.com/lotabout/skim