I've been working on this problem for a while. Website upkeep is hard to quantify, but basically every disk fails and every operating system eventually needs a serious upgrade. The timeframe that a system can run continuously is not that long compared to the timeframe that information is relevant. So the most lightweight way to keep something up and running is to make it trivial to port to many hosting configurations by simplifying the toolchain needed to rehost it. (Note that humans are part of that workflow, if it's a company)

I've written a manifesto about making a commitment to keep websites online and maintained for 10-30 years, for people who are maintaining web content: https://jeffhuang.com/designed_to_last/

And on the flipside (from a user's point of view), I've also been working on a background process that automatically captures full-resolution screenshots of every website you visit, creating your own searchable personal web archive: https://irchiver.com/

I've personally been trying to make a commitment to keep my web projects and writing online for 30 years. My original internal goal when I started thinking about this, was to outlast all the content on Twitter, Google+, and facebook.com. One of those has already been met, kind of sadly.

Hm interesting, I read over your 7 guidelines [1] and I would say I agree 50%.

So the most lightweight way to keep something up and running is to make it trivial to port to many hosting configurations by simplifying the toolchain needed to rehost it -- I agree with this, although I would use the words "standards" and multiple implementations. The linked article doesn't appear to emphasize this.

1. Return to vanilla HTML/CSS

Again I feel the relevant issue is "standards", multiple implementations, and the fallback option of "old code" like GNU coreutils (i.e. taking advantage of the Lindy Effect). Not just just HTML/CSS (which certainly meet all of those criteria.)

I thought about this when designing https://www.oilshell.org/ and the toolchain

- The site started in Markdown but is now standardized on CommonMark [1], which has multiple implementations. So I don't see any reason to stick to HTML/CSS.

- The tools are written in Python and shell [2]. Both languages have multiple implementations. (Ironically, Oil is a second implementation of bash! My bash scripts run under both bash and Oil.)

- Python and shell rely on Unix, which is standardized (e.g. Linux and FreeBSD have a large common subset that can build my site).

This is perhaps unconventional, but I avoided using languages that I don't know how to bootstrap like node.js (has a complex JIT and build process), Go (doesn't use libc) or Rust (complex bootstrap).

On the other hand, C has multiple implementations, and Python and shell are written in C, and easy to compile. I'd say Lua also falls in this category, but node.js doesn't.

I feel like this is at least 60% of the way to making a website that lasts 30 years. The other 40% is the domain name and hosting.

This is a pretty a big site now, you can tar a lot of it up and browse it locally (well it's true you may have to rewrite some self links).

Of course, this is my solution as a technical user. If you're trying to solve this problem for the general public, then that's much harder.

[1] https://www.oilshell.org/blog/2018/02/14.html

[2] https://www.oilshell.org/site.html

2. Don't minimize that HTML

Mildly agree if only because it's useful to be able to read HTML to rewrite self links and so forth. Tools don't need it, but it's nice for humans.

3. Prefer one page over several

If you're allowed to use Python and shell to automate stuff, then this isn't an issue. I suppose another risk is that people besides me might not understand how to maintain my code. But I don't think it needs to be maintained -- I think it will run forever on the basis of Unix, shell, and Python. Those technologies have "crossed the chasm", where again I think the jury is still out on others.

4. End all forms of hotlinking

Yes, my site hosts all its own JS and CSS.

5. Stick with native fonts

Yes, custom fonts have a higher chance of being unreadable in the future. I just use the browser's font preference to avoid this issue. It's more future proof and in keeping with the spirit of the web (semantic, not pixel perfect).

6. Obsessively compress your images

Agree

7. Eliminate the broken URL risk

I think too few people are using commodity shared hosting ... I've been meaning to write some blog posts about that. I use Dreamhost but NearlyFreeSpeech is basically the same idea.

It's a Unix box and a web server that somebody else maintains. I absolutely don't care about CPUs, disk drives, even ipv4 vs. ipv6, and I've never had to.

The key point is writing to an interface and not an implementation. Commodity shared hosting is a de facto standard. The main difference between a tarball of HTML and a shared hosting site is that say "index.html" is always respected as /, and a few other minor things.

So I expect Heroku and similar platforms to come and go, but the de-facto standard of a shared hosting interface will stay forever. It's basically any Unix + any static web server, e.g. I think OpenBSD has a pretty good httpd that's like Apache/Nginx for all the relevant purposes.

Github pages also qualifies.

So I guess this is adding sort of a programmer's slant to it. To be honest it took me a long time to be fluent enough in Python and shell to make a decent website :) Markdown/CommonMark definitely helps too. I had of course made HTML before this, but it was the odd page here and there. Making a whole site does require some automation, and I agree that lots of common/popular tools and hosting platforms will rot very quickly. (and like you I've seen that happen multiple times in my life!)

And I think what your guidelines might be missing is a guideline on how to make "progress". For example CommonMark being standardized is progress that has happened in the last 10 years. You don't want to be tied to the old forever. At some point you have to introduce new technologies, and the way to do that is once there's wide agreement on them and multiple implementations. (just like there's wide agreement on HTML)

I think there is / can be progress in dynamic sites too, so you don't have to stick to static!

I’d quibble over at least CommonMark and Python.

CommonMark is not good for flawless longevity: it’s not as bad as non-CommonMark Markdown, but it’s still not fully settled and implementations vary in what HTML they will allow through and almost no one actually stops at CommonMark but implements other extensions which will normally break the meaning of at least some legitimate content.

Python is risky, even if you don’t use anything outside the standard library: consider what happened with Python 2.

I use the CommonMark reference implementation, which I assume stops at CommonMark :) I just download their tarball, and it's pure C, tiny, and very easy to use e.g. from shell or Python: https://github.com/oilshell/oil/blob/master/soil/deps-tar.sh...

Nothing happened to Python 2! I still use it, and it's trivial to build and run anywhere, even if distros drop it. Just download the tarball and run ./configure and make, which I've done hundreds of times for various purposes.

(The same is not true for node.js, Go, or Rust, which as mentioned is one reason I don't use them for my site. If people stopped making binaries for them, I'd be lost.)

Ironically, if you want something that will last 30 years without maintenance and upgrades, Python 2 is better than Python 3. (I use both; Python 3 is very good for many/most things.) There are memes about "security" mostly based on misunderstandings IMO.

As you point out, the real problem is PIP packages, but that's why my site doesn't depend on any. Or if it does, I again make sure I can download a tarball, and not rely on transitive dependencies and flaky solvers.

The Python 2 stdlib is definitely good enough for making a website. It's not good enough for bigger apps, but it's great for simple website automation.

----

The higher level point is that you can always "plan for the future" at the expense of the present. IMO avoiding things like CommonMark and Python will just make your site worse right now, which defeats the purpose of preserving it in the future. So there has to be a balance against extreme conservatism, and there has to be a way of making progress (new standards) while not succumbing to bad fashions. Likewise I think Oil looks like a "retro" project to some, but it does a lot of new things, and that is the whole point.

> (The same is not true for node.js, Go, or Rust, which as mentioned is one reason I don't use them for my site. If people stopped making binaries for them, I'd be lost.)

Thankfully, this no longer appears to be the case for Rust, thanks to mrustc [0], a compiler that can build a working Rust 1.54.0 toolchain (released 2021-06-29) from source. It requires only a C++14-compatible compiler and some other common tools; I've just verified that its build script works with no problems on my Ubuntu machine. To be safe, you'd want to specify exact dependency versions for everything (or better yet, vendor them locally), since the big crates all have different policies for when they break backward compatibility with older compiler versions.

[0] https://github.com/thepowersgang/mrustc