Software developer that knows nothing about ML: What would it take to mess around with this?

From a skim of the README it seems like install the package, and then find some training dataset, and then ??? to use it.

An older, but similar and still impressive alternative is available here: https://github.com/CompVis/latent-diffusion

If you have a decent amount of VRAM, you can use it to start generating images with their pre-trained models. They're nowhere near as impressive as DALL-E 2, but they're still pretty damn cool. I don't know what the exact memory requirements are, but I've gotten it to run on a 1080 TI with 11gb.

EDIT: I also tried a 980 with 4GB of RAM a while back, but that failed...so you probably need more than that.