All the examples are portraits of people.

I have to wonder whether it works well with anything else.

trained on celebA, so no, but you could for sure train this on a more varied dataset

Eisenstein

Would it be as simple as feeding it a bunch of decolorized images along with the originals?

atorodius

yes, so infinite training data. but the challenge will be scaling to large resolutions and getting global consistency

drapado

I guess you can always use a two-stage process. First colorize, then upscale

atorodius

yeah, you can use SOTA super res, but that tends to be generative too (even diffusion based on its own, or more commonly based on GANs). it can be a challenge to synthesize the right high res details.

but that’s basically the stable diffusion paper (diffusion in latent space plus GAN superres)

erwannmillon

Yeah, if you have a high res image, you can get color info at super low-res and then regenerate the colors at high res with another model. (though this isn't an efficient approach at all)

https://github.com/TencentARC/T2I-Adapter

i've also seen a controlnet do this.