Is there a way to have current AI tools maintain consistency when generating multiple images of a specific creature or object? For example, if there are images of 'Dr. Venom' they need to look similar, or if there are images of the same space ship.

Yes, right now you have 3 options:

- dreambooth, ~15-20 minutes finetuning but generally generates high quality and diverse outputs if trained properly,

- textual inversion, you essentially find a new "word" in the embedding space that describes the object/person, this can generate good results, but generally less effective than dreambooth,

- LORA finetuning[1], similar to dreambooth, but you're essentially finetuning the weight deltas to achieve the look, faster than dreambooth, much smaller output.

1: https://github.com/cloneofsimo/lora