If Emad tweeted this image [1] made with SDXL, then text in image could possibly better!

1:https://twitter.com/emostaque/status/1671885525639380992

Text will be better due to simple scale, but the text will still be limited due to the use of a CLIP for text encoding (BPEs+contrastive). So that may be SD XL 0.9 but it should still be worse due to not using T5 like https://github.com/deep-floyd/IF