They didn't mention gpt-4-32k. Does anybody know if it will be generally available in the same timeframe?
There's still no news about the multi-modal gpt-4. I guess the image input is just too expensive to run or it's actually not as great as they hyped it.
>I guess the image input is just too expensive to run or it's actually not as great as they hyped it.
We already know they have a SOTA model that can turn images into latent space vectors without being some insane resource hog - in fact, they give it away to competitors like Stability. [0]
My guess is a limited set of people are using the GPT-4 with CLIP hybrid, but those use-cases are mostly trying to decipher pictures of text (which it would be very bad at), so they're working on that (or other use-case problems).