An AI that handles text, images, audio, and video, not just words. Now it can misunderstand you in every available format.
It's multimodal now, which means it can ignore your instructions in voice mode too.
+1407
Think they got it wrong?
Add your own definition of Multimodal