That's not enough for reconstructing the face of John Malkovich from text, you need minute facial feature parameters (eye shape, nose shape, eye-nose distances etc etc)
Because he is famous on the Internet, CLIP “knows” what John Malkovich looks like. Or, more accurately: what an image people would label “John Malkovich” feels like.
That's not enough for reconstructing the face of John Malkovich from text, you need minute facial feature parameters (eye shape, nose shape, eye-nose distances etc etc)