Even taking into account the limited resolution, this is more like SD1.
https://imgur.com/a/nn9c0hB
The release is more about the multimodal captioning which is an objective improvement. I'm not a fan of the submission title.