That trick always seemed incomplete to me because the output only contains part of the formula; it's missing the constant that actually encodes most of the information. Has anyone ever made one where the output contains everything you need?
Yes, I think a "Cartesian" quine would be pretty tough to create, because you're going from a symbolic representation to an inefficient visual representation. Could a more efficient visual representation solve it? One approach might be to output a bitmap that looks like this:
gunzip(########)
where ######## is a bitmap representation of the raw input to gunzip.