Some of us want a record of what was, not a hallucination of what might have or could have been.
Courts, for example. Forensic science was revolutionized by widespread adoption of photography leading to a reduction of the importance given to witnesses. Who also hallucinate what might have happened.
So when I took an 8 second exposure of the aurora on Friday and then used Capture One to process the raw to make it more vivid than it was in real life - is that a record of what was?
Don’t get me wrong, I’m not super keen on AI type stuff in cameras as a whole. The line is muddy though. A smartphone camera straight up can’t capture the moon well, or at all. If it then looks more like it did in real life after processing is that better or worse than my above example?
How often do you capture auroras or other beauty shots, vs. readouts on your electricity meter, stickers on the back of your furnace, receipts, and a hundred other displays and documents you need to send someone? I definitely do plenty of the latter, and in such cases, I'd really appreciate the AI to not spice things up with details it thinks should be there.
I'm by no means against the feature. Hell, I shoot 90% of my family photos in Portrait mode on my Galaxy phone, which does some creative bluring and probably some other magic[0]. I just really appreciate being able to turn the magic on or off myself. That, and knowing exactly what the magic is[1].
--
[0] - I don't know what exactly it does, but switching from normal to Portrait mode is all it takes for my photos to suddenly look amazing, as judged by my wife, vs. plain sucking.
[1] - See e.g. "scene optimizer" in Galaxy phones. It's a toggle on normal photo mode. I have no first clue what it does, I can't see any obvious immediate difference between shots taken with vs. without that feature.
When I'm taking a picture of a receipt or sticker behind a machine, I don't actually want a literal photograph of the entire scene but just a reproduction of the text content.
Any environmental lighting, color and texture of the desk, and all other visual detail are only a distraction.
So if the camera would recognize this intent and just give me the receipt looking like it came from a scanner, that would in fact be a great improvement. So I think your example is in fact a point in favor of having AI meddle with most photos that people shoot.
There's a pretty famous example of Xerox's scanners using compression on the scans done...except the "compression" would sometimes substitute the wrong letter or number on text-based content. [1]
So I can foresee pretty straightforward problems with merely storing the text content from the images, to say nothing of any less binary "is it correct" questions.
I recently took a picture of a lizard on a granite with large grains. When I zoomed in to identify the type of lizard I saw that all the grains and some leaves on a tree had been simplified with some type of swirl. I find it unlikely those swirls were artifacts of the sensor itself. My assumption is the effect is related to compression given how often it repeated but I'm not sure.
> Some of us want a record of what was, not a hallucination of what might have or could have been
Yes, but that doesn’t imply “A camera is supposed to take pictures of what it sees”, only “cameras sometimes are supposed to take pictures of what they see”.
Some of us prefer a nice picture over a more exact record of what was; some of us will even argue that such manipulated pictures are better at capturing what was precisely because they sacrifice some of the physical reality for the non-physical essence of a moments one’s memory of such a moment.
That moon photo is a nice example. Smartphone cameras aren’t very good at capturing what the full moon looks like in our memory.
Pretty much all modern digital cameras are using heuristics and algorithms to construct the image you see - it's not just a sensor grid and a bitmap file and it hasn't been for a long time.
The important property is how the pixels are correlated with the physical reality being imaged - because the goal is to reason and learn about the depicted subject though information in the photo. Heuristics and algorithms for demosaicing, white balance, auto-brightness/dynamic range, lens collection, removing motion blur, etc. improve the correlation or improve our ability to see it. This is fine, though you need to be aware at time which properties of the image are to be treated as relative vs. absolute.
This is also a far cry from having your camera think, "I'm a consumer camera! Normies often shoot pictures of the Moon, so this fuzzy circle must be it; let me paste a high-resolution photo of the Moon there", or "gee, normies often shoot sportsball, so this green thing must be astroturf, and this grey blob is probably the ball", etc.
Big difference between a fancy interpolation algorithm that compiles to 500 bytes and another that takes many more orders of magnitude of space because it also contains data used to add details from what it thinks other similar photographs have
Courts, for example. Forensic science was revolutionized by widespread adoption of photography leading to a reduction of the importance given to witnesses. Who also hallucinate what might have happened.