Yeah, like the way DNA evidence is used or bullet matching or finger print matching or blood spatter analysis or lie detector tests or fire analysis…
All of these have massive error rates (for various reasons, but simply that it’s an uncontrolled environment with potentially bias data collectors & scientists opens lots of issues).
I agree better laws should be in place, but I suspect the there may be less issues with “enhance” than many of the items listed above.
Right, there many other cases that require being dealt with complexity...
But that would be a fault in logic - not treating Bayesian contexts (of priors and posteriors) properly, outside a solid framework. Which should be a recognized part of the process.
Not many centuries ago a form called "spectral evidence" was used, according to which that the victim stated that the suspect appeared to them in dreams and fantasies was valid evidence - it was used in the Salem witch trials, for example.
It is possible to see a definite lack of "good common sense". (And a parallel title on the HN frontpage today goes "[Autoconf] makes me think we stopped evolving too soon".)
This said,
> there may be less issues with “enhance” than many of the items listed above
one thing is imperfect measurement and false positives, another hallucinating details...
That's not what they're doing. This was using one of those generative ai hallucination tools that "add detail" to a video to make it look fancier. Something like that clearly has no place in a courtroom.
Silly comparison. The article clearly states that Topaz Labs’ tool was used. That tool is known for being quite good but also it does still hallucinate in small portions of the images sometimes.
Photography contests, for example, clearly exclude Topaz Labs-style gen-AI enhancements from entering.
It certainly sounds good on paper, but I feel like this has unintended consequences: Evidence captured with phone cameras is quite literally null and void. Invalid. Phones use a metric ton of AI enhancements to get anywhere near the quality of what dedicated cameras could do 15 years ago.
AI to enhance faces.
AI to denoise the insufferably noisy phone camera.
AI to align and stack multiple pictures to get better low light performance.
AI to replace the difficult to capture moon with moon pictures taken by good cameras in your area.
AI to align and stack multiple pictures to get better dynamic range.
AI to convert higher dynamic range into normal dynamic range.
AI to upscale digitally zoomed in pictures to get a consistent resolution.
AI to align and stack multiple pictures to get more resolution.
This is what your phone does on the fly. Everything your phone captures is AI-enhanced. A small sensor physically can't capture anywhere near the quality of dedicated cameras. So they use software to bridge the gap. Again, AI-enhanced, invalid evidence.
For most of them, this is not AI, these are standard image processing algorithms which are already plugged into the digital processor of a 15 years old camera.
They do have predictable and determnist output, as for simple jpeg compression which is no AI.
> [...], these are standard image processing algorithms which are already plugged into the digital processor of a 15 years old camera.
I don't know what to tell you other than this is not the case at all. Computational photography failed miserably in the standalone camera market. Photographers love manual control. Cameras with built-in computational photography features flopped hard, nobody bought them. Standalone camera innovation is therefore limited to make it as easy and reliable as possible to get a good shot. Everything else is done in post processing software, such as Lightroom or Photoshop, never in camera. For the record: My 15 years old Canon EOS 5D Mark II can't do a single thing I listed in my comment aside from basic noise reduction for jpeg images. And that camera was considered revolutionary at the time. But the camera sure as hell makes it easy to capture the shots I need, and post process them on my computer to get exactly what I want.
So why don't you stop and think that maybe phones should not do any of these things? Would you like to be thrown in jail because someone's phone's "ai to enhance faces" made it look like you accidentally?
As it should. AI “enhancement” is just making up extra details in the video that aren’t there to begin with. There is no rational basis to allow it to be used because the only reason to present that “enhanced” footage is because the actual footage doesn’t show what you want it to show, but by having it artificially enhanced you can then convince the jury that the footage is more detailed than it really is.
That this is even up for debate is insane. It is literally no different from allowing someone to present photoshopped images as evidence.
Yep. The problem is not only that the AI enhanced image may contain details that are not in the original, but also that the sharpened image may give the impression that something (some feature) cannot possibly be there and yet it was there but impossible to discern because it was smaller than a pixel
State of Connecticut v. Swinton Ruled that admission of digitally enhanced evidence was allowed. This is going to be a long road full of nuance before we land on a largely socially accepted test for how much adjustment is allowed in court cases.
There's a good rundown of many relevant cases in different US jurisdictions here [0]. It looks like up through the date of publication courts were pretty consistent in ruling that digitally enhanced images may be admissible if it can be shown that they are accurate replicas of an original and the enhancement was only used to make things that were already in the image more clear.
In the case of AI I do think that that will be harder to prove, because there's no specific algorithm that you're following that can be shown to be unable to introduce misleading artifacts.
AI enhanced images (as in enhanced with generative Ai) have nothing in common with digitally enhanced images, other than maybe they're both done in digital form. The process, how the image is transformed and result are wildly different and must be treated differently.
> At the trial, the state (plaintiff) introduced photographs of a bite mark on the victim’s body that were enhanced using a computer software program called Lucis. The computer-enhanced photographs were produced by Major Timothy Palmbach, who worked in the state’s department of public safety. Palmbach explained that he used the Lucis program to increase the image detail of the bite mark. Although the original photographs contained many layers of contrast, the human eye could perceive only a limited number of contrast layers. After digitizing the original photographs, Palmbach used the Lucis program to select a particular range of contrast. By narrowing this range, certain contrast layers in the photographs were diminished, thereby heightening the visual appearance of the bite mark. Palmbach clarified that the Lucis program did not delete any contrast layers; rather, the contrast layers that fell outside of the selected range were merely diminished. Indeed, nothing was removed from the original photographs by the enhancement process. Palmbach also testified that the Lucis program was relied upon by experts in the field of forensic science. The trial-court judge found that the computer-enhanced photographs were authenticated. Because the photographs satisfied the other requirements for admissibility, the trial-court judge admitted the photographs. Subsequently, the jury convicted Swinton. Swinton appealed.
Perfect example. Digitally enhanced in this context should mean "bring out the details of already captured photons - so our eyes can see better" and nothing else. Generative AI is completely different. It's even in the damn name!
No? There is no nuance here. The Swinton case is afaict a discussion on basic deterministic contrast increase.
There is no element of "AI enhancement". AI enhancement means "apply a statical model to make up information that is not present in the source data". Even if you take AI to be something more than a glorified statistical model, it cannot add any detail that is not in the original data. What it can do is, based on the biased training data, create imagery that looks real and can (1) make jurors believe the image presented are real (because the statistical model is geared to making plausible imagery) and (2) introduce details that are useful to the side presenting the images. The first is bad because it means the jury believes the fake image over the lower quality source data because it looks better despite being fiction, and the second is no different from asking someone to improve the image in photoshop.
I find it funny that now CSI style zoom and enhance is feasible, except the reality is very much like it’s classic parody in red dwarf where 4 pixels in a reflection are just used to come up with something via diffusion.
I find it concerning that so many important decisions about "A.I." are being made lately by people who barely understand the hype of it, let alone the actual reality of it. People putting it to work in ways it's either not ready for yet, or straight up not remotely capable of / designed for; Panicking and wanting to ban it entirely in various uses it is designed for / performs well at; Using it to replace actual people then being totally shocked when it completely "fumbles the ball" on some critical task; Putting it in charge of autonomous weapons systems; Using it to game rent prices, hiring for jobs, etc, etc, etc. Too many "powerful" people on the "A.I. hype-train", and many (most?) of them probably don't have very ethical interests in mind, I'm betting.
And also remember that we have had attempts to produce identikit police sketches automators with unclear generative (hallucinative) boundaries, such as:
# "Developers Created AI to Generate Police Sketches. Experts Are Horrified"
Hmm this seems incorrect. There’s no AI use for compression, that’s just standard HEIC.
There is a variety of computational photography tricks to enhance the quality of sensor data. However none of it is generating data wholesale but is rather just doing combination of sensor pixels to create the image.
To the best of my knowledge, Apple has no generative post processing so wouldn’t fall afoul of this.
Samsung on the other hand do generate content that isn’t in the shot, for example if it thinks there’s a moon it’ll insert a virtual one.
Google also have various post processing technologies like picking the best of faces from multiple photos. However it does mark them as edited.
I don’t know of any smartphone doing generative upscaling. It’s still quite a costly process that doesn’t scale well to mobile SoCs yet within a user acceptable timeframe .
Apple is using some sort of generative technology[0], though not clear if it's done in the displaying viewer or in the post processing, or if it's Diffusion based.
^ this parent comment is an example of the common human generative failure mode called “hallucination” which introduces fictional details about the iPhone camera
This was debunked by the original photographer as an actual leaf in the way. The comments on that post link to the authors tweet where they admit that.
Codecs don’t hallucinate new things. They have a much stricter input and output. HEIC and AV1 cannot add materially new information that wasn’t in the source data.
The rest of your points are too vague such that they could basically be applied to any number of computer algorithms.
Fair. Though in the case of the primary article the issue is hallucination, and they likely mean generative AI tools that infill to get the resolution increase
This may not be germane to your example, but in my experience of German -> English translation creating several sentences can be the smoothest way to make sense out of something German gets across in a couple of overloaded compound words.
In general German is much more verbose than English. Compound words can sometimes shorten sentences, however the English dictionary has significantly more vocabulary (see comparisons between Shakespeare and Goethe for example). In German you use causal sentences. Because of that German texts are typically much longer than the English originals, e.g. IIRC the goblet of fire was 150 pages longer in english
I think it's not said often enough that LLMs hadn't solved translation problem or anything: we only got a new tool, and it is not a perfect universal everything tool.
- classical MTs: just jarring, "every work and without slack produce blunt child lift"
- pre-LLM AI MT: most natural, but drops expressions that it's not sure about: "We choose to visit in this decennial and it's not easy."
- LLM: too US English centric, dry, and weighing flows over content too much: "Is it one obsolete-aesthetic word? So scary to say that."
All these are useful, just none are panacea to translation problem(including humans).
The original example[0] of a lot more happening due to iPhone's processing, far beyond compression. It was hotly discussed[1] here. I was under the impression that's a well known fact now.