Yeah. Does the Uber car really capture video at 480p and 15fps? Also only releasing the video conveniently ignores the fact that these cars have IR and LIDAR. The pedestrian is hard to see in this video essentially because it is dark and they are wearing dark clothing. Neither of these are at all obstacles to LIDAR and IR, and the video at least shows us that the road is clear of obstructions.
>Does the Uber car really capture video at 480p and 15fps?
Probably not, but it isn't like the inputs to the self-driving models really need to be better than that. Lower resolution helps your processing time a lot and there's little point in having an FPS higher than your processing time.
I'm sure at least the collision avoidance part of the system would need to poll at a much higher rate than 15fps. That's up to 67ms latency you're adding. With enough miles that delay could kill people.
Average human reaction speed is around 215ms. Not an apples-to-apples comparison because humans can react much faster to continuous situations (humans have a timing accuracy of around 9.5ms) while a machine-learning model is limited to only reacting once per frame, but still.
If you wan't to compare against human "sample rate", it'd be equivalent to at least ~200 FPS (in order to get the same accuracy with a camera). Sure, the signal takes a moment to plumb its way through, but that's irrelevant to spotting objects.
If they're actually feeding data at 15 FPS into their ML model, then what the fuck were they expecting? Correlating movements at those framerates would be nigh-impossible.
Relying on ML for this is already comically irresponsible, but that'd just be ridiculous.
My ass, mostly. I'm extrapolating based on monitor framerates and how accurately we can see the velocity of fast-moving objects, and that I can spot a timing difference of ~5ms reliably.
Human eyes are almost comparable in terms of a framerate based on the neuron spiking rates, which are somewhere over 250-500Hz max. Obviously that's not directly comparable though, but it gives an idea of how well we can deal with moving objects.
I think they're pulling it from noticeable monitor FPS rates. I'm not an expert on machine vision, so I couldn't tell you the FPS needed to correlate movements between frames.
Average human reaction speed when driving, from the time the dangerous situation happens to the time of first reaction, is usually considered to be in the 700-2000 ms range. The lower bound is with optimal lighting and after something has alerted the driver in advance (e.g. a police car dashing past you a moment before), and the first reaction is not "the brakes are slammed" but "the foot is lifted from the accelerator". Hitting the brakes can take another 100-200 ms.