Can you expand a bit on the quality of Google's internal practices with user dat...

Eridrus · on Sept 25, 2022

In general, SWE/SRE do not have access to user data. SWEs just have no prod access at all, and SREs generally only have the ability to run signed binaries with code that has been code reviewed and submitted, though there are obviously break glass features with auditing. Though I am not super familiar with them.

ML is where things tend to be a little bit more grey overall since being able to look at data is very useful for development, so some things are scrubbed for PII, but then accessible in some form. But for things like GMail or Photos, I would assume nobody (including ML engineers) can read your data as these are basically impossible to sanitize.

Some products have systems train ML models without engineers seeing the data, e.g. spam filtering, even when the underlying data is considered sensitive.

hansvm · on Sept 25, 2022

Basically all the data is available with no oversight if you ask permission and have some allusion to a relevant $JOB reason to need it.

The fairly recent case of people's private conversations being shipped out to basically unvetted contractors for labeling and analysis (and subsequently leaked) should serve as sufficient evidence that "shit happens," and if private conversations without having even initiated an interaction with your Google devices are being tossed around and leaked, forgive me if I don't believe that when producing the tagging, timeline, and album features in Google Photos there wasn't some underpaid, unwatched contractor snooping through my photos without my permission.

lrem · on Sept 25, 2022

I believe I can share an anecdote: a while ago I have uploaded a bunch of photos of a work event to the corporate account. This being corporate, it runs pre-release of everything. The gallery managed to hit some bug in the jillions of lines of Javascript, which I never cared to understand.

I reported the bug. Knowing how security works technically, I added to the bug the words "I'm happy with whoever works on this to take a look at the gallery, here's a world-readable sharing link". A couple rounds of bug comments later, I have been asked to sign a legally binding consent form allowing an engineer to look at the gallery. Then somehow they decided I need to sign a different form to satisfy whatever other legal spirit needed appeasing. Only then someone finally looked at the bundle of photos. They figured out whatever was triggering the bug. They generated a gallery reproducing the bug with generic sample images. Whoever worked on the bug and adding a regression test worked off that synthetic gallery instead.

imiric · on Sept 25, 2022

My concern—and I think what everyone's concern should be—is not that Google employees have access to my data. Or even that it's used to train ML models. It's that my data is sold to advertisers via a shady behind-the-scenes marketplace, and later used to profile me in order to show me content that manipulates me into spending money.

And that, moreover, I get none of the profits from these transactions, and have no control over whom it's sold to and under what terms.