If you can't load the blog: - He FOIA'd all metadata of emails to and from the C...

flashback2199 · on Oct 24, 2023

> The job seems awful

How hard can it be to do a sanity check of the metadata?

legitster · on Oct 24, 2023

The dataset was so large they were probably not opening up the files before sending them over. It was clearly a rookie mistake by someone who had never had to do this stuff before and wouldn't have known what to look for.

Again, the vast majority of the emails were automated alert/spam emails. So it's unclear if a random sample of the data would have turned up anything interesting to look at if you didn't know what you were looking for.

flashback2199 · on Oct 24, 2023

I don't agree, I think opening any of the files would have made it immediately apparent that the first 256 chars of each email were included, if that's indeed what happened as the article said.

legitster · on Oct 24, 2023

>Some things about the dataset: It’s very messy – triple quotes, semicolons, commas, oh my. There are a millions of systems alerts. For seattle.gov → seattle.gov communication, there are two distinct metadata records

Opening a file of that size in Excel would probably crash the desktop. And this was probably a lowly admin without a lot of other tools.

The first 256 char of a lot of system emails are going to just have junk html and header tags. If you don't understand that you are looking at HTML, it's not going to be apparent that you are looking at the body of an email.

It's a rookie mistake to be sure, but the admin was clearly unfamiliar with what was being requested.

kelnos · on Oct 24, 2023

Would it really have been a XLS file? I would expect probably CSV? In that case, just running 'less' on the file (or the Microsoft equivalent) would be fine, and wouldn't tax anyone's desktop resources.

kelnos · on Oct 24, 2023

That's not a "rookie mistake"; that's gross negligence. Whenever I write some sort of script to produce some sort of data file, I always look to ensure that the data file ends up looking like what I expect.

This isn't even to ensure my data file doesn't include something private, just to ensure that it actually includes what I intended it to, and I didn't do something dumb like put the data in the same field twice, or duplicate the same record over and over, or whatever.

legitster · on Oct 24, 2023

We are not talking about one of the geniuses here at HN. The guy answering FOIA requests for the city is the IT equivalent of the counter person at a McDonalds. I don't think it's fair to flame them for being an idiot when it's clearly a process issue.

NegativeK · on Oct 24, 2023

That's not how access to email content should work at any organization, and I can personally tell you that responsible government organizations don't give it to entry level employees.

Discovery and FOIA-equivalent requests that I've seen at the SLTT level were handled with the care that is expected for potentially sensitive communications. I'm sure smaller orgs can't do it as well, but Seattle is probably going to have some money for this stuff.

_Algernon_ · on Oct 24, 2023

Just as hard as not deleting a production database, yet that still happens.

This is an edge case for which they probably didn't have an existing process which means they had to wing it.

mc32 · on Oct 24, 2023

I suppose this is why some gov officials use weird aliases for emails. It’s not on the up and up but it avoids disclosing potentially embarrassing or illegal activities…