Hacker Newsnew | past | comments | ask | show | jobs | submit | more mlissner's commentslogin

Hi HN. I'm the director of Free Law Project, the org that's behind this, RECAP, and CourtListener.com.

The idea of this project is to create topic-based bots that follow certain areas of the law so that folks can access the raw data underlying the news and read it for themselves.

We have a handful of "topic bots" over at https://bots.law, and we're working on Slack/MS Teams/Discord integration as well. We'll probably be launching a Crypto bot soon, if we can find a reasonable curator for that.

We'd love your thoughts and if you follow tech cases, hopefully you'll find our bots useful!


The whole case is here:

https://www.courtlistener.com/docket/66631003/securities-and...

And if you want to get free email updates as it develops, you can sign up here:

https://www.courtlistener.com/alert/docket/new/?pacer_case_i...


At CourtListener we send thousands of emails each day. We recently migrated to AWS SES and learned a ton of lessons beyond DMARC/SPF/etc.

Getting all of those parts right is important, but you also have to handle bounces, down recipient servers, full inboxes, changed email addresses, and a lot more.

It's been a ton of work getting this set up, so we thought we'd share our notes.


This uses AWS's Textract service, but if you're doing a LOT of extraction, that gets pretty expensive pretty quickly. We do thousands of pages daily on CourtListener.com and created an open source microservice for this purpose. It can take PDFs, DOCX, DOC, TXT, HTML, or a handful of other files and extract the text, doing OCR if necessary:

https://free.law/projects/doctor

We're always looking for more people to use and improve it.


This bug means that when somebody uninstalls Signal (or an iPhone disables it for lack of use), people sending messages never learn that the message didn't go through.

As a result:

  "I just missed out on placing an offer on a house because of this issue."

  "Today I ended up uninstalling Signal after an experience where this problem caused me to almost lose two of my friends."

  "I just had the experience of being very worried about a friend's well-being because he stopped responding to my messages."
And so forth. It makes Signal a very risky thing to use, not because the encryption is bad, but because the UX is. It seems easy to fix, but Signal never seems to care.


I'm the director of Free Law Project, the non-profit org that runs CourtListener. If anybody wants to get email or RSS alerts for this case, you can set them up here: https://www.courtlistener.com/alert/docket/new/?pacer_case_i...


My organization built a similar tool that can find bad redactions caused when people just use a black rectangle on top of text in PDFs: https://free.law/projects/x-ray

Very fun project. Lots of problems out there.


Seems like even the size of the rectangle would give hints on what it is redacting. For example "Yes" has a bigger box vs "No".


Yes, it's been that way long before PDFs. Simply knowing the potential words, often names, that could appear in a document, gives those with the redacted documents a chance at determining what has been hidden based on size. This might be part of the reason why when declassifying documents, the redactions end up being more of a sentence than is needed. The extra buffer of hidden words gives some additional protection to what needs to be redacted.


This reminds me of one of my proudest moments in high school.

For a test in German class (my worst class), the teacher had just used tippex to remove some words and put them next to the text, and we had to fill them back in. I grabbed my ruler and measured all the sizes. There was 1 very long word, many medium sizes and a few smaller ones, but with this information and the context of the text for the first and last time I was able to get my first and last 10/10 in this class.


A malicious "redacting" algorithm submitted to the underhanded C contest used a similar idea, just on lower level.

PNG allows ASCII numbers, so flipping all digits to 0 creates a pixel which is graphically "masked" but leaks information about the original pixel: "000" means the value was larger than 99.



Am I the only one who redacts info, prints it out, then scans it back in? Or redacts, then takes a screenshot before sending out?

For some reason I just never trust the PDF tool (or human error on my end) actually redacting the info, even if I were to do a print to PDF.


Nope. That's called rebroadcast. It's also used to try to "launder" photo manipulations, like compositing. I helped work on some algorithms which could pick up artifacts even after rebroadcast.

I would absolutely not trust pdf not to leak metadata. Although now you risk metadata leak from the printer or scanner, which may or may not affect your threat model.


Careful what printer you use to print it out, some of them add patterns of dots that can uniquely identify the print: https://en.wikipedia.org/wiki/Machine_Identification_Code


When a coworker asked me for my recommended method of creating and publicly sharing redacted copies of documents which (in their unredacted forms) contained PII for children, I told them to do this, in no uncertain terms.


> Am I the only one who redacts info, prints it out, then scans it back in?

if you have the source document, redacting from the source (by actually removing and replacing with an appropriate placeholder, not obscuring, the content) and regenerate the static (e.g., PDF) version.

If you are working from print, I think scan and redact by digital replacement (not overlay or otherwise obscure) would be sufficient. Redact->print->scan probably helps somewhat (especially if the scan is low quality) if you are using a bad redaction method to start with, but why do that?


I do same, except scanning, why not just print it to PDF?


Because some tools might still put a text-layer under the printed so you can select text and copy.


Not if there is a rasterization step in the process. That's essentially what printing and scanning achieves, rasterization, and we can do that without the printer and scanner.

Of course, the artifacts introduced by printing and scanning (especially with contrast turned way up) gives it an air of legitimacy, although these can also be simulated.


If you print to paper and scan you are mostly safe, but if you do a software print to a pdf document you might use a tool that saves the actual content as invisible text or the whole word document as an attachment to the pdf. I would print and scan physically if it was something important. Or just edit the word document to remove the stuff and then print and scan to avoid saving the edit history since I don't know if that will be saved somewhere.

Usually I'm in full control of the software myself so I just output X instead of the secret data.


> I would print and scan physically

This degrades quality and wastes paper and toner. There are software tools to convert PDF to raster graphics.


Interesting. They have the right idea (using a black rectangle) but in the wrong program. I can see how this could trip up non-technical people.


On MacOS, preview makes a clear distinction between 'drawing on' and 'redacting' PDFs. It is an important part of UX that shooting yourself in the foot should _not_ be the default.


That's why you always Print to PDF the document and not just save it, I'm amazed people save documents instead printing them.


You're amazed that people use the conventional "save" method instead of a weird option in the print menu?


If they go lengths to protect some information, then yes I'm amazed they forget about such simple precaution.


Wanting to redact information is not a subset of PDF knowledge. Understanding how PDFs work is not a prerequisite of desire to redact information. Lots of people have only the most basic rudimentary understand of how PDFs work, how Adobe works, and the limits or capabilities.


A lot of people don't even know you can print to a file instead of paper. Not sure why you're surprised about that, after all the standard method for all formats is "save as" or "export" and it's reasonable to assume those two options include all possible ways to save a file. It's a UI quirk that goes against user expectations.


Doesn't help if the output type is wrong.

Recently discovered a manual forr some home appliance with a clear Word comment along with username, seems like slipped in when the manual was translated.


This is fine, but signal still doesn't tell you when the person you're sending to has uninstalled signal. Instead, your messages go into ether and you think the person is ignoring you. It blows my mind they haven't prioritized this. https://github.com/signalapp/Signal-Android/issues/11164


Applications can't determine when they're uninstalled. Or, not reliably anyway, and not while following platform guidelines. So the question becomes how to tell uninstalled vs left in a drawer, powered down, while on vacation.


They just have to tell you if a message isn't received after a day or two. This is already exposed via the check marks, so it's just something they have to amplify with a notification.

Or when you start writing a message to somebody, if they haven't read the last couple messages signal could make that obvious. Etc. Lots of easy fixes.


Those both rely on the assumption that being offline for a little while = app uninstalled. Not always so.


They can just say the message wasn't received. They don't have to say it was uninstalled. Just loudly tell me things aren't working like I expected. That's all this takes.


You could just check for yourself if it's important. I do.


There are multiple anecdotes in this thread, on HN, that people missed that. All GP is asking for is better UX making it more obvious, because being able to check is something other than knowing to check and how to check.


It's the exact same UX as SMS, Telegram, WhatsApp Facebook and (partially) Twitter...


I don't see why that matters? (Especially given that Signal has far fewer users and presumably higher attrition than those other platforms.) If things can be better, than it would be great if they were.


This is bad design. Why excuse bad design? When I send a text message and it doesn't arrive, my messaging app lets me know. With Signal, this is a step backward.


You only know if your SMS fails to send, not if the receiving party has deleted their messaging app, broken their phone, or changed number.


> When I send a text message and it doesn't arrive, my messaging app lets me know

Signal does let you know, it never gets the delivered mark.


That's not letting me know, that's something I have to check.


I meant I check in Signal. It does indicate whether it's received or not and whether it's read or not.


They can determine when did the user logged in last. Signal already tracks this.


That sounds like it would have privacy concerns. I don't want everyone to know when I last was on my phone.


This shows a single check mark, no? Ie it tells you that the user hasn't received the message.


Yea, it seems like this is the most information they could give you without violating the addressee's privacy by revealing whether they have uninstalled the app. I suppose it could be worth it if, when the message remains undelivered for a while, Signal added an explicit note to that effect so the sender doesn't misunderstand.


Yes, exactly this. All that's needed is to tell senders when a message wasn't received after X hours.

You don't have to figure out if the user uninstalled. This also happens if they get a new phone and don't re-install on it, so relying on uninstalls wouldn't work anyway.


How can they tell that a user has uninstalled the app? Does uninstalling send a notification to signal.org?


Uninstalling doesn't send a notification to signal.org, I've previously messaged a few people without getting a response, later realizing they never got it because they switched phones and stopped using Signal without pressing the "Delete Account" button in Signal settings. The workaround is to have the user install+register again, then press delete.

https://support.signal.org/hc/en-us/articles/360007061192-De...

> Signal must be actively working on your phone to make changes to the account. Register to see these options for your number. Deletion requests are not accepted outside of the registered app because there is no way to accurately verify whether or not a number is truly associated with the requester.


Yes, I expected as much: most users who stop using Signal (because, say, their friends use something else) are more likely to either just stop using it or uninstall the app, without explicitly deleting the account.


FCM system they use to deliver notifications will return the delivery ID as no longer valid after uninstall though.


I dunno. It's true they might not even have that information.


Exactly, I messaged someone multiple times and didn't get a response. I assumed they weren't interested in hanging out any more.

I found out many months later when we ran into each other by chance that they don't use Signal anymore and my messages had gone to a blackhole..


Another pain point for me: when I send an SMS to someone, I expect to get replies on SMS not on Signal. Don't try to replace SMS. It's just really annoying to have half the conversation in the text messages app and the other half in Signal app.


This is exclusive to the iOS version. Apple won't let Signal handle the SMS.

On android it easily replaces messages app and you do all messaging, SMS and signal in one chat.

Complain to Apple. Not to Signal.


This is linked from the help page here: https://support.google.com/a/answer/60217#zippy=%2Cwhat-if-i...

> we understand some customers may not use their G Suite legacy free edition for business and may be interested in other options. If you have 10 or fewer users in your group and do not use your G Suite legacy free edition for business, please complete the form below by April 1, 2022 if you're interested in learning about different options for your account in the coming months.

Seems promising.


now the link you provide says "May 1, 2022"


In January 2023, Chrome will stop running any manifest v2 extensions. I suspect that's about 99.9% of all extensions. This is going to be a huge loss.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: