Hacker Newsnew | past | comments | ask | show | jobs | submit | robertk's commentslogin

Why not just open it inside of and print to a static image output within a fully sandboxed Docker container?


(Hi, disclaimer: I'm one of the current dangerzone maintainers)

You are correct: that's basically what Dangerzone is doing!

The challenges for us are to have a sandbox that keeps being secure and make it possible for non-tech folks (e.g. journalists) to run this in their machines easily.

About the sandbox:

- Making sure that it's still updated requires some work: that's testing new container images, and having a way to distribute them securely to the host machines ;

- In addition to running in a container, we reduce the attack surface by using gVisor¹ ;

- We pass a few flags to the Docker/Podman invocation, effectively blocking network access and reducing the authorized system calls ;

Also, in our case the sandbox doesn't mount the host filesystem in any way, and we're streaming back pixels, that will be then written to a PDF by the host (we're also currently considering adding the option to write back images instead).

The other part of the work is to make that easily accessible to non-tech folks. That means packaging Podman on macOS/Windows, and providing an interface that works on all major OSes.

¹ https://dangerzone.rocks/news/2024-09-23-gvisor/


Why not upload to Google docs and view there? Way less work.


You might not want to make this file, or the fact that you are in posession of this file known by law enforcement.


Certainly, but that's what, like .0001% of PDFs people encounter?


Yep. A static image would be better, although I'd also prefer the option of getting a simple text file so that I can get the URLs out of hyperlinks.


Why not leak a dataset of N full text paraphrasings of the material, together with a zero-knowledge proof of how to take one of the paraphrasings and specifically "adjust" it to the real document (revealed in private to trusted asking parties)? Then the leaker can prove they released "at least the one true leak" without incriminating themselves. There is a cryptographic solution to this issue.


It’s slightly biased. ( P(even) = 0.5702; Bias = +0.0702 (about 7 percentage points toward heads) ). You can use this Claude Code prompt to determine how much:

Use your web search tool call. Fetch a list of English words and find their incident frequency in common text (as a proxy for likelihood of someone knowing or thinking of the word on the fly). Take all words 10 characters or longer. Consider their parity (even number of letters or odd). What is the likelihood a coin comes up heads if and only if a word is even when sampled by incidence rate? You can compute this by grouping even and odd words, and summing up their respective incident rates in numerator and denominator. Report back how biased away this is from 0.5. Then do the same for words at least 9 characters to avoid “even start bias” given slight Zipf distribution statistics by word length. Average the two for a “fair sample” of the bias. Then run a bootstrap estimator with random choice of “at least N chars” (8 <= N <= 15) and random subsets of the dictionary (say 50% of words or whatever makes statistical sense). Report back the estimate of the bias with confidence interval (multiple bootstrap methods). How biased is this method from exactly random bits (0.5 prob heads/tails) at various confidence intervals?


“Slightly fringe”


Heh

Arthur C Clarke once suggested that some supernova could be industrial accidents. A curiously romantic idea, and one I rather like!


I am sorry for your loss, Aella. I sobbed with you.

“Each passing minute is a greater percentage of the final minutes we have,” and yet “these [final] seconds are so soft”.

Death needs to die, some future dying day, not yet.

from everyone who’s had a mom, we join you: “Momma, I love you”.



Yes these look perfect! Thank you.


The Apple paper does not look at its own data — the model outputs become short past some thresholds because the models reflectively realize they do not have the context to respond in the steps as requested, and suggest a Python program instead, just as a human would. One of the penalized environments is proven impossible to solve in the literature for n>6, seemingly unaware to the authors. I consider this and more the definitive rebuttal of the sloppiness of the paper: https://www.alignmentforum.org/posts/5uw26uDdFbFQgKzih/bewar...


If I read a comment that has any probability of changing my mind about a fact or opinion, I always go to the user page to check their registration date. No hard cut-off date but I usually discount or ignore any account >= 2020.


Sure but what about false positives? What about real accounts newer than that? This is a work around but not a good solution.


That's a sacrifice I'm willing to make, personally.


wait if they make a good point that has changed your mind, you discount it if you don’t like the source?

so you prefer authority of the messenger over merit of the message?


In some case yes. If their argument is based on their own personal experience and it turns out that personal experience isn't true.


you can buy old accounts for like $3


Shawn, there is a mildly redacted version available at https://huggingface.co/datasets/monology/pile-uncopyrighted


Thank you.


No, it doesn’t. This concerns a corporation subject to legitimate national security concerns, not “a person, or a group of people.”


an American corporation does in fact have some recognized legal personhood, and so a 'bill of attainder' could technically be found to exist within a legislative act which violates the legal rights of one.

https://en.wikipedia.org/wiki/Corporate_personhood#In_the_Un...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: