Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think that you misunderstood nullc's question. They are asking what happens if at some point one of the poly1305 MACs in the file is incorrect. Not if someone truncates the file.


We're talking about the same thing.

I saw that it used a streaming AEAD, but that's actually what inspired my question.

Since (from the github page) it reads stdin, it can't two-pass the file.

So it appears that if you hand it a file with midstream corruption it's going to feed a truncated input down your pipeline.

That has consequences. They may well be less serious consequences than buffering a potentially unlimited amount of data in memory :), but it's useful to make the behavior very clear because it wouldn't require too advanced an idiot to make something that was exploitable on this basis.


It's Rogaway's STREAM scheme from https://eprint.iacr.org/2015/189.pdf. Are you pointing out a problem in the paper, or in some specific idiosyncrasy you see of how it's implemented here? If so: what is it?

The AGL post the spec links to directly talks more generally about the high-level strategy: you're buffering chunks of files. You're only ever releasing authenticated plaintext. If you're piping to something processing plaintext on-line, that thing might need to wait for the end-of-file signal before processing or else potentially operate on a truncated file (by some integral number of chunks). `age` is still just a Unix program.


My question was asking to confirm that it indeed will put out a truncated output when given a mid-stream corrupted input (and that it doesn't do something like buffer just to validate).

That behavior should be clearly documented, so that users can be advised that their pipelines need to safely handle that case.

> that thing might need to wait for the end-of-file signal before processing or else potentially operate on a truncated file

Exactly. The docs should say this clearly, or someone will manage to create an interesting vulnerability with it eventually. :)

Could go with a message the points out that encryption doesn't authenticate the source-- which is a not uncommon misuse that shows up with PGP, where people assume that the source is authentic if the input was encrypted, even where no signature is used. (the fact that corrupted input gives an "authentication failed" message might be particularly misleading)


It's streaming on-line encryption. That's literally the point of streaming encryption: not buffering whole messages. The rest of your point directly follows from "not buffering whole messages".


Indeed. And the readme and the usage output makes no mention of streaming, buffering, on-line, authentication, or anything related.

This is a potential security relevant behavior that most users-- who haven't written or analyzed tools like this-- would find surprising.

For those following along, I went and tested it-- since the behavior wasn't documented or clear from the code. If it encounters midstream corruption it truncates the output, exits with a non-zero return and prints some error text std stderr: "Error: chacha20poly1305: message authentication failed\n[ Did age not do what you expected? Could an error be more useful? Tell us: https://filippo.io/age/report ]"

If the input is truncated, it either does that-- or if the truncation is on a block boundary it prints "Error: unexpected EOF\n[ Did age not do what you expected? Could an error be more useful? Tell us: https://filippo.io/age/report ]" instead.

It's not a problem, but it should be documented.


This is a security footgun and a vulnerability waiting to happen, but bash is at fault, not age. age does the best it can do (while maintaining O(1) memory requirement) by exiting non-zero, but the shell swallows that if it's in the middle of a pipeline.


IMHO it's not that bad. It's actually quite usable, and reasonably easy to handle safely.

Use

  bash: set -eu -o pipefail
  # unfortunately pipefail is not POSIX
and some care when writing scripts. Possibly decrypt to a file first.

A proper and likely footgun would be decrypting and passing tainted plaintext and only then exiting nonzero. E.g.

  decrypt < file | sh  # owned
Definitely should be documented either way.


I agree with all of what you said.

The footgun you described can still happen if there's a verification error somewhere in the middle. You could still conceivably craft exploits using only truncation of the plaintext, depending on the situation.

No one should "decrypt < file | sh" (or anything | sh without verifying), but they will. Doesn't matter if we have POSIX or non-POSIX shell flags that can fix it, the defaults are bad.

There's nothing tools like age can do about that, though.

Edit: I was thinking more along the lines of

    if decrypt < file | postprocess > tempfile
    then
        sh tempfile
    fi
where postprocess exits zero. This is where the default shell behavior fails. The "decrypt < file | sh" antipattern is something not even the shell can do anything about.


> No one should "decrypt < file | sh" (or anything | sh without verifying)

I was thinking of self-prepared scripts, tooling or owner controller distribution. Decrypt+good signature is precisely what I want.

Anyway, as nmadden pointed out, age does not provide source authentication duh. AFAIU that means, all the streaming semantics and blockwise AEAD are practically useless, unless you are using the password encryption, which is helpfully blocked from automation.


> The "decrypt < file | sh" antipattern is something not even the shell can do anything about.

It could refuse to accept input from stdin if it's not a terminal.


> If it encounters midstream corruption it truncates the output, exits with a non-zero return and prints some error text std stderr:

Do you mean it releases output even if the encrypted file is corrupt or tampered with? Isn't this one of the issues in e-fail?


I think the problem with e-fail was that gpg would output data before verifying it. Age will only output chunks of data after they've been verified.


The fact that this is the point of streaming encryption does not preclude the usefulness of pointing it out explicitly. It eliminates a reasoning step by spelling it out, which is always a good thing for critical things, IMO.


Serious question: If you're not signing (age does not*), then what is the point of the AEAD STREAM scheme? By definition, nothing is authenticated, right?


Consider this attack.

You found a vulnerability in FooSmith and want to collect a bounty. You're keeping the vuln secret both for security reasons, and so no one else can jump your claim.

FooSmith has announced a bounty process where you can claim a bounty by sending an encrypted message with a novel vulnerability according to a specified process.

So you send a report using the mandatory bounty collection form, which starts off with a fixed position field "Bitcoin address to pay bounty to: <address goes here>".

I happen to know what address you're going to use since you posted it so everyone could see when you got paid. I happen to have write access to FooSmith's issue tracker. I xor youraddress xor myaddress into the stream at the right position, and tada thanks to the fragility of stream ciphers, esp unauthenticated ones: it decrypts to a different message that asks for the payout to my address.

Adding a digital signature to the encrypted wouldn't have magically made it secure: I would just rip that one off and replace it with my own-- FooSmith can't authenticate a signature here, the authentication is "common membership inside an encrypted message", and without authentication that can't work securely.

There are other attacks when the encryption lacks a auth. Imagine you run a network service that accepts encrypted messages and decrypts them then reports back various distinct result messages based on what the input decrypted to.

I have an encrypted message for your service authored by someone else and I'd like to learn about its content. Without auth I could start sending it to you over and over again, flipping bits in it to learn about the content. In some cases, when the planets align just right, this kind of bug lets you use the service as a decryption oracle-- you can get the entire encrypted message!

(Toy example: if the service reports the input in an error message, simply corrupting the first bit might instantly get you the content. But it can be much more complex and subtle than that.)

This isn't to say that you couldn't build a security protocol that didn't use authed encryption... you can, but without auth the encryption doesn't form a nice abstracted layer and much more of the application has to be analyzed from the perspective of cryptographic attacks. History has shown people fail to do this well, so authed encryption should almost always be used unless there is a really good reason why it can't be.


'nullc and I are talking about the same thing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: