His comments are actually the most insightful points I've seen about the discussion regarding PRISM:
I have my own suspicions -- which I won't go into here -- about what PRISM was actually about. I'll just say that there are ways to intercept people's Google, Facebook, etc., traffic in bulk without sticking any moles into the org -- or directly tapping their lines. You may find some interesting hints in the leaked PRISM slides [1], especially the second and fourth ones shown there. The subtleties of phrasing are important, I suspect, not because they were trying to be elliptical but because they reveal what was obvious to the people who were giving that presentation.
And like I said, I have both some reason to believe that there aren't such devices inside Google, and that the PRISM slides are actually talking about a somewhat different kind of data collection -- one that's done from outside the companies.
Beam splitters (prisms?) inside the backbone providers. All traffic goes to its destination unharmed, but the NSA gets all the packets. SSL is harder, but all you need is the private keys. Those are hard to get but not impossible for someone with the resources of the government. This is the only scalable way to do what they are supposed to be doing and not involve lots of outsiders. Note that the people who have really clammed up the past few days are the telecoms in all this.
The linked quora answer (from the co-author of Firesheep) says that even in that case one can't launch a passive man in the middle attack if perfect forward secrecy is used.
Google.com uses Diffie–Hellman key exchange which provides perfect forward secrecy.
So... If I understand everything correctly, it should be impossible to decrypt passively captured HTTPS traffic to/from google.com.
We are not disagreeing with anything he says. We are saying the NSA has the private key.
Ian Gallager: > so if you have the private key, you can decrypt that key, and then use it to decrypt the bulk-encrypted data.
Like he says, if you have the key, wireshark can decrypt the data trivially.
If you have the CA master keys, then the only thing you can do is perform a MITM attack, but not silently decrypt the raw data. A MITM attack would eventually get detected.
Ephemeral Diffie-Hellman creates a new key per connection in a public-safe manner. You cannot eavesdrop on such a connection, even if you have the signing key. The question then becomes, are the SSL sessions actually using that mode.
> Ephemeral Diffie-Hellman creates a new key per connection in a public-safe manner.
How does that make a difference when you have the Diffie-Hellman key? We are saying they have the Diffie-Hellman keys, not the signing keys, nor the block cipher key that is exchanged. They have the only key that matters.
How are they getting the DH keys without cooperation from at least one of the SSL endpoints involved? They're newly generated at every SSL handshake, you can't just get a mole to hand you the keys once and be done with it. If you had the certificate private key, you could do a MITM, but this requires a LOT more resources and would be much more easily detectable.
But I am still lost on how it would be detectable? From Google's end, some client just disconnected. From the client's end, the internet just got a tiny bit more latency.
If you had Google's certificate private key, you can pretend to be Google. It's undetectable from the user's perspective. I think we should trust Google to keep their private keys safe, although it would help a lot if the published in general terms how they accomplish this.
The signing key for Gmail's certificate is a 1024-bit RSA key. That key size is simply not safe against an attacker like the NSA today, so we may as well assume they have the private key even if Google didn't voluntarily give it to them.
But while the signing key may allow them to impersonate Google in some circumstances, it doesn't really help decrypting passively recorded TLS traffic to the real Google. For that, they would need to break the ECDH key exchange, and if Google uses reasonable elliptic curve parameters, that's presumably much harder than factoring a 1024-bit RSA modulus, at least with known cryptanalytic techniques.
"I think we should trust Google to keep their private keys safe, although it would help a lot if the published in general terms how they accomplish this."
Really, I would think it would be easy for the NSA, etc to get an operative inside Google, FB etc and steal these. Intelligence organizations are very good at this after all..
>How are they getting the DH keys without cooperation from at least one of the SSL endpoints involved?
One possibility is to actually compute discrete logarithms.
Does anyone know what elliptic curve parameters Gmail uses for key exchange? If the parameters are large, it is not feasible to break discrete logs using known methods, but while I'm usually wary of claims that the NSA is miles ahead of the academic research community, I could perhaps believe they have faster algorithms for e.g. some NIST curves.
If you hold the theory that the traffic is being intercepted and the parties have compromised the TLS keys: The test for complicity is obvious: failing to rotate out the TLS keys, failure to HSTS, and failure to switch to EDH ciphersuites everywhere.
These are all moderately 'cheap' steps if you believe you're being compromised in this manner.
I was being a bit devious. I am just so tired of tptacek dismissing stuff I say with asinine arguments and then watching my comment get downvoted to hell.
Ok so they split and copy all the packets, nobody else is concerned with the complexity of tagging, filtering, rebuilding and contextualizing this conceptual volume of packet data?
Beam splitters are not enough, they would need something to interpret this traffic.
Then you just process basic metadata. Size, IP source, destination, timing, and statistical analysis of the binary. Assuming that they have ways of converting IP to an identity that information alone would be hugely revealing. In fact basic metadata is what they have admitted to recording.
That's why IMHO secret sharing[1] algorithms are so important
We store only parity of data in one data center, and some in another on a different continent. Any data intercepted or lost does not damage the integrity of the whole, plus this makes ISP can not discriminate raw binary data.
Here's a much simpler explanation: The Feds submit a FISA order for specific data collection. The companies' lawyers approve it. Then the NSA has a convenient user interface for accessing that data (perhaps real-time?) somehow from the companies' servers (possibly through an intermediary). How else is this data being sent to Ft. Mead? Thumb drives via FedEx?
The dates on the slides might be when a company has erected some convenient access point to grab the data "lawfully" obtained by a FISA order. Microsoft whipped something together quickly. Apple took years to get the UX just right.
Frankly, sucking in ALL of the Internet seems extremely difficult and useless. We're talking GOOG+AAPL+MS+YHOO+Skype+many more. And all for $20M/year? The gov't spends more on toilet paper.
This is my thoughts exactly. I cannot imagine any hugely sophisticated data collection infrastructure costing a mere $20m a year.
More likely this is software written to take in structured data obtained by subpoena -- as it's generated by targeted users. This "ultimate user data liberation" API may have even been the system at Google that was attacked by the Chinese: http://www.washingtonpost.com/world/national-security/chines...
It's possible PRISM is a separate program that's accounted for differently. We've had pretty good evidence for a while now that the government does have firehose type capability in at least some locations.
For only $20M/yr (which is nothing by government standards), I could definitely see that being a roadmap for building the user friendly endpoint to obtain the relatively small number of legally obtained records from each provider.
A target's phone call, e-mail or chat will take the cheapest path, not the physically most direct path - you can't always predict the path
Dates When PRISM Collection Began For Each Provider
This is complete conjecture, but this reads to me like the NSA set up its own backhauls and set up peering agreements at artifically low prices to get traffic going over their pipes. Is there historical data for route announcements available anywhere? There are a lot of specific dates that could confirm/disprove this.
This is expounded upon in some news articles, particularly that data is routed not by the most efficient geographic route but by the cheapest by dollar price.
It's a very important tidbit of information that divulges the heart of the matter -- the NSA is a backbone operator[1]. It's a tall claim to make, however their involvement in internet exchange points (and other regional network access points) would be far more difficult to conceal.
Acquire access to major routers, send routing commands, route target traffic into their hidden networks, and avoid physically wiretapping anything. And this framework can be done globally:
Your target's communications [...] flowing into and through the U.S. is as easy as announcing BGP route advertisements globally.
Actually the Great Firewall of China started investigated realtime and fine-grained control of national routing infrastructure as early as in 2003.[1] This allows them to apply routing policies to routers national-wide in seconds. One observable effect is that a single address can be null routed immediately after failure to get blocked by TCP resets. It is believed the HTTPS MITM of Github last time was also helped by this routing framework. And the GFW is viewed by the Chinese government as a national security framework. No wonder the USG is doing the same thing.
[1]: Liu, G., Yun, X., Fang, B., Hu, M. 2003. A control method for large-scale network based on routing diffusion. Journal of China Institute of Communications: 10.
There are two features in the slides that need to be accounted for in any explanation.
1. The second slide, which implies that the fact that the internet traffic passes through the US is relevant.
This implies that either some physical interception was necessary, either from outside the companies, or from the inside with their cooperation, or that it was legally necessary (on order to require the companies to deliver the information. And yet it would seem like the issue of what the NSA could require US companies to reveal, is orthogonal from which traffic passes through the US, since in that case the issue would be what was stored on US servers.
2. The implied cooperation from the companies in the slides. According to the Washington Post, the "Special Source Operations" in the logo refers to "the NSA term for alliances with trusted U.S. companies". Given this, and the use of the terms "providers", it seems unlikely that no cooperation from the companies is involved.
From the above, I would propose that the PRISM scheme must involve some combination of physical interception, and cooperation from the companies involved.
For example, the NSA may be intercepting traffic then asking the companies for private keys for some of the traffic they intercept. This would fit the above facts in that it requires the traffic to pass through the US, and also requires cooperation from the companies.
I wonder if he's hinting at spying on XMPP -- a lot of the companies implicated use this protocol and the timelines of the powerpoint timeline is aligned with a lot of companies mobile growth as well as XMPP usage -- also, this might explain why Google is suddenly dropping XMPP support (rather than just being "closed")
I have my own suspicions -- which I won't go into here -- about what PRISM was actually about. I'll just say that there are ways to intercept people's Google, Facebook, etc., traffic in bulk without sticking any moles into the org -- or directly tapping their lines. You may find some interesting hints in the leaked PRISM slides [1], especially the second and fourth ones shown there. The subtleties of phrasing are important, I suspect, not because they were trying to be elliptical but because they reveal what was obvious to the people who were giving that presentation.
And like I said, I have both some reason to believe that there aren't such devices inside Google, and that the PRISM slides are actually talking about a somewhat different kind of data collection -- one that's done from outside the companies.
Any ideas what he could be thinking?
1. http://www.washingtonpost.com/wp-srv/special/politics/prism-...