The PRISM Details Matter

jholman · on June 11, 2013

Please take all my upvotes. This is the first useful article I've seen about PRISM, with the _possible_ exception of the scoop itself, but not excluding the existing analysis from the Guardian and WaPo.

For $20M, you can't do what Prism claims to do for $20M. Even suppose you limited your scope only to Google, AND now you limit only to GMail, AND your development and installation costs are free, and all you're spending on is bandwidth and servers and cooling and maintenance staff, I still doubt you can read all of GMail for $20M/year.

Then there's the bit where they claim that analysts, what's the phrasing, "quite literally can watch your ideas form as you type". Seriously, think for a second what it would mean to be able to do that. Think of how that would work, in terms of network protocols. Now go install wireshark and open a gmail window and start composing an email, or writing half of a chat (without hitting Enter). I haven't done the experiment; maybe you can be the first and break a scoop!

Obviously the people who reported on PRISM have told some lies about what it does. Either it can't do what they say it does, or it cost a lot more than $20M, or both.

(I don't want to minimize the seriousness of the accusations... they should be given due appraisal. But they basically cannot be literally true as written.)

And look. Google gets about 100 "requests" (legal demands) from government per working day, and responds (under legal duress) about 30 per working day in the US alone. You think maybe some kind of workflow management software might be appropriate here, or you're imagining this all goes back and forth over email? Or maybe you're imagining an anonymous FTP server somewhere? Obviously the right thing to do is build a secure custom ticket-tracking system, to manage demands, push-back, authorization, fulfillment, etc. If we want to believe it's all okay, here's a plausible story. Maybe it's true, maybe it's not.

guelo · on June 11, 2013

About the "quite literally can watch your ideas form as you type" quote, it's not clear that it's referring to PRISM. It came at the end of the long WaPo article and comes across to me as something Snowden might have said about NSA's capabilities in general, not necessarily PRISM.

  Firsthand experience with these systems, and horror at their capabilities, 
  is what drove a career intelligence officer to provide PowerPoint slides 
  about PRISM and supporting materials to The Washington Post in order to 
  expose what he believes to be a gross intrusion on privacy. “They quite 
  literally can watch your ideas form as you type,” the officer said.

If the NSA is snooping at the line level, which we know since the AT&T closet revelation a decade ago that they do, then they could see real-time typing in apps like Google Instant.

chm · on June 11, 2013

Agreed, but his point still stands that either:

1) Their budget is an order of magnitude higher; 2) PRISM is just an efficient legal data collection framework, an API that companies are forced to comply with

markjaquith · on June 11, 2013

If they can snoop SSL'd content, that would be a bombshell. Many of the implicated companies use SSL for most of their properties. FB is all over SSL, for example.

For stuff in FB, FB could be pushing that target-specific data to collection points in real-ish time. Instead of doing daily dumps, you just provide a targeted stream of data.

lern_too_spel · on June 11, 2013

He was a sysadmin for three months. He never claimed to have firsthand knowledge of this tool. He found the slides on the intranet and jumped to conclusions.

uptown · on June 11, 2013

"...and jumped to conclusions"

You seem to know all about that.

guelo · on June 11, 2013

Read the first five words of the WaPo paragraph I quoted again.

lern_too_spel · on June 12, 2013

Where did Snowden himself claim firsthand experience? Nowhere. That sentence is the author's conjecture. It's shoddy journalism.

This is exactly the same as how the slides saying that the NSA collects data directly from the companies was misinterpreted by WaPo to mean direct access to all the companies' user data. I suspect Snowden jumped to the same conclusion when he came across those slides while sysadminning, and that's why he's in the mess he's in now.

thrownaway2424 · on June 11, 2013

I agree, this is one of the few useful analyses of the PRISM leaks that makes much sense. One begins to feel that the Snowden person is a relatively low-level employee of government contractors, with a predisposition to the EFF/EPIC/ACLU end of the spectrum, who came across some slide decks, misinterpreted them, and constructed some far-fetched conclusions therefrom, and wrapped himself in the middle. Greenwald probably didn't do him any favors, because he has a history of grabbing any barely-true story and detonating it (see: dozens of wrong things written by him in regards to the Plame thing.)

Under rational scrutiny the PRISM story has fallen to pieces. It doesn't make any sense that all of the high-level executives and hundreds of thousands of top engineers have no idea what is happening, while some guy from Booz Allen and a blogger are the only people with the truth.

foxylad · on June 11, 2013

I also agree that the article poses useful questions that need answers. However, in this case I think it is extremely important that we get real answers, and don't allow ourselves to be swayed by ad-hominem or other specious arguments. Greenwald may be a flake, and it is one unknown's word against that of many public and powerful people... but lets not let that stop us demanding those answers.

These allegations are very serious and if by any remote chance they were true, those powerful people would be busy trying to make Greenwald look like a flake and Snowden like a confused tech incompetent.

fexl · on June 11, 2013

This article discusses how high-level executives and many top engineers might have no idea what is happening:

https://financialcryptography.com/mt/archives/001431.html

Excerpt: "How is this apparent contradiction possible? It is generally done via secret arrangements not with the company, but with the employees. The company does not provide back-door access, but the people do. The trick is to place people with excellent tech skills and dual loyalties into strategic locations in the company. These 'assets' will then execute the work required in secret, and spare the company and most all of their workmates the embarrassment. ..."

In a discussion of this article (on a cryptography list) I observed this incredulous response: "Hmm. So what does that mean a team of ex-military/intelligence security people work there way up or get assistance with contacts and references, replace all the key people in a companies inner security department and start coding up backdoors, APIs and allowing VPN access to it? All without telling anyone or getting noticed by ops people etc."

To which the other party retorted: "Been there. They are noticed, but you get orders from on high to shut up and not notice."

If that's all true, then it sounds like only a very few engineers and managers acting as moles will have specific knowledge of the program. A few non-mole engineers will sense that something's afoot, but they'll stay mum. Maybe that's as far as it goes.

thrownaway2424 · on June 11, 2013

Well that's great, but it's also stupid. Companies like Google and Facebook have hundreds of high-level engineers staring at all levels of their system all day long, trying to find out where their microseconds have gone. And these people are responsible for umpteen billions of dollars in capital expenditures every year, and responsible for capacity planning and so forth. The theory espoused at the link you posted requires that all of these people are either not smart enough to notice that an external entity is using their resources, or that these people, who I would point out are largely not Americans, are in on the conspiracy, or, finally, that the NSA is capable of pulling off their surveillance without having any detectable impact on production CPU, memory, storage, and networking.

These are highly implausible scenarios.

cpleppert · on June 11, 2013

It is also, as far as I know, illegal to hack or disrupt computer networks in that way (even for the NSA). If they had warrants giving them access to information they wanted it would have been overkill to do something like that, risk getting caught, and now have to go to the company for intelligence cooperation in the information.

dnautics · on June 11, 2013

and if the NSA should break the law that makes it illegal to hack networks in this way, what would be its punishment? Do we send the NSA to jail? Do we fine the NSA? Who do you think pays the for the lawsuit (punishment, legal team, etc) when the NSA breaks the law?

As for the agent: two words - qualified immunity

fexl · on June 11, 2013

I had the same thought as you, which is why I posted the skeptical response "... All without telling anyone or getting noticed by ops people etc."

To anyone who notices, there's always "shut up and not notice." But there's also "oh that rsync you see there is for our geographically redundant backup facility" or whatever -- in other words, dissembling.

zero_intp · on June 11, 2013

As an engineer for a tier 1, I can affirm that such requests seem to enter the company laterally at the VP level.

betterunix · on June 11, 2013

"For $20M, you can't do what Prism claims to do for $20M."

You seem to be assuming that the equipment budget is tied into that. I may have missed something in the details of this program, but I would guess that the NSA's equipment budget is separate. If I had to guess, I would say that the NSA has many software systems that share supercomputer resources (did you think that big new datacenter in Utah was just for storage?); PRISM is one of these, but probably not the only one.

I am guessing that $20M is the price of developing, maintaining, and improving the PRISM software only. The NSA's main problem is not hardware, it is algorithmic -- they are processing a lot of data, and they need software that can scale well and give good results. $20M is a big budget to pay top computer scientists to solve algorithmic problems.

mtgx · on June 11, 2013

If PRISM is just the front-end/interface, then it wouldn't cost too much. I doubt PRISM refers to all the servers they had to buy for example.

eightyone · on June 11, 2013

It doesn't seem like $20-million would even cover the cost of what the government is claiming PRISM actually is, so I think you're right.

montagg · on June 11, 2013

HN titlebar flash: PRISM is just a weekend project.

flyinRyan · on June 11, 2013

Perhaps they have, shall we say "outside" sources of funding like the CIA is famous for.

waxjar · on June 11, 2013

GMail does send what you've typed to their servers and stores it as a concept so you don't lose your work in case your browser or computer crashes. I seriously doubt the NSA has access to that stuff, but technically it would be more or less possible ;)

o0-0o · on June 11, 2013

When I read the slides, the $20M USD price sounds like what the US Gov was SELLING PRISM access for. Likely to contractors and governments I suppose. Only speculation, but yeah.

SanderMak · on June 11, 2013

May be the $20M is the projected cost (or amount paid to, or contribution required, or...) for each 'participant'?

mpyne · on June 11, 2013

I'll note (again) that the NSA slides could indeed be talking about direct from the central servers of the company without actually having direct access to the central servers of the company.

This is also not a pedantic point. It's almost literally like adding a getter method to a class instead of giving direct access to a member variable. E.g, for range-checking.

  class GoogleCentralServers
  {
    vector<CompanyApprovedFISA> m_fisas;
    vector<PersonalData>        m_persons;

    // ...
    PersonalData PRISM_get(int fisa_id) const
    {
      if(fisa_id < 0 || fisa_id >= m_fisas.size())
        throw GFY::NSA{};
      else
        return m_persons[m_fisas[fisa_id].person_id];
    }
  };

In our 20¢ implementation of PRISM we can see that the data is still coming directly from the personal data store, but that the NSA does not actually have direct access to that data; it has to go through a company-provided intermediary that checks that the request is valid and only then proceeds to obtain the data and return it.

In fact I'm pretty convinced this is the most probable explanation of all the different testimony and evidence provided to this point. You could later add provisions to do real-time updating of the data in the lockbox, etc. etc., but that wouldn't change the core of the system.

cpleppert · on June 11, 2013

The slide says "collection directly from the servers." If you look at the context it is comparing traditional SIGINT wiretapping with PRISM. Now it seems clear (as you suggested) that this refers not to the method(direct access) but rather to the provenance(where does it come from, directly not indirectly from wiretaps) of the data.

I'm not sure what to make of this article though: http://www.washingtonpost.com/world/national-security/us-com... Everything in this article from the Wash Post directly contradicts what the government and the companies are claiming. It really is real time, direct access to data without any mediation from company staff at all.

The article seems credible as it comes from multiple intelligence sources and executives(!) at the company. I don't really expect executives to lie in a way would raise suspicions about their company like that. If PRISM was really just a dropbox executives would be very happy to say so(in private at least so reporters could calm people down)

It is a lot easier to imagine intelligence sources saying it is just a dropbox than executives at companies admitting they installed a back door for the government.

Right now, I'm undecided.

mpyne · on June 11, 2013

What would make sense to me is that the company is able to "mediate" the FISA warrant, not the access request itself. Kind of like the TLS CA architecture; once you go and "trust" the root CA you would automatically trust requests that were signed by that CA.

Presumably they could ensure that their automated mediator limits the data collection only to things approved by the FISA warrant, in this case there would be no need to keep manually double-checking as the computer would do that for the company.

Likewise the automated system could continue to update the data in the "drop box". Think long-polling techniques or push notifications to the NSA analyst's machine.

Terretta · on June 11, 2013

Thank you. Assuming it's true that email address in, data out, then to the agent getting the info, and to the user whose info is gotten, the layers involved are irrelevant.

jack-r-abbit · on June 11, 2013

The layers are relevant if one of those layers checks that the request is valid and legal. Assuming it's true that email address in, [access check layer], data out, then we have nothing to worry about.

mpyne · on June 11, 2013

Well, we still have to worry that only the NSA has access to the layer. But even with ongoing crypto breaks I trust computers more than I trust people (in a world of spear phishing) to do the right thing.

jack-r-abbit · on June 11, 2013

Being worried about unauthorized access to your server is not a new worry and is not limited to this topic. So if that is our only worry, then this whole thing is a non-issue.

wyck · on June 11, 2013

There is something odd about the PRISM leak, only 5 slides have been published, yet Snowden wanted the full 41 slides released.

Both Glenn Greenwald and Barton Gellman seem to have no desire to publish the rest, Greenwald had this to say about them ..."we’re not publishing NSA tech methods”.

At this point the "tech methods" might be the only truth to this enigmatic subject.

Greenwald is between a rock and a hard place, on one hand the rest of the slides might be too sensitive to publish, and on the other hand none of what has been published makes much sense.

eightyone · on June 11, 2013

I found these tweets a couple days ago from Glenn Greenwald. Click to see the full context, of course.

"The highly detailed NSA technical materials on how they eavesdrop, that I have & am not publishing." [1]

"You won't - it contains very specific technical NSA means for collection - we'd probably be prosecuted if we did." [2]

"There are all sorts of secrets I have I'm not publishing - it's public interest v. harm analysis at the heart of journalism." [3]

[1] https://twitter.com/ggreenwald/status/343470800784982016

[2] https://twitter.com/ggreenwald/status/343454484917280770

[3] https://twitter.com/ggreenwald/status/343453682387542016

jack-r-abbit · on June 11, 2013

This. THIS. This a hundred times over. If I could give this a thousand upvotes I would. It actually makes sense to have a place to put the stuff that these agencies might be requesting from the companies. What else are they going to do? mail a CD? email a zip file? Wouldn't it make sense to just have a spot that is secure and give the various agencies access and then copy over anything they've requested (and granted access to) for them to collect at their leisure. An FTP site. A Dropbox-like thing.

But sadly, this one will get lost in the sea of panic.

Terretta · on June 11, 2013

If you can submit a query by email address and get a Google Takeout or Facebook graph style data dump back, what does it really matter how many pieces of equipment or API layers are between the request and the database?

Your Facebook profile page itself doesn't have "direct access" to Facebook's "central" servers, it goes through several application layers, not to mention several intermediary server and networking layers. So what?

I haven't seen any dispute of "email address > data dump", and as far as I'm concerned, arguing about the directness of the word direct (when even front end vs back end is hazy on HN) is just distraction.

This article is just hand waving. If an agent who wants info can put in an email address and get out user info, it's a distinction without a difference.

DanielBMarkham · on June 11, 2013

I liked this, and it brings up an important point: details matter.

I'd be careful, though. After TIA was destroyed a few years back, all of these ideas just got new codenames and continued on. My point being that NSA runs a huge bucket of programs that all get stuff from all over. Yes, for Google or FB, we can speculate that maybe there's a ticket system. Likely, even.

But that's missing the point. As Snowden himself said, he could get anything he wanted to know about people simply by placing the right intercepts. Want to see what they type? Go for the telco records. Want to get all their personal details? Fetch that from the credit bureaus. Sometimes you might have to send a real, live person to attach a physical monitoring device. The pathways here are myriad.

This isn't some single system that somehow is massively accessing every scrap of data. It looks like an asynchronous system that interacts with a platoon of other systems, already developed, to provide a total view of anybody you'd like. In some cases, auto-generated FISA requrests or NSLs are probably created. In some others, not.

There's a danger here for we libertarians to go around like their head is on fire. That's the way it's always been. But I'd be careful with oversimplifying the actual nature of the discussion also.

Secrecy really sucks, because all we end up doing is speculating about what might be created and what might be happening. Meh.

pserwylo · on June 11, 2013

As with other commenters here, I completely agree.

The NSA/Verizon issue has been conflated with the PRISM issue, because they were both published very close together and involve the NSA and broad scale data snooping.

I am horrified about the NSA/Verizon thing - because there is a court order that explains exactly what is going on, and it doesn't require too much technical knowledge to understand how bad it actually is.

The PRISM thing will scare me if it turns out to have "direct" and "unilateral" access to the servers of these companies. However, I doubted this from the beginning. Although I agree very strongly with this article, even they don't go far enough. The term "the companies central servers" keeps coming up, but even that is not a good representation of how the companies would store data of interest to the NSA.

Do the Washington Post journalists think that you just turn on the server like you do your desktop, open a folder called "John Smith", and then see a text document with all of their chat messages? Most software developers would have enough trouble trying to figure out exactly how the DB and Application layer of a piece of software works together in the first place - some decisions that DB designers or application developers (including myself) are horrifyingly baffling. And each company would have had engineers who made equally baffling, but extremely different design decisions with regards to data storage.

Then there is the number of servers involved. Would somebody at the NSA say "Hmm, I want to get details on John Smith, so I'll just log into Facebook's central servers and pull the details up." - there would be thousands of servers involved, and plenty of duplication and de-normalization. But still, what servers have details on John Smith? There probably isn't even somebody at Facebook who could tell you, let alone somebody accessing a supposed system like PRISM.

As the original article states, I think that it is more likely a system to deal with warrants and getting info from these companies through the typical channels that most of us are probably familiar with.

Now, if there is another AT&T-esque issue where the government is tapping into fibre to store data in communication, possible for future decryption, then that is a whole nother issue. As a previous HNer mentioned (can't find the reference now) we all "know" that the government is probably doing that, but none of us KNOW that they are, until we see a court order like that from the Snowden leak.

wavesounds · on June 11, 2013

Mr. Snowden said he could read any citizen, even the presidents, email if he new the address[1]. Theres a number of ways this could be accomplished, hopefully the technically details of which will come out eventually, but theres no question he saying its being accomplished.

http://www.guardian.co.uk/world/video/2013/jun/09/nsa-whistl...

brown9-2 · on June 11, 2013

However there is a question of how much exaggeration he is using.

For example, technically a police officer could shoot anyone if they wish. The only thing stopping them is society's laws.

That the NSA has the capability to ask any company for an email address's data is different from any employee having the ability to do that peeking without repercussions.

jchung · on June 11, 2013

The details matter if you want to discuss whether what is happening now is appropriate or not. In fact, it's precisely half of what is required. The other half is an opinion on what ought to happen. Can we finally move the discussion toward what policies ought to be in place? If we skip over that discussion in our rage, the whole activity will have been for naught.

einhverfr · on June 11, 2013

I think Medium goes off the tracks just a little here. I want to point out a couple of obvious things.

1. Prism is not necessarily the only way they get data. There is nothing ruling out having an expedited system of getting requests sent, and having some sort of access.

2. "Directly from the servers of...." can only be reasonably read one way. There is no reasonable reading of the slide which puts it in line with the NYT story.

I think it is therefore necessary to note three possibilities:

1. The NYT story is wrong.

2. The slide author was wrong.

3. NSA uses both these methods.

markjaquith · on June 11, 2013

“the servers” could be read many ways. Doesn't necessarily mean direct access to all their data. Doesn't mean unilateral access as alleged. The slides we have don't give enough detail to allege direct and unilateral access. Especially in light of companies, NSA, NYT sources all denying this.

einhverfr · on June 12, 2013

"Directly from" means direct access, "the servers of" means the servers controlled by. That's the only plausible reading. The thing is it doesn't necessarily mean "all of the servers of" however indirect or mediated access would be contrary to the slide's plain language, and if these were mediating servers controlled by the other company that would be quite misleading to the NSA employee audience.

handymanne · on June 11, 2013

This story seems pretty carefully thought out in its analysis. I wonder if some of the claims in it are true. That would be somewhat interesting, although I'm still stuck on the more important issues of the PRISM revelation (the ease of the potential for criminal spying on anyone, especially Americans, the lies told by American companies to Americans and foreigners, and the damage caused to those companies and their clients by this forced compliance). We've already dropped Google's services. It looks pretty clear that the original articles on PRISM were technically inaccurate (wrong) in some relevant (if not important) ways but I'm less inclined than most here to look for the most charitable interpretation of the words of those who have just been shown to be lying to my face so vociferously. Reality probably does not include an NSA that can access anyone's servers but we've known for years that, here in San Francisco, there is an NSA room to which feeds for some of these sites are duplicated. Many seem to believe that this must be for ease of returning search results to the feds but isn't a simpler explanation that it houses a mirror of commercial servers (so that they could continue to legitimately claim no access to company servers and produce no load issues for engineers to find)? Many here seem to believe that the access could be real time. They read "watch your ideas as you type" as something magical but isn't a simpler explanation that they meant what most near-lay-people mean when they say that... that it has capability like google's autocomplete (which also lends credence to the mirror idea)?

brown9-2 · on June 11, 2013

I think it's useful on this topic to consider things from the NSA's perspective. If you were them, would you rather have:

- direct access to any of Google's backend servers, and the need to decode GFS and BigTable bits to reconstruct a user's data, times 9 companies?

- an API provided by the companies that gives you the data in a common format when requested?

One is much harder for the government than the other.

utunga · on June 12, 2013

This is bang on. The tech details matter - a lot. Thought I'd add to the discussion by linking/quoting from this piece from wired UK http://www.wired.co.uk/news/archive/2013-06/11/prism-powerpo...

"To us geeks, the tech methods, of course, are the most interesting part. Snowden evidently wanted the entire presentation published, at least at one point in his discussions with the Post's Gellman. On 24 May, "Snowden asked for a guarantee that the Washington Post would publish -- within 72 hours -- the full text of [the] PowerPoint presentation," Gellman wrote on 9 June, in a fascinating account of his interactions with the whistleblower. "I told him we would not make any guarantee about what we published or when."

skwirl · on June 11, 2013

I am glad a post like this reached the front page. There have been many aspects of this story that are odd to me and should be odd to anyone here once they move past the visceral outrage of being told the NSA has direct access to your e-mails and Facebook account. Obviously other people realize this, but I haven't seen many comments about it (and I have to blame myself partly for that).

I find it odd that the NSA has managed to basically gain admin access to the databases of Facebook, Google, Skype, Microsoft, etc. without anyone working for these tech companies coming forward. I know some people have proposed theories as to why, but I haven't found them to be very convincing.

I also find the denials from all of these companies puzzling if the NSA slides are accurate. While some people have tried to read non denials out of the denials, these readings have seemed contrived as several of them seem pretty air tight to me when I read them (Google's is a good example).

The $20 million annual price tag is also laughable, especially to those who have a feel for how much large technical projects cost (and this one would be gigantic).

If we found out that PRISM was simply a software system for companies to comply with FISA warrants that they were already complying with, with those companies pushing data related to individually identified persons and without supplying any more data than they already were or are legally required to, I would not be overly surprised. And this entire thing would be a non story. The details definitely matter.

What really keeps me from settling on that idea entirely is that I don't think that could be true without Snowden having some kind of malicious intent (or being very mistaken about the nature of the systems he had access to). He seems like a true believer civil libertarian, and does not seem evil or stupid.

If this program is really nothing special, than the government should have no problem declassifying more information about it. But even if they do, will we believe them?

Mordor · on June 11, 2013

Well, in the 'if you've got nothing to hide' vein, I think it would be appropriate for an international team of UN cyber inspectors to walk into the NSA and all related company data centers for an unrestricted access all areas fact finding mission.

tvwonline · on June 11, 2013

I agree the details matter!

Each company would need to build an API for them, which would also need to be maintained. Imagine a dev updating some code and finding out he broke the build in the secret US govt. API? Ok that is a bit flippant, but the practicality of accessing the data is an issue, and would have involved lots of devs, managers etc which I would think increases the chances of leaks from with in each Internet company. There have been none, even now.

Just a thought.

Actually, hold that thought. Is it weird that no one is saying "I work at company x and I did or didn't see this".

jack-r-abbit · on June 11, 2013

It doesn't even have to be an API. How about just a secure server that only contains the stuff they can have. Or some software that they use as a request/response queue. They log in and get it.

eranation · on June 11, 2013

Very good point. My question exactly.