In the US, subpoenas come from the Justice Department (either state or federal depending on the crime for which evidence is being sought). The court that issued the subpoena is on it, and the person or entity being served, has the right to see why some government agency felt it could aid in the uncovering of a crime that had already been committed. The person or entity then has the opportunity to challenge that in court prior to complying with it. This is sometimes informally called "quashing the subpoena." From my sister-in-law who is a defense attorney, the most common result of challenging a subpoena is to get what it asks for narrowed down to just what is plausibly responsive.
In the article, this response: As a result we are currently developing new data retention and disclosure policies. These policies will relate to our procedures for future government data requests, how and for what duration we store personally identifiable information such as user access records, and policies that make these explicit for our users and community. Is good practice for limiting what a subpoena can request (you can't give what you don't have).
At Blekko we logged access records in such a way that we could use PII for 48 hours and then it was deleted. The CTO, Greg Lindahl, is a huge privacy advocate and this sort of architecture made it possible to get information to improve our ranking and service without compromising people's privacy. In practice I don't think any agency could go from "we have a suspect" to "issue a subpoena" in 48 hrs so it was a useful way for us to stay out of the crosshairs. The most interesting event was the FBI asking for information on IP addresses that had accessed their honeypot CSAM site. That turned out to be some of the machines in the crawling cluster. Given that the site was outside the crawl "horizon" and didn't rank (very few sites linked to it) it didn't even make it into the cache for rank analysis. But in that case the turn around time was impressive. Of course that is because they were just using their own logs to generate subpoena requests.
As I recall (and I'm not a lawyer so don't rely on this advice) the lawyers had advised that as long as the retention period was published, even if a subpoena asked for a longer look back you could meet your obligation by returning "all the data you had" which would only be 48hrs worth.
Had a jurisdiction said, "You should have expected ..." I expect our response would have been, "We have published what we retain, me meet conform to federal and state laws you knew ahead of time we wouldn't have more than 48 hrs worth."
That said, jurisdiction when it comes to the Internet is always kind of "weird". Did you use the web service in your house in Columbus OH, or did you use the web service on a server in a data center in California? Also as I recall our TOS also had a requirement that any legal action be brought in California but I don't think we ever tested that in court.
In the US, subpoenas come from the Justice Department (either state or federal depending on the crime for which evidence is being sought). The court that issued the subpoena is on it, and the person or entity being served, has the right to see why some government agency felt it could aid in the uncovering of a crime that had already been committed. The person or entity then has the opportunity to challenge that in court prior to complying with it. This is sometimes informally called "quashing the subpoena." From my sister-in-law who is a defense attorney, the most common result of challenging a subpoena is to get what it asks for narrowed down to just what is plausibly responsive.
In the article, this response: As a result we are currently developing new data retention and disclosure policies. These policies will relate to our procedures for future government data requests, how and for what duration we store personally identifiable information such as user access records, and policies that make these explicit for our users and community. Is good practice for limiting what a subpoena can request (you can't give what you don't have).
At Blekko we logged access records in such a way that we could use PII for 48 hours and then it was deleted. The CTO, Greg Lindahl, is a huge privacy advocate and this sort of architecture made it possible to get information to improve our ranking and service without compromising people's privacy. In practice I don't think any agency could go from "we have a suspect" to "issue a subpoena" in 48 hrs so it was a useful way for us to stay out of the crosshairs. The most interesting event was the FBI asking for information on IP addresses that had accessed their honeypot CSAM site. That turned out to be some of the machines in the crawling cluster. Given that the site was outside the crawl "horizon" and didn't rank (very few sites linked to it) it didn't even make it into the cache for rank analysis. But in that case the turn around time was impressive. Of course that is because they were just using their own logs to generate subpoena requests.