PACER Deleting Old Cases; Time to Fix PACER

engined · on Aug 26, 2014

PACER is definitely interesting, a bit antiquated, and to date, the data has mostly resided in the hands of the big information companies (Lexis, Westlaw, etc.).

I've been building a system/website to access, search and develop intelligent analytics from PACER court information. We're tracking cases, attorneys, parties, judges, as well as the actual case dockets. The data is a treasure trove of information, and if anyone's interested, I'd be very happy to chat more about it.

The site (a signup for now as I'm working out the kinks in the system) is www.docketleads.com. Email me there or ping me here for more info.

declan · on Aug 26, 2014

I worked on a similar project a decade ago written mostly in Perl with the frontend in PHP (hey, it was 2004, folks!). Just checked and I still even have the old courtbot.com domain I registered for the project.

I suspect you'll find pretty quickly that there's a limit to how far regular expressions or similar techniques can take you if you want to normalize and reference precedents and make sense of cases. That's why Lexis and Westlaw pay actual attorneys considerable sums to summarize cases, and why they can still command such princely subscription fees even in 2014. But analytics might be interesting. A family member is a judge, and her judicial office keeps track of how many cases she decides per month, how many reversals she receives, etc. I don't know if those are made public -- certainly I'm not aware of any project to do it across a large data set, and I wish you luck with it.

engined · on Aug 26, 2014

You're definitely right about case/precedent information, but what we've found is that there's a whole other world of info that can be neatly organized with a lot of crunching, and a small bit of manual manipulation.

The big guys chasing this are highly focused almost entirely on lawyers, in the context of providing them case analysis tools. We've found a bit of a different niche which doesn't need as much fidelity/granularity to the information, but needs it nonetheless.

In any case, I'd love to chat about your experience, even if a decade old. Can I PM you?

declan · on Aug 26, 2014

Sure, happy to chat! What you're doing seems interesting, especially if you're not targeting the lawyer/case research market. My email address is in my HN profile. Though I am working nonstop on http://recent.io/ right now. :)

r00fus · on Aug 26, 2014

I wonder if Recap [1] would help in addressing the censorship/deletion issue. Ultimately, the way we fund these programs is the root the problem (and the privatization of what is supposed to be public data).

[1] https://www.recapthelaw.org

engined · on Aug 26, 2014

RECAP hasn't been nearly as active as its initiators had hoped. The data there is pretty good however, and for a handful of key cases, I would say it's very good. The biggest issue with the data is that it's spotty. Since it relies on individuals to pull info on each case, some cases may only have partial information (not all the parties, attorneys, etc. represented), or not have the full docket available (and rarely, if ever, all of the documents associated with a case).

anseljh · on Aug 26, 2014

RECAP recently found a new home with the Free Law Project, so hopefully things will improve. http://freelawproject.org/2014/05/19/our-recap-partnership-w...

cjbprime · on Aug 26, 2014

Recap only helps here if everyone accesses/pays for all of the files that are about to be deleted, and doing so would surely cost millions or billions.

rayiner · on Aug 26, 2014

I don't get the bit about "privatization." PACER is run by the judiciary.

Natsu · on Aug 26, 2014

I assume they're confused by the part where it's run essentially for-profit by the government.

DannyBee · on Aug 26, 2014

The judiciary sees it as a profit center. Folks have offered to essentially buy the data and make it entirely public. But they see too much profit from it

rayiner · on Aug 26, 2014

Of course nobody is "profiting." There are no shareholders getting dividends or execs getting bonuses. They use it fund the operations of the judiciary in the face of a shortage of funding from Congress.

DannyBee · on Aug 28, 2014

Except, of course, that PACER has various requirements that conflict with this, and they make as hard as possible to keep this profit (which was 150 million in 2008) up.

For example written opinions that "set forth a reasoned explanation for a court's decision" must be free of charge.

They make it is as difficult as possible to access this, and do not allow any sort of bulk download, because doing so would make PACER/courtweb less useful as a pay service.

thrownaway2424 · on Aug 26, 2014

It would cost Google negligible money to host this data and the only people who would be upset would be the rent-seeking jerks responsible for the current PACER debacle.

And EDGAR after that.

RubberSoul · on Aug 26, 2014

What do you dislike about EDGAR? I find EDGAR to be pretty good. It's easy to search and completely free.

amha · on Aug 26, 2014

Un-fucking-believable. PACER has always been awful (I've used it since about 2005), but this is a new low---this is ACTIVE awfulness.

I assume, based on the weird specificity of what they're removing, that the PACER office is doing this at the request of the individual courts. Which just sort of underscores how awful this is---that courts get to decide how public their own opinions are.

thinkcomp · on Aug 26, 2014

Not so. The AO forced the courts to do it according to two people at the Second Circuit.

The most likely explanation is that as part of the "upgrade" of CM/ECF (the write component of PACER) they needed to jettison old databases that used a different schema. This is of course nonsense. They've likely spent over $100 million on this upgrade since 2007, though actual numbers are surprisingly hard to come by. For that price they could have probably afforded a few coders to convert the older databases over.

toomuchtodo · on Aug 26, 2014

Is there any way to obtain this old data through a FOIA? Or do those requests not apply to US courts?

declan · on Aug 26, 2014

Alas, FOIA applies only to federal government agencies that are in the executive branch. It doesn't apply to federal courts or the U.S. Congress.

toomuchtodo · on Aug 26, 2014

Well that's depressing.

declan · on Aug 26, 2014

Yep. Though Congress could liberate all of PACER, retrospective and prospective, if it chose -- one data dump to Carl Malamud would do it. The appropriations bills are wending their way through the legislative process right now (mostly out of committee), and that might be a vehicle to add a one-line amendment. Would require a lot of work in the next month or two.

toomuchtodo · on Aug 26, 2014

Who do I call or who's door do I bang on?

declan · on Aug 28, 2014

I'd point you in the direction of Carl Malamud, Jim Harper at the Cato Institute, and EFF, probably in that order. Jim's made it a project to liberate government data; Carl's gone further and made it his life's work.

Inside Congress itself? Hmm. I'm spending my time working on http://recent.io/ and now paying close attention nowadays. But if you're local to the SF south bay try Rep. Lofgren? I've done some Q&As with her and found she's one of the smarter and well-informed members of Congress on tech policy issues.

thinkcomp · on Aug 26, 2014

Not really. You could try but I doubt it would get you anywhere.

This needs Congressional attention. Of course, Congress is on vacation. Not that it would matter.

Still, contact your representatives.

oneweirdtrick · on Aug 26, 2014

The day that PACER gets fixed is the day judges stop using WordPerfect.

MWil · on Aug 26, 2014

I'll say it again: Bonkers!