Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Lessons I learned building an MPs' expenses crowdsourcing app for the Guardian (simonwillison.net)
57 points by simonw on Dec 20, 2009 | hide | past | favorite | 32 comments


It's nice to see someone from a big public-facing operation straightforwardly describing the mistakes made during a tech project. I learn a lot more about real-world development from reading about the hacks and errors than I do from the much more common "We used this technology because it gave us this cool feature" style of writeup. It's also more dramatic.


I seriously have improved my programming more from reading little asides from folks on their blogs than I have from all the conferences I've been sent to. (Recently, a HN comment caused me to replace ImageMagick with GraphicsMagick. Cut 50% of my request times for less than 5 minutes of work -- 2 lines of code had to change. Where have you been all my life?)

Yay for sharing information.


Here we were, seeing headlines about some MPs who claimed a gas bill twice, or a second home when they didn't really need a second home. It's peanuts. It's irrelevant. It's small fry. Obviously if you make rules fuzzy and allow for second homes, people will push those rules up to the limit. Just like anyone else would.

Whilst at the same time, bailing out banks for BILLIONS.

I guess it's easier to sell papers with a headline about an MPs husband claiming expenses for a £10 adult movie, than it is to investigate a multi billion bank giveaway.

If the crowd-sourcing finds a receipt for gimp outfit you can be 100% sure that'll be front page news. But why should anyone care? Is there a risk that MPs may be spending too much on gimp outfits? could thy all start claiming gimp outfits and we'd have no money to pay the NHS bills? No. Exactly.


I don't see how these two things are even slightly related. Other than the Lloyds business, what was not above board about the bank bailout? What laws were broken? How were any of the people at the Bank of England who recommended and actually implemented the bailout personally enriched?

Note that according to the latest Budget, the actual cost of the bank bailout when the dust settles is expected to be 10Bn (5% of the welfare budget) and if Alastair Darling can be convinced not to wreck RBS, we (the taxpayer) might even make a profit on it. We own 84% of RBS, that's where the money went, into acquiring stuff, not into a black hole. Not that we'll see any of it mind, but that too is beside the point.

If you want to conflate anything, MPs expenses + "dodgy dossier" is the way to go.


The point is, the Bank bailout was spending taxpayers money without our consent.

Claiming a few blue movies on the tax payer is the same, but hardly in the same league in terms of amount of money spent.

I'd just rather read about real news - that matters. Not a few grand some MPs have claimed on expenses that they maybe shouldn't have but it's a grey area.

What is the total money claimed by all MPs, that shouldn't have been claimed by MPs? 6 figures maybe early 7 figures? Oh well that'll keep the NHS running for another few hours - the scale of this just seems miniscule.

It's like finding out someone in the ships bar has stolen £10 from petty cash while the titanic sinks.

</rant>


If you believe the Clinton impeachment was about perjury in a court of law by the highest law enforcement officer in the nation, it was arguably worse than all of the above.

But based on what I read of the first batch of (leaked) expense disclosures, they paint a pattern nearly as bad as Clinton's, much worse than "claiming a few blue movies*, one of an irredeemably corrupt ruling class.

Granted, this is corruption at a "retail" personal level, but it's hard to imagine these same people aren't as corrupt when they're acting in their capacity as MPs and ... how do you say it? ... members of the government/ministers/front benchers/whatever.


Yes, exactly. While the corruption may be "retail," the fact or suspicion calls into question their ability to conduct more substantial business in the people's interest.


The Bank of England has always been the "lender of last resort". That's integral to fractional reserve banking. That ship has sailed my friend :-)

Now personally, I would have no problem with a few banks failing. There are 700-odd in the City, what's one or two? Wiping out existing shareholders was the right way to do it tho'.


>The point is, the Bank bailout was spending taxpayers money without our consent.

Tax money is always spent without our consent in this sense. I don't recall the government asking my personal permission the last time they build a hospital or maintained a road. But they are elected and authorized to spend that money.


Interesting article, and kudos to the Guardian, one of the only worthwhile papers left in the UK, for funding such a venture.


The Guardian's pretty well positioned for this kind of thing. We're owned by the Scott Trust which keeps us free from undue influence by shareholders and actively encourages long-term thinking about the future of journalism. We also have a very strong technology department - the site runs on an in-house custom CMS (Spring/Hibernate/Velocity) and we have plenty of flexibility to try out new things.

It's a really fun place to work.


We're owned by the Scott Trust which keeps us free from undue influence by shareholders and actively encourages long-term thinking about the future of journalism.

In the interest of full disclosure, you should also point out that the Graun has a near monopoly on public sector job ads, so is effectively State-subsidized.

(Back in the late 90s I worked on your old StoryServer implementation... What a bunch of arse that was).


"the Graun has a near monopoly on public sector job ads, so is effectively State-subsidized."

Cobblers. (And a lazy Conservative Party talking point to boot.)

The TES and Times Higher have more education jobs, and local papers get a far higher local government spend in aggregate. And in the Guardian that's only Tuesdays and Wednesdays (having bought them before I can tell you that the Saturday job supplement is merely a repeat and I never paid more for it).

In fact the vast majority of advertised public sector jobs are not advertised in the Guardian. Under what bizarre definition of 'monopoly' would this fall?


I don't know anything about the specifics of public advertising in the British newspaper market, but I will say that a substantial ad buy two days a week could be a significant subsidy to the paper as a whole. And how much you paid for the Saturday supplement is irrelevant: what matters is what the advertisers paid for that reprint.


Sorry, I wasn't clear. In my experience, the advertisers don't pay for the reprint at all.


Show me a BBC ad in the Daily Mail, then.


Right after you show me any media job ad in the Daily Mail that's not for DMGT itself, sure.


Are you claiming that the Daily Mail doesn't accept media ads except from itself?

If a general circulation pub is willing to take govt ads but govt doesn't place them, it's fair to ask why.


No, I'm saying nobody else in the media industry is prepared to buy them, and therefore the BBC is not remotely exceptional in this regard.


It's the same Guardian Media Group, ultimately owned by the Scott Trust, that's currently slashing and burning its regional media properties in print, TV and radio to keep the heavily loss-making Guardian afloat.

I hear their Manchester regional arm MEN Media, not content with closing all its small weekly newspaper offices, turning its local TV station Channel M into an infomercial, repeat and music video channel and cutting half its staff, is now up for sale. [1]

The MEN Media weeklies, which this time last year had busy newsrooms in each town, are now literally reduced to having a journo sitting in a public library waiting for people to go to them with stories for one afternoon a week. It's a shame for a newspaper that started out as the Manchester Guardian.

It's all very well pumping money into this kind of thing, but how will it pay for itself when the more popular and profitable MEN is gone?

[1] http://goo.gl/bTGJ


The link between the first and the second paragraph says http://mps-expenses2.guardian.co.uk/ but points to the (old) http://mps-expenses.guardian.co.uk/.


Blast, as if that bit wasn't confusing enough. Thanks for the tip, fixed.


While I'm mostly far right of center by US or U.K. standards, I find a whole bunch of U.K. papers interesting.

While I obviously prefer the Torygraph, which first broke this story (and what can you say about the paper that employs Matt http://www.telegraph.co.uk/news/matt/?cartoon=6798426&cc...), many others like The Guardian frequently deliver very interesting stuff such as this.


Heather Brooke fought long and hard to break the story, but in the end the Telegraph obviously had deeper pockets. But you have to give the Telegraph credit for the way they reported it - Labour one week, Tories the next - with both parties receiving an individual public shaming, hence not being able to simply point fingers at each other like they usually do.


I got the impression it was even better than that, e.g. didn't David Cameron say some things that he quickly had to walk back when the 2nd shoe dropped?

A bit like how the ACORN videos over here were carefully released so that generally at least one statement made after each quickly became "non-operative" as it's said over here.


Not following the ACORN example: The district court tossed out their funding cut due to Bill of Attainder and the MA-AG cleared them, complaining that the unedited videos didn't support the claims illustrated in the edited videos.

If the shoes dropped they found their way quickly back on ACORN's feet.


I'm referring entirely to the sequencing of revelations in these respective scandals.

Andrew Breitbart's BigGovernment.com first released one taken at the Baltimore office, followed by Washington, D.C. After the first or second, the head of ACORN said "Well, yes, but we threw them out of the office at A, B, New York City, etc."

Then Breitbart released the one in NYC. Lather, rinse, repeat.

(It's a bit misleading to say "the MA-AG cleared them", seeing as how that was the execrable (Amiraults) former MA-AG Scott Harshbarger, who was hired by ACORN for an "independent" inquiry.)


Great story. It was actually kind of interesting to see EC2 relegated to a small part of one paragraph -- sounds like it did its job as expected. What else can you tell us (or me at least :-) about this aspect of the project?


There's not a lot to say really - it Just Worked. We used it for the last project as well. This time we had two instances, one running MySQL and Redis and the other running Django. Last time we started with one instance and then moved to two when it turned out I'd written a bunch of inefficient code (the ORDER BY RAND for example). This time we started with two, but I'd avoided the dumb mistakes I made first time round so scaling turned out not to be a problem. Now that most of the activity is over we'll scale down to just one instance.


Absolutely amazing that it's possible to get this kind of application up in a week. Kudos to everyone involved.

That said, make the categorisation buttons AJAX, and can we have higher resolution scans? Or is that how they were supplied? The couple of pages I did had items on them that were basically illegible.


Are you talking about the new system or the old one? The new one is http://mps-expenses2.guardian.co.uk/ - it doesn't seem to fit Ajax quite as well (we should definitely have used Ajax for the first one).

The scans were sized to fit the design - with hindsight, having a "zoom" option would have made sense for some pages. They were large enough for the documents we had in advance, but you can never be entirely sure what the actual released documents will look like.


I was talking about whatever was linked with the post. yes, I think it was the one you mention.

When a user clicks a button to categorise an entry, the whole page reloads - this could be handled by ajax. And yeah the scans could and probably should be bigger.

Not that I want to sound like a jerk because I think you did a phenomenal job. Just suggestions for next time. But seriously you and the NYT are on the cutting edge of this shit and much respect is due.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: