Hacker News new | past | comments | ask | show | jobs | submit login
Security Lessons Learned From The Diaspora Launch (kalzumeus.com)
250 points by pw on Sept 23, 2010 | hide | past | favorite | 139 comments



"For example, if you were logged in to a Diaspora seed and knew the ID of any photo on the server, changing the URL of any destroy action from the ID of a photo you own to an ID of any other photo would let you delete that second photo."

When I was working as a pen tester I would completely scold developers for letting this happen - telling them that with everything we know today about security and good programming practices there is no way you should allow that to happen. Off by-one bugs, timing attacks etc. are more excusable, but this, this is just amateur hour.

That was 11 years ago.


> That was 11 years ago.

The problem is, these kids are from college. They don't teach you stuff like "writing a secure web application" in college, or even try to.

(Not that this is unreasonable, though perhaps I'm suggesting that there should be different career paths for CS majors and people who intend to be professional programmers. (I say as a CS-educated professional programmer))


I've always thought it would make more sense for CS degrees to be for computer scientists (ie, people who want to do more high-level theoretical work), and that software development was more of a trade school, where you learned the languages, and were soon thrown into real-world style projects and apprenticeships.

Imagine if your nurse came out of college having never stepped foot into a hospital, having only read about how to take vitals and such, but never having done it on a live human being.


I agree with the sentiment, but it's important to note that excellent programming requires some pretty high level theoretical understanding of Computer Science (i.e. Algorithms). There's a slow way to do everything and a fast way to do some things; programmers need to understand the theory behind this. In addition, if you're trying to teach someone how to write secure code, they're going to need at least some understanding of crypto. Crypto has theoretical aspects to it as well. Basically when you keep all the parts of theory that are useful to programming, there isn't much extra stuff left. It turns out the things you could remove only amount to about one theory course in your standard competitive Computer Science curriculum.

The real reason why we have an issue producing quality programmers is more fundamental; we just don't have enough people who are good at teaching it. It's not because we "waste" time on one or two courses on some theoretical aspects.

Another thing to consider is allowing Computer Science majors to opt out of general education requirements in favor of more programming classes. Allowing this would free up a semester or more for most undergraduates (as opposed to the one course saved from cutting theory). Even with just average teaching, a semester can make a big difference.


That's why professional programmers should get a B.S. in CS and then an M.S. in software engineering (or a few years of internship experience such as that required for licensed architects). Unfortunately, this is one of those things you can't say because it increases the opportunity cost of a programming career.


Or a professional programmer could just get a job and learn software engineering that way.


...by getting scolded by more senior programmers for playing amateur hour and allowing authenticated, unauthorized actions?


The theory is that they know your entry-level and do some mentoring, code review, etc. to teach you about those sorts of things. That sort of depends on you getting a job with a good company, though.


Cooperative Eduction (http://en.wikipedia.org/wiki/Cooperative_education) is another great option. I think it gave me a huge leg up on the students who didn't participate.


We're conflating a lot of different professions here. Phlebotomists only have trade school, but they're still going to be exposed to some pure science - at the least biology but likely chemistry as well - in high school.

Nurses and especially Doctors have years of pure science before they're ever allowed into their trade schools.

Likewise there are a variety of software careers, from sysadmin to developer to architect that require varied levels of education (though much like nurses developers can only benefit from better understanding of the principles behind their art.)


>Nurses and especially Doctors have years of pure science before they're ever allowed into their trade schools.

This is an (unnnecessary) North American tick, whose pernicious influence is spreading.

Medicine has traditionally been an undergraduate degree in Europe and all former European colonies aside from the US, and those countries that are in its cultural sphere (like S. Korea, which got rid of its undergraduate medicine degrees.) My cousin started his Medicine degree at 17. It will take him five years. It's not like this is even unknown in the States, IIRC UCSD has a runaround where you get a Bachelor while doing an M.D.

And even in the US there are different types of nurses, some of whom went to college, some who didn't (LPN, RN and Nurse Practitioner). I understand demanding continuing education and testing to ensure competency, but college is a means of doing that, but not the only one.


And demand for doctors is kept artificially high via a small number of available med schools[1], incenting the profession to erect increasingly high barriers to becoming a doctor.

[1]: http://www.nytimes.com/2010/02/15/education/15medschools.htm...


I've always felt the problem is that class projects never have to be production ready. You're expected to show that you've learned and can implement the theoretical material, but you never have to turn that proof of knowledge into a complete, well tested project.

That's one of my biggest regrets about my CS degrees (BS/MS). I took so many classes and did so many class projects that I never had to see one project all the way through to complete, tested, usable release.

And I don't think it's a CS department's responsibility to expect class projects to be built to a shippable standard. However, I do think that it's the school's duty to encourage students to work on a real, production project, of their own creation or as a contributor, on their own time - even if it means taking on a reduced academic load.


of their own creation or as a contributor, on their own time - even if it means taking on a reduced academic load.

But that's why the apprenticeship model would work so well. I've always thought the way union electricians are trained (a four year apprenticeship, which includes a lot of practical and theoretical schooling, until becoming a journeyman) would be a great fit for software development.


In Austria there are Höhere Technische Lehranstalten (HTL, http://en.wikipedia.org/wiki/H%C3%B6here_Technische_Lehranst...) which offer vocational education (but the whole educational system is very different from the US and other countries, which is reflected in the problems faced in the conversion to the Bologna system, i.e. bachelor/master etc.).

Over 5 years, students (usually aged 14 when they enter the school) receive practical training in programming as well as a theoretical foundation, though in no way as thoroughly as in any university program. In the first year of the informatics branch they let students enjoy the beauty of programming linked lists in C, which is quite tough for many.


As an ex-student, I would rather take a CS degree though, even though I was in it for the software development. You end up learning about software development anyway, in the beginning of your career, so it's not a good use of your time to devote yourself to this in college.


You end up learning about software development anyway, in the beginning of your career

...and that's exactly what people are complaining about: the Diaspora devs learning about software development practices in the beginning of their career, while writing code for release.


I wouldn't really call CS degrees "high level theoretical work". Plus, I blame the professors.

I see Master's level students all the time who don't even know the basics of programming. That's just not acceptable and gives the university a bad rep.


Depends on the course - the one I did was heavily theory/maths based. We had to do a lot of development but we weren't taught that much about good development practices - those kind of practical issues are pretty much orthogonal to CS and are much better picked up in a practical environment anyway.


If you need a college professor to tell you "Check that the resource is owned by the logged in user requesting a change before you change it", you might never, ever be a good programmer no matter what. This was apparent to me when I'd been doing PHP for like 6 months. These issues are common sense.


My first web app had authorization. I didn't even go to school for CS.

Simply put, they focused too much on trendy tools and libraries like MongoDB and CarrierWave and neglected the basics.


First of all, this type of thing, preventing users from just changing around URLs, shouldn't need to be taught. It is pretty common sense. When you make something like this, if you don't wonder: "Gee, what would happen if somebody changed photo_id=123&delete=true to photo_id=124&delete=true, would it delete photo 124?" then I'd have to say you aren't a very curious individual. That likely doesn't bode well for your programming prowess.

Validating user input is probably the first thing you learn about web application programming, which is frequently taught at universities, or in books titled "web application programming" which you should at least skim if you're going to start a project like this. Don't blame college for this. Just because it is something that isn't focused on in college (it is, though), and they went to college, does not mean it was college's fault. Would it be fair to blame college for any other mistakes they made, just as long as college did not "focus" on it? No. Some things are common sense.

Most likely, the culprit was time constraints, which is far more excusable.


What a stupid excuse. I'm a "college kid". I don't neglect simple access control in my software.


The problem is, these kids are from college. They don't teach you stuff like "writing a secure web application" in college, or even try to.

Bingo. I went to Georgia Tech, which has a pretty damn good CS program, and I had to hunt for security classes. One was a "special topics" course that wasn't available very often and didn't have anything to do with application security (was a Net. Sec. course). The other was not a CS course but a Comp. Eng. course and was focused on penetration testing. :/ I actually earned a "Network Security" certificate with my degree which I never even knew was available (it wasn't mentioned anywhere in the course literature).

Since I've graduated they have redone the whole CS dept so I don't know if things have changed, though.

But like someone else said, a lot of this stuff is common sense, especially if you're a programmer and have systems knowledge. And I think most programmers have the habit of imagining all the different ways things could break when they are coding, too. Like a hackers curiosity that most of us share. I know when something looks obviously wrong on a website or in an application I'm using I start to poke around and see what I can uncover.


They likely didn't learn Rails in college either


> They don't teach you stuff like "writing a secure web application" in college

We had a class on it. They basically pushed us through OWASP from front to back :)


I disagree, this should be an obvious security capacity: Don't let people who are not permissioned to modify a given resource modify a given resource.

I might be able to excuse this since they're fundamentally still in alpha (or pre-alpha) and were rushing to get code out.


"I might be able to excuse this since they're fundamentally still in alpha"

I wouldn't. Authorization is the sort of thing that has to be done first.


Any decent rails book will have a section on common security flaws and how to avoid them. There are plenty of web tutorials on the same topic. All the flaws in the linked article are very basic and should have been avoided if they took the time to RTFM.


I can probably go all night on this, but a couple things from a quick read of this (very good) post:

First, mass assignment.

The answer to mass-assignment bugs is "attr_accessible". Accessible attributes can be set via update/build/new; nothing else can. Every Rails AR model should have an "attr_accessible" line in it.

I've met smart dev teams working under the misconception that attr_accessible means "these are the attributes that can be changed based on user requests", and so virtually everything is made accessible. No! If something's not attr_accessible, you just set it manually (user.foo = params[:user][:foo]). It's not painful and the extra line expresses something important ("this is a sensitive attribute"). Attributes are inaccessible until they prove themselves mass-assignment-worthy.

Second, the string interpolation in the regex.

Real quick: don't ever let users interpolate arbitrary strings into regular expressions. Regular expression libraries are terribly complicated and not very well tested. To illustrate (but not fully explain) the danger here, run this line of code:

     ruby -e "'=XX===============================' =~
     /X(.+)+X/"
There are worse things that can happen to you with regex injection than a trivial DoS, but that should be enough motivation.

Oh, one more thing: I appreciate Patrick's take on systems failures breaking Rails apps before underlying crypto flaws will, but even if they had protected their keys, their crypto wouldn't have worked. Don't build things that require crypto. You aren't going to get it right.


Every Rails AR model should have an "attr_accessible" line in it.

I'd do you one better: use an initializer to monkeypatch ActiveRecord::Base and fire "attr_accessible nil", which will cause mass assignment to fail on any object you create from a class which doesn't make the assignment explicit.


That's clever. Want a job? =)


You two, get a (conference) room :-p


Is that offer good for anyone? http://news.ycombinator.com/item?id=1031126


Of course. Drop me a line. I'd love to talk to you. We love talking to HN people.


Good luck! I've been trying to hire him for months now. ;p


I'm keeping this in my back pocket the next time I have a conversation about why I prefer Ruby's monkey patching paradigm to Python's strictness. This is better than all my current examples. :)


The good part about Ruby: you can monkeypatch around framework defaults which do not maximize for your project's circumstances. The bad part about Ruby: your least talented coder can monkeypatch around security features which make his life more difficult ("attr_accessible? Stupid Rails coders, don't they know they have private for that shit? Well, I'll redefine it to just NOP. I am the awesome!")


I find educating the least talented coder I work with a mostly social problem that I can solve over lunch [1]. But yes, I've definitely felt the pain of monkey patching gone awry. :-p Working around a restriction enforced by your language is no picnic either though and it's not really something you can solve cleanly.

[1] Obviously large companies with massive Ruby code bases can't really do this. Not sure what to say there.


The real problem is with the coders that you don't eat lunch with — the authors of the shitty gems that get pulled in as dependencies.

For your companies' code reopening a class should be a huge flag in code review (something like gerrit should be in place at every large company), but it's not sustainable to police the dependencies of the libraries you use, especially when the default in the Rails community is spray and pray.


Note that this specific fix doesn't need you to open/monkey-patch ActiveRecord::Base, you can just do ActiveRecord::Base.send(:attr_accessible, nil) in an initializer.


In Shapado we use a safe_update methode like this so we always need to specify which attribute can be updated:

@question.safe_update(%w[title body language tags], params[:question])


I like this better than my solution, which was to specific which params were allowed for each controller action and remove any that weren't allowed.


I'd go one further and say that if you use string interpolation with anything remotely related to user input at all in a web app, you probably just wrote a security hole of some sort. I use "probably" in the frequentist sense of the term, because the odds of a given user-input string interpolation in an arbitrarily-chosen web app being in at least one of 1. an HTML string 2. a database string 3. a javascript string or 4. something else that is executable by something, without the context-relevant encoding function, are quite high. Same goes for any string concatenation involving anything from the environment or user without the conspicuous presence of an encoding function of some kind, since interpolation is just syntax sugar for concatenation.

"Don't let users interpolate, ever" is close to truth. It isn't quite truth, but it's a lot shorter than the truth.


ruby -e "'=XX===============================' =~ /X(.+)+X/"

Why does that hang in Ruby? In Perl it's fine...


Because Ruby uses PCRE, but Perl uses its own regex engine which handles cases like that a lot better. That pattern is rejected right away by Perl's engine because it sees that there isn't another X in that string:

    anchored "X" at 0 floating "X" at 2..2147483647 (checking floating) minlen 3 
    Guessing start of match in sv for REx "X(.+)+X" against "=XX==============================="
    Found floating substr "X" at offset 2...
    Contradicts anchored substr "X", trying floating at offset 3...
    Did not find floating substr "X"...
    Match rejected by optimizer
    
Changing the pattern to /X(.)X/ will cause Perl to do real work. But it'll complete it in 0.003s while Ruby will just hang.

I wrote an article a while back about the differences between PCRE and Perl's engine: http://use.perl.org/~avar/journal/33585


Thanks for the explanation


Well... basically, it sounds like Ruby's regex engine needs some work, hmm?


No, people should know better than to write regexes like /X(.+)+X/, with gratuitous doubly-nested "+" characters. :-) This code performs fine when written as /X(.+)X/, and it matches the same set of strings.

Regexp engines are subtle beasts, and there's a couple different ways to implement them (DFAs vs NFAs, simple engines vs lots of clever special cases, etc.). See O'Reilly's "Mastering Regular Expressions" for an exhaustive discussion.


Interesting. Any other recommendations on how to secure regex's that take in user input in Ruby/Rails?


First of all you should think hard before taking regexes from users. Even if you do it correctly you'll still (presumably) need to search over your entire dataset, instead of doing something more lightweight like rely on SQL indexes. Use it with care.

You should use a regex engine that's explicitly designed to take potentially hostile input. Like the Plan9 engine, or Google's re2 engine which powers Google Code Search.

You can also just use Ruby's dangerous PCRE engine if you do something like forking off another process with strict ulimits which executes the regex for you. Then you can just kill it if it starts running away with your resources. Look into how e.g. evaluation bots that work on the popular IRC channels on FreeNode are implemented. POE::Component::IRC::Plugin::Eval on the CPAN is a good example.


This assumes that PCRE doesn't still contain memory corruption flaws, despite not being heavily tested, and being in effect a programming language interpreter. Tavis Ormandy found a couple serious problems a few years ago.

I'd just scrub the hell out of strings before passing them to a regex engine.


Even if it does it's a pretty remote possibility that it'll be exploitable if you limit the input to say 100 bytes. Pretty hard to get a Perl or Ruby level program of that size to exploit some memory corruption at the C level.


Good advice, thanks you two!


It just seems to me that if Perl's regex engine handles this without a problem, and Ruby's implementation freaks out, something should be improved about the Ruby one.


I don't have access to a box with ruby on at the moment, what does that line do?


It just hangs ruby (like in an endless loop)


Ah fair enough, thanks!


Lesson learned: Never let the outside world see your First Big Project Ever.

This is what Fred Brooks would have called the First System. Everybody builds this thing at the beginning of their career, and it's always this embarrassing. Mine, in 1996, took this a step further and actually prepared SQL statements in javascript before submitting them to the server to run. Yours probably did something equally bad. It's the rule of the First System. It's where we learn all the lessons we need so that we can do things right the Third time (I'll spare the Second System description for the time being...)

The key though is to make sure that nobody ever sees that code. Hopefully it will be locked away in some intranet vacation-time-planning app that nobody will ever dig into. That way you can look back at it in shame, but few others will ever know about it.

So here we have a team of people who have clearly never built anything at all, trying to learn on the job while being scrutinized by the entire world and actually submitting their code for public review.

God help them.


Quite the contrary. Provided they have the right attitude, the programmers at Dispora just learned more on their first project than they ever would have if they hadn't shown it to anyone. That's one of the primary benefits of participating in open source software.


I think the point is that this is the stuff that you always learn in the course of building your First Big Thing. By opening it up to public ridicule rather than to a polite code review by an understanding senior dev, they don't really get any extra learning out of the deal. Just extra humiliation.


Yes, but Fred Brooks also warned that the Second System would be even worse!


No need to but. I actually mentioned that above :)


I'll admit, when people came down hard on them vis a vis security, I figured it was just a bit sloppy as a preview release.

I was wrong.. these aren't really security "holes" as that's not strong enough a word. I think the best way to put it is they accidentally created the first social network wiki.


When you put it that way, suddenly I'm intrigued by the concept.


Obviously real people will object when you manipulate their friend lists and their photo galleries. But there's real possibilities in a wikified social network for fictional characters.


Permissions-based, perhaps? I often feel facebook is a little wiki-ish when I find myself tagged in a new album of photos a friend uploads.


Maybe they should take that strength and build on it! Then they can say they were visionaries instead.

I agree though, these aren't like subtle security holes that would need a security expert to review. Checking that a user own the resource on which they are requesting modification is basically common sense.


You've said this twice now, and it still isn't true. Don't be the Youtube commenter who, on a video on how to hold and what to call parts of a musical instrument and how many people got it wrong, said "this video is useless, it's all common sense".

If it were common sense to do it, they would have done it. It's not. It's a very distinct thought pattern shift from "the browser is a part of the execution of our code and it will only try a delete link which the code has generated" to "the user can request anything at any time no matter what links we have or haven't generated or what they can see on screen".

It's a learned shift specific to some subsets of some kinds of computer programmers, not at all "common sense".

(and besides, even if it were common sense, what's the point in your comment then?)


Sure, I've said this a few times because I'm talking to various people, in my perception of how posting here works.

Your assumption that everyone shares in common sense equally is a bit optimistic.

So, then you must agree that they clearly don't understand, as you say "the user can request anything at any time no matter what links we have or haven't generated or what they can see on screen". To me, this shows a lack of understanding of basic guidelines of web programming, namely that you can never, never trust user input, whether it's form submissions or cookies.

Perhaps not common sense, but nor is it an advanced principle. If you've ever used Firebug for more than a couple of hours, you'd have figured out on your own that you can change forms and then submit them. If you've even used a browser for a while, you will have realized you can type in different numbers in query strings. If they haven't noticed that by now - what are they doing taking on a project like this?


The landlord of the pub I worked in once used to say: "There is no such thing as common sense. It's all experience"


It's not that the pre-alpha Diaspora has insecurities that bothers me, it's the whole execution.

What I really would like to see is a documented protocol - based on XMPP or some other established, well-tested protocol would be good, but if not then at least something.

Once you have that protocol - which tells you how Diaspora "seeds" communicate securely - you can let others build their own implementation, using Rails, PHP, Python, doesn't matter. Sure, release a reference implementation in Rails, but the protocol is the most important thing.

Unfortunately what we have is just another Facebook clone done in Rails, which is disappointing.


It looks like a classic case of poorly managed expectations - the technologists would prefer an approach like the one you outline (and this was what I was expecting), however given their visibility they had to deliver a working application that people could download and install and have it do something.

While meeting both of those objectives in those timescales might be possible it would be a truly remarkable achievement. Not surprisingly it didn't happen and they released something that pleased nobody - all we can hope for is that they learn some lessons and move onto better things.


That's what I found interesting when looking into OneSocialWeb, their focus on already existing protocols (XMPP plus some XEPs http://onesocialweb.org/developers-xmpp.html), instead of adhering to Not invented here.


Google tried this with Wave.

Nobody stepped up to write a decent client, and the product was judged (unfavorably) on the merits of the reference implementation.


One more bug in the first code snippet is their use of find_by_id means that @album could be nil, causing an error when they attempt to edit it. Most people would use @album.find and then catch the error/show a 404.

I realize these guys are in college, but they really should have (a) brought people's expectations in line with their abilities and (b) reached out to experienced developers to help them out. Intridea probably would have given them a few developer hours per week to help code, advise them, and so on. Just a little input and guidance would have saved them a lot of grief.


Totally, their lives would be so much better now if they had just asked 2-3 experienced Rails developers to review their work before releasing it like this. Actually, the fact that they didn't think to do that kind of illustrates a problem.


They shared a space with Pivotal Labs. Either they were lazy and didn't think to ask about stuff that matters - ui is nice[1], but securing your app is more important - or the Pivotal Labs guys dropped the ball big time.

I'm inclined to think it was neither and they just didn't think anyone would notice. It happens.

[1] http://www.joindiaspora.com/2010/07/01/one-month-in.html


"NoSQL Doesn’t Mean No SQL Injection"

I lol'd. Mind if I use that?

MongoDB is harder to secure and filter because you have all of Javascript to worry about, rather than just SQL (and where most servers can escape arguments themselves through prepared statements etc.).

SQL databases are also well understood (for eg. in MS-SQL I can stop the remainder of the statement from executing with '--'). MongoDB with its JS engine is still a big unknown.


Similarly:

"...secret squirrel double-plus alpha unrelease..."

Mind if I use that? It would be a terrific title for an animal fighting game I've been itching to make.


It's a snowclone of a line from Animal House.


...which also references 1984 and a Hanna Barbera cartoon. Of course it's less funny now that I've analyzed it, but still a good joke.


I haven't used MongoDB, but I think there is a fundamental difference in the way data is updated. I don't think you have to escape JavaScript in the user input, because you don't update by submitting a single String to execute. The user input is just data.


It depends on the client driver. They have insert, delete, save etc. which send those commands with the user supplied data encoded, but most of the drivers also have an exec or execute which dumps what the user enters straight onto the db.

for eg.

http://www.php.net/manual/en/mongodb.execute.php

"This method allows you to run arbitary JavaScript on the database."


Most Mongo queries don't involve javascript, they're abusing a special operator.


There are a lot of lessons to be learned from Diaspora, beyond the (brilliant) points made about security.

1. When media hype provides you with $200k, you're still best served by bringing people's expectations down to earth. There was no way they were going to be able to build anything approaching a Facebook killer in 3 months, and it would have been best if they would have made that clear in the beginning.

2. There are a number of projects like Diaspora that have been working for years towards the same exact goal. The only way Diaspora could have succeeded where those have failed (or not-yet-succeeded), is if they had properly articulated where they would go right where others had gone wrong. If you can't do that, you probably don't have the perspective necessary to take on such a huge project.

3. We all need to be less susceptible to the story of the "boy wonders" taking on the establishment. Between Diaspora and Haystack+, these should be sobering lessons about the dangers of hype and a good human-interest story when the result we need is stable, well-written code.

+ http://blog.jgc.org/2010/09/myth-of-boy-wizard.html

4. Open source isn't magic. Rails isn't magic. Having a good idea and a whole lot of heart isn't magic. It's true that the software world isn't exactly a meritocracy, but at the same time, we need to recognize that you have to be able to build something generally usable, and if you can't, there's nothing that will save you except for harder work and learned lessons. Some bugs will be fixed by the interest in Diaspora, but there are big architecture questions here that need to be resolved, and coordinating that democratically through the internet in a sea of strangers is a logistical nightmare. And while rails does provide a lot of functionality out-of-the-box, a project of this size isn't held up by how long it takes to write the photo-uploading code, it's held up by the big-picture stuff that rails can't really help you with. In the end, programming is programming. You need specs, mockups, user stories, documentation, all kinds of unsexy stuff.

I think we'll probably see Diaspora stabilize into something usable at some point. But I'm very doubtful that will be anytime soon, and I'm especially doubtful that it will be before the other projects (Elgg, Appleseed, OneSocialWeb, StatusNet, etc) mature into the facebook killer people want to see.

Building the kind of open source social networking software necessary to take on Facebook at it's own game is such a massive, complex undertaking, and it's such undiscovered territory, that there really is a big disadvantage to being the new kid on the block.


Good post. It comes down to inexperience, and I am sure we have all been through it ("pfffft.. I could do that in 4 hours" -- younger me).

There were red flags in their kickstarter post:

* "We are four talented young programmers from NYU’s Courant Institute trying to raise money so we can spend the summer building Diaspora"

* "Diaspora knows how to securely share (using GPG) your pictures, videos, and more."

* "We have a plan, a bunch of ideas and the programming chops to build Diaspora. What we need is the time it takes to iron out a powerful, secure, and elegant piece of software. Daniel, Ilya, Raphael, and Maxwell are all ready to trade our internships and summer jobs for three months totally focused on building Diaspora"

* "We promise to you that Diaspora will be aGPL software which will released at the end of the summer."

They also said that no similar system exists (they do)


> Open source isn't magic.

Isn't it though? I think popular open source projects have magic. These guys put out a demo with lousy unsecured code and rookie mistakes and within a week the worst offenders were identified and repaired.

I'm not saying they are great developers, but certainly OSS is some sort of powerful magic.


Open source is good at getting the small bugs, the syntax errors, the security holes with clear and direct fixes, etc.

Open source is not good for making use of the "Many eyeballs" for architectural decisions. It's where the wisdom of crowds provides little benefit. This is why most open source projects are not a haphazard bazaar of stone soup contributors, like most people think, but are actually small dedicated teams of talented software developers, who are usually paid.

I think our conception of what OSS can achieve, simply by being OSS, is somewhat inflated.

(Don't get me wrong, though, OSS is fantastic, and I think it's, by design, much better than close source)


The biggest problem I think, and I speak from experience, is that some of the best rails tutorials leave out a lot of important security issues entirely. Some of them might do a trivial paragraph on password encryption, but most of them leave out any mention of these kind of basic security flaws.

Even if the writers didn't explicitly write chapters dedicated to security, obviously even that wouldn't be enough, they should at least note which code is entirely unsafe in production, and where you can learn more about it.

Every rails book is filled with this sort of code that people learn, and then use.


Obligitory reference:

http://expatsoftware.com/articles/2007/03/examplecode-produc...

(exampleCode != productionCode)


This is a great post and shows how awesome of a community we have. The Diaspora guys are a bit behind when it comes to solid programming, but Patrick (among others) reviewed the code, noted flaws, reported and advised them on how to fix them, then wrote a great post explaining what was wrong to help others avoid the same mistakes. Stuff like this is both educational and helps the programming community move forward. Thanks to Patrick.


Thanks for the web-app security pointers in the post: the RoR security guide[1] and the OWASP Top 10[2]. These should be required reading for anyone making a public web application.

[1] http://guides.rubyonrails.org/security.html (Well-written, like the other guides. Totally worth reading fully).

[2] http://www.owasp.org/index.php/Top_10_2010-Main (Open Web Application Security Project's top application security risks for 2010)


Not to be totally nitpicky but if they're using any recent version of Rails (I haven't looked at the source yet), the DESTROY action doesn't respond to GET by default. That doesn't change the fact that they don't scope deletes to the logged-in user's assets.


POST vs. GET is a little bit of a red herring anyways, since either method works for CSRF. (I'm adding to your comment, not amending it).


I'm certainly not qualified to argue with you on anything security related (nor would I want to). I'm simply clarifying Rails default handling of destructive actions which is probably semi-offtopic.


Would appreciate if more articles like this are posted on HN, useful and practical!


Yeah - I usually come on here to find the technical sorts of articles that I learn from and have been seeing less and less of these lately.


It's true. And as a result, I'm seeing more and more threads like this one. :)


For that to happen, HN update speed would have to slow dramatically.

(I favor this and agree with you, incidentally.)


I knew Diaspora had problems when I got an email from the guy who hosts Openspora telling me that [two] other people had the same username.


I like that the phrase "Lil Bobby Tables" to reference dropping the database via SQL injection has entered the hacker vernacular.

What would be a nice one-page security guide would be a 'lil bobby tables' guide to databases - SQL injection for any database - (SQL or NoSQL) - the goal being to help developers prevent these attacks.



Well, I don't have a ready reference for SQL injections, but how about one for XSS? http://ha.ckers.org/xss.html If that doesn't teach you anything, well, my hat's off to you.


Wow, those are some pretty amateur mistakes. Just taking whatever data you got from the user with no checks? Really? And that's not something you fix later. That should stick out like a sore thumb the moment you write it, making you check that user == logged_in_user before you even move on. Wow.


Did anyone really believe that a bunch of guys were going to write a secure privacy-based facebook-killer application in three or four months?

That's a challenging proposition even for experienced teams.

Though we all know that they warned us that the software was still full of bugs and treated as experimental.

Patrick says that doesn't matter, that they haven't got the foundation right and anything built on top of it will likely fall.

Maybe they can still iterate and fix those things. Maybe they will scrap large sections of the code and re-write it properly. I suggest they keep trying and learn something from all of this. Us older geeks can be pretty harsh sometimes. We often expect new-comers to not make the same mistakes we did once. That's how we learn though, so don't take it the wrong way.

Next time just don't promise more than you can deliver.


I had my doubts about Diaspora, but now I know for sure: it will be a complete fiasco. Those are horrendous errors that show that those guys have completely no idea about web programming - and I can't see them learning it quickly (certainly not before planned release date of the final version). Sad thing.


Indeed, the level of errors shown in this demonstrates that they would probably need about 2-3 years of experience to be decent at doing work like this. I think the best thing they could have done is hired 3 skilled developers to work for 6 months at $5k a month.


Which begs the question - where did the money go ?

What they've released looks like your average weekend github side-project. I suspect a large % of Hacker News members have projects like this (though hopefully with better security ;-)). So what did they spend the money on ?


For their sake, I hope they still have most of it!


I appreciate the two links with how to handling security issues in web application.

As many have pointed out, they do not teach this stuff in University and acquiring such knowledge you tend to have to be very proactive about your development if you do not have industry experience.

So thanks a lot for sending these nuggets our way.

You stated that the team is manifestly out of their depth in terms of web security, how do you suggest that they proceed given their month deadline? Can they pay a security expert to resolve these issue or no one wants to come within a mile of this?


So one of the painful lessons I picked up in industry is this: an impossible deadline plus a great desire to change is still an impossible deadline. Scheduling is not my bag, and I'm far less well acquainted with Diaspora than you are, but if I was four man months from release at the day job with the state of the project where I think it is I would be sending out emails saying "We will not hit this. It is impossible. We need to cut scope or push back release." to my superiors.

I don't know if you can get a security expert to fix this for you. You can certainly wave a big enough check around to get somebody to look over your code, but that won't magically improve code you haven't written yet. Also, it is highly likely that Diaspora is architecturally insecure -- that, beyond the "Oops, didn't check the input" code-level whoopsies that your federation strategy as written (and apparently as not documented outside of the sourcecode) just cannot be made to work right.


That's what I feel. Not only do they have an impossible deadline to hit right now, but the deadline they did hit was probably impossible as well.

They may have spent most of the summer architecting this thing and figuring out things like "seeds". When it came down to putting code on screen, as it were, they had minimal time to do so. All of these security issues feel like the release was rushed and that they might have been better off releasing it silently, without a way to deploy public nodes, and having a blog post explaining the situation.


Thanks for the reply. I agree with the impossible deadline bit. Can you give an example of what you would consider a architecturally insecure web application?


Pretend someone described email over HTTP as a hit new webapp. Email over HTTP is architecturally insecure. Architecturally, there is no way to tell that people are who they say they are. Architecturally, the message is readable by every server between the endpoints. There is no notion of trust baked into email, so you're going to pour gazillions down the drain to retrofit anti-spam mechanics over the insecure architecture.

In terms of micro-architecture, take a look at Wordpress. I love Wordpress, don't get me wrong, but it almost can't be made secured due to some design decisions that can't be reversed, such as "Wordpress templates contain executable code with direct unfiltered access to the database."


What idiots those Diaspora guys are! All that excellent security consulting, for free! They don't know the first thing about software development! It makes me so mad, I'm going to write up a carefully researched and detailed account of other errors they've made! Then they'll see how clueless they are - again! Ha, what amateurs!


EDIT here because too late to edit my above comment I just want to clarify, to distance myself from the recent genius/tragedy submission: I'm mocking the people who criticize the Diaspora guys for not being perfect. I'm pro release early, release often and anti perfectionism. IMHO, Patrick is doing exactly the right thing - helping to build something better. But my post could be interpreted as mocking Patrick - that's not what I meant, and I apologize for the ambiguity.

On reflection, I shouldn't have expressed this through mockery at all, but through helping. :(


Excellent article. I can see why under the kind of time pressure they have been under there would be issues like these.

Interesting I was thinking in comparison to PHP where a beginner would much more likely be individually assigning each input into a SQL statement. Sure there are all kind of possibilities that can go wrong there to but mass assignment looks like it makes it so easy for someone to shoot themselves in the foot.


I've always wondered about this (not being a code-monkey-ninja-wizard, myself)...

If you open source something, unless it's perfectly written, wouldn't the hacking potential be... near 100%? If everyone can see how you do everything it seems like even a minor slip up will potentially surrender your site.

Could someone explain this (I'm probably missing a piece of the puzzle I can't place)?


Having source code makes both attacks and defenses easier. It isn't automatic that a bug present in source code will be caught, either by attackers or defenders -- "a million eyes make all bugs shallow" is basically horsepuckey. The bugs in the PNG processing routines, for example, took something like a decade for someone to find and exploit. All Java implementations of OpenID are OSS. All are vulnerable to timing attacks, at least as of a month or so ago. I implemented one of them and passed my implementation past our resident God King of Engineers and he didn't see that fault, either.

Now, if you're a highly anticipated project and you're making errors covered in every Security 101 article which happen to be very visible, then OSSing your code makes it highly likely that people will see those, for good and ill. What scares me for Diaspora's future isn't those errors -- it is the part of the iceberg below the waterline. I mean, if you're steaming at full speed towards a gigantic "I'M GONNA RIP UP YOUR BOAT!" sign, there is probably something underwater and I doubt any qualified security guy (I am so not one) will donate you a few tens of thousands of dollars to tell you how screwed you are right now.


On about 80% of our projects we'll never see source code. Not having source code is a speed bump for a professional appsec tester. The problem with Diaspora isn't that it's open source; it's that they launched their open source project when it wasn't ready for that.


With source code available attackers are white-box testing as opposed to black-box testing.

Black-boxing is where you throw known arguments at a system and measure the responses, and hence deducing what the system is doing

White-box testing is knowing what is happening internally, so you can immediately skip to step two of a security test - which is exploiting.

With black-boxing it takes a very very long time to learn the entire system and its workings, but it can be done. Having the code just means skip that test. For eg. with Diaspora from it launching to it being exploited was a matter of minutes.

The Twitter URL escaping bug from this week was from within one of their public source code repositories. While they don't release everything as open source, they have released enough to give an attacker a good view of their stack and how it works. Bugs not detected in the open source code are likely to also appear in other parts of the platform that are closed source (since they weren't detected in the first place)


That's why you don't run pre-alpha code on a production server. The advantage of publishing code this early is that people like the author can point out flaws and help secure the software.


The idea is that people can help you eliminate insecure methods--if the code available. "Security through obscurity" leaves you with no assistance and engenders a false sense of security where determined attackers are concerned.

That said, MPWILGSIANSE (my password is LadyGaga so I am no security expert)


It is funny, I have looked at the code and am not a big ruby/rails person, but see obvious flaws and security holes. However, a lot of the discussion around these have been either in the "it is amateurish" bucket or the "are they expecting help from the OSS community? that is leveraging eyeballs, that may not come".

Now, while it hasn't been their primary intension (most likely), to have the OSS community (and others curious about the code) provide fixes and feed back, they sure have gotten a lot of input and advice. I also suspect they have learned quite a lot through the feedback - positive, constructive, and inflammatory. You really can't ask much more -- aside from some direct help.

While I have no plan to run my own Diaspora node, I do look forward to seeing how the code and project evolve.


[deleted]


Hire a coder from any of the outsourcing places. I hate to plug a specific link - really I do, but truth be told people can be found to do a job like that for $5.


After reading the article, also read this comment: http://www.kalzumeus.com/2010/09/22/security-lessons-learned...

The comment negates some of the statements made in the post


No, that comment just helps to explain how trivial these things would have been to work around. That's the whole point of the article, that these guys missed all the most obvious things that you need to do to secure your application.

There are no deep, tricky issues explained because absolutely zero effort was needed to find a half dozen breathtakingly bad practices floating at the surface.

So yes, of course it's trivially fixable. The problem is that it wasn't trivially fixed and they thought they were ready to release it.


Please read the whole comment I linked to. I was talking about things like the OP saying that browsers could delete things by prefetching (GET requests) or that update_attributes does a double assignment which are simply not true


I don't understand what the comment means by double assignment, but I guarantee that update_attributes will indeed let me overwrite owner_id in the manner specified. Would you like me to demonstrate this with code against a specific git revision? It isn't hard.


No, I'm happy to believe you. It just seems that assumptions like the "the code doesn’t check to see if the destroy action is called by an HTTP POST or not." are incorrect and still in the post. There also isn't a proper answer to that comment so far.



It wasn't my reply, I just thought it was an interesting one.

The original post lists a lot of samples, the reply adds some more information to those, the "reply" moves this into almost to an argumentum ad hominem.

I'm just saying that it would be nice if the OP would at least delete the stuff that is just plain wrong.


Well, I for one am glad that I learned from articles like this. It's a hell of a lot easier to read discussion from "real programmers" and learn from them. Now that I'm in college stuff like this is second nature.


OK, you've made headlines. Now, where are your patches? ^_^


I don't think this is that big of a deal. Pretty much every developer I know has learned about security through this exact process. Either a senior developer or user exposes the flaw and smart, but new, developers quickly realize the didn't understand the attack angles. Without concrete experience it's pretty hard to appreciate how exactly these attacks work. But after a few exploits you start getting paranoid, understand the basic problems, and are always thinking and reading about how your code might be missing something. These guys seem reasonably smart so I expect after a few months of exposure, they'll be fine.


Why would the community of experienced developers who are supposedly expected to be interested in working on this project, find it rewarding to sit around and wait for them to work through their training wheels? This whole situation really is absurd.


It doesn't seem absurd to me that a few college friends want to hack together a project to fix a problem they see. I think it's great, regardless of whether they succeed. Nobody is forcing you to join them.


Nothing at all is absurd about what they're doing - what is surreal is all the media attention and the expectations it has created in the average person regarding this. I don't think it's in the benefit of these four guys, actually. They're lucky in some ways and really unfortunate from other perspectives. I don't even know how I'd feel if I was in their shoes right now.

The idea that 'these guys are going to get all this money, and work for three months, and then surely deliver this top notch 'facebook killer' that we all want to use!' is a bit absurd, that's all, given that there are plenty of other projects working in this area, and there would be no reason at all to think that this one would be superior, the best, or even suitable. If it wasn't for the nytimes, etc. this would be just another project on github. The media has taken a normal situation and screwed it up majorly, and it doesn't appear to me that it's going to turn out that great for anyone involved.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: