If there's something that defines the Linux kernel development process is that l...

cyberpunk · on May 15, 2016

You don't evolve proper security though a million tiny commits. You design it. This is why containers on Linux will probably never have the security that Solaris or the BSDs have, and this attitude is why. It will never get better until that is rethought, if you're right..

SwellJoe · on May 15, 2016

I don't think I buy this argument; at least, I don't think grsecurity backs up this argument.

grsecurity is a big collection of a bunch of different techniques. It isn't a total rethink of the Linux kernel, it is a bunch of patches spread over a large surface area. Some of the changes are far-reaching, yes, but even so, each of the techniques represented in grsecurity could be broken into independent patches. So, why haven't they been? (I honestly don't know why. It's not something I follow closely.)

zanny · on May 15, 2016

> So, why haven't they been?

Grsec upstream argues that only accepting part of the whole is worse than having their patches outside mainline in bulk, because it would give developers and users a false sense of security to only provide some of the hardening grsec does.

See this[1] LWN article from 2009, which is a kernel developers post with a comment retort by some grsec developers.

[1]: https://lwn.net/Articles/313621/

The frustration is that we are now seven years later and the situation has only gotten worse - plenty of kernel exploits have emerged, no distro is shipping grsec comprehensively, grsec itself had to stop providing stable releases, and it has become more religious than logical as to why many critical features, like the internal bug prevention functionality, never will be mainlined and will be re-implemented with much less thoroughly tested surrogates.

SwellJoe · on May 15, 2016

The irony is that the grsecurity folks complain of limited time and unwillingness to do that work. But, had they done so back then, they wouldn't still be maintaining a huge patchset against the massive moving target that is the kernel. Once a patch is in, it becomes the onus of every kernel developer to not break it. As long as it remains in a silo, nobody has an obligation to not making breaking changes.

I don't have a lot of sympathy for that position. I respect the work, but not the hostile-to-collaboration approach.

makomk · on May 15, 2016

This is the problem with grsecurity and the security community in general, I think. grsecurity includes things that are relatively safe and beneficial. It also - for example - redefines integer arithmetic kernel-wide in a way that causes massive false positives and kernel panics. Insisting that if you don't want the parts likely to take critical servers down at the worst possible time for no reason, you may as well not bother is ridiculous, broken, and leads to reasonable people just not bothering.

digi_owl · on May 15, 2016

> This is the problem with grsecurity and the security community in general

Yep. They keep coming across as treating security as something binary. Either it is secure or it is not.

And the moment even the most convoluted of CVEs gets published, anything affected is in their view insecure. And thus needs to be taken out of production until a fix has been applied.

Frankly it seems quite a number of the most outspoken in the community is not in it for the hohum daily security process, but as some kind of grand joust with "the man". Thus their low water mark for security is "can it stop the NSA".

xorcist · on May 15, 2016

> This is why containers on Linux will probably never have the security that Solaris or the BSDs have

The dozens of changes which you refer to as containers are very much bolted-on to both FreeBSD and Solaris. It would have been nice of that wasn't true, but it's just not realistic to to back to the drawing board netiher for FreeBSD nor Linux to get this single feature.

Just think about cgroups, which required some rather deep reworking of several subsystems in Linux, took years to get in place. And that's just one of many pieces that containers can potentially make use of.

ZenoArrow · on May 15, 2016

I don't know about the security situation with FreeBSD, but the OpenBSD team seem to put security as a central consideration of what they do.

http://www.jp.openbsd.org/security.html

DDub · on May 15, 2016

Why do you believe that they are opposing activities? I can make a journey by helicopter or by walking but the destination is still the same, however as I walk there someone might point me at a better one.

CrLf · on May 15, 2016

That's what people used to say about OS kernels, but Linux proved them wrong by evolving the most successful kernel on the planet by almost all measures.

KingMob · on May 19, 2016

True, but one measure Linux is not the best at is security. By many accounts, OpenBSD leads the pack there.

forgotpwtomain · on May 15, 2016

> If there's something that defines the Linux kernel development process is that large/wide changes are never accepted in bulk. Changes must be broken up into independent pieces that make sense by themselves and can be well understood by the people doing the merges.

And that's how you end up with clearly superior subsystem implementations (BFQ) that linger outside the kernel as GP mentions. Because a professor doing free work should somehow find a way to hack a cohesive working subsystem into the mess which is the old codebase so as to satisfy the maintainers (who are paid to work on this stuff) need for a gradual transition (which will eventually rip out all of the old code anyways).

CrLf · on May 15, 2016

That's just it, isn't it? What makes it clearly superior? (BTW, I'm not arguing it isn't.)

Linux supports quite a lot of different workloads on quite a lot of different hardware. Looking for the best possible performance while ensuring reliability is a very big task and a very big responsibility not to be taken lightly by stereotyping kernel maintainers as some villains looking to make victims out or poor, well-intentioned, contributors.

Also, arguing that contributors don't have the means to go through a public review process intended to result in better code doesn't really add much credibility to the original code, does it? (Not saying that's the case with BFQ either.)

There have been many instances in the past where "clearly superior" patches ended up having pathological behavior for many workloads. Something that would only be discovered after impacting a lot of users if it weren't for the review processes in place. Some of these were merged anyway (but not made default) while some others didn't prove to provide enough improvement (even in the cases they optimize for) to justify the extra bloat.

forgotpwtomain · on May 15, 2016

It's been a while since I've looked at the papers or read the discussion surrounding it's inclusion into the mainline (so I apologize if I get some details wrong), but my understanding regarding the first point is two fold:

> That's just it, isn't it? What makes it clearly superior? (BTW, I'm not arguing it isn't.)

A) There are is a range of benchmarks which support that conclusion across various hardware and workloads. [0][1]

B) It is my understanding that the current day CFQ is a complicated code-base with various engineering improvements that don't have formal specs or theoretical guarantees, the improvements BFQ brings have theoretical underpinnings [2][3], at the least a spec other than "read the code to understand how and why".

> Looking for the best possible performance while ensuring reliability is a very big task and a very big responsibility not to be taken lightly by stereotyping kernel maintainers as some villains looking to make victims out or poor, well-intentioned, contributors.

It's definitely not my intention to villainize kernel developers (for example Tejun Heo appears to be highly supportive in [4]) - and I don't think Paolo is a victim here; but I do think that both casual and professional users are hurt by the end result.

> Also, arguing that contributors don't have the means to go through a public review process intended to result in better code doesn't really add much credibility to the original code, does it? (Not saying that's the case with BFQ either.)

I mean to the extent that you are talking about ripping out a sub-system for a new one, it's nearly impossible to really talk about having the means. The later is I presume a non-trivial task even for someone with a long track-record of good engineering and maintenance. Paolo is if anything also a very good engineer - but that's not even the case for a lot of academics. I guess from my perspective if an Academic provides you with formalized improvements + benchmarks + working code that's already really, really great, there is very significant reason to have engineers with a long track record of reliability who can review the specs, implement/verify/improve the code to get it up to standards, and to deal the with requirements/politics for making these changes. Again my understanding can be totally off here, I'm not a kernel developer - I am just a user who patches with BFQ (and CK for my personal machine) since given the current process improvements do not seem to be forthcoming.

[0] http://algo.ing.unimo.it/people/paolo/disk_sched/results.php

[1] http://algo.ing.unimo.it/people/paolo/disk_sched/comparison....

[2] http://algo.ing.unimo.it/people/paolo/disk_sched/description...

[3] http://algo.ing.unimo.it/people/paolo/disk_sched/bfq-techrep...

[4] https://lkml.org/lkml/2014/5/30/412

DominikD · on May 14, 2016

AFAIK they never tried submitting in bulk so I'm not sure why this (sensible, I think) process is used as an argument against them. I don't know if they tried to post piece by piece (and when/if they gave up).

But there's a pretty well documented anti-grsecurity sentiment on the lists dating waaaay back. So if I were them I wouldn't be necessarily convinced that I should invest my time in slicing to begin with. ;)

cm3 · on May 15, 2016

VFS is a good example of something that's hairy and easy to introduce regressions affecting everyone. That's why Al Viro dissects stuff in detail.

Most regressions are introduced in hardware drivers and those are the largest part of the kernel and the part that sees most churn. It's great that we don't see more regressions, actually, though the Intel DRM drivers have gotten worse in quality over the last year.