It might be okay even without the "is it being 'distributed'?" question. The GPL on the kernel doesn't require that an entire distribution be 100% free-software, just that modifications to the actual kernel have to be GPL'd (and even then, there's an exception for proprietary kernel modules). So it's possible Amazon is including some non-open-sourced libraries, userland tools, or kernel modules in their distribution.
I must have scrolled right past it when reading the PDF.
It turns out that no, you can't just materialize all the source by simply typing one command. get_reference_source is a rather absurd python script: https://gist.github.com/86abe580675500a35900
It requires your nonsecret AWS account ID as a parameter, but the server it's making requests to is only available inside the EC2 network. They already know who you are, they rented you the damn box! There's no check that the account ID is the one that created the box either. As an added bonus, the input sanitization code allows dashes, as Amazon always displays it, but it passes it along verbatim to a web service that does not.
It takes one package at a time, which must already be installed for it to match the name, and the script is interactive — it always attempts to prompt you for 'Are these parameters correct? Please type 'yes' to continue' even if it's not connected to a TTY.
The web service responds with a unique signed S3 URL that are set to expire 30 minutes in the future, plus or minus a minute or so. It then downloads it to a fixed location: /usr/src/srpm/debug/
Most of this could be alleviated by just hacking up the shitty python script, but still, this is ridiculous. Why did they do it at all?
It looks like this is, roughly, a very scaled down version of Centos 5 with EC2 tweaks, and all the EC2 tools pre-installed. Great if you want something super lean and ready to customize.
They claim that most Centos packages should work out-of-the-box.
It’s basically a custom Linux distro maintained by Amazon that’s based on CentOS 5. One of the biggest problems with EC2 until recently is that the official AMIs provided by Amazon are ridiculously out of date. For example, the standard Fedora AMI is Fedora 8—the latest version of Fedora is Fedora 13.
However, the fact that it's based on Centos 5 doesn't seem great too me. Granted, I've only started using Centos recently, but it seems like most packages are not kept up to date in the default repositories, and you end up having to add a few independently maintained ones to get some more recent software.
But yeah, can always just install from source I suppose. Perhaps I'm "Doing It Wrong" on Centos.
Centos 5.x exactly tracks bug-for-bug what RHEL 5.x is. That is why some of it appears old.
If you don't need or want a free version of Linux that is exactly compatible with RHEL, then simply add some repositories, such as EPEL or the DAG repository (http://dag.wieers.com/rpm/ ) .
It doesn't allow anything that wasn't possible before. It's just a virtual machine image. Kinda like how anyone could roll their own Windows virtual machine images in Virtual PC by booting up and installing Windows, but Microsoft provides pre-built images with Windows so you don't have to.
The point of this is to make one image that is up to date and works well so that there's a good official choice. Even if you're fine with rolling your own, not having to do so really lowers the barrier to entry.
Is this actually legit under the GPL given that in use the software is never conveyed beyond Amazon's property?