How long ago was this? I had this same experience, but there's a new implementation for memory as of a few months ago which seems to have solved this weird "I need to mention every memory in every answer" behavior.
In 2025 there's no reason for anyone to be logging into an AWS account via the root credentials and this should have been addressed in the preventative measures.
There's no actual control improvements here, just "we'll follow our procedures better next time" which imo is effectively doing nothing.
Also this is really lacking in detail about how it was determined that no PII was accessed. What audit logs were checked? Where was this data stored?
Overall this is a super disappointing postmortem...
> In 2025 there's no reason for anyone to be logging into an AWS account via the root credentials and this should have been addressed in the preventative measures.
I am curious what preventative measures you expect in this situation? To my knowledge it is not actually possible to disable the root account. They also had it restricted to only 3 people with MFA which also seems pretty reasonable.
It is not unheard of that there could be a situation where your ability to login through normal means (like lets say it relies on Okta and Okta goes down) and you need to get into the account, root may be your only option in a disaster situation. Given this was specifically for oncall someone having that makes sense.
Not saying there were not failures because there clearly are, but there have been times I have had to use root when I had no other option to get into an account.
You don't need the root account, unless you need to bypass all policies. In such a scenario, you a use the root access reset flow instead, reducing standing access.
As for other flows (break glass, non-SSO etc), that can all be handled using IAM users. You'd normally use SAML to assume a role, but when SSO is down you'd use your fallback IAM user and then assume the role you need.
As for how you disable the root account: solo accounts can't, but you can still prevent use/mis-use by setting a random long password and not writing it down anywhere. In an Org, the org can disable root on member accounts.
To me that sounds like security by obscurity not actual security.
If you have the ability to go through the reset flow than then why is that much different than the username and password being available to a limited sets of users. That would not have prevented this from happening if the determination was made that all 3 of these users need the ability to possibly get into root.
As far as having an IAM user, I fail to see how that is actually that much better. You still have a user sitting there with long running credentials that need to be saved somewhere that is outside of how you normally access AWS. Meaning it is also something that could be easily missed if someone left.
Sure yes you could argue that the root user and that IAM user would have drastically different permissions, but the core problem would still exist.
But then you are adding another account(s) on top of the root account that must exist that you now need to worry about.
Regardless of the option you take, the root of the problem they had was 2fold. Not only did they not have alerts on the usage of the root account (which they would still need if they switched to having long running IAM users instead, but now they would also need to monitor root since that reset flow exists) and their offboarding workflow did not properly rotate that password, which a similar problem would also exist with a long running IAM user to delete that account.
At the end of the day there is not a perfect solution to this problem, but I think just saying that you would never use root is ignoring several other issues that don't go away just by not using root.
Not using root means not bypassing policies. There is no way to not bypass all policies. So yes, never using root makes that issue go away completely.
As for all the other stuff: what it does is it creates distinct identities with distinct credentials and distinct policies. It means that there is no multi-party rotation requires, you can nuke the identity and credentials of a specific person and be done with it. So again, a real solution to a real problem.
It depends on what the goal of all of this was, which is unclear. If the goal was simply to get the data that they originally wanted it does not solve that problem and it would have just happened a different way.
According to the article there was 11 days between the first actions taken and them finding out it happened.
If instead of a root account you have a long running IAM user that you can then assume into the role you normally use through SSO. If you also do not monitor that account with proper alerts and proper offboarding procedures than they could have logged into that account and retrieved the data they wanted.
Which again is the reason I am saying they just saying not using root is not a magic bullet that would have avoided problems. Maybe the situation would have been different but they still could have done a lot in 11 days.
The problem was that the user's credentials were revoked but because the root account was a shared credential it wasn't revoked. Was the break-glass account also a user-specific account, it would have fit in with any 'revoke anything for user XYZ' workflow instead of being a root account edge-case.
So, in short, this would likely have prevented this, as the normal off boarding for user-bound credentials worked out fine already.
Does it? Pretty sure that logging in as root generates one cloudtrail per action, regardless of whether or not you did it with a saved password or you reset the password. Resetting the password doesn't generate a cloudtrail event as far as I've seen.
Sometimes I log into the root account to see the billing information.
I created an "administrator" account, but apparently it can't see the billing information, including the very-important amount of remaining cloud credits.
Maybe I could spend time fiddling with IAM to get the right privileges, but I have more pressing tasks. And besides, on my personal AWS account I only log in with the root account.
It's not an enforcement issue so much as it is a heavily exploited loophole. Part of the reason freight trains are so long is so that they can't fit in passing sidings. Since Amtrak does fit, they end up having to yield because the freight trains simply cannot.
Could this be fixed by legislation on max train length to ensure all trains fit in sidings? Yes. Will that legislation get passed? No.
This is correct but needs more explanation. What the commenter is alluding to is Precision Scheduled Railraoding ("PSR") [1]. Basically this means having really long trains with half the crew and cutting down on safety inspections to increase profits by spending more time delivering freight. It also gets around the Amtrak priority. Why fewer staff? Because you only need one engine crew for a train twice as long.
Increasing train length on tracks not designed for it is a safety issue. Think about it, you have a whole bunch of separate carriages. Some are turning because that's wher they are on the track. Others are going uphills, yet others downhill. All of these forces become a problem that arguably increases the likelihood of derailment, the kind of which we had in East Palestine, Ohio a few years ago.
The labor situation is so bad that there was the threat of a strike in the Biden administration. For what? Paid sick leave, mainly. Biden got Congress to use their powers to end a strike by "essential" workers and then quietly later went and partially conceded to their demands.
Retiring crews haven't been replaced so the labor is at dire levels, all to slightly increase profits. It was estimated that if UP conceded toa ll the union's demands it would reduce their profit by 6%. Not revenue, profit.
Having a enforced max length on any route especially those with commuter service is not a bad idea, it is the tendency of freight to scale up the number of cars as much as possible for efficiency, passenger services work better shorter with more frequent services.
Yes there are myriad other reasons Amtrak gets delayed, it is not like this is the only bottleneck they have, but that doesn't mean this is not also a key problem.
No idea how true/false the comments are, but one reason to lie would be to scapegoat someone else for Amtrak's problems. If an airline's flights were regularly delayed by 6-24 hours, they'd go out of business
I don't think anyone is outright lying. I think they are just not telling the full story. What that full story is though I have no idea, nor do I have any way to figure it out. (Any investigative reporters want to spend a year or two tracking this down? Beware that there probably isn't enough interest to pay for the time you spend)
From what people in the industry have told me, freight train management is no less scummy than any other kind of freight transportation management, and they continually make trains longer and longer despite nearly everyone’s objections. Some are miles long so there’s no way engineers can see the front of the train even with a gentle curve, and they’re taking hazardous cargo through populated areas.
Wasn't there a big train crash with hazardous materials on an understaffed train a few years ago? And a strike for more sensible working conditions that was struck down by Congress?
I love that the US moves so much freight by train rather than truck, but everything I hear about how trains are run in the US sounds terrible.
The infrastructure is horrifying and the railroads do everything possible to defer any and all maintenance. Practically every train arrives late, but the customers can't really do much about it (how else are you going to move 4 million pounds of coal?)
Yeah, kind of like when they put ‘style’ on the end of a product that copies the aesthetic of something without functionally being that thing— like a kosher-style deli or a professional-style stove— the mainstream democratic leaders are pro-labor-style politicians.
I've been out for a couple years and was just a code monkey anyway. But they did treat me poorly, and it was quite shocking seeing how the sausage was made.
Yes, but S3 has single region redundancy that is better than GCP. Your data in two AZs in one region is in two physically separate buildings. So multi-region is less important to durability.
Thank you for creating the project! Oddly enough I came across fck-nat last week and have been looking into incorporating it into my project app to avoid paying the $30+ per month for the managed solution.
As someone who is new to setting up VPCs and networking, how does this work? I was so curious I even tried to query ChatGPT about it a couple of days ago but I got a less than satisfactory answer.
Is the secret to making it work disabling the "source destination check"? Say a host in the private subnet wants to connect to a host on the internet, it tries to connect to <PublicIP>, and sends some IP packets over the subnet via the ENI, does the VPC subnet act like an old-school ethernet connection where fck-nat gets the IP packets for <PublicIP> (source/dest is disabled so it receives the packets) and then it forwards it to the internet gateway and does the network address translations when it receives a response packet?
From the VPC perspective, the key here is understanding that subnets within VPCs have route tables that determine where traffic from your subnet goes next. In this case traffic to the internet is sent to an interface on the NAT instance.