Back in the days of MyISAM and before Google had their own ad network I worked f...

mansoon · on June 27, 2021

I’m confused by the last part of your post.

Sounds like you appreciated that your boss gave you space to learn, and understood that you made an honest mistake, but you’d fire someone who made this mistake if they were working for you?

How do you square those two things internally?

notatoad · on June 27, 2021

It's not good to punish people for making mistakes in the course of their work (especially if that work is meant to be educational)

It is good to punish people who give access to production databases to people who shouldn't have it. And the guy learning MySQL should not be given that access.

Taking down prod is always a symptom of a systemic failure. The person responsible for the systemic failure should see the consequences, not the person responsible for the symptom.

peeters · on June 27, 2021

> The person responsible for the systemic failure should see the consequences

You don't see the contradiction of terms there? A systemic failure is by definition not the responsibility of one person. You're saying people should be able to make mistakes. But not those people.

andrewstuart2 · on June 27, 2021

Having worked in a very large, bureaucratic company I can say that I strongly suspect that just ignoring systemic failures as learning opportunities is also not a sustainable. Too many times I’ve had to yield on something and say “I guess they’ll learn when this fails” only to see them easily move on or get promoted before the failure occurs. They don’t learn their lessons.

I suspect the solution is to find a way to make sure the consequences of the decision are fed back properly into the system directed at the right person. How to do that, I have no idea.

jfrunyon · on June 27, 2021

It's not that you ignore them. But your first step if someone makes a mistake should not be to fire them. Maybe they need some coaching, maybe they have too much access or authority, but everyone makes mistakes. The key is whether or not they learn from them, or keep making them.

ketzo · on June 27, 2021

A piece of the system (a junior developer) is allowed to make mistakes. The person responsible for architecting and protecting the system (the CTO)... less so.

d1str0 · on June 27, 2021

That depends, this might have been this CTO’s first time as CTO. Without knowing the story, they could so have been pretty far out of their element and just lucky to be a founder or something.

Even C-level people always have to have their first day as C-level and of course they will make mistakes.

The important thing is learning from them of course.

maxk42 · on June 28, 2021

> The important thing is learning from them of course.

Which is why I find it unacceptable that he repeated the mistake.

lnjarroyo · on June 27, 2021

Well, one could argue that "giving access to production databases to people who shouldn't have it" was just "making mistakes in the course of their work" for this dude.

quacked · on June 27, 2021

Wait, but if you choose the guy giving access as the mistake made in the course of his work, then what's the new plan?

helmsb · on June 27, 2021

I’ve never understood the logic of firing someone over a mistake like that. They’re now the person least likely to make a similar mistake and they will maintain the institutional knowledge to help ensure it doesn’t happen again.

lenova · on June 27, 2021

It's a poetic way of saying the boss was/is a better person than the OP.

puchatek · on June 27, 2021

I think the hint is in "after the second time I did this". I would also wonder if to keep them on at that point.

maxk42 · on June 28, 2021

> How do you square those two things internally?

Easy: (1) He wasn't my boss. (2) He allowed a person not associated his team or even the tech department to conduct potentially harmful operations on the production database without supervision. (3) Those actions resulted in millions of dollars of lost revenue and make-goods. (4) He did not coach the person who brought the database down. (5) He repeated the mistake.

mikeywazowski · on June 27, 2021

Wait, this happened twice? Weren't you at great pains to avoid it reoccurring after the first time?

iJohnDoe · on June 28, 2021

Yeah, about firing him for making the same mistake twice that took down production both times.

Sounds like the boss was cool.

CodesInChaos · on June 27, 2021

> some kinds of JOINs can go exponential.

How? AFAIK a single join can be at most quadratic, and multiple joins should at most be polynomial, where the exponent is the number of joins. To go exponential, you'd need some kind of recursion or self reference, but I know no way to express such a thing as an ordinary join statement.

(of course quadratic performance is already prohibitively slow on large tables, so there is no need to go exponential in order to take "forever")

2muchcoffeeman · on June 27, 2021

Unfortunately exponential sometimes means “grows rapidly” and not exponential in a mathematical sense.

mormegil · on June 27, 2021

In our software, a minority codepath sometimes reported database deadlocks. Nothing critical but it littered the ops error logs and probably displayed error messages to a few customers. So I added a pessimistic exclusive lock to a query which basically solved the deadlock problem (not a great solution but it worked). However, what I missed was the query, even though in a minority code path, was touching another table used basically in all hot-path queries. So basically the code seemed to work fine until deployed to all servers when all operations of the whole cluster got basically serialized through this single lock. So, yeah, database locks can bite you hard!

vietvu · on June 27, 2021

Not as bad as yours, but MySQL and also blocking prod table: on my first job after graduate, I once run a delete commands on about 20 rows on a quite large table (maybe 500M+ rows), but was causing deadlock because of gap lock, it has been 6 years so I don't really remember the details.

I was not expert but knew about MySQL optimization at that time, but it looks like sometime you just do things and not think through.

15 minutes later, sysad team PM me and ask WTF am I doing, and I realized what happen.

samus · on June 27, 2021

Someone hogging the database with an analytics query is a honest error because of an insidious footgun inherent in the technology stack. On the other hand, the CTO permitted access to the production database ... why? To learn MySQL, it would have been sufficient to set up a local instance. Or connect to testing/staging environments to get at some data.