A principled way to solve problems

travisjungroth · on May 6, 2022

I have a few variants of problem solving methodologies for different situations. They are incredibly similar to this post. They are also incredibly similar to the process in How to Solve It[0].

I intended to memorize the one-page summary to How to Solve It last year. I generally don't get past the first few sentences because I find them so relevant. It's like a mantra in my mind. "Understanding the Problem. First. You have to understand the problem."

[0]Summary: https://www.math.utah.edu/~alfeld/math/polya.html

hinkley · on May 6, 2022

I sometimes state this as, “ask ‘why’ one more time than you think is reasonable. Otherwise known as “Ask ‘why’ until you get tired of it, then ask one more time.”

If you don’t understand the problem you’re fixing symptoms.

travisjungroth · on May 6, 2022

The "Why?" questions are a bit different. Usually they take you to another problem versus telling you about the problem at hand. In its worst form, because it's such a simple technique, you can get quite lazy about it.

If you start with "The website is slow.", the Why? questions might take you to "Why is speed important to our users?"

Maybe before you jump off the original problem so quick, it's good to understand it. "Who said that? Are they talking about latency or speed? Is it the time to the beginning of the response or the completion or something in between? What browser and internet were they on? How fast is the website?"

patrck · on May 6, 2022

It's good to practice the different situations.

I spend most my time `duck-talking` with a TeX doc that alternates between annotated questions and answers from some REPL. When I go AFK and have a problem, I try to punt because otherwise I am much more likely to make a mistake. :/

jrvarela56 · on May 6, 2022

Equally important as asking 'what problem are we solving?' is to ask 'what does it look like for this problem to be solved?'.

The tough part about having problem-focus is not being able to converge towards a solution. There's fear about being too hasty or going into specifics too early. Getting to a good enough solution as a group requires some leaps from problem-space that may feel like abandoning problem-focus.

asplake · on May 6, 2022

Right. Related to that:

Why is solving that important?

How would we know that we’ve solved it?

When we’ve solved it, then what happens?

danbruc · on May 6, 2022

The most important point is the first one.

Ask yourself what’s the problem you’re trying to solve

Be sure that you understand it deeply, and that you at least have an idea of how would you know that the problem is solved.

But I think this does still not stress enough how important truly understanding the problem is. More often than not, the problem you are immediately facing is not the problem you should solve. When you are facing some problem, there might be a more or less obvious way to solve it. But then think again. And again. Is this really your actual problem or did you go wrong much earlier? If this is indeed the case and you can identify the true root problem, you will immediately know - not only will your immediate problem suddenly become trivial, but you will also see how many other things that always did not seem quite right could be improved.

mrandish · on May 6, 2022

Over the years I fell into a role where I get called in to help solve business problems in different parts of a large tech company when the team working on it gets stuck. I always begin by gathering the key stakeholders and asking "What exactly are we trying to accomplish?"

After working enough of these through to resolution, I've noticed that more than half the time, the key to the root problem is found in the answer to that first question. That's how I learned to start with that question and then to carefully probe the exact terms and associated meanings in the various answers that emerge around the table. As I gently push toward agreement on a clear, concise problem statement different definitions, scopes, boundaries etc emerge. It's rarely a short conversation and oftentimes we never get past that first question without discovering something really key to unblocking the situation.

CharlesW · on May 6, 2022

> I always begin by gathering the key stakeholders and asking "What exactly are we trying to accomplish?"

Also the first question listed in TFA.

_carbyau_ · on May 6, 2022

I see troubleshooting as being more like a the OODA loop: https://en.wikipedia.org/wiki/OODA_loop

Observe - Orient - Decide - Act

But it might be more palatable as: Observe - Understand - Plan - Do

Often, the issue will be a symptom that you don't necessarily understand and so the "Plan - Do" part of the loop will aim to get more information so as to better understand. IE maybe an experiment through change, maybe simply gather some logs.

But in a large organisation working on an urgent problem it can be hard to have everyone involved being in the same stage of the loop with the resulting chaos you'd expect.

mellavora · on May 6, 2022

The OODA loop is explicitly as you said

> Observe - Orient - Decide - Act

The "Orient" step is usually done inside of your existing mental framework (Boyd is pretty explicit about this, see the blue box in your linked wikipedia article).

Now if you are a great troubleshooter operating in your field of expertese, your "orient" step (and mental framework) might be sufficient to find and solve the trouble. Think an experienced engineer noticing some issues in a junior's code submission.

But OODA is not so good for solving problems where you need to extend your mental framework.

Your proposed:

> Observe - Understand - Plan - Do

could very well be a good approach, but it is not at all OODA, especially if the Understand phase involves a re-evaluation of your mental framework for the problem.

Hence all the "whys". You are questioning the situation and the way you think about the situation.

_carbyau_ · on May 9, 2022

That blue box specifically has "new information" and "analysis and synthesis". If your previous Act step got you new information you can "orient" yourself with respect to it.

OODA is often thought of in a "do or die, moments count" context (given it's origins...) where learning lots of stuff is often impossible hence irrelevant. But with a longer time scope you can learn more. And given 10 minutes a person familiar with the environment can learn a lot about an issue, without having solved it necessarily.

Applied to troubleshooting, so often the Decide and Act bits are about gathering more information! Hence my "logs" or "expt" examples. Maybe it is breaking out a debugger, or gathering stats on A vs B, but each Action gets one step closer.

The main reason I like to think of it this way is to prompt "stepping back". By being able to say "Where am I at in the loop?" it gives permission to yourself to mentally step back from the coalface for a minute while still not giving up on the issue.

And every time I have seen successful leadership in resolving a problem it is effectively someone acting kind of like a "flight controller" making sure the workers OODA loops at least don't mess each other up and better yet are collaborative. IE keep them bubbling until someone says "I've got it!".

For me, OODA is valuable in the post incident review. You can lay out every step and figure out how to get there faster.

edmundsauto · on May 6, 2022

Do you find ooda useful for general problem solving? I’ve only found it useful in dynamic environments, especially when there are other agents who act in addition to react.

The theory is that you want to get inside the other agents decision loop, so you can iterate faster than they are. For general problems, I have not found it very useful - curious how you use it for a typical engineering problem, if you wouldn’t mind sharing an example.

_carbyau_ · on May 9, 2022

Like a lot of thought models it depends how you apply it as to whether it is any use.

OODA came from an adversarial practices. So often the intent is to be competitive vs the adversary. But if your adversary is inert - a problem to fix - then your time pressure to "get inside their loop" is gone. But of course you have other time pressures - assumedly management or someone screaming for you to fix it. Or maybe it is a memory leak that you know will hit a critical point.

From my reply to a sibling comment:

OODA is often thought of in a "do or die, moments count" context (given it's origins...) where learning lots of stuff is often impossible hence irrelevant. But with a longer time scope you can learn more. And given 10 minutes a person familiar with the environment can learn a lot about an issue, without having solved it necessarily.

Applied to troubleshooting, so often the Decide and Act bits are about gathering more information! Hence my "logs" or "expt" examples. Maybe it is breaking out a debugger, or gathering stats on A vs B, but each Action gets one step closer.

The main reason I like to think of it this way is to prompt "stepping back". By being able to say "Where am I at in the loop?" it gives permission to yourself to mentally step back from the coalface for a minute while still not giving up on the issue.

And every time I have seen successful leadership in resolving a problem it is effectively someone acting kind of like a "flight controller" making sure the workers OODA loops at least don't mess each other up and better yet are collaborative. IE keep them bubbling until someone says "I've got it!".

For me, OODA is valuable in the post incident review. You can lay out every step and figure out how to get there faster.

civilized · on May 6, 2022

There is a good insight here - people often forget to evaluate whether their ideas really solve the problem that motivated them to have the idea. Once they have the idea, their brains get all fixated on "if we do the idea, the world will become magical".

jll29 · on May 6, 2022

> how would you know that the problem is solved [?].

That's the key: the more clear you are about assessing or measuring success, the easier finding the actual solution will become. Often, trying to define a quantitative metric how close a partial solution to the ultimate (complete) solution can help you track progress, and then you're no longer "flying blind" on your innovation journey.

jll29 · on May 6, 2022

We should also ask ourselves: how can the problem be avoided in the first place.

Some of the most elegant solutions in my career I have been able to find by inquiring whether the problem should or shouldn't, as it may be a side-effect of a broken process.

revskill · on May 6, 2022

To me, it's divide and conquer.

yakkomajuri · on May 6, 2022

Nice blog style btw

mowfask · on May 6, 2022

My approach to solving problems that span complex systems:

1. Instrument 2. Measure 3. Interpret 4. Act

Iterate as necessary.

I have come to see this pattern working on electronics design, embedded software, industrial control systems, networking and webapp backends.

Breaking it down:

1. Instrument

Understand subsystems and their interfaces. Use tooling around these interfaces to trace the interplay between subsystems. Make sure all tooling is synchronized so you can correlate information across tools via timestamps. If you can't instrument remotely, bite the bullet and reproduce locally. This ties back into the design phase: Design interfaces to be instrumentable, ideally remotely. Test points on PCBs, traceable APIs in software, using network protocols that tools like wireshark can decode. Pub/Sub systems are great for this, as you can easily add another subscriber for instrumenting all communication. Don't rely on "what happens to be available" for instrumentation. AWS CloudWatch will miss that one crucial piece of information. Your oscilloscope tip will not make reliable contact on a QFN pad. Simply stated: Become good at interfaces and make them accessible.

2. Measure

Take the time to properly run tests and gather data. For issues in systems spanning mechanical, electrical, digital and software domains, you won't have one tool to do it all for you. Data preparation and cross correlation will be a manual process in most cases. That is ok.

3. Interpret

This is about understanding your problem and digging down from high-level symptoms to low-level root causes. Don't jump to conclusions. Let the data sink in to identify second order effects. Don't rush it because of pressure from your boss or the customer.

4. Act

Now that you understand your problem at a deeper level, it should be straightforward to apply corrective action. This might not solve the issue yet, but you will get closer to the root cause.

Two notes:

* Never stop after step 4! Always iterate once more so you can be confident the issues is actually solved and not just hidden by some effect.

* If you're a team player, document each step. A short note and screenshot in an issue tracker go a long way.

About engineering mindsets:

I find it infuriating when people calling themselves engineers don't follow any practice like this. Yes, you can solve problems through sheer experience or by hitting your head against the wall for long enough. Alone. On simple systems. But working together on complex systems you have to apply some methodology. Doesn't have to be my methodology, just not no methodology. For me a big red flag is when engineers don't understanding why something works. Not understanding why something doesn't work is ok. We are human and systems are complex. But getting something to work, wondering why it does and then sending it to the customer? That's not engineering, that's tinkering. It's asking for trouble.

mellavora · on May 6, 2022

> I find it infuriating when people calling themselves engineers don't follow any practice like this.

Andy Grove's (the engineering legend) wisdom is summed in one sentence: "Everything is a process, which can be improved."

If you don't have a process for solving a problem (where problem is as implied by the parent), you are not engineering.

mejutoco · on May 6, 2022

I agree. Note also that if your process resembles sacrificing a goat you might also not be engineering (a process is necessary but not sufficient).

IMO an alternative way of putting it is you need to apply the scientific method.

https://en.wikipedia.org/wiki/Scientific_method

edit: grammar

StopDarkPattern · on May 6, 2022

Spoiler. There are no principles referenced. This is just a random list of steps for problem solving. Please provide the decisions that come out of this grand scheme so we can learn so much wisdom.