By "reasoning" I meant the fact that o*(-mini) does "chain-of-thought", in other words, it prompts itself to "reason" before responding to you, whereas GPT-4o(-mini) just directly responds to your prompt. Thus, it is not appropriate to compare o*(-mini) and GPT-4o(-mini) unless you implement "chain-of-thought" for GPT-4o(-mini) and compare that with o*(-mini). See also: https://docs.anthropic.com/en/docs/build-with-claude/prompt-...
Yes you use the models for the same things, and one is better than the other for said thing. The reasoning process is an implementation detail that does not concern anybody when evaluating the models, esp since "open"ai does not expose it. I just want llms to to task X which is usually "write a function in Y language that does W, taking these Z stuff into account", and for that i have found no reason to switch away from sonnet yet.