> I haven’t tried o3, but one issue I struggle with in large context analysis tasks is the LLMs are never thorough.
o3 does look very promising with regards to large context analysis. I used the same raw data and ran the same prompt as Simon for GPT-4o, GPT-4o mini and DeepSeek R1 and compared their output. You can find the analysis below:
o3 does look very promising with regards to large context analysis. I used the same raw data and ran the same prompt as Simon for GPT-4o, GPT-4o mini and DeepSeek R1 and compared their output. You can find the analysis below:
https://beta.gitsense.com/?chat=46493969-17b2-4806-a99c-5d93...
The o3-min model was quite thorough. With reasoning models, it looks like dealing with long context might have gotten a lot better.
Edit:
I was curious if I could get R1 to be more thorough and got the following interesting tidbits.
- Depth Variance: R1 analysis provides more technical infrastructure insights, while o3-mini focuses on developer experience
- Geopolitical Focus: Only R1 analysis addresses China-West tensions explicitly
- Philosophical Scope: R1 contains broader industry meta-commentary absent in o3-mini
- Contrarian Views: o3-mini dedicates specific section to minority opinions
- Temporal Aspects: R1 emphasizes future-looking questions, o3-mini focuses on current implementation
You can find the full analysis at
https://beta.gitsense.com/?chat=95741f4f-b11f-4f0b-8239-83c7...