Even starting at 30%, the MMLU graph is false. The four bars are wrong. Even their own 73,7% is not at the right height. The Mixtral 71.4% is below the 70% mark of the axis.
This is really the kind of marketing trick that makes me avoid a provider / publisher. I can't build trust this way.
I believe they are using the percentages as part of the height of the bar chart! I thought I'd seen every way someone could do dataviz wrong (particularly with a bar chart), but this one is new to me.
That's really strange and incredibly frustrating - but slightly less so if it's consistent with all of the bars (including their own).
I take issue with their choice of bar ordering - they placed the lowest-performing model directly next to theirs to make the gap as visible as possible, and shoved the second-best model (Grok-1) as far from theirs as possible. Seems intentional to me. The more marketing tricks you pile up in a dataviz, the less trust I place in your product for sure.
Interesting! It is probably one of the worst trick I have seen in a while for a bar graph. Never seen this one before. Trust vanishes instantly facing that kind of dataviz.
Wow, that is indeed a novel approach haha, took me a moment to even understand what you described since would never imagine someone plotting a bar chart like that.
MMLU is not a good benchmark and needs to stop being used.
I can't find the section, but at the end of one of https://www.youtube.com/@aiexplained-official/videos he runs down a deep dive of the questions and answers in MMLU, and there are so many typos, omissions, and errors in the questions and the answers that it should no longer be used.
It’s an honest mistake in scaling the bars. It’s getting fixed soon. The percentages are correct though. In the process of converting excel chart to pretty graphs for the blog, scale got messed up.