In the Qualcomm AI paper linked in this post it turns out they use a similar tes...

		nl on July 25, 2023 \| parent \| context \| favorite \| on: Attention Is Off By One In the Qualcomm AI paper linked in this post it turns out they use a similar testing approach: BERT 109M, testing perplexity OPT 125M, testing perplexity ViT 22M, testing on ImageNet top-1.