Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Analyzing GPT-4 Tokens with Llama3 (koenvangilst.nl)
1 point by vnglst on May 2, 2024 | hide | past | favorite
Inspired by Andrej Karpathy's excellent YouTube video on tokenizers, I used Llama3 to analyze all 100,000 GPT-4 tokens. The results were somewhat expected — a strong focus on English and code. Interestingly, only 124 tokens were dedicated to my native Dutch, which might explain why it underperforms in that language.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: