> Especially in a chatbot use case with cumulative prompting, which is the best ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		ml_basics on March 14, 2023 \| parent \| context \| favorite \| on: GPT-4 > Especially in a chatbot use case with cumulative prompting, which is the best use case for such a large context vs. the default cheaper 8k window. Depends on what is up with the images and how they translate into tokens. I really have no idea, but could be that 32k tokens (lots of text) translates to only a few images for few-shot prompting. The paper seems not to mention image tokenization, but I guess it should be possible to infer something about token rate when actually using the API and looking at how one is charged.

minimaxir on March 14, 2023 [–]

Currently, CLIP's largest size is at patch-14 for 336x336 images, which translates to 577 ViT tokens [(336/14)^2+1]. It might end up being token-efficient depending on how it's implemented. (the paper doesn't elaborate)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact