Hacker News new | past | comments | ask | show | jobs | submit login

You might be able to use an even cheaper model. Google Gemini 1.5 Flash 8B is Input: $0.04 / Output: $0.15 per 1M tokens.

17 input tokens and 2 output tokens * 10 million jobs = 170,000,000 input tokens, 20,000,000 output tokens... which costs a total of $6.38 https://tools.simonwillison.net/llm-prices

As for rate limits, https://ai.google.dev/pricing#1_5flash-8B says 4,000 requests per minute and 4 million tokens per minute - so you could run those 10 million jobs in about 2500 minutes or 42 hours. I imagine you could pull a trick like sending 10 items in a single prompt to help speed that up, but you'd have to test carefully to check the accuracy effects of doing that.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: