Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My costs for embedding are so small compared to inference I don't generally notice.

But am I crazy or did the pre-production version of gemini-embedding-001 have a much larger max context length?

Edit: It seems like it did? 8k -> 2k? Huge downgrade if true, I was really excited about the experimental model reaching GA before that



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: