Hacker Newsnew | past | comments | ask | show | jobs | submit | serjester's commentslogin

Karpathy had an amazing tweet about this if you’re interested in a deep dive.

[1]https://x.com/karpathy/status/1902046003567718810


This is a hand wavy article that dismisses away VLMs without acknowledging the real world performance everyone is seeing. I think it’d be far more useful if you published an eval.


Congrats on the launch - you're value-add is quite confusing as someone that's at the applied AI layer. This comes off as more of a research project than a business. You're going to need an incredibly compelling sales pitch for me to send my data to an unknown vendor to fix a problem that might be obviated by the next model release (or just stronger evals with prompt engineering). Best of luck.


>You're going to need an incredibly compelling sales pitch for me to send my data to an unknown vendor

I agree! Our customers require on-prem deployments, though, so nothing is being sent to us outside their environment.


I'm not sure if they're gearing up for an announcement, but about 9 days ago they dropped the preview warning from their README. I'm assuming they're still working through final housekeeping items before formally announcing it.

[1] https://github.com/astral-sh/ty/commit/7a6b79d37e165f2e73189...


I mean, it's been announced already. Caveat: It's still very far from being competitive with SOTA type checkers like basedpyright.

Still, it's great that this is being worked on and I expect in a year or two ty should be comprehensive enough to migrate over to.


Seems like their critique boils down to two areas - pandas limitations and fewer built ins to lean on.

Personally I've found polars has solved most of the "ugly" problems that I had with pandas. It's way faster, has an ergonomic API, seamless pandas interop and amazing support for custom extensions. We have to keep in mind Pandas is almost 20 years old now.

I will agree that Shiny is an amazing package, but I would argue it's less important now that LLMs will write most of your code.


If cloudflare goes down, you can blame them. If your hand rolled solution fails when cloudflare exists, you’re going to have a tough pitch to leadership why you’re in charge of the technical roadmap. Choose your battles, and this is not a hill worth dying on.


It's disappointing there's no flash / lite version - this is where Google has excelled up to this point.


Maybe they're slow rolling the announcements to be in the news more


Most likely. And/or they use the full model to train the smaller ones somehow


The term of art is distillation


As someone that also worked at a large automakers, I think you’re making large, unfounded assumptions.

Warranty data flows up from the technicians - good luck getting any auto technician to properly tag data. Their job is to fix a specific customer’s problem, not identify systematic issues.

There’s a million things that make the data inherently messy. For example, a technician might replace 5 parts before they finally identify the root cause.

Therefore, you need some sort of department to sit between millions of raw claims and engineering. I would be curious what kind of alternative you have in mind?


Seems like a very natural fit for fine tuning - would have loved to see more on the LLM side.


Did you try fine tuning the LLMs?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: