Hacker News new | past | comments | ask | show | jobs | submit login

That is an excellent prompt to tuck away in your back pocket and try again future iterations of this technology. It's going to be an interesting milestone when or if any of these systems get good enough at comprehensive research to provide a correct answer.





If you keep the prompt the same at some point the data will appear in training set and we might have answer.

So even though today it might be a good check it might not remain as such a good benchmark.

I think we need a way to keep updating prompts without increasing complexity in someway to properly verify model improvements. ARC Deep Research anyone?


Well, to test research capabilities, one could just adopt the year (2024->2025) in the prompt.

I am not sure what happens if some site keeps tracking these metrics and that manages to find its way into the training data.

There are some NBA fan sites that do keep track of some of these tournament level final metrics.


Wouldn't somebody need to answer the question below? Or do you mean the discussion of its weakness might somehow make it stronger the next time it's trained?

I think it can be both, what happens if discussing weakness provides more relavent links for the question and help the model that is trained scraped web data to learn somehow.

I am not sure if the model will need the exact answer or just the backlinks to site where they can find them is enough. Maybe just documenting how to do it could do the job as well...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: