Hacker News new | past | comments | ask | show | jobs | submit login

I'd say the original Alpaca paper is a good source of inspiration on how to create datasets, they even shared the script used to generate data using OpenAI API.

One thing that came to mind for creating datasets for other programming languages would be to start with this Python dataset and use GPT-4 to convert to equivalents in other languages. You can even automatically test each generates example and ask GPT-4 to correct any errors




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: