Hacker Newsnew | past | comments | ask | show | jobs | submit | novacode007's commentslogin

Looks interesting to me, but I had some questions. Could you elaborate on the process of expert review and validation? How do you ensure the quality and accuracy of the datasets created?


We have a team of domain expert who do the vetting of the instruction dataset.We do typical RLHF(Reinforcement learning from human feedback) and connect back to our SFT(supervised finetuning) loop.That's why we name ourself as hardware and human in loop.Humans play an important role in ensuring quality and accuracy of our dataset.


Got it, and how well does it work with more complex documents, like those with a lot of images or intricate tables? I'm curious about how accurately it aligns the content with the source code in those cases.


We use multimodal RAG and tools similar to unstructued.io ,We generate structured output and use LLM again to do the matching with our AST parsed source code.Now matching part is really complex and need manual inspection and validation.


Please visit https://h2loop.ai/ to know more about H2LooP


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: