One way to get around context length is to perform embedding and retrieval of yo... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		fzliu on March 14, 2023 \| parent \| context \| favorite \| on: GPT-4 One way to get around context length is to perform embedding and retrieval of your entire corpus. Langchain (https://langchain.readthedocs.io/en/latest/) and Milvus (https://milvus.io) is one of the stacks you can use.

ComplexSystems on March 14, 2023 [–]

Can you elaborate on how this works?

teaearlgraycold on March 15, 2023 | [–]

You run the corpus through the model piecemeal, recording the model's interpretation for each chunk as a vector of floating point numbers. Then when performing a completions request you first query the vectors and include the closest matches as context.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact