All examples on this page assume manual prompt building, I didn't tackle this issue because I didn't want to enter the topic of automatically creating prompts via code.
But you are very right, this is an enormous issue right now to systems that create prompts programatically. I am actively looking for solutions for this problem and I am very interested if anyone has any good solutions for it.
Very interesting use of ChatGPT and prompt engineering. For your problem, summarizing a large document, splitting the document into smaller parts is indeed the way to go.
I also had problems myself with operating on large documents. In my case, I had an insurance policy that I wanted to extract information from.
My solution: use the OpenAI API to convert the document to OpenAI's embeddings and saving those embeddings to a vector database. Then, use similarity search on the database to find chunks of the document that might be related to my query and pass only those chunks to GPT for the information extraction prompt.
I plan to create a guide on how to tackle these problems after I consolidate my findings.
My solution was to write a bit of code that writes a CSV, then I used a langchain-based CSV agent. Since that one calls on pandas it effectively has no token limit, but it also has no overview of the data, only what pandas tells it.