I am working on refactoring a system which I feel does some frequently encountered flows across organisations. Collects raw event streams, grabs unique URLs, crawls, extracts, creates vectors, uses these output in some downstream analysis and objects.
I have many different solutions which “work” but I am not sure which is optimal.
I find both the internet and LLMs to be a very poor resource for this type of work.
So where does one learn about how others have implemented such systems? Happy to reads books, blogs, repos!