More

dluc · on Dec 10, 2023

about #3 I’ll recommend https://github.com/microsoft/kernel-memory :)

dluc · on Oct 9, 2023

We are also developing an open-source solution for those who would like to test it out and/or contribute, it can be consumed as a web service, or embedded into .NET apps. The project is codenamed "Semantic Memory" (available in GitHub) and offers customizable external dependencies, such as using Azure Queues, RabbitMQ, or other alternatives, and options for Azure Cognitive Search, Qdrant (with plans to include Weaviate and more). The architecture is similar, with queues and pipelines.

We believe that enabling custom dependencies and logic, as well as the ability to add/remove pipeline steps, is crucial. As of now, there is no definitive answer to the best chunk size or embedding model, so our project aims to provide the flexibility to inject and replace components and pipeline behavior.

Regarding Scalability, LLM text generators and GPUs remain a limiting factor also in this area, LLMs hold great potential for analyzing input data, and I believe the focus should be less on the speed of queues and storage and more on finding the optimal way to integrate LLMs into these pipelines.

ddematheu · on Oct 9, 2023

The queues and storage are the foundation on which some of these other integrations can be built on top. Agree fully on the need for LLMs within the pipelines to help with data analysis.

Our current perspective has been on leveraging LLMs as part of async processes to help analyze data. This only really works when your data follows a template where I might be able to apply the analysis to a vast number of documents. Alternatively it becomes too expensive to do at a per document basis.

What types of analysis are you doing with LLMs? Have you started to integrate some of these into your existing solution?

dluc · on Oct 9, 2023

Currently we use LLMs to generate a summary, used as an additional chunk. As you might guess, this can take time, so we postpone the summarization at the end (the current default pipeline is: extract, partition, gen embedding, save embeddings, summarize, gen embeddings (of the summary), save emb)

Initial tests though are showing that summaries are affecting the quality of answers, so we'll probably remove it from the default flow and use it only for specific data types (e.g. chat logs).

There's a bunch of synthetic data scenarios we want to leverage LLMs for. Without going too much into details, sometimes "reading between the lines", and for some memory consolidation patterns (e.g. a "dream phase"), etc.

ddematheu · on Oct 9, 2023

Makes sense. Interesting on the fact that summaries affect quality sometimes.

For synthetic data scenarios are you also thinking about synthetic queries over the data? (Try to predict which chunks might be more used than others)

dluc · on Oct 10, 2023

yes, queries and also planning.

For instance, given the user "ask" (which could be any generic message in a copilot), decide how to query one or multiple storages. Ultimately, companies and users have different storages, and a few can be indexed with vectors (and additional fine tuned models). But there's a lot of "legacy" structured data accessible only with SQL and similar languages, so a "planner" (in the SK sense of planners) could be useful to query vector indexes, text indexes and knowledge graphs, combining the result.

bradneuberg · on Oct 10, 2023

Really interesting library.

Is anyone aware of something similar but hooked into Google Cloud infra instead of Azure?

dluc · on Oct 10, 2023

we could easily add that if there's interest, e.g. using Pub/Sub and Cloud Storage. If there are .NET libraries, should be straightforward implementing some interfaces. Similar considerations for the inference part, embedding and text generation.

derekperkins · on Oct 22, 2023

GCP also has a hosted vector db https://cloud.google.com/vertex-ai/docs/vector-search/overvi...

CharlieDigital · on Oct 10, 2023

Why .NET apps specifically?

dluc · on Oct 10, 2023

Multiple reasons, some are subjective as usual in these choices. Customers, performance, existing SK community, experience, etc.

However, the recommended use is running it as a web service, so from a consumer perspective the language doesn't really matter.

dluc · on June 19, 2023

good alternative: https://www.linen.dev/

dluc · on June 16, 2023

The model doesn't run code, it generates text that happens to be code. It's up to the client calling the model API to use this text, e.g. compiling and executing (if that's your scenario) and calling the model again to fix the original code if needed.

dluc · on June 6, 2023

hello there all, dluc from Semantic Kernel :wave: looking forward to attend :-)

dluc · on May 5, 2017

Nice pun, it's an edge case...

dluc · on March 29, 2017

the search filter seems to have some hard coded logic, preferring languages like R (4 projects [1]) over Scala (5 projects [2]). I wonder if that's based on trends or purely taste (or just cache :-)).

  [1] https://opensource.google.com/projects/search?q=%20&language=r
  [2] https://opensource.google.com/projects/search?q=%20&language=scala

kilotaras · on March 29, 2017

Maybe it has something to do with actual amount of scala code. E.g. github reports bazel as 90%+ java.

dluc · on March 28, 2017

Having worked mostly with mesos and k8s, I found k8s configuration superior, e.g. how one can import secrets, config files, and more importantly set up the network without address translation or port forwarding. Tooling seems OK with both, and I agree one needs to spend some time to get familiar with the CLI and nomenclature, IMHO because both are quite flexible and powerful.

dluc · on March 28, 2017

To author: custom metrics link is broken.

Looking forward to compare that with an autoscale solution I'm working on.

dluc · on Jan 17, 2017

perhaps the repo was private until recently