asif_'s comments

asif_ · on July 3, 2024

Absolutely agree, thanks for the clear explanation!

asif_ · on July 3, 2024

I try to think about what should be a framework and what should be a library. Libraries are tools that helps you achieve a task, for example, building a prompt, calling LLM models, communicating with vector database.

Frameworks are more process driven for achieving a complex task. This is like ReactJS with their component mode -- they set a process for building web application such that you can build more complex applications. At the same time, you have lots of flexibilities in the implementation details of your application. Framework should provide as much flexibilities as possible.

Similarly, we are trying to build our framework for streamlining the process for LLM development such that you can iterate on your LLM application faster. To help setup this process, we enforce very high-level interfaces for how you build(input & output schema), evaluate, deploy your application. We provide all the flexibilities to the developer for low-level implementation details and ensure it's extensible so you can also use any external tools you want within the framework.

asif_ · on July 3, 2024

Hi, yeah unfortunately this is only in Typescript at the moment. As we refine the framework more, we'll look for a more language agnostic approach, or provide support across different languages.

asif_ · on July 3, 2024

I agree that you want to have the flexibility for what you use to build your LLM application. For example, if you want direct API level access to building and searching through your RAG layer, calling your LLM models, and other business logic, you should. There's a lot of opportunities to fine-tune each of these layers. However, you are still left with having thousands of combinations that you can experiment with, ex. which prompt template x rag context x LLM model gives me the best results. And you need a framework that helps you manage these thousands of experiments. That is where I'm trying to position this framework, which is it helps you scale the need for being able to try thousands of different configuration of your LLM application, so you can improve your LLM application performance, while providing as much flexibility for what components you use to actually build your LLM application.

With our framework, if you want flexibility for

> You know exactly what goes into the prompt, how it’s parsed, what params are used or when they are changed

We provide this for you. We just give you a process that lets you try and evaluate different configurations of your LLM application layer at scale.

asif_ · on July 3, 2024

Interesting read. Honestly I don't have enough business experience to make any conclusion. But here's one point I disagreed with.

The article states companies are pivoting towards more specialized verticals, ex. LlamaIndex is focusing on managed document parsing / OCR, which means they are going to get smaller and smaller and eventually die. I don't think just because companies are narrowing their scope means they can't have viable business. If LlamaIndex was charging $100K base price per enterprise and had 1000 customers, they are doing 100M in revenue at least, which is a very viable business.

If you are curious about this topic, maybe this is a good podcast for you :)

https://www.youtube.com/@opensourcebusiness/videos

asif_ · on July 3, 2024

We provide complete flexibility on how you call your LLM model. So if you have your on-prem LLM behind an API, you would just write the standard code to call your API from within our framework.

asif_ · on July 3, 2024

Hey, thanks for the question. Are you talking about standard evaluation tools like promptfoo? These evaluation frameworks are often just tools that helps you grade the response of your LLM application. They however do not help you to build an LLM application that makes it easy to test different configurations of your application and evaluate them. That is where we different -- we help you build an application that is made for easily testing different configurations of your application so you can evaluate them much faster.

So the process we see when companies are trying to adopt a evaluation framework is that when they want to try a new configuration, they completely change their code-base, create the code to run an evaluation, and review that result independently and try to compare with other changes they have made sometimes in the past. This usually leads to a very slow process for making new changes and becomes very unorganized.

With us, we help you build your LLM application where it's easy to swap components. From there, when you want to see how your application works with a certain configuration, we have a UI where you can pass in the configuration settings for your application, and run an evaluation. We also save all your previous evaluations so you can easily compare them with each other. As a result, it's very easy and fast to test different configurations of your application and evaluate them with us.

asif_ · on July 3, 2024

Hey, thanks for checking out the framework! We just released this week so there aren't any data-points to share yet. But as we onboard more dev teams, we'll are planning on writing about their process and outcomes over the next few months.

If you are curious about the theory and best practices behind iterating on LLM applications to improve it's performance, this is a good blog-post from Data Science at Microsoft: https://medium.com/data-science-at-microsoft/evaluating-llm-...

I am also working on takes the theory behind the blog-post above, and converting that to a more practical guide using our framework. It should be out within the next two weeks. You can get notified when we release a blog by signing up for our newsletter: https://palico.us22.list-manage.com/subscribe?u=84ba2d0a4c03...