Thanks, that is very informative! I have heard about the tokenization process be...

wruza · 2025-02-10T20:45:40 1739220340

When I say LLMs, I mean literal large language models, like all of them in the general "Text-to-Text" && "Transformers" categories, loadable into text-generation-webui. Most people probably only have experience with cloud LLMs https://www.google.com/search?q=big+LLM+companies . Most cloud LLMs are based on transformers (but we don't know what they are cooking in secrecy) https://ai.stackexchange.com/questions/46288/are-there-any-n... . Copilot, Cursor and other frontends are just software that uses some LLM as the main driver, via standard API (e.g. tgwebui can emulate openai api). Connectivity is not a problem here, cause everything is really simple API-wise.

I have heard about the tokenization process before when I tried stable diffusion, but honestly I can't understand it. It sounds important but it also sounds like a very superficial layer whose only purpose is to remove ambiguity, the important work being done by the next layer in the process.

SD is special because it's actually two networks (or more, I lost track of SD tech), which are sort of synchronized into the same "latent space". So your prompt becomes a vector that basically points at the compressed representation of a picture in that space, which then gets decompressed by VAE. And enhanced/controlled by dozens of plugins in case of A1111 or Comfy, with additional specialized networks. I'm not sure how this relates to text-to-text thing, probably doesn't.

mhast · 2025-02-11T13:34:50 1739280890

If you want to get a better understanding of this I recommend playing around in the "chat playgrounds" on some of the engines.

The Google one allows for some free use before you have to pay for tokens. (Usually you can buy $5 worth of tokens as a minimum and that will give you more than you can use up with manual requests.)

https://aistudio.google.com/prompts/new_chat

This UI allows you to alter the system prompt (which is usually hidden from the user on eg ChatGPT) and change to different models and change parameters. And then you give it the chat input similar as any other site.

You can also install a program like "LM Studio" and that will allow you to download models (through the UI) and run locally on your own machine. This gives you a similar interface to what you see in the Google AI Studio but you run it locally. And with downloaded models. (The model you download is the actual LLM which is basically very large amount of parameters you combine with the input tokens to get the next token the system outputs.)

For a more fundamental introduction to what all these systems do there are a number of Computerphile videos which are quite informative. Unfortunately I can't find a good playlist of them all but here's one of the early ones. (Robert Miles is in many of them.) https://www.youtube.com/watch?v=rURRYI66E54