Especially in the age of AI tools, I also thought about this a few times. The current idea I have is something like a parking meter. Every expensive transaction (like calling a model) would subtract from the money pool, and every visitor could see how much is still left in the pool. In addition, a list of the top 5 donors with their amounts might improve the group dynamic (like on pay-what-you-want pages like humblebundle.com).
It would be more about covering the cost than about making someone rich, but I think that is what most of the people who build stuff care about. Sadly, I don't know a service yet that offers this model.
This won't work when the meter is at zero due to human psychology. New visitors will say: "no one subsidized my experience (indeed I don't even know what $thing does) but <creator> wants me to subsidize $thing for others".
The whole "subsidize for other visitors" concept is weaker than "pay <creator>".
One of my favorite recent KDE features: Press Meta+t to design a custom window layout, and later hold Shift while you drag a window to place it in a slot in that layout.
I mean, if you let the LLM build a testris bot, it would be 1000x better than what the LLMs are doing. So yes, it is fun to win against an AI, but to be fair against such processing power, you should not be able to win. It is only possible because LLMs are not built for such tasks.
While Qwen2.5 was pre-trained on 18 trillion tokens, Qwen3 uses nearly twice that amount, with approximately 36 trillion tokens covering 119 languages and dialects.
Thanks for the info, but I don't think it answers the question. I mean, you could train a 20-node network on 36 trillion tokens. Wouldn't make much sense, but you could. So I was asking more about the number of nodes / parameters or GB of file size.
This is the Max series models with unreleased weights, so probably larger than the largest released one. Also when refering to models, use huggingface or modelscope (wherever it is published) ollama is a really poor source on model info. they have some some bad naming (like confusing people on the deepseek R1 models), renaming, and more on model names, and they default to q4 quants, witch is a good sweet-spot but really degrades performance compared to the raw weigths.
It is one thing to do that while you have that boss, but something completely different to keep acting that way even when you have a different boss. The more people you have on a team who keep their mouths shut, the less effective it will be.
That is exactly the point. But it makes sense if you look at it from the other side. When you put in the effort to maintain a project, there have to be boundaries to the social interactions, and when those are reached, "just fork it" is a pressure valve to protect the ones who put in the effort to maintain projects.
Many people think they know how something should be done better, but as a community, we have to protect the ones who are not just talking, but actually maintaining.
I actually ran the numbers on time dilation! At 600km/s (0.2%), the effect is surprisingly small. We basically 'save' about 63 seconds a year compared to a stationary observer relative to the CMB. Not enough to live forever, but enough to be late for a meeting.
From a syntax perspective, I prefer the component syntax in Vue / Riot, which is HTML-like. That way, the general structure is clear, and you have to learn only the additional directives. As a bonus, syntax highlighting in most editors just works without an additional plugin.
It would be more about covering the cost than about making someone rich, but I think that is what most of the people who build stuff care about. Sadly, I don't know a service yet that offers this model.
reply