The U.S. needs changes in its constitution if it ever wants to go back to where it was and get the rest of the world to play along again.
The fact that the DoJ is not an independent institution unlike in almost every other western country makes it impossible to uphold the law if the white house doesn't want to. The only thing preventing a sitting president from going after his political enemies is a "gentleman's agreement" between administrations in the United States.
Stability it key and there isn't any as we can see clearly.
I can't instinctively process how many R's are in STRAWBERRY. I use my vision to get it though almost immediately.
I feel simple transformers simply don't get access to those modalities that a human would use. I can't use my "talking" centers to count letters in words either.
You just need to pay attention to understand you don't use your language skills to count words.
The author creates art using their own custom library that uses CSS-like syntax to render HTML, SVG, and more recently shaders. The point isn't that this is the best way to do it. It's simply a trick that the author used to do something with their own bespoke library that they were trying to do.
To get a single knowledge-cutoff they spent 16.5h wall-clock hours on a cluster of 128 NVIDIA GH200 GPUs (or 2100 GPU-hours), plus some minor amount of time for finetuning. The prerelease_notes.md in the repo is a great description on how one would achieve that
While I know there's going to be a lot of complications in this, given a quick search it seems like these GPUs are ~$2/hr, so $4000-4500 if you don't just have access to a cluster. I don't know how important the cluster is here, whether you need some minimal number of those for the training (and it would take more than 128x longer or not be possible on a single machine) or if a cluster of 128 GPUs is a bunch less efficient but faster. A 4B model feels like it'd be fine on one to two of those GPUs?
Also of course this is for one training run, if you need to experiment you'd need to do that more.
reply