Does anyone else have trouble loading from the qwen blogs? I always get their placeholders for loading and nothing ever comes in. I don’t know if this is ad blocker related or what… (I’ve even disabled it but it still won’t load)
What doesn’t make sense to me is that the cameras are no where as good as human eyes. The dynamic range sucks, it doesn’t put down a visor or where sunglasses to deal with beaming light, resolution is much worse, etc. why not invest in the cameras themselves if this is your claim?
I always see this argument but from experience I don't buy it. FSD and its cameras work fine driving with the sun directly in front of the car. When driving manually I need the visor so far down I can only see the bottom of the car in front of me.
The cameras on Teslas only really lose visibility when dirty. Especially in winter when there's salt everywhere. Only the very latest models (2025+?) have decent self-cleaning for the cameras that get very dirty.
For which car? The older the car (hardware) version the worse it is. I've never had any front camera blinding issues with a 2022 car (HW3).
The thing to remember about cameras is what you see in an image/display is not what the camera sees. Processing the image reduces the dynamic range but FSD could work off of the raw sensor data.
It doesn't run well on HW3 at all. HW4 has significantly better FSD when running comparable versions (v14). The software has little to do with the front camera getting blinded though.
"works fine" as in can follow a wide asphalt roads' white lines. That is absolutely trivial thing, Lego mind storms could follow a line just fine with a black/white sensor.
This vision clearly doesn't scale to more complex scenarios.
Cerebras has effectively 100% yield on these chips. They have an internal structure made by just repeating the same small modular units over and over again. This means they can just fuse off the broken bits without affecting overall function. It's not like it is with a CPU.
I suggest to read their website, they explain pretty well how they manage good yield. Though I’m not an expert in this field. I does make sense and I would be surprised if they were caught lying.
Defects are best measured on a per-wafer basis, not per-chip. So if if your chips are huge and you can only put 4 chips on a wafer, 1 defect can cut your yield by 25%. If they're smaller and you fit 100 chips on a wafer, then 1 defect on the wafer is only cutting yield by 1%. Of course, there's more to this when you start reading about "binning", fusing off cores, etc.
There's plenty of information out there about how CPU manufacturing works, why defects happen, and how they're handled. Suffice to say, the comment makes perfect sense.
That's why you typically fuse off defective sub-units and just have a slightly slower chip. GPU and CPU manufacturers have done this for at least 15 years now, that I'm aware of.
Sure it does. If it’s many small dies on a wafer, then imperfections don’t ruin the entire batch; you just bin those components. If the entire wafer is a single die, you have much less tolerance for errors.
They can be made from large wafers. A defect typically breaks whatever chip it's on, so one defect on a large wafer filled with many small chips will still just break one chip of the many on the wafer. If your chips are bigger, one defect still takes out a chip, but now you've lost more of the wafer area because the chip is bigger. So you get a super-linear scaling of loss from defects as the chips get bigger.
With careful design, you can tolerate some defects. A multi-core CPU might have the ability to disable a core that's affected by a defect, and then it can be sold as a different SKU with a lower core count. Cerebras uses an extreme version of this, where the wafer is divided up into about a million cores, and a routing system that can bypass defective cores.
There’s an expected amount of defects per wafer. If a chip has a defect, then it is lost (simplification). A wafer with 100 chips may lose 10 to defects, giving a yield of 90%. The same wafer but with 1000 smaller chips would still have lost only 10 of them, giving 99% yield.
As another comment referenced in this thread states, Cerebras seems to have solved by making their big chip a lot of much smaller cores that can be disposed of if they have errors.
Indeed, the original comment you replied to actually made no sense in this case. But there seemed to be some confusion in the thread, so I tried to clear that up. I hope I’ll get to talk with one of the cerebras engineers one day, that chip is really one of a kind.
Forces you to read after every write. E.g. you edit line 15 to be two lines. Then now you need arithmetic for later vs earlier lines or you need to read full file to reindex by line number.
we dug into those sorts of questions with hypertokens, a robust hash for lines, code, tables/rows or any in-context token tagging to give models photographic memory
one mechanism we establish is that each model has a fidelity window, i.e., r tokens of content for s tag tokens; each tag token adds extra GUID-like marker capacity via its embedding vector; since 1,2,3 digit numbers only one token in top models, a single hash token lacks enough capacity & separation in latent space
we also show hash should be properly prefix-free, or unique symbols perp digit, e.g., if using A-K & L-Z to hash then A,R is legal hash whereas M,C is not permitted hash
we can do all this & more rather precisely as we show in our arXiv paper on same; next update goes deeper into group theory, info theory, etc. on boosting model recall, reasoning, tool calls, etc. by way of robust hashing
The author writes that these hashes are 2 or 3 characters long. I assume depending on the line count. That's good for almost 48k lines. You have other issues then.
reply