Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It seems like they have learned floating point parameters x and y so that dequantized(bit 0) = x and dequantized(bit 1) = y. Thus there is no built in asymmetry. Or more precisely they learned a zero point and a scale but it's equivalent to this simpler model in the 1 bit case.

It still seems like there would be a problem because either [x, y] looks like [0ish, 1ish] and you can't have negative weights, or [x, y] looks like [-1ish, 1ish] and you can't have "don't care" weights. But if you have some redundancy in your neurons I guess this is acceptable because you can just cancel out the positive contribution from a neuron you don't care about with a negative contribution from a very similar neuron that you also don't care about.



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: