Hacker News new | past | comments | ask | show | jobs | submit login

> the recent (-1,0,1) encoding?

A side point, but this "recent" encoding goes back to a 2017 paper from the Allen Institute. These days a seven year old paper is ancient.

They went further and showed you could could get away with binary, you don't even need trinary!




Goes back before then. This got popularized by BinaryConnect in 2015, and groups were training binary networks as early as 2011.

You are probably referring to XNOR net, and the novel piece there was also using binary activations (which bitnet is not).

So as far as I can tell, bitnet is basically BinaryConnect applied to LLMs.

https://arxiv.org/abs/1511.00363


Thanks for your informative comment. What HN is for!


The bitnet paper was showing worse results than fp16 transformer with the same parameter count. The shocking result in the 1.58b paper (same group) is no quality loss compared to fp16.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: