In 2025, Microsoft researchers had released an open-weights and open inference code model BitNet b1.58 2B4T demonstrating performance competitive to the full precision models at 2B parameters and 4T training tokens.
Some researchers point out that the scaling laws of large language models favor the low-bit weights only in case of undertrained models. As the number of training tokens increases, the deficiencies of low-bit quantization surface.
Ma et al. 2024, p. 1. - Ma, Shuming; Wang, Hongyu; Ma, Lingxiao; Wang, Lei; Wang, Wenhui; Huang, Shaohan; Dong, Li; Wang, Ruiping; Xue, Jilong; Wei, Furu (2024-02-27). "The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits". arXiv:2402.17764 [cs.CL]. https://arxiv.org/abs/2402.17764
Friha et al. 2024, p. 5822. - Friha, Othmane; Amine Ferrag, Mohamed; Kantarci, Burak; Cakmak, Burak; Ozgun, Arda; Ghoualmi-Zine, Nassira (2024). "LLM-Based Edge Intelligence: A Comprehensive Survey on Architectures, Applications, Security and Trustworthiness". IEEE Open Journal of the Communications Society. 5: 5799–5856. doi:10.1109/OJCOMS.2024.3456549. ISSN 2644-125X. https://doi.org/10.1109%2FOJCOMS.2024.3456549
Hutson 2024. - Hutson, Matthew (2024-05-30). "1-bit LLMs Could Solve AI's Energy Demands". IEEE Spectrum. Retrieved 2025-04-22. https://spectrum.ieee.org/1-bit-llm
Ma et al. 2024, p. 1. - Ma, Shuming; Wang, Hongyu; Ma, Lingxiao; Wang, Lei; Wang, Wenhui; Huang, Shaohan; Dong, Li; Wang, Ruiping; Xue, Jilong; Wei, Furu (2024-02-27). "The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits". arXiv:2402.17764 [cs.CL]. https://arxiv.org/abs/2402.17764
Morales 2025. - Morales, Jowi (2025-04-17). "Microsoft researchers build 1-bit AI LLM with 2B parameters". Tom's Hardware. Retrieved 2025-04-21. https://www.tomshardware.com/tech-industry/artificial-intelligence/microsoft-researchers-build-1-bit-ai-llm-with-2b-parameters-model-small-enough-to-run-on-some-cpus
Huyen 2024, p. 330. - Huyen, Chip (2024-12-04). AI Engineering. "O'Reilly Media, Inc.". ISBN 978-1-0981-6627-4. Retrieved 2025-04-22. https://books.google.com/books?id=S7M1EQAAQBAJ&pg=PA330
Wang et al. 2023, p. 1. - Wang, Hongyu; Ma, Shuming; Dong, Li; Huang, Shaohan; Wang, Huaijie; Ma, Lingxiao; Yang, Fan; Wang, Ruiping; Wu, Yi; Wei, Furu (2023). "BitNet: Scaling 1-bit Transformers for Large Language Models". arXiv:2310.11453 [cs.CL]. https://arxiv.org/abs/2310.11453
Ma et al. 2025. - Ma, Shuming; Wang, Hongyu; Huang, Shaohan; Zhang, Xingxing; Hu, Ying; Song, Ting; Xia, Yan; Wei, Furu (2025). "BitNet b1.58 2B4T Technical Report". arXiv:2504.12285 [cs.CL]. https://arxiv.org/abs/2504.12285
Ouyang et al. 2024. - Ouyang, Xu; Ge, Tao; Hartvigsen, Thomas; Zhang, Zhisong; Mi, Haitao; Yu, Dong (2024). "Low-Bit Quantization Favors Undertrained LLMS: Scaling Laws for Quantized LLMS with 100T Training Tokens". arXiv:2411.17691 [cs.LG]. https://arxiv.org/abs/2411.17691
Kumar et al. 2024. - Kumar, Tanishq; Ankner, Zachary; Spector, Benjamin F.; Bordelon, Blake; Muennighoff, Niklas; Paul, Mansheej; Pehlevan, Cengiz; Ré, Christopher; Raghunathan, Aditi (2024). "Scaling Laws for Precision". arXiv:2411.04330 [cs.LG]. https://arxiv.org/abs/2411.04330