论文、相关资源和引用方式
以下学术作品按时间倒序排列。
SpQR:一种用于近乎无损 LLM 权重压缩的稀疏量化表示(2023 年 6 月)
作者:Tim Dettmers、Ruslan Svirschevski、Vage Egiazarian、Denis Kuznedelev、Elias Frantar、Saleh Ashkboos、Alexander Borzunov、Torsten Hoefler、Dan Alistarh
@article{dettmers2023spqr,
title={SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression},
author={Dettmers, Tim and Svirschevski, Ruslan and Egiazarian, Vage and Kuznedelev, Denis and Frantar, Elias and Ashkboos, Saleh and Borzunov, Alexander and Hoefler, Torsten and Alistarh, Dan},
journal={arXiv preprint arXiv:2306.03078},
year={2023}
}
QLoRA:量化 LLM 的高效微调(2023 年 5 月)
作者:Tim Dettmers、Artidoro Pagnoni、Ari Holtzman、Luke Zettlemoyer
@article{dettmers2023qlora,
title={Qlora: Efficient finetuning of quantized llms},
author={Dettmers, Tim and Pagnoni, Artidoro and Holtzman, Ari and Zettlemoyer, Luke},
journal={arXiv preprint arXiv:2305.14314},
year={2023}
}
4 位精度的论据:k 位推理缩放定律(2022 年 12 月)
作者:Tim Dettmers、Luke Zettlemoyer
@inproceedings{dettmers2023case,
title={The case for 4-bit precision: k-bit inference scaling laws},
author={Dettmers, Tim and Zettlemoyer, Luke},
booktitle={International Conference on Machine Learning},
pages={7750--7774},
year={2023},
organization={PMLR}
}
LLM.int8():面向大规模 Transformers 的 8 位矩阵乘法(2022 年 11 月)
作者:Tim Dettmers、Mike Lewis、Younes Belkada、Luke Zettlemoyer
@article{dettmers2022llm,
title={Llm. int8 (): 8-bit matrix multiplication for transformers at scale},
author={Dettmers, Tim and Lewis, Mike and Belkada, Younes and Zettlemoyer, Luke},
journal={arXiv preprint arXiv:2208.07339},
year={2022}
}
通过分块量化实现 8 位优化器(2021 年 10 月)
作者:Tim Dettmers、Mike Lewis、Sam Shleifer、Luke Zettlemoyer
@article{DBLP:journals/corr/abs-2110-02861,
author = {Tim Dettmers and
Mike Lewis and
Sam Shleifer and
Luke Zettlemoyer},
title = {8-bit Optimizers via Block-wise Quantization},
journal = {CoRR},
volume = {abs/2110.02861},
year = {2021},
url = {https://arxiv.org/abs/2110.02861},
eprinttype = {arXiv},
eprint = {2110.02861},
timestamp = {Thu, 21 Oct 2021 16:20:08 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-2110-02861.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}