PEFT 文档 (PEFT documentation)

骨骼 (Bone)

Hugging Face's logo
加入 Hugging Face 社区 (Join the Hugging Face community)

并获得增强的文档体验 (and get access to the augmented documentation experience)

开始使用 (to get started)

Bone

块仿射适配 (Block-Affine Adaptation) (Bone) 这项工作受到 GQA 和 MQA 的启发,利用 LLM 权重的稀疏性来设计块仿射适配 (Bone) 结构。在 Bone 中,原始权重被划分为多个子空间,所有子空间共享一个初始化为零的低秩矩阵以进行更新。广泛的实验表明,Bone 在各种任务中始终优于 LoRA 及其变体,同时还提供卓越的计算效率。(Block-Affine Adaptation (Bone) This work, inspired by GQA and MQA, leverages the sparsity of LLM weights to design the Block-Affine Adaptation (Bone) structure. In Bone, the original weights are divided into multiple sub-spaces, all of which share a single low-rank matrix initialized to zero for updates. Extensive exper-iments demonstrate that Bone consistently outperforms LoRA and its variants across various tasks, while also offering superior computational efficiency.)

论文摘要如下 (The abstract from the paper is)

低秩适配 (LoRA) 通过冻结原始权重并仅训练低秩矩阵,取得了显著的训练成果,从而确立了其作为 LLM 的主要微调方法地位。为了追求更接近全参数训练的性能,出现了一系列 LoRA 变体,例如 LoRA+、PISSA、Olora 和 LoRA-GA。本文介绍了一种不同于 LoRA 的新型 PEFT 技术,称为块仿射适配 (Bone)。通过将原始权重划分为多个子空间,这些子空间共享一个用于权重更新的矩阵,Bone 简化了流程,只需将可训练矩阵初始化为零,从而消除了像某些 LoRA 变体中那样进行复杂初始化的需要。与 LoRA 相比,Bone 显著降低了内存使用量并实现了更快的计算速度。对 NLU 和 NLG 任务的评估表明,Bone 显著优于 LoRA 及其变体。受 Pissa 的启发,我们进一步提出了“权重引导”理论,以更好地利用来自原始权重的信息。通过将“权重引导”与 Bone 相结合,我们开发了一种名为块仿射变换 (Bat) 的新结构,消融实验证实了“权重引导”的有效性。(Low-Rank Adaptation (LoRA) has achieved remarkable training results by freezing the original weights and training only low-rank matrices, establishing itself as the predominant fine-tuning method for LLMs. In pursuit of performance closer to full-parameter training, a series of LoRA variants have emerged, such as LoRA+, PISSA, Olora, and LoRA-GA. This paper introduces a novel PEFT technique distinct from LoRA, called Block-Affine Adaptation (Bone). By dividing the original weights into multiple subspaces that share a single matrix for weight updates, Bone simplifies the process by requiring the trainable matrix to be initialized to zero, eliminating the need for complex initialization as in some LoRA variants. Compared to LoRA, Bone significantly reduces memory usage and achieves faster computation. Evaluation of both NLU and NLG tasks demonstrates that Bone substantially outperforms LoRA and its variants. Inspired by Pissa, we further proposed the ‘Weight Guide’ theory to better utilize the information from the original weights. By integrating ‘Weight Guide’ with Bone, we developed a new structure called Block-Affine Transformation (Bat), and ablation experiments confirmed the effectiveness of ‘Weight Guide’.)

BoneConfig

class peft.BoneConfig

< >

( task_type: typing.Union[str, peft.utils.peft_types.TaskType, NoneType] = None peft_type: typing.Union[str, peft.utils.peft_types.PeftType, NoneType] = None auto_mapping: typing.Optional[dict] = None base_model_name_or_path: typing.Optional[str] = None revision: typing.Optional[str] = None inference_mode: bool = False r: int = 64 target_modules: Optional[Union[list[str], str]] = None exclude_modules: Optional[Union[list[str], str]] = None init_weights: bool | Literal['bat'] = True layers_to_transform: Optional[Union[list[int], int]] = None layers_pattern: Optional[str] = None bias: str = 'none' modules_to_save: Optional[list[str]] = None )

参数 (Parameters)

  • r (int) — Bone 在不同层中的秩。最好将 “r” 设置为偶数;否则,默认的初始化方法将不起作用。(The rank of Bone across different layers. It is best to set ‘r’ to an even number; otherwise, the default initialization method will not work.)
  • target_modules (Optional[Union[List[str], str]]) — 应用适配器的模块名称。如果指定此项,则只会替换具有指定名称的模块。当传递字符串时,将执行正则表达式匹配。当传递字符串列表时,将执行精确匹配,或者检查模块名称是否以任何传递的字符串结尾。如果指定为 “all-linear”,则将选择所有线性模块,但不包括输出层。如果未指定此项,将根据模型架构选择模块。如果架构未知,则会引发错误——在这种情况下,您应该手动指定目标模块。(The names of the modules to apply the adapter to. If this is specified, only the modules with the specified names will be replaced. When passing a string, a regex match will be performed. When passing a list of strings, either an exact match will be performed or it is checked if the name of the module ends with any of the passed strings. If this is specified as ‘all-linear’, then all linear modules are chosen, excluding the output layer. If this is not specified, modules will be chosen according to the model architecture. If the architecture is not known, an error will be raised — in this case, you should specify the target modules manually.)
  • exclude_modules (Optional[Union[List[str], str]]) — 不应用适配器的模块名称。当传递字符串时,将执行正则表达式匹配。当传递字符串列表时,将执行精确匹配,或者检查模块名称是否以任何传递的字符串结尾。(The names of the modules to not apply the adapter. When passing a string, a regex match will be performed. When passing a list of strings, either an exact match will be performed or it is checked if the name of the module ends with any of the passed strings.)
  • init_weights (bool | Literal[“bat”]) — 不同的初始化对应于不同的 Bone 变体。默认情况下,设置为 True 使用 Bone 结构,而 “bat” 选择 Bat 结构。(Different initializations correspond to different Bone variants. By default, setting True uses the Bone structure, while “bat” selects the Bat structure.)
  • layers_to_transform (Union[List[int], int]) — 要转换的层索引。如果传递整数列表,它会将适配器应用于此列表中指定的层索引。如果传递单个整数,它将对该索引处的层应用转换。(The layer indices to transform. If a list of ints is passed, it will apply the adapter to the layer indices that are specified in this list. If a single integer is passed, it will apply the transformations on the layer at this index.)
  • layers_pattern (str) — 层模式名称,仅当 layers_to_transformNone 不同时使用。(The layer pattern name, used only if layers_to_transform is different from None.)
  • modules_to_save (List[str]) — 除了适配器层之外,要设置为可训练并保存在最终检查点中的模块列表。(List of modules apart from adapter layers to be set as trainable and saved in the final checkpoint.)

这是用于存储 BoneModel 配置的配置类。(This is the configuration class to store the configuration of a BoneModel.)

BoneModel

class peft.BoneModel

< >

( model peft_config: Union[PeftConfig, dict[str, PeftConfig]] adapter_name: str low_cpu_mem_usage: bool = False ) torch.nn.Module

参数 (Parameters)

  • model (torch.nn.Module) — 将在上面附加适配器调谐器层的模型。(The model to which the adapter tuner layers will be attached.)
  • config (BoneConfig) — Bone 模型的配置。
  • adapter_name (str) — 适配器的名称,默认为 "default"
  • low_cpu_mem_usage (bool, optional, 默认为 False) — 在 meta 设备上创建空的适配器权重。有助于加速加载过程。

返回值

torch.nn.Module

Bone 模型。

从预训练模型创建 Householder 反射适配 (Bone) 模型。该方法在 https://arxiv.org/abs/2409.15371 中进行了描述

示例

>>> from diffusers import StableDiffusionPipeline
>>> from peft import BoneModel, BoneConfig

>>> config_te = BoneConfig(
...     r=8,
...     target_modules=["k_proj", "q_proj", "v_proj", "out_proj", "fc1", "fc2"],
...     init_weights=True,
... )
>>> config_unet = BoneConfig(
...     r=8,
...     target_modules=[
...         "proj_in",
...         "proj_out",
...         "to_k",
...         "to_q",
...         "to_v",
...         "to_out.0",
...         "ff.net.0.proj",
...         "ff.net.2",
...     ],
...     init_weights=True,
... )

>>> model = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
>>> model.text_encoder = BoneModel(model.text_encoder, config_te, "default")
>>> model.unet = BoneModel(model.unet, config_unet, "default")

属性:

  • model (~torch.nn.Module) — 要适配的模型。
  • peft_config (BoneConfig): Bone 模型的配置。

delete_adapter

< >

( adapter_name: str )

参数 (Parameters)

  • adapter_name (str) — 要删除的适配器的名称。

删除现有的适配器。

merge_and_unload

< >

( progressbar: bool = False safe_merge: bool = False adapter_names: typing.Optional[typing.List[str]] = None )

参数 (Parameters)

  • progressbar (bool) — 是否显示进度条,指示卸载和合并过程
  • safe_merge (bool) — 是否激活安全合并检查,以检查适配器权重中是否存在任何潜在的 Nan 值
  • adapter_names (List[str], optional) — 应该合并的适配器名称列表。如果为 None,则将合并所有活动的适配器。默认为 None

此方法将 Bone 层合并到基础模型中。如果有人想要将基础模型用作独立模型,则需要这样做。

unload

< >

( )

通过删除所有 Bone 模块但不进行合并来恢复基础模型。这将返回原始的基础模型。

< > 更新 在 GitHub 上