arxiv:2310.11454

VeRA: Vector-based Random Matrix Adaptation

Published on Oct 17, 2023

· Submitted by

akhaliq on Oct 18, 2023

#3 Paper of the day

Upvote

Authors:

Dawid Jan Kopiczko ,

Yuki Markus Asano

Abstract

Vector-based Random Matrix Adaptation (VeRA) reduces the number of trainable parameters by 10x compared to LoRA while maintaining performance, and is demonstrated on benchmarks like GLUE and E2E, showing its utility in instruction-following.

AI-generated summary

Low-rank adapation (LoRA) is a popular method that reduces the number of trainable parameters when finetuning large language models, but still faces acute storage challenges when scaling to even larger models or deploying numerous per-user or per-task adapted models. In this work, we present Vector-based Random Matrix Adaptation (VeRA), which reduces the number of trainable parameters by 10x compared to LoRA, yet maintains the same performance. It achieves this by using a single pair of low-rank matrices shared across all layers and learning small scaling vectors instead. We demonstrate its effectiveness on the GLUE and E2E benchmarks, and show its application in instruction-following with just 1.4M parameters using the Llama2 7B model.