minerva.models.finetune_adapters
Classes
LoRA (Low-Rank Adaptation) for Linear Layers. |
Module Contents
- class minerva.models.finetune_adapters.LoRA(original_module, bias=True, alpha=1, r=4)[source]
Bases:
torch.nn.Module
LoRA (Low-Rank Adaptation) for Linear Layers.
This module applies low-rank adaptation to an existing linear layer. LoRA enables fine-tuning of pre-trained models efficiently by introducing learnable low-rank matrices that adapt the weights of the original layer while keeping its parameters frozen.
Parameters
- original_moduletorch.nn.Module
The original linear or transformer layer (e.g., torch.nn.Linear) to which LoRA is applied. It must have in_features and out_features attributes.
- biasbool, optional
Whether to include a bias term in the LoRA adaptation layers. Default is True.
- alphafloat, optional
The scaling factor for the LoRA output. Default is 1.
- rint, optional
The rank of the low-rank matrices used for adaptation. Default is 4.
Attributes
- original_moduletorch.nn.Module
The original module that LoRA adapts.
- matrix_Atorch.nn.Linear
The low-rank matrix A with dimensions (in_features, r).
- matrix_Btorch.nn.Linear
The low-rank matrix B with dimensions (r, out_features).
- scalingfloat
The scaling factor applied to the LoRA adaptation output.
Methods
- init_weights():
Initializes the weights of the low-rank matrices A and B. Matrix A is initialized using Kaiming uniform initialization, and matrix B is initialized with zeros.
- forward(x):
Computes the forward pass through the adapted module.
Examples
>>> import torch >>> import torch.nn as nn >>> from lora_module import LoRA
>>> # Original linear layer >>> original_layer = nn.Linear(128, 64)
>>> # Wrap the original layer with LoRA >>> lora_layer = LoRA(original_layer, alpha=2, r=8)
>>> # Input tensor >>> x = torch.randn(16, 128) # batch size of 16
>>> # Forward pass >>> output = lora_layer(x) >>> print(output.shape) torch.Size([16, 64])
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]
Forward pass of the LoRA module.
Computes the output as the sum of the original module’s output and the low-rank adaptation output, scaled by the specified scaling factor.
Parameters
- xtorch.Tensor
The input tensor with shape (batch_size, in_features).
Returns
- torch.Tensor
The output tensor with shape (batch_size, out_features).
Notes
The output is computed as: .. math:
y = ext{original_module}(x) + ext{scaling} \cdot B(A(x)),
where A and B are the learnable low-rank matrices.
- Parameters:
x (torch.Tensor)
- init_weights()[source]
Initialize weights for the low-rank matrices.
Matrix A is initialized with Kaiming uniform initialization, which is suitable for layers with ReLU activations. Matrix B is initialized with zeros to ensure that the original module’s behavior is not perturbed at the start.
- matrix_A
- matrix_B
- original_module
- scaling = 0.25
- Parameters:
original_module (torch.nn.Module)
bias (bool)
alpha (int)
r (int)