AutoGGUF/docs/convert_lora_to_gguf.py

41 lines
1.7 KiB
Python

"""
LoRA to GGUF Converter
This script converts a Hugging Face PEFT LoRA adapter to a GGML-compatible file format.
Key features:
- Supports various output formats (f32, f16, bf16, q8_0, auto)
- Handles big-endian and little-endian architectures
- Provides options for lazy evaluation and verbose output
- Combines base model information with LoRA adapters
Classes:
PartialLoraTensor: Dataclass for storing partial LoRA tensor information.
LoraTorchTensor: Custom tensor class for LoRA operations and transformations.
LoraModel: Extends the base model class to incorporate LoRA-specific functionality.
Functions:
get_base_tensor_name: Extracts the base tensor name from a LoRA tensor name.
pyinstaller_include: Placeholder for PyInstaller import handling.
parse_args: Parses command-line arguments for the script.
Usage:
python lora_to_gguf.py --base <base_model_path> <lora_adapter_path> [options]
Arguments:
--base: Path to the directory containing the base model file (required)
lora_path: Path to the directory containing the LoRA adapter file (required)
--outfile: Path to write the output file (optional)
--outtype: Output format (f32, f16, bf16, q8_0, auto; default: f16)
--bigendian: Flag to indicate big-endian machine execution
--no-lazy: Disable lazy evaluation (uses more RAM)
--verbose: Increase output verbosity
--dry-run: Perform a dry run without writing files
The script processes LoRA adapters, combines them with base model information,
and generates a GGML-compatible file for use in various applications.
Note: This script requires specific dependencies like torch, gguf, and safetensors.
Ensure all required libraries are installed before running the script.
"""