mirror of https://github.com/leafspark/AutoGGUF
41 lines
1.7 KiB
Python
41 lines
1.7 KiB
Python
"""
|
|
LoRA to GGUF Converter
|
|
|
|
This script converts a Hugging Face PEFT LoRA adapter to a GGML-compatible file format.
|
|
|
|
Key features:
|
|
- Supports various output formats (f32, f16, bf16, q8_0, auto)
|
|
- Handles big-endian and little-endian architectures
|
|
- Provides options for lazy evaluation and verbose output
|
|
- Combines base model information with LoRA adapters
|
|
|
|
Classes:
|
|
PartialLoraTensor: Dataclass for storing partial LoRA tensor information.
|
|
LoraTorchTensor: Custom tensor class for LoRA operations and transformations.
|
|
LoraModel: Extends the base model class to incorporate LoRA-specific functionality.
|
|
|
|
Functions:
|
|
get_base_tensor_name: Extracts the base tensor name from a LoRA tensor name.
|
|
pyinstaller_include: Placeholder for PyInstaller import handling.
|
|
parse_args: Parses command-line arguments for the script.
|
|
|
|
Usage:
|
|
python lora_to_gguf.py --base <base_model_path> <lora_adapter_path> [options]
|
|
|
|
Arguments:
|
|
--base: Path to the directory containing the base model file (required)
|
|
lora_path: Path to the directory containing the LoRA adapter file (required)
|
|
--outfile: Path to write the output file (optional)
|
|
--outtype: Output format (f32, f16, bf16, q8_0, auto; default: f16)
|
|
--bigendian: Flag to indicate big-endian machine execution
|
|
--no-lazy: Disable lazy evaluation (uses more RAM)
|
|
--verbose: Increase output verbosity
|
|
--dry-run: Perform a dry run without writing files
|
|
|
|
The script processes LoRA adapters, combines them with base model information,
|
|
and generates a GGML-compatible file for use in various applications.
|
|
|
|
Note: This script requires specific dependencies like torch, gguf, and safetensors.
|
|
Ensure all required libraries are installed before running the script.
|
|
"""
|