AutoGGUF/docs/convert_lora_to_gguf.py

"""
LoRA to GGUF Converter

This script converts a Hugging Face PEFT LoRA adapter to a GGML-compatible file format.

Key features:
- Supports various output formats (f32, f16, bf16, q8_0, auto)
- Handles big-endian and little-endian architectures
- Provides options for lazy evaluation and verbose output
- Combines base model information with LoRA adapters

Classes:
    PartialLoraTensor: Dataclass for storing partial LoRA tensor information.
    LoraTorchTensor: Custom tensor class for LoRA operations and transformations.
    LoraModel: Extends the base model class to incorporate LoRA-specific functionality.

Functions:
    get_base_tensor_name: Extracts the base tensor name from a LoRA tensor name.
    pyinstaller_include: Placeholder for PyInstaller import handling.
    parse_args: Parses command-line arguments for the script.

Usage:
    python lora_to_gguf.py --base <base_model_path> <lora_adapter_path> [options]

Arguments:
    --base: Path to the directory containing the base model file (required)
    lora_path: Path to the directory containing the LoRA adapter file (required)
    --outfile: Path to write the output file (optional)
    --outtype: Output format (f32, f16, bf16, q8_0, auto; default: f16)
    --bigendian: Flag to indicate big-endian machine execution
    --no-lazy: Disable lazy evaluation (uses more RAM)
    --verbose: Increase output verbosity
    --dry-run: Perform a dry run without writing files

The script processes LoRA adapters, combines them with base model information,
and generates a GGML-compatible file for use in various applications.

Note: This script requires specific dependencies like torch, gguf, and safetensors.
Ensure all required libraries are installed before running the script.
"""