automatically quant GGUF models

Go to file

leafspark a0bd57cd70 add kv override and cuda backend bugfix		2024-08-03 14:29:04 -07:00
src	add kv override and cuda backend bugfix	2024-08-03 14:29:04 -07:00
LICENSE	Initial commit	2024-08-02 21:09:30 -07:00
README.md	Update README.md	2024-08-03 13:33:49 -07:00
requirements.txt	add code	2024-08-02 21:10:32 -07:00
run.bat	Add files via upload	2024-08-02 21:55:57 -07:00

README.md

AutoGGUF - automated GGUF model quantizer

This application provides a graphical user interface for quantizing GGUF models using the llama.cpp library. It allows users to download different versions of llama.cpp, manage multiple backends, and perform quantization tasks with various options.

Main features:

Download and manage llama.cpp backends
Select and quantize GGUF models
Configure quantization parameters
Monitor system resources during quantization

Usage:

Cross platform:

Install dependencies, either using the requirements.txt file or pip install PyQt6 requests psutil.
Run the run.bat script to start the application, or run the command python src/main.py.

Windows:

Download latest release, extract all to folder and run AutoGGUF.exe
Enjoy!

Building:

cd src
pip install -U pyinstaller
pyinstaller main.py
cd dist/main
main

Dependencies:

PyQt6
requests
psutil

To be implemented:

Actual progress bar tracking
Download safetensors from HF and convert to unquanted GGUF
Specify multiple KV overrides
Better error handling
Cannot select output/token embd type

Troubleshooting:

llama.cpp quantizations errors out with an iostream error: create the quantized_models directory (or set a directory)

User interface: