automatically quant GGUF models

Go to file

BuildTools 45d0212abe Merge branch 'main' of https://github.com/leafspark/AutoGGUF		2024-09-04 17:53:44 -07:00
.github	feat(ui): add AutoFP8 quantization window	2024-09-02 18:17:29 -07:00
assets	refactor: prepare repo for v1.8.1	2024-09-04 17:19:54 -07:00
docs	refactor: prepare repo for v1.8.1	2024-09-04 17:19:54 -07:00
plugins	feat(core): implement plugins	2024-08-22 20:08:02 -07:00
src	fix: reorganize imports and remove pytz	2024-09-04 17:53:32 -07:00
.env.example	feat: support key shortcuts for AutoFP8 window	2024-09-04 17:31:39 -07:00
.gitattributes	chore: add .gitattributes	2024-08-04 18:52:14 -07:00
.gitignore	feat(core): implement plugins	2024-08-22 20:08:02 -07:00
.pre-commit-config.yaml	ci: remove crlf	2024-08-04 21:15:34 -07:00
CHANGELOG.md	docs: update CHANGELOG.md	2024-09-04 17:36:48 -07:00
CODE_OF_CONDUCT.md	docs: add code of conduct	2024-08-05 11:47:18 -07:00
CONTRIBUTING.md	docs: update docstrings and small code fixes	2024-08-16 19:43:48 -07:00
LICENSE	add details	2024-08-03 19:41:08 -07:00
README.md	docs: update showcase image	2024-09-04 17:33:53 -07:00
SECURITY.md	docs: add more information to SECURITY.md	2024-09-04 17:47:39 -07:00
build.bat	edit favicon	2024-08-04 16:04:04 -07:00
build.sh	build: add cross-platform build scripts	2024-08-04 19:04:02 -07:00
build_optimized.bat	refactor: edit build scripts and add README mention	2024-08-13 20:10:06 -07:00
requirements.txt	build(deps): update setuptools requirement from ~=68.2.0 to ~=74.0.0	2024-09-01 11:19:39 +00:00
run.bat	modify backend check logic	2024-08-04 09:12:07 -07:00
run.sh	feat(conversion): add HF to GGUF conversion + refactor localization	2024-08-05 13:29:30 -07:00
setup.py	docs: update CHANGELOG.md	2024-09-04 17:36:48 -07:00

README.md

AutoGGUF - automated GGUF model quantizer

AutoGGUF provides a graphical user interface for quantizing GGUF models using the llama.cpp library. It allows users to download different versions of llama.cpp, manage multiple backends, and perform quantization tasks with various options.

Features

Download and manage llama.cpp backends
Select and quantize GGUF models
Configure quantization parameters
Monitor system resources during quantization
Parallel quantization + imatrix generation
LoRA conversion and merging
Preset saving and loading
AutoFP8 quantization

Usage

Cross-platform

Install dependencies:
```
pip install -r requirements.txt
```
Run the application:
```
python src/main.py
```
or use the run.bat script.

macOS and Ubuntu builds are provided with GitHub Actions, you may download the binaries in the releases section.

Windows

Standard builds:

Download the latest release
Extract all files to a folder
Run AutoGGUF-x64.exe

Setup builds:

Download setup varient of latest release
Extract all files to a folder
Run the setup program
The .GGUF extension will be registered with the program automatically
Run the program from the Start Menu or desktop shortcuts

After launching the program, you may access its local server at port 7001 (set AUTOGGUF_SERVER to "enabled" first)

Verifying Releases

Linux/macOS:

gpg --import AutoGGUF-v1.5.0-prerel.asc
gpg --verify AutoGGUF-v1.5.0-Windows-avx2-prerel.zip.sig AutoGGUF-v1.5.0-Windows-avx2-prerel.zip
sha256sum -c AutoGGUF-v1.5.0-prerel.sha256

Windows (PowerShell):

# Import the public key
gpg --import AutoGGUF-v1.5.0-prerel.asc

# Verify the signature
gpg --verify AutoGGUF-v1.8.1-Windows-avx2.zip.sig AutoGGUF-v1.8.1-Windows-avx2.zip

# Check SHA256
$fileHash = (Get-FileHash -Algorithm SHA256 AutoGGUF-v1.8.1-Windows-avx2.zip).Hash.ToLower()
$storedHash = (Get-Content AutoGGUF-v1.8.1.sha256 | Select-String AutoGGUF-v1.8.1-Windows-avx2.zip).Line.Split()[0]
if ($fileHash -eq $storedHash) { "SHA256 Match" } else { "SHA256 Mismatch" }

Release keys are identical to ones used for commiting.

Building

Cross-platform

pip install -U pyinstaller
./build.sh RELEASE | DEV
cd build/<type>/dist/
./AutoGGUF

Windows

build RELEASE | DEV

Find the executable in build/<type>/dist/AutoGGUF.exe.

You can also use the slower build but faster executable method (Nuitka):

build_optimized RELEASE | DEV

Dependencies

Find them in requirements.txt.

Localizations

View the list of supported languages at AutoGGUF/wiki/Installation#configuration (LLM translated, except for English).

To use a specific language, set the AUTOGGUF_LANGUAGE environment variable to one of the listed language codes (note: some languages may not be fully supported yet, those will fall back to English).

Issues

None!

Planned Features

Time estimation for quantization
Actual progress bar tracking
Perplexity testing
HuggingFace upload/download (coming in the next release)
AutoFP8 quantization (partially done) and bitsandbytes (coming soon)

Troubleshooting

SSL module cannot be found error: Install OpenSSL or run from source using python src/main.py with the run.bat script (pip install requests)

Contributing

Fork the repo, make your changes, and ensure you have the latest commits when merging. Include a changelog of new features in your pull request description. Read CONTRIBUTING.md for more information.

README.md

AutoGGUF - automated GGUF model quantizer

Features

Usage

Cross-platform

Windows

Verifying Releases

Linux/macOS:

Windows (PowerShell):

Building

Cross-platform

Windows

Dependencies

Localizations

Issues

Planned Features

Troubleshooting

Contributing

User Interface

Stargazers