Created FAQ (markdown)

2024-08-04 16:31:54 -07:00 · 2024-08-04 16:31:54 -07:00 · 18d33612e4
parent e0d596d4ad
commit 18d33612e4
1 changed files with 11 additions and 0 deletions
--- a/FAQ.md
+++ b/FAQ.md
@ -0,0 +1,11 @@
 1. Q: What is the difference between quantization types?
   A: Different quantization types offer various tradeoffs between model size and inference quality. IQ1_S is the smallest but has worst quality and quantization time, while Q8_0 offers better quality but larger file size and faster quantization.
 2. Q: Can I quantize any HuggingFace model?
   A: Most HuggingFace models compatible with the GGUF format can be quantized using AutoGGUF, although you need to first convert it with the command `python convert_hf_to_gguf.py --outtype auto path_to_your_hf_model` and then move the GGUF to the `models` folder.
 3. Q: How long does quantization take?
   A: Quantization time depends on the model size, quantization type, and your hardware. It can range from minutes to several hours.