LLM 模型测试

hugging face 下载模型

首先在 poweshell 下设置代理,该方式只在 session 中生效

$env:HTTP_PROXY="http://username:password@xxxx.xxxx.xxxx.xxxx:3030"
$env:HTTPS_PROXY="http://username:password@xxxx.xxxx.xxxx.xxxx:3030"

或者利用hf-mirror加速站

$env:HF_ENDPOINT = "https://hf-mirror.com"

下载指定模型，如果是 llama，请先登录，获取授权

huggingface-cli login
huggingface-cli download meta-llama/Llama-3.2-1B

llama.cpp

构建 llama.cpp 环境

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
conda create -n llama.cpp python=3.11
conda activate llama.cpp
pip install -r requirements.txt

验证依赖安装是否正确

python convert_hf_to_gguf.py
usage: convert_hf_to_gguf.py [-h] [--vocab-only] [--outfile OUTFILE] [--outtype {f32,f16,bf16,q8_0,tq1_0,tq2_0,auto}] [--bigendian] [--use-temp-file] [--no-lazy] [--model-name MODEL_NAME] [--verbose]
                             [--split-max-tensors SPLIT_MAX_TENSORS] [--split-max-size SPLIT_MAX_SIZE] [--dry-run] [--no-tensor-first-split] [--metadata METADATA]
                             model
convert_hf_to_gguf.py: error: the following arguments are required: model

如上即为正常。

转换 model 为 gguf 格式

python convert_hf_to_gguf.py models/Llama-3.2-1B/

LM Studio

将编译好的 gguf 模型放置在C:\Users\用户名\.cache\lm-studio\models\，即可识别本地模型。

例如：

"C:\Users\用户名\.cache\lm-studio\models\meta-llama\Llama-3.2-1B\Llama-3.2-1B-F16.gguf"

Ollama

ollama load gguf

Modelfile

FROM ./Llama-3.2-1B-F16.gguf

加载模型

ollama create llama3.2:1b -f .\Modelfile

运行模型

ollama run llama3.2:1b

TrumanDu's Book

LLM 模型测试

hugging face 下载模型

llama.cpp

转换 model 为 gguf 格式

LM Studio

Ollama

ollama load gguf