LLM: LoRA

1. peft库

1.1 环境安装

1
2
3
4
5
6
7
8
conda create -n peft python=3.10
conda activate peft

# 安装相应的包
pip install git+https://github.com/huggingface/transformers
pip install git+https://github.com/huggingface/accelerate
pip install git+https://github.com/huggingface/peft
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=11.8 -c pytorch -c nvidia

LoRA代码运行

1
bash create_datasets.sh

2. huggingface下载模型和数据方法一览

2.1 模型下载

huggingface官网链接:https://huggingface.co/docs/huggingface_hub/v0.19.3/guides/download

下载一整个仓库

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
from huggingface_hub import snapshot_download

repo_id = "gpt2-medium" # 模型在huggingface上的名称
local_dir = "/data/wangyh/mllms/LoRA-main/examples/NLG/pretrained_checkpoints/" # 本地模型存储的地址
local_dir_use_symlinks = False # 本地模型使用文件保存,而非blob形式保存
# token = "XXX" # 在hugging face上生成的 access token

# # 如果需要代理的话
# proxies = {
# 'http': 'XXXX',
# 'https': 'XXXX',
# }

snapshot_download(
repo_id=repo_id,
local_dir=local_dir,
local_dir_use_symlinks=local_dir_use_symlinks,
# token=token,
# proxies=proxies
)

下载仓库中的某个文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
from huggingface_hub import hf_hub_download

repo_id = "gpt2-medium" # 模型在huggingface上的名称
file_name = "pytorch_model.bin"
local_dir = "/data/wangyh/mllms/LoRA-main/examples/NLG/pretrained_checkpoints_2/" # 本地模型存储的地址
local_dir_use_symlinks = False # 本地模型使用文件保存,而非blob形式保存
# token = "XXX" # 在hugging face上生成的 access token

# # 如果需要代理的话
# proxies = {
# 'http': 'XXXX',
# 'https': 'XXXX',
# }

hf_hub_download(
repo_id=repo_id,
filename=file_name,
local_dir=local_dir,
local_dir_use_symlinks=local_dir_use_symlinks,
# token=token,
# proxies=proxies
)
Error: API rate limit exceeded for 18.232.203.249. (But here's the good news: Authenticated requests get a higher rate limit. Check out the documentation for more details.)