用LlamaFactory微调Qwen-2 VL

在这篇博文中，我们将探索如何使用 LlamaFactory 框架微调多模态大模型Qwen-2 VL。无论你是经验丰富的 AI 开发人员还是刚刚起步，本指南都将为你提供定制 Qwen-2 VL 以满足你的特定需求的知识。

1、Qwen-2 VL：多模态冠军

在我们进入微调过程之前，让我们花点时间来欣赏这个惊人模型的功能。

开源：这意味着它可供所有人免费使用和修改，促进 AI 社区内的创新和协作。
紧凑尺寸：与许多需要大量计算资源的大型语言模型 (LLM) 不同，Qwen-2 VL 非常紧凑，个人和资源有限的小型团队都可以使用它。
多模态：能够同时处理文本和图像，使 Qwen-2 VL 能够处理各种任务，从图像字幕到视觉问答。

2、LlamaFactory：微调的简便方法

微调是将预训练模型调整为特定任务的过程。这对于提高模型的性能和实现最佳结果至关重要。LlamaFactory 通过其用户友好的界面和强大的功能简化了此过程。

LlamaFactory 就像拥有一个装满 AI 魔法工具的工具箱，让你可以：

微调各种 AI 模型：从 LLM 到 Qwen-2 VL 等多模态模型。
使用“低代码”或“无代码”方法：这意味着你不必是编码专家即可开始使用。
为特定任务自定义模型：训练你的模型以进行图像字幕、文本摘要或你能想到的任何其他任务。

用LlamaFactory 微调 Qwen-2 VL 模型有两种主要方法：

LlamaBoard：无代码方法

LlamaBoard 是一个可视化、用户友好的界面，让您无需编写一行代码即可微调模型。它非常适合初学者和那些喜欢更直观方法的人。

LlamaFactory CLI：命令行灵活性

LlamaFactory CLI 通过命令行命令提供更大的灵活性和对微调过程的控制。这对于想要尝试各种参数和设置的有经验的用户来说是理想的选择。

3、入门：设置你的环境

让我们为微调冒险做好准备：

Google Colab Pro：你需要访问 Google Colab Pro 以获得必要的计算资源。免费的 Colab 无法完成这项任务！
克隆 LlamaFactory：使用 git clone 从 GitHub 下载 LlamaFactory 存储库。
安装依赖项：通过运行 pip install -r requirements.txt 确保你拥有所有必需的软件包。
准备数据：收集你将用于微调模型的文本和图像数据。

!git clone https://github.com/hiyouga/LLaMA-Factory.git

%cd LLaMA-Factory

!pip install -r requirements.txt

!pip install bitsandbytes

!pip install git+https://github.com/huggingface/transformers.git
!pip install -e ".[torch, metrics]"
!pip install liger-kernel

3.1 Llama Board

import os
!GRADIO_SHARE=1 llamafactory-cli webui

4、微调过程：分步指南

对于这篇博文，我们将重点介绍 LlamaFactory CLI 方法，但 LlamaBoard 的步骤类似。

创建配置文件 (JSON)

首先创建一个 JSON 文件，概述微调过程的参数。这包括你正在使用的模型、数据集和所需的训练设置等内容。

启动微调过程

使用 llama_factory train 命令，将路径传递到你的 JSON 配置文件。

监控训练

观察微调过程的输出和进度。这将让你深入了解模型的学习方式。

合并微调模型

训练完成后，你可以使用 LlamaFactory 中提供的 merge_adapter 函数将微调模型与原始模型合并。

测试和部署

最后，评估微调模型的性能并将其部署到您的应用程序中。

5、Llama Factory CLI


import json

args = dict(
  stage="sft",                        # do supervised fine-tuning
  do_train=True,
  model_name_or_path="Qwen/Qwen2-VL-2B-Instruct", # use bnb-4bit-quantized Llama-3-8B-Instruct model
  dataset="mllm_demo,identity",             # use alpaca and identity datasets
  template="qwen2_vl",                     # use llama3 prompt template
  finetuning_type="lora",                   # use LoRA adapters to save memory
  lora_target="all",                     # attach LoRA adapters to all linear layers
  output_dir="qwen2vl_lora",                  # the path to save LoRA adapters
  per_device_train_batch_size=2,               # the batch size
  gradient_accumulation_steps=4,               # the gradient accumulation steps
  lr_scheduler_type="cosine",                 # use cosine learning rate scheduler
  logging_steps=10,                      # log every 10 steps
  warmup_ratio=0.1,                      # use warmup scheduler
  save_steps=1000,                      # save checkpoint every 1000 steps
  learning_rate=5e-5,                     # the learning rate
  num_train_epochs=3.0,                    # the epochs of training
  max_samples=500,                      # use 500 examples in each dataset
  max_grad_norm=1.0,                     # clip gradient norm to 1.0
  loraplus_lr_ratio=16.0,                   # use LoRA+ algorithm with lambda=16.0
  fp16=True,                         # use float16 mixed precision training
  use_liger_kernel=True,                   # use liger kernel for efficient training
)


json.dump(args, open("train_qwen2vl.json", "w", encoding="utf-8"), indent=2)
!llamafactory-cli train train_qwen2vl.json

args = dict(
  model_name_or_path="Qwen/Qwen2-VL-2B-Instruct", # use official non-quantized Llama-3-8B-Instruct model
  adapter_name_or_path="qwen2vl_lora",            # load the saved LoRA adapters
  template="qwen2_vl",                     # same to the one in training
  finetuning_type="lora",                  # same to the one in training
  export_dir="qwen2vl_2b_instruct_lora_merged",              # the path to save the merged model
  export_size=2,                       # the file shard size (in GB) of the merged model
  export_device="cpu",                    # the device used in export, can be chosen from `cpu` and `cuda`
  #export_hub_model_id="your_id/your_model",         # the Hugging Face hub ID to upload model
)

json.dump(args, open("merge_qwen2vl.json", "w", encoding="utf-8"), indent=2)

%cd /content/LLaMA-Factory/

!llamafactory-cli export merge_qwen2vl.json

final_model_path = "/content/LLaMA-Factory/qwen2vl_2b_instruct_lora_merged"

hf_model_repo = "skuma307/Qwen2-VL-2B-Instruct-LoRA-FT"

from huggingface_hub import notebook_login

notebook_login()

from huggingface_hub import HfApi, HfFolder, Repository

# Create an instance of HfApi
api = HfApi()

api.upload_folder(
    folder_path=final_model_path,    # The folder containing the model files
    repo_id=hf_model_repo,                # Your authentication token
    commit_message="Initial model upload"  # Optional commit message
)

print(f"Model pushed to: {hf_model_repo}")

故障排除：常见错误及其解决方法

GPU 内存问题：如果遇到内存不足错误，请尝试清理缓存、释放 GPU 内存或减小批处理大小。
缺少依赖项：仔细检查是否安装了所有必要的依赖项。
数据格式问题：确保你的数据格式正确且与 LlamaFactory 兼容。

7、结束语：微调的力量

使用 LlamaFactory 对 Qwen-2 VL 等多模态模型进行微调开辟了无限可能。它允许你针对特定任务自定义模型的功能，从而提高准确性和性能。

别忘了查看 LlamaFactory GitHub 存储库，你将找到全面的文档、代码示例和有用的资源。

原文链接：Fine-Tuning the Multimodal Marvel: Qwen-2 VL with LlamaFactory

汇智网翻译整理，转载请标明出处