APPLICATION

集成Llama模型与MCP服务器

在这篇博客文章中，我们将学习如何使用任何开源LLM、OpenAI或Google Gemini的MCP服务器。我们将构建一个简单的CLI代理，该代理可以控制和与SQLite数据库交互。

admin

Mar 8, 2025 • 9 min read

AI代理正在从简单的聊天机器人发展为能够进行复杂推理、规划和与现实世界互动的（半）自主系统。这些代理不仅仅是理论概念；它们将在各种垂直领域中投入生产，处理越来越复杂和长时间运行的任务。随着它们的能力增长，管理和发展它们的挑战也随之增加。

目前，AI代理的设计是将所有工具、结构和资源集中在一个单一的单体应用程序中。随着系统的增长，这可能会变得越来越具有挑战性，导致维护困难、阻碍创新，并在开发过程中造成瓶颈。

为了解决这些挑战，Antropic提出了一个名为模型上下文协议（MCP）的开放标准。它试图解耦组件并定义明确的接口。在这篇博客文章中，你将学习如何使用任何开源LLM（如Llama3）、OpenAI或Google Gemini的MCP服务器。你将学习如何构建一个简单的CLI代理，该代理可以控制和与SQLite数据库交互。

1、什么是模型上下文协议（MCP）？

模型上下文协议（MCP）是一个开放标准，使开发人员能够在LLM之上构建安全的代理和复杂的流程。MCP允许AI代理通过标准化的应用程序提供上下文的方式无缝连接到各种外部数据源和工具。MCP实现了一个客户端-服务器架构，其中AI代理启动客户端并与服务器通信。当前的MCP服务器组件包括：

工具：这些是LLM可以调用以执行特定操作的功能，例如天气API。
资源：这些是可以访问的数据源，类似于REST API中的GET端点。资源提供数据而不执行大量计算。
提示：这些是预定义的模板，用于以最优化的方式使用工具或资源。

MCP允许AI代理通过统一协议与多个资源进行交互。工具和资源可以独立更新、测试、扩展和重用于不同的平台，而无需复制代码。

然而，仍然存在一些限制和缺失的功能，包括远程MCP服务器或如何正确处理身份验证。好消息是这些问题正在得到解决！

注意：如果你想查看Google Gemini示例，请检查仓库。

2、MCP Server与其他AI模型的搭配

我们将构建一个简单的CLI代理，该代理可以通过MCP服务器控制SQLite数据库。我们将使用官方的SQLite MCP服务器使用Docker。架构非常简单：

首先，我们需要创建我们的LLM客户端，例如OpenAI或GenerativeModel。
初始化MCP客户端并连接到我们的SQLite MCP服务器。
加载由MCP服务器提供的现有工具、资源和提示。
将工具转换为LLM兼容的函数调用工具（JSON Schema），并可调用到我们的MCP服务器。
根据可用的MCP功能创建自定义系统消息。
启动等待用户输入的交互循环。

在我们的示例中，我们将使用托管在Hugging Face Inference API上的Meta Llama 3.3 70B指令，并通过openai SDK进行访问。

注意：在开始之前，请确保你已经设置了docker并登录到你的Hugging Face账户：

huggingface-cli login --token YOUR_TOKEN

在我们开始编码之前，我们需要安装所需的库。我们将使用openai SDK与Hugging Face Inference API进行交互，并需要官方的Python mcp SDK。

pip install huggingface_hub openai "mcp==1.1.2"

注意：与其他博客文章不同，你将在下面找到完整的代码文件，包括对不同部分的详细代码注释。

import json
from huggingface_hub import get_token
from openai import AsyncOpenAI
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from typing import Any, List
import asyncio
 
MODEL_ID = "meta-llama/Llama-3.3-70B-Instruct"
 
# System prompt that guides the LLM's behavior and capabilities
# This helps the model understand its role and available tools
SYSTEM_PROMPT = """You are a helpful assistant capable of accessing external functions and engaging in casual chat. Use the responses from these function calls to provide accurate and informative answers. The answers should be natural and hide the fact that you are using tools to access real-time information. Guide the user about available tools and their capabilities. Always utilize tools to access real-time information when required. Engage in a friendly manner to enhance the chat experience.
 
# Tools
 
{tools}
 
# Notes 
 
- Ensure responses are based on the latest information available from function calls.
- Maintain an engaging, supportive, and friendly tone throughout the dialogue.
- Always highlight the potential of available tools to assist users comprehensively."""
 
 
# Initialize OpenAI client with HuggingFace inference API
# This allows us to use Llama models through HuggingFace's API
client = AsyncOpenAI(
    base_url="https://api-inference.huggingface.co/v1/",
    api_key=get_token(),
)
 
 
class MCPClient:
    """
    A client class for interacting with the MCP (Model Control Protocol) server.
    This class manages the connection and communication with the SQLite database through MCP.
    """
 
    def __init__(self, server_params: StdioServerParameters):
        """Initialize the MCP client with server parameters"""
        self.server_params = server_params
        self.session = None
        self._client = None
 
    async def __aenter__(self):
        """Async context manager entry"""
        await self.connect()
        return self
 
    async def __aexit__(self, exc_type, exc_val, exc_tb):
        """Async context manager exit"""
        if self.session:
            await self.session.__aexit__(exc_type, exc_val, exc_tb)
        if self._client:
            await self._client.__aexit__(exc_type, exc_val, exc_tb)
 
    async def connect(self):
        """Establishes connection to MCP server"""
        self._client = stdio_client(self.server_params)
        self.read, self.write = await self._client.__aenter__()
        session = ClientSession(self.read, self.write)
        self.session = await session.__aenter__()
        await self.session.initialize()
 
    async def get_available_tools(self) -> List[Any]:
        """
        Retrieve a list of available tools from the MCP server.
        """
        if not self.session:
            raise RuntimeError("Not connected to MCP server")
 
        tools = await self.session.list_tools()
        _, tools_list = tools
        _, tools_list = tools_list
        return tools_list
 
    def call_tool(self, tool_name: str) -> Any:
        """
        Create a callable function for a specific tool.
        This allows us to execute database operations through the MCP server.
 
        Args:
            tool_name: The name of the tool to create a callable for
 
        Returns:
            A callable async function that executes the specified tool
        """
        if not self.session:
            raise RuntimeError("Not connected to MCP server")
 
        async def callable(*args, **kwargs):
            response = await self.session.call_tool(tool_name, arguments=kwargs)
            return response.content[0].text
 
        return callable
 
 
async def agent_loop(query: str, tools: dict, messages: List[dict] = None):
    """
    Main interaction loop that processes user queries using the LLM and available tools.
 
    This function:
    1. Sends the user query to the LLM with context about available tools
    2. Processes the LLM's response, including any tool calls
    3. Returns the final response to the user
 
    Args:
        query: User's input question or command
        tools: Dictionary of available database tools and their schemas
        messages: List of messages to pass to the LLM, defaults to None
    """
    messages = (
        [
            {
                "role": "system",
                "content": SYSTEM_PROMPT.format(
                    tools="\n- ".join(
                        [
                            f"{t['name']}: {t['schema']['function']['description']}"
                            for t in tools.values()
                        ]
                    )
                ),  # Creates System prompt based on available MCP server tools
            },
        ]
        if messages is None
        else messages  # reuse existing messages if provided
    )
    # add user query to the messages list
    messages.append({"role": "user", "content": query})
 
    # Query LLM with the system prompt, user query, and available tools
    first_response = await client.chat.completions.create(
        model=MODEL_ID,
        messages=messages,
        tools=([t["schema"] for t in tools.values()] if len(tools) > 0 else None),
        max_tokens=4096,
        temperature=0,
    )
    # detect how the LLM call was completed:
    # tool_calls: if the LLM used a tool
    # stop: If the LLM generated a general response, e.g. "Hello, how can I help you today?"
    stop_reason = (
        "tool_calls"
        if first_response.choices[0].message.tool_calls is not None
        else first_response.choices[0].finish_reason
    )
 
    if stop_reason == "tool_calls":
        # Extract tool use details from response
        for tool_call in first_response.choices[0].message.tool_calls:
            arguments = (
                json.loads(tool_call.function.arguments)
                if isinstance(tool_call.function.arguments, str)
                else tool_call.function.arguments
            )
            # Call the tool with the arguments using our callable initialized in the tools dict
            tool_result = await tools[tool_call.function.name]["callable"](**arguments)
            # Add the tool result to the messages list
            messages.append(
                {
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "name": tool_call.function.name,
                    "content": json.dumps(tool_result),
                }
            )
 
        # Query LLM with the user query and the tool results
        new_response = await client.chat.completions.create(
            model=MODEL_ID,
            messages=messages,
        )
 
    elif stop_reason == "stop":
        # If the LLM stopped on its own, use the first response
        new_response = first_response
 
    else:
        raise ValueError(f"Unknown stop reason: {stop_reason}")
 
    # Add the LLM response to the messages list
    messages.append(
        {"role": "assistant", "content": new_response.choices[0].message.content}
    )
 
    # Return the LLM response and messages
    return new_response.choices[0].message.content, messages
 
 
async def main():
    """
    Main function that sets up the MCP server, initializes tools, and runs the interactive loop.
    The server is run in a Docker container to ensure isolation and consistency.
    """
    # Configure Docker-based MCP server for SQLite
    server_params = StdioServerParameters(
        command="docker",
        args=[
            "run",
            "--rm",  # Remove container after exit
            "-i",  # Interactive mode
            "-v",  # Mount volume
            "mcp-test:/mcp",  # Map local volume to container path
            "mcp/sqlite",  # Use SQLite MCP image
            "--db-path",
            "/mcp/test.db",  # Database file path inside container
        ],
        env=None,
    )
 
    # Start MCP client and create interactive session
    async with MCPClient(server_params) as mcp_client:
        # Get available database tools and prepare them for the LLM
        mcp_tools = await mcp_client.get_available_tools()
        # Convert MCP tools into a format the LLM can understand and use
        tools = {
            tool.name: {
                "name": tool.name,
                "callable": mcp_client.call_tool(
                    tool.name
                ),  # returns a callable function for the rpc call
                "schema": {
                    "type": "function",
                    "function": {
                        "name": tool.name,
                        "description": tool.description,
                        "parameters": tool.inputSchema,
                    },
                },
            }
            for tool in mcp_tools
            if tool.name
            != "list_tables"  # Excludes list_tables tool as it has an incorrect schema
        }
 
        # Start interactive prompt loop for user queries
        messages = None
        while True:
            try:
                # Get user input and check for exit commands
                user_input = input("\nEnter your prompt (or 'quit' to exit): ")
                if user_input.lower() in ["quit", "exit", "q"]:
                    break
 
                # Process the prompt and run agent loop
                response, messages = await agent_loop(user_input, tools, messages)
                print("\nResponse:", response)
                # print("\nMessages:", messages)
            except KeyboardInterrupt:
                print("\nExiting...")
                break
            except Exception as e:
                print(f"\nError occurred: {e}")
 
 
if __name__ == "__main__":
    asyncio.run(main())

你可以通过以下方式运行代理并开始聊天：

python sqlite_llama_mcp_agent.py

以下是你可以做什么以及它如何响应的一个示例对话。

Enter your prompt (or 'quit' to exit): HEllo 
Response: Hello! How can 1 assist you today?

Enter your prompt (or 'quit' to exit): Who are you?
Response: I am a helpful assistant capable of accessing external functions and engaging in casual chat. I can use the available tools to provide accurate and infor mative answers. The available tools include read_query,
write_query, create_table, describe_table, and append_insight. I can guide you about these tools and their capabilities, and I will utilize them to access real-time information when required. 

Enter your prompt (or 'quit' to exit): Cool! Create a new table for fake products. Each product should have a name and a price. Generate 10 fake products based on video games.
Response: The table has been created successfully with the following schema:
- name (TEXT): The name of the product.
- price (REAL): The price of the product.

Here are 10 fake products based on video games:
1. "The Last of Us" T-shirt - $19.99
2. Minecraft Creeper Plush Toy - $14.99
3. Grand Theft Auto V Poster - $9,99
4. The Legend of Zelda: Breath of the Wild Strategy Guide - $24.99
5. Call of Duty: Modern Warfare Gaming Keyboard - $69.99
6. World of Warcraft Subscription Card (3 months) - $39.99
7. Assassin's Creed Odyssey Action Figure - $29.99
8. Fortnite Monopoly Board Game - $29.99    
9. Resident Evil 2 Remake Collector's Edition - $99.99
10. Pokemon Sword and Shield Nintendo Switch Bundle - $399.99

Let me know if you'd like to add more products or perform any other operations on the table!

请随时更改代码，包含调试语句以便更好地理解每一步的输出。

Google Gemini 示例

3、结束语

这个简单的示例展示了如何使用MCP服务器与开放的LLMs、OpenAI或Google Gemini构建AI代理的基础知识。它演示了代理如何连接到服务器、与工具交互并响应用户命令的核心原理。这不是一个框架或可靠的实现——它的目的是帮助你理解使用MCP服务器构建分布式AI代理架构的基本组件。

原文链接：How to use Anthropic MCP Server with open LLMs, OpenAI or Google Gemini

汇智网翻译整理，转载请标明出处