不用框架，从零构建AI代理

代理并不复杂！

我们的目标是，仅使用OpenAI API和Python创建一个聊天机器人代理，能够使用计算器或知识库等工具，不依赖任何框架。这将为你提供关于代理系统工作原理的基础和内部细节。

如果你刚刚起步并想了解一些代理的基本知识，可以参考这篇文章。

1、高层次想法

我们将构建一个AI代理，它可以：

接受用户输入
决定使用哪个工具（计算器、知识库或无需工具）
手动执行该工具（没有自动魔法）
返回响应

我们将模拟OpenAI函数调用，但控制工具选择和执行流程。

2、什么是代理，为什么需要它？

一个AI代理是一个系统，它可以：

感知输入（例如用户的消息）
决定行动方案（例如进行计算、搜索事实）
使用工具（例如函数、API）执行任务
返回有意义的输出

代理之所以有用，是因为它们可以：

使用外部工具
维持多轮对话上下文（虽然完整的记忆需要存储对话历史）
如果需要扩展，可以轻松添加新功能以实现可扩展性。

3、规划你的代理

在编码之前，我们需要决定以下内容：

我们的代理应该使用哪些工具？
工具的调用是由LLM还是我们的Python代码控制？
代理如何维护状态或记忆？
我们是使用OpenAI内置的函数调用，还是自己模拟？

我们将通过结构化提示使用OpenAI的常规ChatCompletion API来提取工具选择和输入，从而模拟自己的函数调用。

4、关键概念与组件

组件及其作用：

LLM → 处理和解释查询的语言模型
提示 → 告诉LLM如何表现以及我们期望什么
工具选择器 → 解析用户输入并决定使用哪个工具
工具运行器 → 执行计算器或知识搜索
推理循环 → 管理对话、上下文和回复

5、模型层与应用层

层级及其职责：

模型层（LLM） → 理解自然语言并路由意图
应用程序层（例如：Python代码） → 执行工具、管理流程、发送响应

这种分离允许你轻松地替换工具、提示或模型。

https://www.flyingfish.vc/blog/back-to-basics

6、最小代理架构

用户输入  
    ↓  
LLM 工具选择器（决定使用哪个工具及输入）  
    ↓  
Python 工具执行器（运行计算器或知识搜索）  
    ↓  
最终响应

7、实现步骤

7.1 工具定义

import openai  
import os  
openai.api_key = os.getenv("OPENAI_API_KEY")  

import re  
def calculate_expression(expression):  
    safe_expr = re.sub(r'[^0-9+\-\*/(). ]', '', expression)  
    if safe_expr.strip() == "":  
        return None  
    try:  
        return eval(safe_expr)  
    except:  
        return None  
knowledge_base = {  
    "python": "Python is a programming language known for its readability.",  
    "openai": "OpenAI is the company that created ChatGPT.",  
    "ai": "AI stands for Artificial Intelligence, which simulates human intelligence."  
}  

# 我们只是模拟了知识检索，在真实世界中，这将是你的知识检索器  
def search_knowledge(query):  
    q = query.lower()  
    for keyword, value in knowledge_base.items():  
        if keyword in q:  
            return value  
    return None

7.2 模拟函数调用（LLM作为工具路由器）

我们不使用OpenAI内置的函数调用，而是要求LLM：

决定使用哪个工具：calculator、knowledge_search或none
返回我们自行解析的JSON样式的结构

代理如何知道使用知识库？

代理在以下情况下使用知识库：

用户提出事实性或信息性问题
我们的系统提示告诉它：“如果问题是事实性的，请选择knowledge_search。”
LLM理解类似以下查询的语义：
“Python是什么？”
“谁是OpenAI？”
“告诉我关于AI。”

它会响应如下：

{ "tool": "knowledge_search", "input": "What is Python?" }

如果它找不到匹配项（例如，“法国的首都是什么？”），LLM可能会回复"tool": "none"，然后我们回退到常规聊天响应。

def query_openai_tool_selector(user_input):  
    messages = [  
        {  
            "role": "system",  
            "content": (  
                "You're an AI assistant. Based on the user's message, "  
                "decide which tool to use: 'calculator', 'knowledge_search', or 'none'. "  
                "Respond ONLY with a JSON object like this:\n"  
                "{ \"tool\": \"calculator\", \"input\": \"5 * (4 + 3)\" }\n"  
                "or\n"  
                "{ \"tool\": \"none\", \"input\": \"Hello!\" }"  
            )  
        },  
        {"role": "user", "content": user_input}  
    ]  

    response = openai.ChatCompletion.create(  
            model="gpt-3.5-turbo",  
            messages=messages  
        )  
        try:  
            tool_call = eval(response.choices[0].message['content'])  
            return tool_call.get("tool"), tool_call.get("input")  
        except:  
            return "none", user_input

7.3 工具执行与响应处理

def query_openai(user_input):  
    tool, tool_input = query_openai_tool_selector(user_input)      
    if tool == "calculator":  
        result = calculate_expression(tool_input)  
        return f"The result is: {result}" if result else "I couldn't compute that."  
    elif tool == "knowledge_search":  
        result = search_knowledge(tool_input)  
        return result if result else "I couldn't find relevant information."  
    else:  
        response = openai.ChatCompletion.create(  
            model="gpt-3.5-turbo",  
            messages=[  
                {"role": "system", "content": "You are a helpful assistant."},  
                {"role": "user", "content": user_input}  
            ]  
        )  
        return response.choices[0].message['content']

7.4 如何运转

用户: "7 * (2 + 3)是多少？"  
    ↓  
LLM（工具选择器）: { "tool": "calculator", "input": "7 * (2 + 3)" }  
    ↓  
Python 工具: calculate_expression("7 * (2 + 3)") → 35  
    ↓  
最终回复: "结果是: 35"

8、完整最终代理代码

import openai, re, os  
openai.api_key = os.getenv("OPENAI_API_KEY")  

def calculate_expression(expression):  
    safe_expr = re.sub(r'[^0-9+\-\*/(). ]', '', expression)  
    if safe_expr.strip() == "":  
        return None  
    try:  
        return eval(safe_expr)  
    except:  
        return None  

knowledge_base = {  
    "python": "Python is a programming language known for its readability.",  
    "openai": "OpenAI is the company that created ChatGPT.",  
    "ai": "AI stands for Artificial Intelligence, which simulates human intelligence."  
}  

# 这只是模拟的知识检索  
def search_knowledge(query):  
    q = query.lower()  
    for keyword, value in knowledge_base.items():  
        if keyword in q:  
            return value  
    return None  

def query_openai_tool_selector(user_input):  
    messages = [  
        {  
            "role": "system",  
            "content": (  
                "You're an AI assistant. Based on the user's message, "  
                "decide which tool to use: 'calculator', 'knowledge_search', or 'none'. "  
                "Respond ONLY with a JSON object like this:\n"  
                "{ \"tool\": \"calculator\", \"input\": \"5 * (4 + 3)\" }\n"  
                "or\n"  
                "{ \"tool\": \"none\", \"input\": \"Hello!\" }"  
            )  
        },  
        {"role": "user", "content": user_input}  
    ]  

    response = openai.ChatCompletion.create(  
        model="gpt-3.5-turbo",  
        messages=messages  
    )  

    try:  
        tool_call = eval(response.choices[0].message['content'])  
        return tool_call.get("tool"), tool_call.get("input")  
    except:  
        return "none", user_input  

def query_openai(user_input):  
    tool, tool_input = query_openai_tool_selector(user_input)  
    if tool == "calculator":  
        result = calculate_expression(tool_input)  
        return f"The result is: {result}" if result else "I couldn't compute that."  
    elif tool == "knowledge_search":  
        result = search_knowledge(tool_input)  
        return result if result else "I couldn't find relevant information."  
    else:  
        response = openai.ChatCompletion.create(  
            model="gpt-3.5-turbo",  
            messages=[  
                {"role": "system", "content": "You are a helpful assistant."},  
                {"role": "user", "content": user_input}  
            ]  
        )  
        return response.choices[0].message['content']
``````markdown
t = query_openai_tool_selector(user_input)  
    if tool == "calculator":  
        result = calculate_expression(tool_input)  
        return f"结果是: {result}" if result else "我无法计算那个。"  
    elif tool == "knowledge_search":  
        result = search_knowledge(tool_input)  
        return result if result else "我找不到相关的信息。"  
    else:  
        response = openai.ChatCompletion.create(  
            model="gpt-3.5-turbo",  
            messages=[  
                {"role": "system", "content": "你是一个有用的助手。"},  
                {"role": "user", "content": user_input}  
            ]  
        )  
        return response.choices[0].message['content']  
print("欢迎！问我任何问题。输入'quit'退出。")  
while True:  
    user_input = input("你: ")  
    if user_input.lower() == "quit":  
        print("助手: 再见！")  
        break  
    print("助手:", query_openai(user_input))

9、结束语

你现在有了一个功能齐全的AI代理，它：

理解用户输入
使用LLM输出智能地选择工具
手动执行函数而不依赖OpenAI的内置工具调用
保持逻辑清晰和控制

你可以在此基础上进行哪些改进：

添加更多工具（天气、提醒、文件处理等）
添加更多推理和规划能力
添加上下文记忆（例如，memory = []）
构建Web界面（Streamlit、Flask、React）
将其连接到你的业务逻辑和API

原文链接：Building AI Agents from Scratch -No Frameworks

汇智网翻译整理，转载请标明出处