Agent 学习笔记

Agent 是指能够感知环境、做出决策和采取行动的实体。

主流 Agent 设计思想

CoT

思维链（Chain of Thought） <input——>reasoning chain——>output>

仅仅在 Prompt 中添加了一句 “Let's Step by Step” 就能够让大模型在推理上用到了思维链。

CoT 的核心思想是通过提供一系列中间推理步骤来引导模型逐步思考，从而生成更加准确和连贯的输出。这种方法特别适用于需要复杂推理、数学计算或解决多步骤问题的场景。

ReACT

ReAct（Reasoning and Action）是一种推理与行动相结合的思考链模式，设计思路是让模型在生成答案的过程中，不仅推理出逻辑链条，还能结合上下文采取相应行动。ReAct 模式通过交替执行推理和行动步骤，适用于开放式问答、对话和任务执行类应用场景。

在 LangChain、AutoGPT、MetaGPT 都有 ReAct 思想的实现,按推理 -> 行动 -> 行动输入 -> 观察结果循环实现。

LangChain 中最经典的 ReAct 的 Prompt 结构:

PREFIX = """Answer the following questions as best you can. You have access to the following tools:"""
FORMAT_INSTRUCTIONS = """Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question"""
SUFFIX = """Begin!

Question: {input}
Thought:{agent_scratchpad}"""

提示词的最后一行是 Thought:{agent_scratchpad}。 agent_scratchpad 保存了代理已经执行的所有想法或行动，下一次的思考 -> 行动 -> 观察循环可以通过 agent_scratchpad 访问到历史的所有想法和行动，从而实现代理行动的连续性。

ReAct Agent 的一般提示词结构：

前缀：引入工具的描述
格式：定义React Agent的输出格式

问题：用户输入的问题
思考：React Agent推理如何行动
行动：需要使用的工具
行动输入：工具所需输入
观察：行动执行后得到的结果
（按需重复：思考-行动-观察流程）

终点推理：产生最终结论
最后回答：问题的答案

TrumanDu's Book

Agent 学习笔记

主流 Agent 设计思想

CoT

ReACT

参考