GLM-4.6支持思考及思维链回传
· 阅读需 5 分钟
GLM-4.6在cluade code中启用思考
GLM从4.5开始就对claude code进行了支持,我之前也一直在关注,很多用户反映在claude code中无法启用思考,刚好最近收到了来自智谱的赞助,就着手进行研究。
首先根据官方文档,我们发现/chat/completions端点是默认启用思考的,但是是由模型判断是否需要进行思考
thinking object
仅 GLM-4.5 及以上模型支持此参数配置. 控制大模型是否开启思维链。
thinking.type enum<string> default:enabled
是否开启思维链(当开启后 GLM-4.6 GLM-4.5 为模型自动判断是否思考,GLM-4.5V 为强制思考), 默认: enabled.
Available options: enabled, disabled
在claude code本身大量的提示词干扰下,会严重阻碍GLM模型本身的判断机制,导致模型很少进行思考。所以我们需要对模型进行引导,让模型认为需要进行思考。但是claude-code-router作为proxy,能做的只能是修改提示词/参数。
在最开始,我尝试直接删除claude code的系统提示词,模型确实进行了思考,但是这样就无法驱动claude code。所以我们需要进行提示词注入,明确告知模型需要进行思考。
// transformer.ts
import { UnifiedChatRequest } from "../types/llm";
import { Transformer } from "../types/transformer";
export class ForceReasoningTransformer implements Transformer {
name = "forcereasoning";
async transformRequestIn(
request: UnifiedChatRequest
): Promise<UnifiedChatRequest> {
const systemMessage = request.messages.find(
(item) => item.role === "system"
);
if (Array.isArray(systemMessage?.content)) {
systemMessage.content.push({
type: "text",
text: "You are an expert reasoning model. \nAlways think step by step before answering. Even if the problem seems simple, always write down your reasoning process explicitly. \nNever skip your chain of thought. \nUse the following output format:\n<reasoning_content>(Write your full detailed thinking here.)</reasoning_content>\n\nWrite your final conclusion here.",
});
}
const lastMessage = request.messages[request.messages.length - 1];
if (lastMessage.role === "user" && Array.isArray(lastMessage.content)) {
lastMessage.content.push({
type: "text",
text: "You are an expert reasoning model. \nAlways think step by step before answering. Even if the problem seems simple, always write down your reasoning process explicitly. \nNever skip your chain of thought. \nUse the following output format:\n<reasoning_content>(Write your full detailed thinking here.)</reasoning_content>\n\nWrite your final conclusion here.",
});
}
if (lastMessage.role === "tool") {
request.messages.push({
role: "user",
content: [
{
type: "text",
text: "You are an expert reasoning model. \nAlways think step by step before answering. Even if the problem seems simple, always write down your reasoning process explicitly. \nNever skip your chain of thought. \nUse the following output format:\n<reasoning_content>(Write your full detailed thinking here.)</reasoning_content>\n\nWrite your final conclusion here.",
},
],
});
}
return request;
}
}
至于为什么让模型将思考内容放入reasoning_content标签而不是think标签有两个原因:
- 直接使用think标签不能很好的激活思考,猜测是训练模型时以think标签作为数据集进行训练。
- 如果使用think标签,模型的推理内容会被拆分到单独的字段,这就涉及到我们接下来要说的思维链回传问题。