使用 MiMo 的 thinking.type

MiMo 推理参数使用 thinking.type。Xiaomi 文档只说明该参数的 enabled 和 disabled 两个值。如果要尝试 reasoning_effort 这类字段，先用目标模型发起请求，并比较质量、延迟和 token 用量；确认有效后再写进业务配置。

确认目标 MiMo 模型适用本文

先确认目标模型是否仍接受 thinking.type。本文适用于 GenStudio 上提供的 MiMo thinking 能力模型，例如 mimo-v2.5-pro。

设置当前请求的 thinking.type

当前请求开启思考时使用 thinking.type: "enabled"。以下 Python 示例使用 OpenAI SDK。

language-python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["API_KEY"],
    base_url="https://cloud.infini-ai.com/maas/v1",
)

response = client.chat.completions.create(
    model="mimo-v2.5-pro",
    messages=[{"role": "user", "content": "What is 2 + 2? Give only the final answer."}],
    max_tokens=256,
    extra_body={"thinking": {"type": "enabled"}},
)

关闭思考时使用：

language-python

extra_body = {"thinking": {"type": "disabled"}}

如果关闭思考后仍看到推理字段，先确认目标模型是否默认或强制启用思考。

用 curl 验证 thinking.type 请求体

先用 curl 确认请求体中已经传入 MiMo 的 thinking.type。运行前先在当前终端设置 API_KEY 环境变量。以下 curl 命令适用于 bash/zsh 等 POSIX 风格 Shell（macOS/Linux、WSL、Git Bash）。如果使用 Windows PowerShell 或 CMD，请按对应 Shell 的语法调整命令。

language-shell

curl --request POST \
  --url "https://cloud.infini-ai.com/maas/v1/chat/completions" \
  --header "Accept: application/json, text/event-stream" \
  --header "Authorization: Bearer $API_KEY" \
  --header "Content-Type: application/json" \
  --data-raw '{
    "model": "mimo-v2.5-pro",
    "messages": [
      {
        "role": "user",
        "content": "What is 2 + 2? Give only the final answer."
      }
    ],
    "max_tokens": 256,
    "thinking": {
      "type": "enabled"
    }
  }'

读取 MiMo 的 reasoning_content

MiMo 返回可见推理时，读取 reasoning_content。SDK 类型可能不声明该字段，应使用动态访问。

language-python

message = response.choices[0].message
reasoning = getattr(message, "reasoning_content", None)

if reasoning:
    print(reasoning)
print(message.content)

流式响应中读取 delta.reasoning_content。

工具调用后保留 reasoning_content

Agent 或工具调用流程中，如果 assistant 消息带有 tool_calls 和 reasoning_content，继续请求时应一起回传。

language-python

assistant_message = {
    "role": "assistant",
    "content": response.choices[0].message.content or "",
    "tool_calls": [
        {
            "id": tool_call.id,
            "type": "function",
            "function": {
                "name": tool_call.function.name,
                "arguments": tool_call.function.arguments,
            },
        }
        for tool_call in response.choices[0].message.tool_calls
    ],
}

reasoning = getattr(response.choices[0].message, "reasoning_content", None)
if reasoning:
    assistant_message["reasoning_content"] = reasoning

messages.append(assistant_message)

不要只回传工具调用参数而丢掉同一条 assistant 消息上的推理字段。

把 reasoning_effort 当作接受性验证

Xiaomi 官方文档没有说明 reasoning_effort。当前验证显示，mimo-v2.5-pro 在取值为 low、medium 或 high 时不会触发 400；这只说明请求会被接受，不代表它一定会改变质量、延迟或 token 用量。

language-python

extra_body = {
    "thinking": {"type": "enabled"},
    "reasoning_effort": "high",
}

只有当目标模型、目标任务和多次请求都显示质量、延迟或 token 用量有可解释差异时，才把它作为业务配置暴露。

修复工具调用历史缺字段导致的 400 错误

工具调用后的 400 通常来自历史消息不完整。排查时检查以下内容：

上一条 assistant 消息的 tool_calls 是否完整。
tool 消息的 tool_call_id 是否匹配上一条 assistant 消息。
如果上一条 assistant 消息返回过 reasoning_content，是否被序列化保留。
如果某一轮关闭思考且没有推理字段，是否错误补写了人工推理。

先修正消息历史，再调整 prompt 或模型参数。

租户管理

凭证管理

费用管理

开始使用

模型与能力

开发工具与集成

计费、限制与用量

排查与支持

更新与参考

Coding Plan

简介

工作流管理

API 文档

监控与计费

简介

版本与发布

计费

Reasoning

使用 MiMo 的 thinking.type

确认目标 MiMo 模型适用本文

设置当前请求的 thinking.type

用 curl 验证 thinking.type 请求体

读取 MiMo 的 reasoning_content

工具调用后保留 reasoning_content

把 reasoning_effort 当作接受性验证

修复工具调用历史缺字段导致的 400 错误

Reasoning

使用 MiMo 的 thinking.type ​

确认目标 MiMo 模型适用本文 ​

设置当前请求的 thinking.type ​

用 curl 验证 thinking.type 请求体 ​

读取 MiMo 的 reasoning_content ​

工具调用后保留 reasoning_content ​

把 reasoning_effort 当作接受性验证 ​

修复工具调用历史缺字段导致的 400 错误 ​

使用 MiMo 的 thinking.type

确认目标 MiMo 模型适用本文

设置当前请求的 thinking.type

用 curl 验证 thinking.type 请求体

读取 MiMo 的 reasoning_content

工具调用后保留 reasoning_content

把 reasoning_effort 当作接受性验证

修复工具调用历史缺字段导致的 400 错误