AIStudio SSH 公钥管理,一处配置,处处可用AIStudio SSH 公钥管理,一处配置,处处可用 ,只为更佳开发体验如何配置
Skip to content

大语言模型 API 服务使用教程

本文将指导您如何通过常用工具调用 GenStudio 预置的大语言模型 API 服务。

TIP

GenStudio 还支持将模型部署到独占实例,提供私有 API 服务。请注意自部署的模型服务 API 域名与平台提供的公共 API 域名不同。详见部署模型服务

OpenAI API 兼容性

GenStudio 大语言模型 API 服务提供一个实现 OpenAI 的 /v1/chat/completions 的 API 接口。

https://cloud.infini-ai.com/maas/v1/chat/completions

NOTE

关于 API 端点的路径、参数等细节,详见 大语言模型 API 参考文档。

前提条件

使用 Curl

您可以通过调用示例中的 curl 命令直接发送 API 请求。

TIP

请将 $API_KEY 修改为您获取的 API 密钥。

验证单轮对话(非流式)

以下请求示例发起一个单轮对话。以下示例未指定 stream 参数,因此 API 服务使用默认响应方式(非流式响应)。以下示例从环境变量加载 API 密钥和 Base URL。

  • API_KEY:GenStudio API key
  • DEFAULT_BASE_URL:GenStudio API URL,请使用 https://cloud.infini-ai.com/maas/v1
shell
curl "${DEFAULT_BASE_URL}/v1/chat/completions" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $API_KEY" \
    -d '{
         "model": "megrez-3b-instruct",
         "messages": [
            { "role": "user", "content": "你是谁?" }
          ]
     }'

单轮对话也可以携带 system message,示例如下

json
"messages": [
    { "role": "system", "content": "请以嘲讽语气回答" },
    { "role": "user", "content": "你是谁?" }
]

API 服务默认使用非流式响应。请求成功时,以 Server-side events(SSE) 方式返回生成的内容。

响应正文示例如下:

json
{
    "id": "chatcmpl-n5McEDBxBdxNDbx2CA8Rz8",
    "object": "chat.completion",
    "created": 1708497105,
    "model": "qwen2.5-7b-instruct",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "我是来自阿里云的大规模语言模型。"
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 22,
        "total_tokens": 39,
        "completion_tokens": 17
    }
}

IMPORTANT

  • 若文本审核违规,则新增 blocked 字段且值为 true,后面响应不再继续输出
  • 若文本审核通过,则无 blocked 字段,后面响应正常输出。

验证单轮对话(流式)

以下示例明确指定 stream 参数为 true,因此 API 服务将会采用流式响应方式返回类型。以下示例从环境变量加载 API 密钥和 Base URL。

  • API_KEY:GenStudio API key
  • DEFAULT_BASE_URL:GenStudio API URL,请使用 https://cloud.infini-ai.com/maas/v1
shell
curl "${DEFAULT_BASE_URL}/v1/chat/completions" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $API_KEY" \
    -d '{
         "model": "megrez-3b-instruct",
         "stream": true,
         "messages": [
            { "role": "user", "content": "你是谁?" }
          ]
     }'

在流式响应模式下,请求成功时,一次性返回生成的内容。

响应正文示例如下:

json
{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {"role": "assistant"}, "finish_reason": null}]}

{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {"content": "我是"}, "finish_reason": null}]}

{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {"content": "来自"}, "finish_reason": null}]}

{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {"content": "无问"}, "finish_reason": null}]}

{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {"content": "芯穹"}, "finish_reason": null}]}

{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {"content": "的"}, "finish_reason": null}]}

{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {"content": "超"}, "finish_reason": null}]}

{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {"content": "大规模"}, "finish_reason": null}]}

{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {"content": "语言"}, "finish_reason": null}]}

{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {"content": "模型"}, "finish_reason": null}]}

{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {"content": ","}, "finish_reason": null}]}

{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {"content": "我"}, "finish_reason": null}]}

{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {"content": "叫"}, "finish_reason": null}]}

{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {"content": "无"}, "finish_reason": null}]}

{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {"content": "问"}, "finish_reason": null}]}

{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {"content": "天"}, "finish_reason": null}]}

{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {"content": "权"}, "finish_reason": null}]}

{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {"content": "。"}, "finish_reason": null}]}

{"id": "chatcmpl-3HjYf888MzQ6XAHADiPanf", "object": "chat.completion.chunk", "created": 1708486029, "model": "megrez-3b-instruct", "choices": [{"index": 0, "delta": {}, "finish_reason": "stop"}], "usage": {"prompt_tokens": 22, "total_tokens": 40, "completion_tokens": 18}}

验证多轮对话

API 服务可接受多轮对话请求,一对 user message + assistant message 算一轮(也可包含 system message)。

以下示例展示了一个多轮对话请求。以下示例从环境变量加载 API 密钥和 Base URL。

  • API_KEY:GenStudio API key
  • DEFAULT_BASE_URL:GenStudio API URL,请使用 https://cloud.infini-ai.com/maas/v1
shell
curl "${DEFAULT_BASE_URL}/v1/chat/completions" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $API_KEY" \
    -d '{
        "model": "megrez-3b-instruct",
        "messages": [
            { "role": "user", "content": "你是谁?" },
            { "role": "assistant", "content": "我是大模型回答助手" },
            { "role": "user", "content": "你能做什么?" },
        ]
     }'

使用 OpenAI Python SDK

无问芯穹大模型 API 服务支持通过 OpenAI 官方 Python SDK 进行调用。

初始化客户端

GenStudio API 服务提供一个实现 OpenAI 的 /v1/chat/completions 的 API 接口。可使用 OpenAI Python 客户端接入。

  • GENSTUDIO_API_KEY :GenStudio API Key。
  • DEFAULT_BASE_URL:使用默认接口时,为 https://cloud.infini-ai.com/maas/v1
python
import os
from openai import OpenAI

API_KEY = os.getenv("GENSTUDIO_API_KEY")
DEFAULT_BASE_URL = os.getenv("DEFAULT_BASE_URL")  

client = OpenAI(api_key=API_KEY, DEFAULT_BASE_URL=DEFAULT_BASE_URL)

验证一(流式)

python
stream = client.chat.completions.create(
    model="qwen1.5-72b-chat",
    messages=[{"role": "user", "content": "根据中国古代圣人孔子的思想,人生的意义是什么?"}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")
孔子认为,人生的意义在于实现“仁”,即以仁爱之心对待他人,追求道德完善,以及实现社会和谐。他强调“修身、齐家、治国、平天下”,认为一个人应该首先修养自身,然后才能管理好家庭,进一步治理好国家,最终达到天下和平。此外,孔子也重视学习和知识,他认为“学而时习之,不亦说乎?”通过不断学习和实践,可以提升自我,接近人生的意义。

验证二(非流式)

python
chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "say '谁是卧底。'",
        }
    ],
    model="qwen1.5-72b-chat",
)

print(chat_completion.choices[0].message.content)
谁是卧底?

使用 Langchain 的 OpenAI 插件

GenStudio 大模型 API 服务支持通过 Langchain 的 OpenAI 插件进行调用。

验证一(流式)

以下示例从环境变量中加载了 API 路径和 API 密钥。

  • GENSTUDIO_API_KEY :GenStudio API Key
  • DEFAULT_BASE_URL:GenStudio API 默认接口,为 https://cloud.infini-ai.com/maas/v1
python
from openai import OpenAI
import os

API_KEY = os.getenv("GenStudio_API_KEY")
DEFAULT_BASE_URL = os.getenv("DEFAULT_BASE_URL")


from langchain_openai import ChatOpenAI
from langchain.callbacks.base import BaseCallbackHandler
from typing import Any, Dict, List

# Define a callback handler to process streaming tokens
class StreamHandler(BaseCallbackHandler):
    def on_llm_new_token(self, token: str, **kwargs: Any) -> None:
        print(token, end="", flush=True)

# Initialize the ChatOpenAI model with streaming enabled
llm_streaming = ChatOpenAI(
    openai_api_key=API_KEY, 
    openai_api_base=DEFAULT_BASE_URL,
    streaming=True,
    callbacks=[StreamHandler()]  # Pass the callback handler
)

# Define your messages
messages = [
    {"role": "system", "content": "You are a pedantic anticient Chinese scholar, who always answers in Simplified Chinese."},
    {"role": "user", "content": "Tell me a joke."}
]

# Get a response from the chat model
response = llm_streaming.invoke(input=messages)

# Output the response (optional, as it is already printed by the callback)
print("\n\n***********\n\n以下是 LLM 的完整回复:\n\n", response)
有一只深海鱼,每天都自由地游来游去,但它却一点也不开心。因为它压力很大。

***********

以下是 LLM 的完整回复:

content='有一只深海鱼,每天都自由地游来游去,但它却一点也不开心。因为它压力很大。' response_metadata={'finish_reason': 'stop'} id='run-25c3c02b-f027-491b-af88-f3eec5058760-0'

验证二(非流式)

python
from langchain_openai import ChatOpenAI
from typing import Any, Dict, List

# Initialize the ChatOpenAI model with streaming enabled
llm_non_streaming = ChatOpenAI(
    openai_api_key=API_KEY, 
    openai_api_base=DEFAULT_BASE_URL,
    streaming=False
)

# Define your messages
messages = [
    {"role": "system", "content": "You are a pedantic anticient Chinese scholar, who always answers in Simplified Chinese."},
    {"role": "user", "content": "Tell me a joke."}
]

# Get a response from the chat model
response = llm_non_streaming.invoke(input=messages)

# Output the response (optional, as it is already printed by the callback)
print("\n\n***********\n\n以下是 LLM 的完整回复:\n\n", response)
***********
    
以下是 LLM 的完整回复:

content='有一只深海鱼,每天都自由地游来游去,但它却一点也不开心。因为它压力很大。' response_metadata={'token_usage': {'completion_tokens': 24, 'prompt_tokens': 36, 'total_tokens': 60}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-48bb7543-1312-4f94-8313-dc3b18b2d87e-0'