API Reference
Chat Completions
Create AI-powered chat completions with blazing fast response times.
POSThttps://api.infe.io/v1/chat/completions
Basic Example
curl https://api.infe.io/v1/chat/completions \ -H "Authorization: Bearer $INFE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "infe-pulse", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the capital of France?"} ] }'Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Model ID (e.g., infe-pulse) |
| messages | array | Yes | List of messages in the conversation |
| stream | boolean | No | Enable streaming responses. Default: false |
| temperature | number | No | Sampling temperature (0-2). Default: 1 |
| max_tokens | integer | No | Maximum tokens to generate |
| top_p | number | No | Nucleus sampling probability. Default: 1 |
| stop | string | array | No | Stop sequences |
| tools | array | No | List of tools/functions the model can call |
| response_format | object | No | Force JSON output with {"type": "json_object"} |
| seed | integer | No | Seed for deterministic generation |
Message Object
Each message in the messages array has this structure:
| Field | Type | Description |
|---|---|---|
| role | string | system, user, assistant, or tool |
| content | string | The message content |
| name | string | Optional name for the participant |
StreamingRecommended
Streaming delivers responses as they're generated. This is where Infe's speed advantage is most visible—instant response as tokens stream in.
from openai import OpenAIclient = OpenAI( api_key="your-infe-api-key", base_url="https://api.infe.io/v1")stream = client.chat.completions.create( model="infe-pulse", messages=[{"role": "user", "content": "Tell me a story"}], stream=True)for chunk in stream: content = chunk.choices[0].delta.content if content: print(content, end="", flush=True)Tool Calling
Let the model call functions in your application:
from openai import OpenAIclient = OpenAI( api_key="your-infe-api-key", base_url="https://api.infe.io/v1")tools = [ { "type": "function", "function": { "name": "get_weather", "description": "Get the current weather in a location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "City name" } }, "required": ["location"] } } }]response = client.chat.completions.create( model="infe-titan", messages=[{"role": "user", "content": "What's the weather in Tokyo?"}], tools=tools)# Check if model wants to call a functionif response.choices[0].message.tool_calls: tool_call = response.choices[0].message.tool_calls[0] print(f"Function: {tool_call.function.name}") print(f"Arguments: {tool_call.function.arguments}")JSON Mode
Force the model to output valid JSON:
response = client.chat.completions.create( model="infe-pulse", messages=[ {"role": "system", "content": "Output valid JSON only."}, {"role": "user", "content": "List 3 colors with hex codes"} ], response_format={"type": "json_object"})import jsondata = json.loads(response.choices[0].message.content)print(data)Response
JSON
{ "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1705369200, "model": "infe-pulse", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "The capital of France is Paris." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 25, "completion_tokens": 8, "total_tokens": 33 }}