Getting Started

Quickstart

Get your first AI response in under 30 seconds.

Why Infe?We don't have every feature—we have the one that matters: blazing fast speed. Our edge-native architecture delivers sub-100ms latency globally.

1Get Your API Key

Create an account and generate an API key from your dashboard.

Go to infe.io/dashboard
Navigate to API Keys
Click Create New Key
Copy your key (starts with infe_)

Keep it secret!Never share or commit your API key. Use environment variables in production.

2Install the SDK

Infe is fully compatible with the OpenAI SDK. Just install and point to our endpoint.

pip install openai

3Make Your First Request

Send a chat completion request to Infe. Notice how you only need to change base_url.

curl https://api.infe.io/v1/chat/completions \
  -H "Authorization: Bearer $INFE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "infe-pulse",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

4Enable Streaming

For real-time responses, enable streaming. This is where Infe's speed really shines—instant response as tokens are generated.

from openai import OpenAI
client = OpenAI(
    api_key="your-infe-api-key",
    base_url="https://api.infe.io/v1"
)
stream = client.chat.completions.create(
    model="infe-pulse",
    messages=[
        {"role": "user", "content": "Tell me a story"}
    ],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

What's Next?

Platform Overview

Learn why Infe is the fastest AI inference platform.

Chat Completions API

Full API reference with all parameters.

Model Catalog

Explore Infe Pulse, Titan, and Core models.

Billing & Pricing

Understand Infe Units and pricing tiers.