Chat Completions API

The /v1/chat/completions API is designed to generate text-based responses from various language models. It's built for flexibility, allowing you to choose your model, adjust settings, and optimize for cost, speed, or token rate.

Request Structure

Property	Value
HTTP Method	POST
Base URL	https://api.clod.io
Endpoint	/v1/chat/completions

Headers

Parameter	Description
Authorization	Bearer (apikey) Your unique CLōD API key
Content-Type	application/json

Request Body

Parameter	Type	Description
model	string	Identifier of the model or strategy to use (e.g., "GPT 4o", "GPT 4o@price", "@latency").
messages	array	An array of message objects, each with role ("user", "assistant", "system") and content.
temperature	number	Optional. Sampling temperature (0-2). Higher values make output more random. Default varies by model.
max_completion_tokens	integer	Optional. Maximum number of tokens to generate. Default varies by model.
stream	boolean	Optional. If true, enables streaming of results. Default: false.
…	…	Other OpenAI-compatible parameters are supported.

‍

Example Request

JSON Example

{
  "model": "GPT 4o",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "What is the capital of France?"
    }
  ],
  "temperature": 0.7,
  "max_completion_tokens": 50
}

CURL Example

curl -X POST "https://api.clod.io/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "GPT 4o",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "What is the capital of France?"
      }
    ],
    "temperature": 0.7,
    "max_completion_tokens": 50
  }'

Python Example

import requests
import json

url = "https://api.clod.io/v1/chat/completions"

headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

data = {
    "model": "GPT 4o",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "user",
            "content": "What is the capital of France?"
        }
    ],
    "temperature": 0.7,
    "max_completion_tokens": 50
}

response = requests.post(url, headers=headers, json=data)
result = response.json()

print(json.dumps(result, indent=2))

Example Response

{
  "id": "chatcmpl-xxxxxxxxxxxxxxxxxxxx",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "GPT 4o", // Actual model used
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 8,
    "total_tokens": 28
  }
}

Model Optimization Strategies

The strategy feature allows you to optimize model selection for specific criteria when multiple providers offer the same model. By adding strategy tags to your model parameter, you can prioritize models based on price, latency, or token rate.

Available Strategies

Parameter	Type	Description
field_name	string	Describe what this field does. Optional notes go here.
another_field	integer	Describe this field.
…	…	Additional fields supported.

Usage

Strategy tags are appended to the model name using the "@" separator. You can combine multiple strategies in any order:

Single Strategy Examples

Price:

{
  "model": "GPT 4o@price",
  "messages": [...]
}

Latency:

{
  "model": "GPT 4o@latency",
  "messages": [...]
}

Token Rate:

{
  "model": "GPT 4o@token_rate",
  "messages": [...]
}

‍

Multiple Strategy Examples

When using multiple strategies, they can be combined in any order:

Price & Latency:

{
  "model": "GPT 4o@price@latency",
  "messages": [...]
}

Token Rate & Price

{
  "model": "GPT 4o@token_rate@latency",
  "messages": [...]
}

Latanecy + Token Rate + Price

{
  "model": "GPT 4o@price@latency@token_rate",
  "messages": [...]
}

Getting Started

Welcome to CLōD. This guide covers everything you need to go from zero to your first API call, including how the model catalog works, how energy-based pricing keeps your costs low, and how to organize your usage with Projects.

‍

Step 1 Create Your Account

Go to https://app.clod.io and sign up for free. No credit card required.

Your free account gives you immediate access to:

30+ free LLMs
100 free requests per day (auto-replenished daily)
API key generation
The Studio playground
Activity logs

‍

Step 2 Get Your API Key

Log in to your CLōD dashboard at https://app.clod.io
Navigate to the API Keys tab
Click Generate Key
Give it a name (e.g., dev-key, n8n-workflow) and optionally assign it to a Project
Copy the key

The dashboard also displays the API endpoint alongside your key for quick reference.

Store your key as an environment variable never hardcode it:

export CLOD_API_KEY="your_clod_api_key"

Never expose your API key in client-side code, public repositories, or version-controlled config files.

‍

Step 3 Understand the Model Catalog

CLōD organizes all models into three categories. You can browse the full catalog at https://app.clod.io/auth/models.

‍

1. Free Models

Open-source models available at no cost. Identified by a Free label in the catalog.

Every account gets 100 free requests per day, automatically replenished at midnight
If you exhaust your daily free requests, you can top up your wallet to continue using free models at their standard per-token rate
Free credits do not apply to non-free models

‍

2. CLōD-Hosted Models

Models hosted directly on CLōD's GPU infrastructure across North America. This is where energy-based dynamic pricing applies (see Step 4 below). Many free models are also available under this category.

‍

3. Third-Party Models

Proprietary closed-source models (e.g., GPT-4o, Claude) that CLōD proxies through a unified endpoint. These are models CLōD cannot host directly.

The value here is simplicity: instead of managing separate API keys and billing accounts for each provider, you access all of them through a single CLōD API key. A 5% routing fee applies on top of the provider's standard rate.

‍

Step 4 How Energy Routing Works (and Why It Saves You Money)

This is CLōD's unique value proposition for hosted models.

CLōD operates GPU servers at multiple locations across North America. Electricity costs at each location fluctuate throughout the day based on real-time energy market prices. CLōD continuously monitors these costs and dynamically adjusts token pricing to always route your request to the lowest-cost available Data Center.

What this means for you:

You always pay the lowest available rate at the time of your request
No manual configuration required, routing is handled automatically
Pricing on hosted models can be up tp 60% lower than other places

‍

Viewing price history: In the Models page of your dashboard, each CLōD-hosted model displays a live pricing chart. You can toggle between 4-hour, 24-hour, and 7-day views to understand pricing trends. If your workload is flexible, scheduling high-volume jobs during low-energy-cost windows can reduce your inference spend materially.

For current token prices, always refer to the live model card in your dashboard , rates update in real time.

‍

Step 5 Make Your First API Call

CLōD's API is fully OpenAI-compatible. If you already use the OpenAI SDK, you only need to change the base_url and your API key.

Base URL: https://api.clod.io/v1

‍Endpoint: POST /v1/chat/completions

‍

cURL

curl -X POST "https://api.clod.io/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_CLOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "DeepSeek V3",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "What is the capital of France?"
      }
    ],
    "temperature": 0.7,
    "max_completion_tokens": 50
  }'

`‍`Python

curl -X POST "https://api.clod.io/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_CLOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "DeepSeek V3",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "What is the capital of France?"
      }
    ],
    "temperature": 0.7,
    "max_completion_tokens": 50
  }'

Node.js

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.clod.io/v1",
  apiKey: process.env.CLOD_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "DeepSeek V3",
  messages: [
    {
      role: "user",
      content: "Explain how electricity markets work in 3 sentences."
    }
  ],
});

console.log(completion.choices[0].message.content);

`‍`Example Response

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.clod.io/v1",
  apiKey: process.env.CLOD_API_KEY,
});

const completion = await client.chat.completions.create({
  model: "DeepSeek V3",
  messages: [
    {
      role: "user",
      content: "Explain how electricity markets work in 3 sentences."
    }
  ],
});

console.log(completion.choices[0].message.content);

‍

Step 6 Organize with Projects

Projects let you group models, API keys, and logs into isolated environments. This is useful for separating teams, use cases, or deployment stages.

To create a project:

Go to the Projects tab in your dashboard
Click New Project
Assign models and API keys to the project

Project features:

Isolated activity logs. Requests are attributed to their project, making it easy to track usage per team or environment
Log encryption. Enable encryption in project settings so only holders of the private key can access request logs
Budget controls. Set daily and monthly spend caps per project. Once the cap is reached, the project stops processing requests for that period, preventing unexpected costs

‍

Dashboard Quick Reference

Section	What it does
Models	Browse the full catalog, view live pricing charts, check context window sizes
Projects	Create isolated environments with separate keys, logs, and budgets
API Keys	Generate and manage keys, scoped to projects
Studio	Browser-based playground to test any model before integrating
Activity	Monitor requests, token usage, latency, TTFT, and cost across your account

What to Do Next

Browse the full model catalog: https://app.clod.io/auth/models
Explore the Swagger API reference: https://newapp.clod.io/api#/
Set up an integration: OpenClaw, n8n, Kilo Code, Roo Code (see the Integrations section)
Join the community on Discord for support and updates

‍

Free

Content overview for developers

Chat Completions API

Request Structure

Headers

Request Body

Example Request

JSON Example

CURL Example

Python Example

Example Response

Model Optimization Strategies

Available Strategies

Usage

Single Strategy Examples

Multiple Strategy Examples

Getting Started

Step 1 Create Your Account

Step 2 Get Your API Key

Step 3 Understand the Model Catalog

1. Free Models

2. CLōD-Hosted Models

3. Third-Party Models

Step 4 How Energy Routing Works (and Why It Saves You Money)

Step 5 Make Your First API Call

cURL

‍Python

Node.js

‍Example Response

Step 6 Organize with Projects

Dashboard Quick Reference

What to Do Next

GLM 4.5 Air

Qwen 3 235B A22B Thinking 2507

GPT OSS 120B

GPT OSS 20B

DeepSeek R1 0528 TPUT

Trinity Mini

GLM 5

Qwen 3 Next 80B A3B Thinking

Qwen 3 Coder 480B A35B Instruct FP8

Qwen 3 Next 80B A3B Instruct

Qwen 2.5 72B Instruct Turbo

Qwen 2.5 7B Instruct Turbo

Qwen 3 235B A22B Instruct 2507 TPUT

Qwen 3 235B A22B FP8 TPUT

Qwen 3 32B

Kimi K2.5

Kimi K2 Thinking

Mistral Small 3

Kimi K2 Instruct

Minimax M2.5

Llama 3 8B Instruct Lite

Meta Llama 3.1 8B Instruct Turbo

Mixtral 8x7B Instruct v0.1

Llama 3.1 405B Instruct Turbo

Llama 4 Maverick 17B 128E Instruct FP8

Llama 3.3 70B Instruct Turbo

Meta Llama 3.3 70B Instruct

Llama 4 Scout 17B 16E Instruct

Meta Llama 3.1 8B Instruct

Llama 3.2 3B Instruct Turbo

Llama 3.1 8B

Marin 8B Instruct

Gemma 3N E4B IT

DeepSeek V3

DeepSeek R1

Cogito V2 Preview Llama 405B

DeepSeek V3.2

Cogito V2 Preview Llama 109B MoE

Cogito V2 Preview Llama 70B

Grok 4

Grok 3

GPT-5 Nano

GPT-5 Mini

GPT-5.2

GPT-4 Turbo

GPT-5

GPT-4o

GPT-4o Mini

GPT-4.1

`‍`Python

`‍`Example Response