Find API documentation, integration steps, and workflow examples. Access everything needed to connect, deploy, and optimize with the largest free model catalog and energy-smart routing.
The /v1/chat/completions API is designed to generate text-based responses from various language models. It's built for flexibility, allowing you to choose your model, adjust settings, and optimize for cost, speed, or token rate.
The strategy feature allows you to optimize model selection for specific criteria when multiple providers offer the same model. By adding strategy tags to your model parameter, you can prioritize models based on price, latency, or token rate.
Strategy tags are appended to the model name using the "@" separator. You can combine multiple strategies in any order:
Price:
Latency:
Token Rate:
When using multiple strategies, they can be combined in any order:
Price & Latency:
Token Rate & Price
Latanecy + Token Rate + Price
Welcome to CLōD. This guide covers everything you need to go from zero to your first API call, including how the model catalog works, how energy-based pricing keeps your costs low, and how to organize your usage with Projects.
Go to https://app.clod.io and sign up for free. No credit card required.
Your free account gives you immediate access to:
dev-key, n8n-workflow) and optionally assign it to a ProjectThe dashboard also displays the API endpoint alongside your key for quick reference.
Store your key as an environment variable never hardcode it:
Never expose your API key in client-side code, public repositories, or version-controlled config files.
CLōD organizes all models into three categories. You can browse the full catalog at https://app.clod.io/auth/models.
Open-source models available at no cost. Identified by a Free label in the catalog.
Models hosted directly on CLōD's GPU infrastructure across North America. This is where energy-based dynamic pricing applies (see Step 4 below). Many free models are also available under this category.
Proprietary closed-source models (e.g., GPT-4o, Claude) that CLōD proxies through a unified endpoint. These are models CLōD cannot host directly.
The value here is simplicity: instead of managing separate API keys and billing accounts for each provider, you access all of them through a single CLōD API key. A 5% routing fee applies on top of the provider's standard rate.
This is CLōD's unique value proposition for hosted models.
CLōD operates GPU servers at multiple locations across North America. Electricity costs at each location fluctuate throughout the day based on real-time energy market prices. CLōD continuously monitors these costs and dynamically adjusts token pricing to always route your request to the lowest-cost available Data Center.
What this means for you:
Viewing price history: In the Models page of your dashboard, each CLōD-hosted model displays a live pricing chart. You can toggle between 4-hour, 24-hour, and 7-day views to understand pricing trends. If your workload is flexible, scheduling high-volume jobs during low-energy-cost windows can reduce your inference spend materially.
For current token prices, always refer to the live model card in your dashboard , rates update in real time.
CLōD's API is fully OpenAI-compatible. If you already use the OpenAI SDK, you only need to change the base_url and your API key.
Base URL: https://api.clod.io/v1
Endpoint: POST /v1/chat/completions
PythonExample Response
Projects let you group models, API keys, and logs into isolated environments. This is useful for separating teams, use cases, or deployment stages.
To create a project:
Project features: