v1.0.0

Introduction

Welcome to the Nexusify API. We provide a unified interface to the world's most capable AI models — text generation, vision, tool calling, and image synthesis — all through a single, OpenAI-compatible integration point.

Base URL

All API requests are made to the following base URL. Every endpoint listed in this documentation is relative to it.

https://api.nexusify.co/v1

What's Included

Unified Text API

Switch between Gemini, GPT-4, Llama, Mistral, Grok, DeepSeek, and 99+ other models with a single parameter.

Image Generation

Access Flux, ZImage, Imagen 4, Klein, and GPT Image all via one standardized endpoint.

OpenAI-Compatible

Drop-in replacement for existing OpenAI integrations — no SDK changes needed.

Getting Started

Obtaining an API Key

To start building with Nexus, you'll need a unique API key. The process takes under a minute and requires no credit card.

Step 1 — Start Building

Navigate to nexusify.co and click the "Start building" button on the homepage.

Step 2 — Authenticate

You'll be redirected to the authentication portal at login.nexusify.co. You can either complete the CAPTCHA verification or sign in with Google or Discord.

Step 3 — Open the Dashboard

After a successful login you'll land on your developer dashboard:

https://dash.nexusify.co/dashboard

Step 4 — Copy Your Key

Scroll to the "API Credentials" section. Your key is obscured by default. Click "Copy Secret Key" to copy it to your clipboard.

Keep it secret. Never paste your API key into client-side JavaScript, browser extensions, or public repositories. Treat it like a password.
Getting Started

Authentication

Every request to the Nexusify API must include your API key in the Authorization header using the Bearer scheme.

Request Header

HTTP Header
Authorization: Bearer YOUR_API_KEY
Security Notice. Do not expose your API key in browser-side code or commit it to version control. Use environment variables and server-side requests.

Example with Environment Variable

The idiomatic approach is to store your key in an environment variable and reference it at runtime, as shown below.

# Add to your .env file or shell profile
export NEXUS_API_KEY="your_key_here"
import os
api_key = os.getenv("NEXUS_API_KEY")
const apiKey = process.env.NEXUS_API_KEY;
Getting Started

Pricing & Credits

Nexus uses a credit-based system. New accounts receive $25 in free credits every week, automatically renewed on Mondays — no credit card required to start.

How Credits Work

🎁

Weekly Free Credits

$25 in free credits every Monday. Automatically topped up — no action needed on your part.

💳

Paid Credits

Purchase additional credits at any time from your dashboard. They never expire and are only drawn after free credits are exhausted.

Pay-as-you-go

You're billed only for what you use. Credits for failed requests are automatically refunded.

Text Model Pricing

Text models are billed per million tokens, with input and output priced separately. All prices are in USD.

ModelInput / 1M tokensOutput / 1M tokens
gpt-4$30.00$60.00
grok-3$3.00$15.00
llama-3.1-405b-instruct$5.00$15.00
gemini-2.5-flash$0.15$0.60
gpt-4o-mini$0.15$0.60
mistral-large-3$2.00$6.00
deepseek-v3.2$0.27$1.10
kimi-k2.5$0.80$3.00
qwen3-235b-a22b$0.20$0.60
All other models follow per-token pricing. Refer to the full model list for details.

Image Model Pricing

Image generation is billed at a flat rate per request. The base price applies to 512×512 px; larger resolutions scale proportionally.

ModelBase (512×512)Large (1024×1024)
flux$0.003$0.012
zimage$0.005$0.020
klein$0.006$0.024
gptimage$0.040$0.160
qwen-image$0.020$0.080
wan-image$0.008$0.032
p-image$0.055$0.220
Low balance. When your credit balance is insufficient, requests return a 402 error. Purchase credits at dash.nexusify.co.
Getting Started

Limits & Errors

Understanding the operational limits and HTTP error responses helps you build more resilient integrations.

Usage Limits

130 Requests / Minute
$25 Free Credits / Week
With Paid Credits

Error Codes

CodeStatusDescription
400Bad RequestMissing required parameters or invalid request format.
401UnauthorizedInvalid or missing API key in the Authorization header.
402Insufficient CreditsYour credit balance is too low. Purchase credits at the dashboard.
429Too Many RequestsYou have exceeded the 130 requests per minute rate limit.
500Internal Server ErrorAn unexpected error occurred on our end. Please retry after a short wait.
Text Generation

Chat Completions

Generate text using 99 distinct AI models through a single, OpenAI-compatible endpoint. The request and response format is identical to the OpenAI Chat Completions API, so existing SDKs work without modification.

POST /chat/completions

Parameters

NameTypeRequiredDescription
modelstringYesThe model ID to use, e.g. gpt-4, gemini-2.5-flash, mistral-large-3.
messagesarrayYesArray of message objects, each with a role ("system", "user", "assistant") and content.
streambooleanNoIf true, responses are streamed as Server-Sent Events.
temperaturefloatNoSampling temperature between 0.0 and 2.0. Defaults to 0.7.
max_tokensintegerNoMaximum number of tokens to generate in the response.

Example Request

curl https://api.nexusify.co/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $NEXUS_API_KEY" \
  -d '{
    "model": "gemini-2.5-flash",
    "messages": [
      {"role": "user", "content": "Explain general relativity to a 10-year-old."}
    ]
  }'
import requests

response = requests.post(
    "https://api.nexusify.co/v1/chat/completions",
    headers={
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
    },
    json={
        "model": "gemini-2.5-flash",
        "messages": [
            {"role": "user", "content": "Explain general relativity to a 10-year-old."}
        ]
    }
)
print(response.json()["choices"][0]["message"]["content"])
const res = await fetch("https://api.nexusify.co/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.NEXUS_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "gemini-2.5-flash",
    messages: [
      { role: "user", content: "Explain general relativity to a 10-year-old." }
    ]
  })
});
const data = await res.json();
console.log(data.choices[0].message.content);

Response Format

The response mirrors the OpenAI Chat Completions response schema.

JSON Response
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "gemini-2.5-flash",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Imagine space is like a stretched rubber sheet..."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 22,
    "completion_tokens": 148,
    "total_tokens": 170
  }
}
Text Generation

Available Models

A complete list of the 99 text models currently supported by Nexusify. Pass any Model ID as the model parameter in your request.

Provider Model ID Capabilities Input ($/1M) Output ($/1M)
Text Generation

Vision (Image Input)

Several models support multimodal input, allowing you to pass images alongside text in the messages array using the standard OpenAI content-parts format.

Supported models gemini-3-flash-preview, kimi-k2.5, llama-3.2-90b-vision-instruct, mistral-large-3, mistral-small-3.1, and others tagged Vision in the model list.

How It Works

Instead of passing a plain string as content, you pass an array of content parts. Each part is either a text block or an image_url block. Images can be sent as public URLs or as base64-encoded data URIs.

Example Request

curl https://api.nexusify.co/v1/chat/completions \
  -H "Authorization: Bearer $NEXUS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3-flash-preview",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "What is in this image?"},
        {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
      ]
    }]
  }'
response = requests.post(
    "https://api.nexusify.co/v1/chat/completions",
    headers={"Authorization": f"Bearer {api_key}"},
    json={
        "model": "gemini-3-flash-preview",
        "messages": [{
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image?"},
                {"type": "image_url",
                 "image_url": {"url": "https://example.com/photo.jpg"}}
            ]
        }]
    }
)

Sending a Base64 Image

To send an image from disk, encode it as a base64 data URI and place it in the url field:

JSON snippet
{
  "type": "image_url",
  "image_url": {
    "url": "data:image/jpeg;base64,/9j/4AAQSkZJRgAB..."
  }
}
Graceful degradation. If you send an image to a non-vision model (e.g. deepseek-v3.2), the image content is silently stripped and only the text portion of the message is forwarded. No error is returned.
Text Generation

Tool Calling

Tool calling — also known as function calling — lets the model invoke your application's functions during a conversation. Instead of replying with plain text, the model returns a structured JSON payload describing which function to call and with what arguments. Your code executes the function and feeds the result back, allowing the model to produce a final, grounded answer.

Broadly supported. Tool calling works natively with the vast majority of models on this API — including all GPT, Grok, DeepSeek, Kimi, Mistral, Cohere Command, Nemotron, Llama 3.x / 4, MiniMax, and Gemini models. Parameters are forwarded as-is to the underlying provider, so any model that supports the OpenAI tools schema works out of the box. Look for the Tools tag in the model list for models with confirmed tool-calling support.

How It Works

The full conversation loop has four steps.

StepWho actsWhat happens
1. Send requestYour appPOST to /v1/chat/completions with a tools array describing available functions.
2. Model calls a toolModelReturns finish_reason: "tool_calls" and a tool_calls array with the function name and JSON arguments.
3. Execute & return resultYour appRun the function locally. Append the assistant message (with tool_calls) and a new role: "tool" message containing the result.
4. Final answerModelReads the tool result and produces a natural-language reply to the user.

Request Parameters

ParameterTypeDescription
tools array List of function definitions. Each entry must have type: "function" and a function object with name, description, and a JSON Schema parameters object.
tool_choice string / object "auto" — model decides when to call a tool (default).
"none" — tools are defined but the model must not call any.
{"type":"function","function":{"name":"..."}} — force a specific function.
parallel_tool_calls boolean When true (default), the model may invoke multiple tools in a single turn. Set to false to enforce sequential calls.

Step 1 — Initial Request

curl https://api.nexusify.co/v1/chat/completions   -H "Authorization: Bearer $NEXUS_API_KEY"   -H "Content-Type: application/json"   -d '{
    "model": "deepseek-v3.2",
    "messages": [
      {"role": "user", "content": "What is the weather in Madrid right now?"}
    ],
    "tools": [{
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get the current weather for a given city.",
        "parameters": {
          "type": "object",
          "properties": {
            "city":  {"type": "string",  "description": "City name, e.g. Madrid"},
            "units": {"type": "string",  "enum": ["celsius", "fahrenheit"], "description": "Temperature unit"}
          },
          "required": ["city"]
        }
      }
    }],
    "tool_choice": "auto"
  }'
import requests, json

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get the current weather for a given city.",
        "parameters": {
            "type": "object",
            "properties": {
                "city":  {"type": "string"},
                "units": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["city"]
        }
    }
}]

messages = [{"role": "user", "content": "What is the weather in Madrid right now?"}]

r = requests.post(
    "https://api.nexusify.co/v1/chat/completions",
    headers={"Authorization": f"Bearer {api_key}"},
    json={"model": "deepseek-v3.2", "messages": messages, "tools": tools}
)
const tools = [{
  type: "function",
  function: {
    name: "get_weather",
    description: "Get the current weather for a given city.",
    parameters: {
      type: "object",
      properties: {
        city:  { type: "string" },
        units: { type: "string", enum: ["celsius", "fahrenheit"] }
      },
      required: ["city"]
    }
  }
}];

const response = await fetch("https://api.nexusify.co/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${apiKey}`,
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    model: "deepseek-v3.2",
    messages: [{ role: "user", content: "What is the weather in Madrid right now?" }],
    tools,
    tool_choice: "auto"
  })
});
const data = await response.json();

Step 2 — Model Response (tool call)

When the model wants to call a tool it sets finish_reason to "tool_calls" and returns a tool_calls array. The arguments field is always a JSON-encoded string.

Response — tool_calls
{
  "id": "chatcmpl-xyz",
  "choices": [{
    "finish_reason": "tool_calls",
    "message": {
      "role": "assistant",
      "content": null,
      "tool_calls": [{
        "id": "call_a1b2c3",
        "type": "function",
        "function": {
          "name": "get_weather",
          "arguments": "{"city": "Madrid", "units": "celsius"}"
        }
      }]
    }
  }]
}

Step 3 — Execute & Return Result

Parse the arguments string, call your function, and send a follow-up request. You must include the assistant's tool-call message in the history, followed by a new role: "tool" message.

assistant_msg = r.json()["choices"][0]["message"]
tool_call     = assistant_msg["tool_calls"][0]
args          = json.loads(tool_call["function"]["arguments"])

# Execute your function
result = get_weather(city=args["city"], units=args.get("units", "celsius"))

# Build follow-up messages
messages.append(assistant_msg)  # keep assistant turn with tool_calls
messages.append({
    "role":         "tool",
    "tool_call_id": tool_call["id"],
    "content":      json.dumps(result)
})

# Second request — get final answer
final = requests.post(
    "https://api.nexusify.co/v1/chat/completions",
    headers={"Authorization": f"Bearer {api_key}"},
    json={"model": "deepseek-v3.2", "messages": messages, "tools": tools}
)
print(final.json()["choices"][0]["message"]["content"])
const assistantMsg = data.choices[0].message;
const toolCall     = assistantMsg.tool_calls[0];
const args         = JSON.parse(toolCall.function.arguments);

// Execute your function
const result = await getWeather(args.city, args.units ?? "celsius");

// Build follow-up messages
const followUp = [
  ...messages,
  assistantMsg,  // keep the assistant tool_calls turn
  {
    role:         "tool",
    tool_call_id: toolCall.id,
    content:      JSON.stringify(result)
  }
];

// Second request — get final answer
const finalRes = await fetch("https://api.nexusify.co/v1/chat/completions", {
  method: "POST",
  headers: { "Authorization": `Bearer ${apiKey}`, "Content-Type": "application/json" },
  body: JSON.stringify({ model: "deepseek-v3.2", messages: followUp, tools })
});
const answer = (await finalRes.json()).choices[0].message.content;
console.log(answer);

Step 4 — Final Model Response

Response — final answer
{
  "choices": [{
    "finish_reason": "stop",
    "message": {
      "role": "assistant",
      "content": "The current temperature in Madrid is 28°C with clear skies."
    }
  }]
}

Defining Good Tools

The quality of your function description directly affects how reliably the model will call it. Follow these guidelines for best results.

PracticeWhy it matters
Write a clear, specific descriptionThe model reads this to decide when to call the function. Vague descriptions lead to missed or wrong calls.
Describe every parameterInclude a description field for each property in the JSON Schema so the model knows what value to fill in.
Use enum for fixed valuesPrevents the model from inventing values for parameters that have a known set of options.
Mark required parametersList every field the function cannot work without in the required array.
Keep names snake_caseFunction and parameter names like get_current_weather are more reliably understood than camelCase or abbreviations.

Supported Models

The following models have confirmed native tool-calling support. Any other model on the API may also accept the tools parameter if the underlying provider supports it.

ProviderModels
DeepSeekdeepseek-v3.1, deepseek-v3.2
OpenAIgpt-4, gpt-4o-mini, gpt-5 series, gpt-5.x series
xAIgrok-3, grok-4, grok-4.1 series
Moonshotkimi-k2, kimi-k2.5
MiniMaxminimax-m2, minimax-m2.5
Mistral AImistral-small-3.2, mistral-nemotron, mistral-large-3, ministral-3-8b, ministral-3-14b
Coherecommand-a-3, command-r-plus, command-r
Meta / NVIDIAllama-3.3-70b-instruct, llama-3.1-405b-instruct, llama-3.1-8b-instruct, nemotron-super-49b-v1, nemotron-super-49b-v1.5
Alibabaqwen3-235b-a22b, qwen3-next-80b
Googlegemini-2.5-flash, gemini-3-flash-preview
RNJ AIrnj-1
Vercel AI SDK / LangChain users. This API is fully compatible with the OpenAI client library, the Vercel AI SDK, and LangChain. Pass baseURL: "https://api.nexusify.co/v1" and your Nexus key — tool calling works identically to the native OpenAI SDK.
Image Generation

Generate Image

Create images from text prompts using state-of-the-art diffusion and synthesis models. A single endpoint provides access to all supported image models.

POST /generate-image

Parameters

NameTypeDefaultDescription
promptstringA detailed text description of the image to generate. Required.
modelstringfluxThe image model ID to use. See Image Models for the full list.
widthinteger512Output width in pixels. Maximum 2048.
heightinteger512Output height in pixels. Maximum 2048.

Example Request

curl https://api.nexusify.co/v1/generate-image \
  -H "Authorization: Bearer $NEXUS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Cyberpunk city with neon rain, cinematic lighting",
    "model": "flux",
    "width": 1024,
    "height": 1024
  }'
const res = await fetch("https://api.nexusify.co/v1/generate-image", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.NEXUS_API_KEY}`,
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    prompt: "Cyberpunk city with neon rain, cinematic lighting",
    model: "flux",
    width: 1024,
    height: 1024
  })
});

const data = await res.json();
console.log(data.imageUrl);
response = requests.post(
    "https://api.nexusify.co/v1/generate-image",
    headers={"Authorization": f"Bearer {api_key}"},
    json={
        "prompt": "Cyberpunk city with neon rain, cinematic lighting",
        "model": "flux",
        "width": 1024,
        "height": 1024
    }
)
image_url = response.json()["imageUrl"]

Response

A successful response includes the image URL, metadata about the request, and the current credit balance of your account. The image is hosted temporarily — it expires after 2 hours.

JSON Response
{
  "success": true,
  "model": "flux",
  "prompt": "A futuristic city at sunset",
  "size": "512x512",
  "imageUrl": "/data/generated-images/pollinations-1735000000000-a1b2c3d4e.png",
  "imagePath": "/data/generated-images/pollinations-1735000000000-a1b2c3d4e.png",
  "expiresIn": "2 hours",
  "message": "Image generated successfully",
  "user": {
    "email": "you@example.com",
    "plan": "free",
    "usageRemaining": 24.85,
    "nextGenerationAvailableIn": 0
  }
}
FieldTypeDescription
imageUrlstringRelative path to the generated image. To access it, prepend https://api.nexusify.co — e.g. https://api.nexusify.co/data/generated-images/pollinations-....png
imagePathstringSame as imageUrl. Provided for convenience.
sizestringDimensions of the generated image, e.g. "512x512".
expiresInstringHow long the image will remain accessible. Images are deleted after 2 hours.
user.usageRemainingfloatYour remaining credit balance in USD after this request.
Image Generation

Image Gen (OpenAI Format)

An OpenAI-compatible image generation endpoint. If you are already using the OpenAI Images API, you can point your client at Nexusify with zero code changes — just swap the base URL and API key.

POST /v1/images/generations
Drop-in replacement. The request and response format is identical to POST /v1/images/generations from the OpenAI API. Existing SDKs and integrations work without modification — just set base_url to https://api.nexusify.co.

Parameters

NameTypeDefaultDescription
promptstringA text description of the desired image. Required.
modelstringfluxThe image model ID to use. See Image Models for the full list.
sizestring"1024x1024"Image dimensions as "WxH", e.g. "512x512" or "1024x1792". Larger sizes are billed proportionally.
response_formatstring"url""url" returns a hosted image URL; "b64_json" returns the raw image as a base-64 string.
ninteger1Number of images to generate. Currently always 1.
streambooleanfalseIf true, the server emits Server-Sent Events while generating the image, then delivers the final result as the last event.

Response

A successful response returns a JSON object that mirrors the OpenAI Images API response shape:

JSON Response
{
  "created": 1712345678,
  "data": [
    {
      "url": "https://api.nexusify.co/v1/images/gen-abc123.png",
      "revised_prompt": "A majestic mountain landscape at sunset..."
    }
  ]
}

When response_format is "b64_json", the url field is omitted and a b64_json field containing the base-64 encoded PNG is returned instead.

Generated images are accessible via GET /v1/images/<filename> and are retained for 2 hours after creation.

Example Request

curl https://api.nexusify.co/v1/images/generations \
  -H "Authorization: Bearer $NEXUS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A majestic mountain landscape at golden hour",
    "model": "flux",
    "size": "1024x1024",
    "response_format": "url"
  }'
from openai import OpenAI

client = OpenAI(
    api_key="$NEXUS_API_KEY",
    base_url="https://api.nexusify.co"
)

response = client.images.generate(
    prompt="A majestic mountain landscape at golden hour",
    model="flux",
    size="1024x1024",
    response_format="url",
    n=1
)

print(response.data[0].url)
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "$NEXUS_API_KEY",
  baseURL: "https://api.nexusify.co",
});

const response = await client.images.generate({
  prompt: "A majestic mountain landscape at golden hour",
  model: "flux",
  size: "1024x1024",
  response_format: "url",
  n: 1,
});

console.log(response.data[0].url);

Streaming

Set stream: true to receive Server-Sent Events while the image is being generated. Each event carries a status field; the final event contains the complete response object.

SSE Stream (cURL)
curl https://api.nexusify.co/v1/images/generations \
  -H "Authorization: Bearer $NEXUS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Neon-lit cyberpunk city at midnight",
    "model": "klein",
    "size": "1024x1024",
    "stream": true
  }'

# Server-Sent Events output:
# data: {"status":"generating","progress":0.2}
# data: {"status":"generating","progress":0.7}
# data: {"created":1712345678,"data":[{"url":"https://api.nexusify.co/v1/images/gen-xyz.png"}]}
Image retrieval. Images returned via url can be fetched with a plain GET request — no authentication required. They expire after 2 hours.
Image Generation

Image Models

Seven image models are currently available, each optimized for different use cases. Pass the ID as the model parameter in your generate-image request.

IDDescriptionBest ForBase Price (512×512)
Text Generation

Text Generate Legacy

A simpler, more direct alternative to /chat/completions. Instead of wrapping everything in a messages array, you can pass a plain prompt string and get a plain completion string back — no extra nesting required. It supports the same 99 models, streaming, conversation history, and all generation parameters.

When to use this endpoint. If you're building a quick prototype, a simple chatbot, or just want to call a model without constructing a full messages array, this endpoint is the easier choice. For production integrations or OpenAI SDK compatibility, use /chat/completions instead.
POST /text/generate

Parameters

NameTypeRequiredDefaultDescription
modelstringYesThe model ID to use. Any model from the full model list is supported.
promptstring*A plain text message to send to the model. Required if messages is not provided.
messagesarray*Full conversation history in OpenAI format. Use this instead of prompt for multi-turn conversations.
systemInstructionstringNoA system-level instruction that shapes the model's behavior for the entire conversation.
temperaturefloatNo0.7Controls randomness. 0.0 is deterministic, 2.0 is very creative.
max_tokensintegerNo300Maximum number of tokens to generate.
top_pfloatNo1.0Nucleus sampling threshold. Values below 1.0 restrict the token pool.
stopstring / arrayNonullOne or more sequences that will stop generation when encountered.
streambooleanNofalseIf true, the response is streamed as Server-Sent Events.
useridstringNoAn identifier for the user. When provided, the server stores conversation history and automatically includes it in future requests with the same userid.

* Either prompt or messages must be provided.

Example Request

curl https://api.nexusify.co/v1/text/generate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $NEXUS_API_KEY" \
  -d '{
    "model": "gemini-2.5-flash",
    "prompt": "Explain how black holes form in two sentences.",
    "temperature": 0.7,
    "max_tokens": 150
  }'
import requests

response = requests.post(
    "https://api.nexusify.co/v1/text/generate",
    headers={
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
    },
    json={
        "model": "gemini-2.5-flash",
        "prompt": "Explain how black holes form in two sentences.",
        "temperature": 0.7,
        "max_tokens": 150,
    }
)
print(response.json()["completion"])
const res = await fetch("https://api.nexusify.co/v1/text/generate", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.NEXUS_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "gemini-2.5-flash",
    prompt: "Explain how black holes form in two sentences.",
    temperature: 0.7,
    max_tokens: 150,
  })
});

const data = await res.json();
console.log(data.completion);

Response

On success, the response contains a completion field with the model's reply as a plain string — no nested arrays to unwrap.

JSON Response
{
  "success": true,
  "model": "gemini-2.5-flash",
  "completion": "Black holes form when a massive star exhausts its nuclear fuel and collapses under gravity...",
  "reasoning": "...",
  "userid": null,
  "historyLength": 2,
  "messagesUsed": 1,
  "usage": {
    "prompt_tokens": 14,
    "completion_tokens": 38,
    "total_tokens": 52
  },
  "user": {
    "email": "you@example.com",
    "plan": "free",
    "usageRemaining": 24.99
  }
}
FieldTypeDescription
completionstringThe model's generated reply.
reasoningstringChain-of-thought output, only present on reasoning models (e.g. grok-3-mini, deepseek-v3.2).
historyLengthintegerTotal messages in the conversation after this turn. Only meaningful when userid is used.
messagesUsedintegerNumber of messages sent to the model in this request.
usageobjectToken breakdown: prompt_tokens, completion_tokens, total_tokens.
user.usageRemainingfloatYour remaining credit balance in USD.

Streaming

Set "stream": true to receive the response as a sequence of Server-Sent Events. Each event delivers a chunk of the completion as it's generated. The stream ends with a data: [DONE] message.

curl https://api.nexusify.co/v1/text/generate \
  -H "Authorization: Bearer $NEXUS_API_KEY" \
  -H "Content-Type: application/json" \
  --no-buffer \
  -d '{
    "model": "gpt-5-mini",
    "prompt": "Write a short story about a robot learning to paint.",
    "stream": true,
    "max_tokens": 400
  }'
const res = await fetch("https://api.nexusify.co/v1/text/generate", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.NEXUS_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "gpt-5-mini",
    prompt: "Write a short story about a robot learning to paint.",
    stream: true,
    max_tokens: 400,
  })
});

const reader = res.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { value, done } = await reader.read();
  if (done) break;
  const chunk = decoder.decode(value);
  for (const line of chunk.split("\n")) {
    if (!line.startsWith("data: ")) continue;
    const payload = line.slice(6);
    if (payload === "[DONE]") break;
    const json = JSON.parse(payload);
    process.stdout.write(json.choices[0]?.delta?.content ?? "");
  }
}

Conversation History

Pass a userid string to enable persistent conversation memory. The server saves each exchange and automatically prepends the history on your next request with the same userid and model — so the model always has context from prior turns without you managing it manually.

Example with userid
// First message
{ "model": "gpt-5-mini", "prompt": "My name is Alex.", "userid": "user_42" }

// Follow-up — model remembers the name
{ "model": "gpt-5-mini", "prompt": "What's my name?", "userid": "user_42" }
// → completion: "Your name is Alex."

To clear the stored history for a user, send a DELETE request to /v1/text/history/{userid}. This resets the conversation across all models at once.

Text Generation

Responses API

The Responses API is OpenAI's modern alternative to Chat Completions. Instead of a flat choices array, it returns a structured output array of typed items — separate objects for the assistant message and (when present) for reasoning traces. This makes it especially well-suited for agentic applications, multi-step workflows, and any integration that already targets the OpenAI Responses API shape.

POST /responses

Parameters

NameTypeRequiredDefaultDescription
modelstringYesThe model ID to use. Any model from the full model list is supported.
inputstring / arrayYesThe user's input. Can be a plain string, an array of {role, content} objects (same shape as messages in Chat Completions), or an array of Responses API message objects with typed content parts.
instructionsstringNoA system-level instruction prepended to the conversation. Equivalent to a system message in Chat Completions.
max_output_tokensintegerNoMaximum number of tokens to generate. Maps to max_tokens internally.
temperaturefloatNo1.0Sampling temperature between 0.0 and 2.0.
top_pfloatNo1.0Nucleus sampling threshold.
streambooleanNofalseIf true, the response is delivered as a sequence of named Server-Sent Events.
toolsarrayNoList of tool definitions available to the model (same schema as Chat Completions).
tool_choicestring / objectNo"auto"Controls whether the model calls a tool.
storebooleanNotrueWhether this response should be stored. Reflected in the response envelope but not processed server-side.
metadataobjectNo{}Arbitrary key/value pairs echoed back in the response. Useful for tagging requests.
previous_response_idstringNoID of a prior response to continue from. Echoed in the response envelope.
userstringNoAn identifier for the end-user. Used for audit purposes; not stored as conversation history.

Example Request

curl https://api.nexusify.co/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $NEXUS_API_KEY" \
  -d '{
    "model": "grok-4",
    "input": "Explain the twin paradox in simple terms.",
    "instructions": "You are a friendly physics tutor.",
    "temperature": 0.7,
    "max_output_tokens": 300
  }'
import requests

response = requests.post(
    "https://api.nexusify.co/v1/responses",
    headers={
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
    },
    json={
        "model": "grok-4",
        "input": "Explain the twin paradox in simple terms.",
        "instructions": "You are a friendly physics tutor.",
        "temperature": 0.7,
        "max_output_tokens": 300,
    }
)
data = response.json()
# The assistant text lives in the first message output item
text = data["output"][-1]["content"][0]["text"]
print(text)
const res = await fetch("https://api.nexusify.co/v1/responses", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.NEXUS_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "grok-4",
    input: "Explain the twin paradox in simple terms.",
    instructions: "You are a friendly physics tutor.",
    temperature: 0.7,
    max_output_tokens: 300,
  })
});
const data = await res.json();
// Grab the assistant message (last output item)
const msg = data.output.find(o => o.type === "message");
console.log(msg.content[0].text);

Response Format

Rather than a flat choices array, the response wraps everything in an output array of typed items. A standard (non-reasoning) response contains one item of "type": "message". When the model produces a reasoning trace, a "type": "reasoning" item appears first at index 0, and the message shifts to index 1.

JSON Response — standard model
{
  "id": "resp_01j9abc123",
  "object": "response",
  "created_at": 1735000000,
  "model": "grok-4",
  "status": "completed",
  "output": [{
    "type": "message",
    "id": "msg_01abc",
    "role": "assistant",
    "status": "completed",
    "content": [{
      "type": "output_text",
      "text": "Imagine twins Alice and Bob..."
    }]
  }],
  "usage": {
    "input_tokens": 22,
    "output_tokens": 84,
    "total_tokens": 106
  },
  "temperature": 0.7,
  "top_p": 1,
  "max_output_tokens": 300,
  "error": null
}
JSON Response — reasoning model (e.g. grok-4-thinking)
{
  "id": "resp_01j9xyz456",
  "object": "response",
  "model": "grok-4-thinking",
  "status": "completed",
  "output": [
    {
      "type": "reasoning",
      "id": "rs_01abc",
      "summary": [{
        "type": "summary_text",
        "text": "Let me think about special relativity step by step..."
      }]
    },
    {
      "type": "message",
      "id": "msg_01def",
      "role": "assistant",
      "status": "completed",
      "content": [{
        "type": "output_text",
        "text": "Imagine twins Alice and Bob..."
      }]
    }
  ],
  "usage": {
    "input_tokens": 22,
    "output_tokens": 312,
    "total_tokens": 334
  }
}
FieldTypeDescription
idstringUnique response identifier, prefixed resp_.
objectstringAlways "response".
statusstringAlways "completed" for non-streaming responses.
outputarrayOrdered list of output items. A "reasoning" item (with a summary array) appears first when the model produces a thinking trace; the "message" item always appears last.
output[].content[].textstringThe assistant's final reply text. Access via output.find(o => o.type === "message").content[0].text.
usage.input_tokensintegerTokens consumed by the prompt (estimated for providers that omit usage).
usage.output_tokensintegerTokens in the generated reply.
errornull / objectnull on success; an error object on failure.

Multi-turn Conversations

Pass input as an array of message objects to send a full conversation history, just as you would with the messages parameter in Chat Completions. Each item should have a role ("user" or "assistant") and a content string.

Multi-turn input
{
  "model": "gemini-2.5-flash",
  "instructions": "You are a helpful assistant.",
  "input": [
    { "role": "user",      "content": "My name is Alex." },
    { "role": "assistant", "content": "Nice to meet you, Alex!" },
    { "role": "user",      "content": "What is my name?" }
  ]
}

Streaming

Set "stream": true and the server delivers the response as a sequence of named Server-Sent Events. Each event has both an event: field (the event name) and a data: field (a JSON object). The stream follows the OpenAI Responses API event lifecycle exactly, so any client library that supports that spec will work without changes.

curl https://api.nexusify.co/v1/responses \
  -H "Authorization: Bearer $NEXUS_API_KEY" \
  -H "Content-Type: application/json" \
  --no-buffer \
  -d '{
    "model": "grok-4.1-fast",
    "input": "Write a haiku about the ocean.",
    "stream": true
  }'
const res = await fetch("https://api.nexusify.co/v1/responses", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.NEXUS_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "grok-4.1-fast",
    input: "Write a haiku about the ocean.",
    stream: true,
  })
});

const reader = res.body.getReader();
const decoder = new TextDecoder();
let buffer = "";

while (true) {
  const { value, done } = await reader.read();
  if (done) break;
  buffer += decoder.decode(value);

  // Events are separated by double newlines
  const parts = buffer.split("\n\n");
  buffer = parts.pop() ?? ""; // keep incomplete trailing chunk

  for (const part of parts) {
    const dataLine = part.split("\n").find(l => l.startsWith("data: "));
    if (!dataLine) continue;
    const evt = JSON.parse(dataLine.slice(6));

    if (evt.type === "response.output_text.delta") {
      process.stdout.write(evt.delta); // stream text as it arrives
    }
    if (evt.type === "response.completed") {
      console.log("\n[done]", evt.response.usage);
    }
  }
}

Streaming Events Reference

The following events are emitted in order during a streaming response. Your client only needs to handle the ones relevant to its use case — most applications only need response.output_text.delta for live text and response.completed for the final state.

Event nameWhen it firesKey fields in data
response.createdImmediately — before any tokens are generated.response.id, response.status: "in_progress"
response.in_progressGeneration has started.response.status: "in_progress"
response.output_item.addedA new output item (e.g. the message) has been opened.output_index, item.type, item.role
response.content_part.addedA content part inside the message has been opened.item_id, content_index, part.type: "output_text"
response.reasoning_text.deltaOnly on reasoning models — a chunk of the thinking trace.delta (string chunk)
response.output_text.deltaA chunk of the assistant's reply text.delta (string chunk), item_id
response.output_text.doneThe full reply text has been sent.text (complete accumulated string)
response.content_part.doneThe content part is closed.part.text
response.output_item.doneAn output item is fully delivered (one per item in output).item (complete item object)
response.completedThe full response is ready. Contains the final response object with usage.response (complete response object)
response.doneTerminal event — stream is closed after this.response (same as response.completed)

Reasoning Models

When you use a thinking-capable model such as grok-4-thinking, grok-4.1-thinking, or grok-3-thinking, the server emits an additional response.reasoning_text.delta stream event for each chunk of the internal reasoning trace. In the final (non-streaming) response this trace appears as a top-level "type": "reasoning" item at index 0 of the output array, with the actual reply in the "message" item at index 1. Models that sometimes think and sometimes don't (hybrid models such as grok-4 and grok-4.1-expert) will include the reasoning item only when they choose to reason — your code should always check output[i].type rather than assuming a fixed index.