v1.0.0

Introduction

Welcome to the Nexusify API. We provide a unified interface to the world's most capable AI models — text generation, vision, tool calling, and image synthesis — all through a single, OpenAI-compatible integration point.

Base URL

All API requests are made to the following base URL. Every endpoint listed in this documentation is relative to it.

https://api.nexusify.co/v1

What's Included

✦

Unified Text API

Switch between Gemini, GPT-4, Llama, Mistral, Grok, DeepSeek, and 99+ other models with a single parameter.

◈

Image Generation

Access Flux, ZImage, Imagen 4, Klein, and GPT Image all via one standardized endpoint.

⟡

OpenAI-Compatible

Drop-in replacement for existing OpenAI integrations — no SDK changes needed.

Getting Started

Obtaining an API Key

To start building with Nexus, you'll need a unique API key. The process takes under a minute and requires no credit card.

Step 1 — Start Building

Navigate to nexusify.co and click the "Start building" button on the homepage.

Step 2 — Authenticate

You'll be redirected to the authentication portal at login.nexusify.co. You can either complete the CAPTCHA verification or sign in with Google or Discord.

Step 3 — Open the Dashboard

After a successful login you'll land on your developer dashboard:

https://dash.nexusify.co/dashboard

Step 4 — Copy Your Key

Scroll to the "API Credentials" section. Your key is obscured by default. Click "Copy Secret Key" to copy it to your clipboard.

Keep it secret. Never paste your API key into client-side JavaScript, browser extensions, or public repositories. Treat it like a password.

Getting Started

Authentication

Every request to the Nexusify API must include your API key in the Authorization header using the Bearer scheme.

Request Header

HTTP Header

Authorization: Bearer YOUR_API_KEY

Security Notice. Do not expose your API key in browser-side code or commit it to version control. Use environment variables and server-side requests.

Example with Environment Variable

The idiomatic approach is to store your key in an environment variable and reference it at runtime, as shown below.

# Add to your .env file or shell profile
export NEXUS_API_KEY="your_key_here"

import os
api_key = os.getenv("NEXUS_API_KEY")

const apiKey = process.env.NEXUS_API_KEY;

Getting Started

Pricing & Credits

Nexus uses a credit-based system. New accounts receive $25 in free credits every week, automatically renewed on Mondays — no credit card required to start.

How Credits Work

🎁

Weekly Free Credits

$25 in free credits every Monday. Automatically topped up — no action needed on your part.

💳

Paid Credits

Purchase additional credits at any time from your dashboard. They never expire and are only drawn after free credits are exhausted.

⚡

Pay-as-you-go

You're billed only for what you use. Credits for failed requests are automatically refunded.

Text Model Pricing

Text models are billed per million tokens, with input and output priced separately. All prices are in USD.

Model	Input / 1M tokens	Output / 1M tokens
`gpt-4`	$30.00	$60.00
`grok-3`	$3.00	$15.00
`llama-3.1-405b-instruct`	$5.00	$15.00
`gemini-2.5-flash`	$0.15	$0.60
`gpt-4o-mini`	$0.15	$0.60
`mistral-large-3`	$2.00	$6.00
`deepseek-v3.2`	$0.27	$1.10
`kimi-k2.5`	$0.80	$3.00
`qwen3-235b-a22b`	$0.20	$0.60
All other models follow per-token pricing. Refer to the full model list for details.

Image Model Pricing

Image generation is billed at a flat rate per request. The base price applies to 512×512 px; larger resolutions scale proportionally.

Model	Base (512×512)	Large (1024×1024)
`flux`	$0.003	$0.012
`zimage`	$0.005	$0.020
`klein`	$0.006	$0.024
`gptimage`	$0.040	$0.160
`qwen-image`	$0.020	$0.080
`wan-image`	$0.008	$0.032
`p-image`	$0.055	$0.220

Low balance. When your credit balance is insufficient, requests return a 402 error. Purchase credits at dash.nexusify.co.

Getting Started

Limits & Errors

Understanding the operational limits and HTTP error responses helps you build more resilient integrations.

Usage Limits

130 Requests / Minute

$25 Free Credits / Week

∞ With Paid Credits

Error Codes

Code	Status	Description
`400`	Bad Request	Missing required parameters or invalid request format.
`401`	Unauthorized	Invalid or missing API key in the Authorization header.
`402`	Insufficient Credits	Your credit balance is too low. Purchase credits at the dashboard.
`429`	Too Many Requests	You have exceeded the 130 requests per minute rate limit.
`500`	Internal Server Error	An unexpected error occurred on our end. Please retry after a short wait.

Text Generation

Chat Completions

Generate text using 99 distinct AI models through a single, OpenAI-compatible endpoint. The request and response format is identical to the OpenAI Chat Completions API, so existing SDKs work without modification.

POST /chat/completions

Parameters

Name	Type	Required	Description
`model`	string	Yes	The model ID to use, e.g. `gpt-4`, `gemini-2.5-flash`, `mistral-large-3`.
`messages`	array	Yes	Array of message objects, each with a `role` (`"system"`, `"user"`, `"assistant"`) and `content`.
`stream`	boolean	No	If `true`, responses are streamed as Server-Sent Events.
`temperature`	float	No	Sampling temperature between 0.0 and 2.0. Defaults to `0.7`.
`max_tokens`	integer	No	Maximum number of tokens to generate in the response.

Example Request

curl https://api.nexusify.co/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $NEXUS_API_KEY" \
  -d '{
    "model": "gemini-2.5-flash",
    "messages": [
      {"role": "user", "content": "Explain general relativity to a 10-year-old."}
    ]
  }'

import requests

response = requests.post(
    "https://api.nexusify.co/v1/chat/completions",
    headers={
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
    },
    json={
        "model": "gemini-2.5-flash",
        "messages": [
            {"role": "user", "content": "Explain general relativity to a 10-year-old."}
        ]
    }
)
print(response.json()["choices"][0]["message"]["content"])

const res = await fetch("https://api.nexusify.co/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.NEXUS_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "gemini-2.5-flash",
    messages: [
      { role: "user", content: "Explain general relativity to a 10-year-old." }
    ]
  })
});
const data = await res.json();
console.log(data.choices[0].message.content);

Response Format

The response mirrors the OpenAI Chat Completions response schema.

JSON Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "gemini-2.5-flash",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Imagine space is like a stretched rubber sheet..."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 22,
    "completion_tokens": 148,
    "total_tokens": 170
  }
}

Text Generation

Available Models

A complete list of the 99 text models currently supported by Nexusify. Pass any Model ID as the model parameter in your request.

Provider	Model ID	Capabilities	Input ($/1M)	Output ($/1M)

Text Generation

Vision (Image Input)

Several models support multimodal input, allowing you to pass images alongside text in the messages array using the standard OpenAI content-parts format.

Supported models gemini-3-flash-preview, kimi-k2.5, llama-3.2-90b-vision-instruct, mistral-large-3, mistral-small-3.1, and others tagged Vision in the model list.

How It Works

Instead of passing a plain string as content, you pass an array of content parts. Each part is either a text block or an image_url block. Images can be sent as public URLs or as base64-encoded data URIs.

Example Request

curl https://api.nexusify.co/v1/chat/completions \
  -H "Authorization: Bearer $NEXUS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3-flash-preview",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "What is in this image?"},
        {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
      ]
    }]
  }'

response = requests.post(
    "https://api.nexusify.co/v1/chat/completions",
    headers={"Authorization": f"Bearer {api_key}"},
    json={
        "model": "gemini-3-flash-preview",
        "messages": [{
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image?"},
                {"type": "image_url",
                 "image_url": {"url": "https://example.com/photo.jpg"}}
            ]
        }]
    }
)

Sending a Base64 Image

To send an image from disk, encode it as a base64 data URI and place it in the url field:

JSON snippet

{
  "type": "image_url",
  "image_url": {
    "url": "data:image/jpeg;base64,/9j/4AAQSkZJRgAB..."
  }
}

Graceful degradation. If you send an image to a non-vision model (e.g. deepseek-v3.2), the image content is silently stripped and only the text portion of the message is forwarded. No error is returned.

Text Generation

Tool Calling

Tool calling — also known as function calling — lets the model invoke your application's functions during a conversation. Instead of replying with plain text, the model returns a structured JSON payload describing which function to call and with what arguments. Your code executes the function and feeds the result back, allowing the model to produce a final, grounded answer.

Broadly supported. Tool calling works natively with the vast majority of models on this API — including all GPT, Grok, DeepSeek, Kimi, Mistral, Cohere Command, Nemotron, Llama 3.x / 4, MiniMax, and Gemini models. Parameters are forwarded as-is to the underlying provider, so any model that supports the OpenAI tools schema works out of the box. Look for the Tools tag in the model list for models with confirmed tool-calling support.

How It Works

The full conversation loop has four steps.

Step	Who acts	What happens
1. Send request	Your app	POST to `/v1/chat/completions` with a `tools` array describing available functions.
2. Model calls a tool	Model	Returns `finish_reason: "tool_calls"` and a `tool_calls` array with the function name and JSON arguments.
3. Execute & return result	Your app	Run the function locally. Append the assistant message (with tool_calls) and a new `role: "tool"` message containing the result.
4. Final answer	Model	Reads the tool result and produces a natural-language reply to the user.

Request Parameters

Parameter	Type	Description
`tools`	array	List of function definitions. Each entry must have `type: "function"` and a `function` object with `name`, `description`, and a JSON Schema `parameters` object.
`tool_choice`	string / object	`"auto"` — model decides when to call a tool (default). `"none"` — tools are defined but the model must not call any. `{"type":"function","function":{"name":"..."}}` — force a specific function.
`parallel_tool_calls`	boolean	When `true` (default), the model may invoke multiple tools in a single turn. Set to `false` to enforce sequential calls.

Step 1 — Initial Request

curl https://api.nexusify.co/v1/chat/completions   -H "Authorization: Bearer $NEXUS_API_KEY"   -H "Content-Type: application/json"   -d '{
    "model": "deepseek-v3.2",
    "messages": [
      {"role": "user", "content": "What is the weather in Madrid right now?"}
    ],
    "tools": [{
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get the current weather for a given city.",
        "parameters": {
          "type": "object",
          "properties": {
            "city":  {"type": "string",  "description": "City name, e.g. Madrid"},
            "units": {"type": "string",  "enum": ["celsius", "fahrenheit"], "description": "Temperature unit"}
          },
          "required": ["city"]
        }
      }
    }],
    "tool_choice": "auto"
  }'

import requests, json

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get the current weather for a given city.",
        "parameters": {
            "type": "object",
            "properties": {
                "city":  {"type": "string"},
                "units": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["city"]
        }
    }
}]

messages = [{"role": "user", "content": "What is the weather in Madrid right now?"}]

r = requests.post(
    "https://api.nexusify.co/v1/chat/completions",
    headers={"Authorization": f"Bearer {api_key}"},
    json={"model": "deepseek-v3.2", "messages": messages, "tools": tools}
)

const tools = [{
  type: "function",
  function: {
    name: "get_weather",
    description: "Get the current weather for a given city.",
    parameters: {
      type: "object",
      properties: {
        city:  { type: "string" },
        units: { type: "string", enum: ["celsius", "fahrenheit"] }
      },
      required: ["city"]
    }
  }
}];

const response = await fetch("https://api.nexusify.co/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${apiKey}`,
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    model: "deepseek-v3.2",
    messages: [{ role: "user", content: "What is the weather in Madrid right now?" }],
    tools,
    tool_choice: "auto"
  })
});
const data = await response.json();

Step 2 — Model Response (tool call)

When the model wants to call a tool it sets finish_reason to "tool_calls" and returns a tool_calls array. The arguments field is always a JSON-encoded string.

Response — tool_calls

{
  "id": "chatcmpl-xyz",
  "choices": [{
    "finish_reason": "tool_calls",
    "message": {
      "role": "assistant",
      "content": null,
      "tool_calls": [{
        "id": "call_a1b2c3",
        "type": "function",
        "function": {
          "name": "get_weather",
          "arguments": "{"city": "Madrid", "units": "celsius"}"
        }
      }]
    }
  }]
}

Step 3 — Execute & Return Result

Parse the arguments string, call your function, and send a follow-up request. You must include the assistant's tool-call message in the history, followed by a new role: "tool" message.

assistant_msg = r.json()["choices"][0]["message"]
tool_call     = assistant_msg["tool_calls"][0]
args          = json.loads(tool_call["function"]["arguments"])

# Execute your function
result = get_weather(city=args["city"], units=args.get("units", "celsius"))

# Build follow-up messages
messages.append(assistant_msg)  # keep assistant turn with tool_calls
messages.append({
    "role":         "tool",
    "tool_call_id": tool_call["id"],
    "content":      json.dumps(result)
})

# Second request — get final answer
final = requests.post(
    "https://api.nexusify.co/v1/chat/completions",
    headers={"Authorization": f"Bearer {api_key}"},
    json={"model": "deepseek-v3.2", "messages": messages, "tools": tools}
)
print(final.json()["choices"][0]["message"]["content"])

const assistantMsg = data.choices[0].message;
const toolCall     = assistantMsg.tool_calls[0];
const args         = JSON.parse(toolCall.function.arguments);

// Execute your function
const result = await getWeather(args.city, args.units ?? "celsius");

// Build follow-up messages
const followUp = [
  ...messages,
  assistantMsg,  // keep the assistant tool_calls turn
  {
    role:         "tool",
    tool_call_id: toolCall.id,
    content:      JSON.stringify(result)
  }
];

// Second request — get final answer
const finalRes = await fetch("https://api.nexusify.co/v1/chat/completions", {
  method: "POST",
  headers: { "Authorization": `Bearer ${apiKey}`, "Content-Type": "application/json" },
  body: JSON.stringify({ model: "deepseek-v3.2", messages: followUp, tools })
});
const answer = (await finalRes.json()).choices[0].message.content;
console.log(answer);

Step 4 — Final Model Response

Response — final answer

{
  "choices": [{
    "finish_reason": "stop",
    "message": {
      "role": "assistant",
      "content": "The current temperature in Madrid is 28°C with clear skies."
    }
  }]
}

Defining Good Tools

The quality of your function description directly affects how reliably the model will call it. Follow these guidelines for best results.

Practice	Why it matters
Write a clear, specific `description`	The model reads this to decide when to call the function. Vague descriptions lead to missed or wrong calls.
Describe every parameter	Include a `description` field for each property in the JSON Schema so the model knows what value to fill in.
Use `enum` for fixed values	Prevents the model from inventing values for parameters that have a known set of options.
Mark required parameters	List every field the function cannot work without in the `required` array.
Keep names snake_case	Function and parameter names like `get_current_weather` are more reliably understood than camelCase or abbreviations.

Supported Models

The following models have confirmed native tool-calling support. Any other model on the API may also accept the tools parameter if the underlying provider supports it.

Provider	Models
DeepSeek	`deepseek-v3.1`, `deepseek-v3.2`
OpenAI	`gpt-4`, `gpt-4o-mini`, `gpt-5` series, `gpt-5.x` series
xAI	`grok-3`, `grok-4`, `grok-4.1` series
Moonshot	`kimi-k2`, `kimi-k2.5`
MiniMax	`minimax-m2`, `minimax-m2.5`
Mistral AI	`mistral-small-3.2`, `mistral-nemotron`, `mistral-large-3`, `ministral-3-8b`, `ministral-3-14b`
Cohere	`command-a-3`, `command-r-plus`, `command-r`
Meta / NVIDIA	`llama-3.3-70b-instruct`, `llama-3.1-405b-instruct`, `llama-3.1-8b-instruct`, `nemotron-super-49b-v1`, `nemotron-super-49b-v1.5`
Alibaba	`qwen3-235b-a22b`, `qwen3-next-80b`
Google	`gemini-2.5-flash`, `gemini-3-flash-preview`
RNJ AI	`rnj-1`

Vercel AI SDK / LangChain users. This API is fully compatible with the OpenAI client library, the Vercel AI SDK, and LangChain. Pass baseURL: "https://api.nexusify.co/v1" and your Nexus key — tool calling works identically to the native OpenAI SDK.

Image Generation

Generate Image

Create images from text prompts using state-of-the-art diffusion and synthesis models. A single endpoint provides access to all supported image models.

POST /generate-image

Parameters

Name	Type	Default	Description
`prompt`	string	—	A detailed text description of the image to generate. Required.
`model`	string	`flux`	The image model ID to use. See Image Models for the full list.
`width`	integer	`512`	Output width in pixels. Maximum `2048`.
`height`	integer	`512`	Output height in pixels. Maximum `2048`.

Example Request

curl https://api.nexusify.co/v1/generate-image \
  -H "Authorization: Bearer $NEXUS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Cyberpunk city with neon rain, cinematic lighting",
    "model": "flux",
    "width": 1024,
    "height": 1024
  }'

const res = await fetch("https://api.nexusify.co/v1/generate-image", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.NEXUS_API_KEY}`,
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    prompt: "Cyberpunk city with neon rain, cinematic lighting",
    model: "flux",
    width: 1024,
    height: 1024
  })
});

const data = await res.json();
console.log(data.imageUrl);

response = requests.post(
    "https://api.nexusify.co/v1/generate-image",
    headers={"Authorization": f"Bearer {api_key}"},
    json={
        "prompt": "Cyberpunk city with neon rain, cinematic lighting",
        "model": "flux",
        "width": 1024,
        "height": 1024
    }
)
image_url = response.json()["imageUrl"]

Response

A successful response includes the image URL, metadata about the request, and the current credit balance of your account. The image is hosted temporarily — it expires after 2 hours.

JSON Response

{
  "success": true,
  "model": "flux",
  "prompt": "A futuristic city at sunset",
  "size": "512x512",
  "imageUrl": "/data/generated-images/pollinations-1735000000000-a1b2c3d4e.png",
  "imagePath": "/data/generated-images/pollinations-1735000000000-a1b2c3d4e.png",
  "expiresIn": "2 hours",
  "message": "Image generated successfully",
  "user": {
    "email": "you@example.com",
    "plan": "free",
    "usageRemaining": 24.85,
    "nextGenerationAvailableIn": 0
  }
}

Field	Type	Description
`imageUrl`	string	Relative path to the generated image. To access it, prepend `https://api.nexusify.co` — e.g. `https://api.nexusify.co/data/generated-images/pollinations-....png`
`imagePath`	string	Same as `imageUrl`. Provided for convenience.
`size`	string	Dimensions of the generated image, e.g. `"512x512"`.
`expiresIn`	string	How long the image will remain accessible. Images are deleted after 2 hours.
`user.usageRemaining`	float	Your remaining credit balance in USD after this request.

Image Generation

Image Gen (OpenAI Format)

An OpenAI-compatible image generation endpoint. If you are already using the OpenAI Images API, you can point your client at Nexusify with zero code changes — just swap the base URL and API key.

POST /v1/images/generations

Drop-in replacement. The request and response format is identical to POST /v1/images/generations from the OpenAI API. Existing SDKs and integrations work without modification — just set base_url to https://api.nexusify.co.

Parameters

Name	Type	Default	Description
`prompt`	string	—	A text description of the desired image. Required.
`model`	string	`flux`	The image model ID to use. See Image Models for the full list.
`size`	string	`"1024x1024"`	Image dimensions as `"WxH"`, e.g. `"512x512"` or `"1024x1792"`. Larger sizes are billed proportionally.
`response_format`	string	`"url"`	`"url"` returns a hosted image URL; `"b64_json"` returns the raw image as a base-64 string.
`n`	integer	`1`	Number of images to generate. Currently always `1`.
`stream`	boolean	`false`	If `true`, the server emits Server-Sent Events while generating the image, then delivers the final result as the last event.

Response

A successful response returns a JSON object that mirrors the OpenAI Images API response shape:

JSON Response

{
  "created": 1712345678,
  "data": [
    {
      "url": "https://api.nexusify.co/v1/images/gen-abc123.png",
      "revised_prompt": "A majestic mountain landscape at sunset..."
    }
  ]
}

When response_format is "b64_json", the url field is omitted and a b64_json field containing the base-64 encoded PNG is returned instead.

Generated images are accessible via GET /v1/images/<filename> and are retained for 2 hours after creation.

Example Request

curl https://api.nexusify.co/v1/images/generations \
  -H "Authorization: Bearer $NEXUS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A majestic mountain landscape at golden hour",
    "model": "flux",
    "size": "1024x1024",
    "response_format": "url"
  }'

from openai import OpenAI

client = OpenAI(
    api_key="$NEXUS_API_KEY",
    base_url="https://api.nexusify.co"
)

response = client.images.generate(
    prompt="A majestic mountain landscape at golden hour",
    model="flux",
    size="1024x1024",
    response_format="url",
    n=1
)

print(response.data[0].url)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "$NEXUS_API_KEY",
  baseURL: "https://api.nexusify.co",
});

const response = await client.images.generate({
  prompt: "A majestic mountain landscape at golden hour",
  model: "flux",
  size: "1024x1024",
  response_format: "url",
  n: 1,
});

console.log(response.data[0].url);

Streaming

Set stream: true to receive Server-Sent Events while the image is being generated. Each event carries a status field; the final event contains the complete response object.

SSE Stream (cURL)

curl https://api.nexusify.co/v1/images/generations \
  -H "Authorization: Bearer $NEXUS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Neon-lit cyberpunk city at midnight",
    "model": "klein",
    "size": "1024x1024",
    "stream": true
  }'

# Server-Sent Events output:
# data: {"status":"generating","progress":0.2}
# data: {"status":"generating","progress":0.7}
# data: {"created":1712345678,"data":[{"url":"https://api.nexusify.co/v1/images/gen-xyz.png"}]}

Image retrieval. Images returned via url can be fetched with a plain GET request — no authentication required. They expire after 2 hours.

Image Generation

Image Models

Seven image models are currently available, each optimized for different use cases. Pass the ID as the model parameter in your generate-image request.

ID	Description	Best For	Base Price (512×512)

Text Generation

Text Generate Legacy

A simpler, more direct alternative to /chat/completions. Instead of wrapping everything in a messages array, you can pass a plain prompt string and get a plain completion string back — no extra nesting required. It supports the same 99 models, streaming, conversation history, and all generation parameters.

When to use this endpoint. If you're building a quick prototype, a simple chatbot, or just want to call a model without constructing a full messages array, this endpoint is the easier choice. For production integrations or OpenAI SDK compatibility, use /chat/completions instead.

POST /text/generate

Parameters

Name	Type	Required	Default	Description
`model`	string	Yes	—	The model ID to use. Any model from the full model list is supported.
`prompt`	string	*	—	A plain text message to send to the model. Required if `messages` is not provided.
`messages`	array	*	—	Full conversation history in OpenAI format. Use this instead of `prompt` for multi-turn conversations.
`systemInstruction`	string	No	—	A system-level instruction that shapes the model's behavior for the entire conversation.
`temperature`	float	No	`0.7`	Controls randomness. `0.0` is deterministic, `2.0` is very creative.
`max_tokens`	integer	No	`300`	Maximum number of tokens to generate.
`top_p`	float	No	`1.0`	Nucleus sampling threshold. Values below `1.0` restrict the token pool.
`stop`	string / array	No	`null`	One or more sequences that will stop generation when encountered.
`stream`	boolean	No	`false`	If `true`, the response is streamed as Server-Sent Events.
`userid`	string	No	—	An identifier for the user. When provided, the server stores conversation history and automatically includes it in future requests with the same `userid`.

* Either prompt or messages must be provided.

Example Request

curl https://api.nexusify.co/v1/text/generate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $NEXUS_API_KEY" \
  -d '{
    "model": "gemini-2.5-flash",
    "prompt": "Explain how black holes form in two sentences.",
    "temperature": 0.7,
    "max_tokens": 150
  }'

import requests

response = requests.post(
    "https://api.nexusify.co/v1/text/generate",
    headers={
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
    },
    json={
        "model": "gemini-2.5-flash",
        "prompt": "Explain how black holes form in two sentences.",
        "temperature": 0.7,
        "max_tokens": 150,
    }
)
print(response.json()["completion"])

const res = await fetch("https://api.nexusify.co/v1/text/generate", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.NEXUS_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "gemini-2.5-flash",
    prompt: "Explain how black holes form in two sentences.",
    temperature: 0.7,
    max_tokens: 150,
  })
});

const data = await res.json();
console.log(data.completion);

Response

On success, the response contains a completion field with the model's reply as a plain string — no nested arrays to unwrap.

JSON Response

{
  "success": true,
  "model": "gemini-2.5-flash",
  "completion": "Black holes form when a massive star exhausts its nuclear fuel and collapses under gravity...",
  "reasoning": "...",
  "userid": null,
  "historyLength": 2,
  "messagesUsed": 1,
  "usage": {
    "prompt_tokens": 14,
    "completion_tokens": 38,
    "total_tokens": 52
  },
  "user": {
    "email": "you@example.com",
    "plan": "free",
    "usageRemaining": 24.99
  }
}

Field	Type	Description
`completion`	string	The model's generated reply.
`reasoning`	string	Chain-of-thought output, only present on reasoning models (e.g. `grok-3-mini`, `deepseek-v3.2`).
`historyLength`	integer	Total messages in the conversation after this turn. Only meaningful when `userid` is used.
`messagesUsed`	integer	Number of messages sent to the model in this request.
`usage`	object	Token breakdown: `prompt_tokens`, `completion_tokens`, `total_tokens`.
`user.usageRemaining`	float	Your remaining credit balance in USD.

Streaming

Set "stream": true to receive the response as a sequence of Server-Sent Events. Each event delivers a chunk of the completion as it's generated. The stream ends with a data: [DONE] message.

curl https://api.nexusify.co/v1/text/generate \
  -H "Authorization: Bearer $NEXUS_API_KEY" \
  -H "Content-Type: application/json" \
  --no-buffer \
  -d '{
    "model": "gpt-5-mini",
    "prompt": "Write a short story about a robot learning to paint.",
    "stream": true,
    "max_tokens": 400
  }'

const res = await fetch("https://api.nexusify.co/v1/text/generate", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.NEXUS_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "gpt-5-mini",
    prompt: "Write a short story about a robot learning to paint.",
    stream: true,
    max_tokens: 400,
  })
});

const reader = res.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { value, done } = await reader.read();
  if (done) break;
  const chunk = decoder.decode(value);
  for (const line of chunk.split("\n")) {
    if (!line.startsWith("data: ")) continue;
    const payload = line.slice(6);
    if (payload === "[DONE]") break;
    const json = JSON.parse(payload);
    process.stdout.write(json.choices[0]?.delta?.content ?? "");
  }
}

Conversation History

Pass a userid string to enable persistent conversation memory. The server saves each exchange and automatically prepends the history on your next request with the same userid and model — so the model always has context from prior turns without you managing it manually.

Example with userid

// First message
{ "model": "gpt-5-mini", "prompt": "My name is Alex.", "userid": "user_42" }

// Follow-up — model remembers the name
{ "model": "gpt-5-mini", "prompt": "What's my name?", "userid": "user_42" }
// → completion: "Your name is Alex."

To clear the stored history for a user, send a DELETE request to /v1/text/history/{userid}. This resets the conversation across all models at once.

Text Generation

Responses API

The Responses API is OpenAI's modern alternative to Chat Completions. Instead of a flat choices array, it returns a structured output array of typed items — separate objects for the assistant message and (when present) for reasoning traces. This makes it especially well-suited for agentic applications, multi-step workflows, and any integration that already targets the OpenAI Responses API shape.

POST /responses

Parameters

Name	Type	Required	Default	Description
`model`	string	Yes	—	The model ID to use. Any model from the full model list is supported.
`input`	string / array	Yes	—	The user's input. Can be a plain string, an array of `{role, content}` objects (same shape as `messages` in Chat Completions), or an array of Responses API message objects with typed content parts.
`instructions`	string	No	—	A system-level instruction prepended to the conversation. Equivalent to a `system` message in Chat Completions.
`max_output_tokens`	integer	No	—	Maximum number of tokens to generate. Maps to `max_tokens` internally.
`temperature`	float	No	`1.0`	Sampling temperature between `0.0` and `2.0`.
`top_p`	float	No	`1.0`	Nucleus sampling threshold.
`stream`	boolean	No	`false`	If `true`, the response is delivered as a sequence of named Server-Sent Events.
`tools`	array	No	—	List of tool definitions available to the model (same schema as Chat Completions).
`tool_choice`	string / object	No	`"auto"`	Controls whether the model calls a tool.
`store`	boolean	No	`true`	Whether this response should be stored. Reflected in the response envelope but not processed server-side.
`metadata`	object	No	`{}`	Arbitrary key/value pairs echoed back in the response. Useful for tagging requests.
`previous_response_id`	string	No	—	ID of a prior response to continue from. Echoed in the response envelope.
`user`	string	No	—	An identifier for the end-user. Used for audit purposes; not stored as conversation history.

Example Request

curl https://api.nexusify.co/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $NEXUS_API_KEY" \
  -d '{
    "model": "grok-4",
    "input": "Explain the twin paradox in simple terms.",
    "instructions": "You are a friendly physics tutor.",
    "temperature": 0.7,
    "max_output_tokens": 300
  }'

import requests

response = requests.post(
    "https://api.nexusify.co/v1/responses",
    headers={
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
    },
    json={
        "model": "grok-4",
        "input": "Explain the twin paradox in simple terms.",
        "instructions": "You are a friendly physics tutor.",
        "temperature": 0.7,
        "max_output_tokens": 300,
    }
)
data = response.json()
# The assistant text lives in the first message output item
text = data["output"][-1]["content"][0]["text"]
print(text)

const res = await fetch("https://api.nexusify.co/v1/responses", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.NEXUS_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "grok-4",
    input: "Explain the twin paradox in simple terms.",
    instructions: "You are a friendly physics tutor.",
    temperature: 0.7,
    max_output_tokens: 300,
  })
});
const data = await res.json();
// Grab the assistant message (last output item)
const msg = data.output.find(o => o.type === "message");
console.log(msg.content[0].text);

Response Format

Rather than a flat choices array, the response wraps everything in an output array of typed items. A standard (non-reasoning) response contains one item of "type": "message". When the model produces a reasoning trace, a "type": "reasoning" item appears first at index 0, and the message shifts to index 1.

JSON Response — standard model

{
  "id": "resp_01j9abc123",
  "object": "response",
  "created_at": 1735000000,
  "model": "grok-4",
  "status": "completed",
  "output": [{
    "type": "message",
    "id": "msg_01abc",
    "role": "assistant",
    "status": "completed",
    "content": [{
      "type": "output_text",
      "text": "Imagine twins Alice and Bob..."
    }]
  }],
  "usage": {
    "input_tokens": 22,
    "output_tokens": 84,
    "total_tokens": 106
  },
  "temperature": 0.7,
  "top_p": 1,
  "max_output_tokens": 300,
  "error": null
}

JSON Response — reasoning model (e.g. grok-4-thinking)

{
  "id": "resp_01j9xyz456",
  "object": "response",
  "model": "grok-4-thinking",
  "status": "completed",
  "output": [
    {
      "type": "reasoning",
      "id": "rs_01abc",
      "summary": [{
        "type": "summary_text",
        "text": "Let me think about special relativity step by step..."
      }]
    },
    {
      "type": "message",
      "id": "msg_01def",
      "role": "assistant",
      "status": "completed",
      "content": [{
        "type": "output_text",
        "text": "Imagine twins Alice and Bob..."
      }]
    }
  ],
  "usage": {
    "input_tokens": 22,
    "output_tokens": 312,
    "total_tokens": 334
  }
}

Field	Type	Description
`id`	string	Unique response identifier, prefixed `resp_`.
`object`	string	Always `"response"`.
`status`	string	Always `"completed"` for non-streaming responses.
`output`	array	Ordered list of output items. A `"reasoning"` item (with a `summary` array) appears first when the model produces a thinking trace; the `"message"` item always appears last.
`output[].content[].text`	string	The assistant's final reply text. Access via `output.find(o => o.type === "message").content[0].text`.
`usage.input_tokens`	integer	Tokens consumed by the prompt (estimated for providers that omit usage).
`usage.output_tokens`	integer	Tokens in the generated reply.
`error`	null / object	`null` on success; an error object on failure.

Multi-turn Conversations

Pass input as an array of message objects to send a full conversation history, just as you would with the messages parameter in Chat Completions. Each item should have a role ("user" or "assistant") and a content string.

Multi-turn input

{
  "model": "gemini-2.5-flash",
  "instructions": "You are a helpful assistant.",
  "input": [
    { "role": "user",      "content": "My name is Alex." },
    { "role": "assistant", "content": "Nice to meet you, Alex!" },
    { "role": "user",      "content": "What is my name?" }
  ]
}

Streaming

Set "stream": true and the server delivers the response as a sequence of named Server-Sent Events. Each event has both an event: field (the event name) and a data: field (a JSON object). The stream follows the OpenAI Responses API event lifecycle exactly, so any client library that supports that spec will work without changes.

curl https://api.nexusify.co/v1/responses \
  -H "Authorization: Bearer $NEXUS_API_KEY" \
  -H "Content-Type: application/json" \
  --no-buffer \
  -d '{
    "model": "grok-4.1-fast",
    "input": "Write a haiku about the ocean.",
    "stream": true
  }'

const res = await fetch("https://api.nexusify.co/v1/responses", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.NEXUS_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "grok-4.1-fast",
    input: "Write a haiku about the ocean.",
    stream: true,
  })
});

const reader = res.body.getReader();
const decoder = new TextDecoder();
let buffer = "";

while (true) {
  const { value, done } = await reader.read();
  if (done) break;
  buffer += decoder.decode(value);

  // Events are separated by double newlines
  const parts = buffer.split("\n\n");
  buffer = parts.pop() ?? ""; // keep incomplete trailing chunk

  for (const part of parts) {
    const dataLine = part.split("\n").find(l => l.startsWith("data: "));
    if (!dataLine) continue;
    const evt = JSON.parse(dataLine.slice(6));

    if (evt.type === "response.output_text.delta") {
      process.stdout.write(evt.delta); // stream text as it arrives
    }
    if (evt.type === "response.completed") {
      console.log("\n[done]", evt.response.usage);
    }
  }
}

Streaming Events Reference

The following events are emitted in order during a streaming response. Your client only needs to handle the ones relevant to its use case — most applications only need response.output_text.delta for live text and response.completed for the final state.

Event name	When it fires	Key fields in `data`
`response.created`	Immediately — before any tokens are generated.	`response.id`, `response.status: "in_progress"`
`response.in_progress`	Generation has started.	`response.status: "in_progress"`
`response.output_item.added`	A new output item (e.g. the message) has been opened.	`output_index`, `item.type`, `item.role`
`response.content_part.added`	A content part inside the message has been opened.	`item_id`, `content_index`, `part.type: "output_text"`
`response.reasoning_text.delta`	Only on reasoning models — a chunk of the thinking trace.	`delta` (string chunk)
`response.output_text.delta`	A chunk of the assistant's reply text.	`delta` (string chunk), `item_id`
`response.output_text.done`	The full reply text has been sent.	`text` (complete accumulated string)
`response.content_part.done`	The content part is closed.	`part.text`
`response.output_item.done`	An output item is fully delivered (one per item in `output`).	`item` (complete item object)
`response.completed`	The full response is ready. Contains the final response object with usage.	`response` (complete response object)
`response.done`	Terminal event — stream is closed after this.	`response` (same as `response.completed`)

Reasoning Models

When you use a thinking-capable model such as grok-4-thinking, grok-4.1-thinking, or grok-3-thinking, the server emits an additional response.reasoning_text.delta stream event for each chunk of the internal reasoning trace. In the final (non-streaming) response this trace appears as a top-level "type": "reasoning" item at index 0 of the output array, with the actual reply in the "message" item at index 1. Models that sometimes think and sometimes don't (hybrid models such as grok-4 and grok-4.1-expert) will include the reasoning item only when they choose to reason — your code should always check output[i].type rather than assuming a fixed index.