Introduction
Welcome to the Nexusify API. We provide a unified interface to the world's most capable AI models — text generation, vision, tool calling, and image synthesis — all through a single, OpenAI-compatible integration point.
Base URL
All API requests are made to the following base URL. Every endpoint listed in this documentation is relative to it.
What's Included
Unified Text API
Switch between Gemini, GPT-4, Llama, Mistral, Grok, DeepSeek, and 99+ other models with a single parameter.
Image Generation
Access Flux, ZImage, Imagen 4, Klein, and GPT Image all via one standardized endpoint.
OpenAI-Compatible
Drop-in replacement for existing OpenAI integrations — no SDK changes needed.
Obtaining an API Key
To start building with Nexus, you'll need a unique API key. The process takes under a minute and requires no credit card.
Step 1 — Start Building
Navigate to nexusify.co and click the "Start building" button on the homepage.
Step 2 — Authenticate
You'll be redirected to the authentication portal at login.nexusify.co. You can either complete the CAPTCHA verification or sign in with Google or Discord.
Step 3 — Open the Dashboard
After a successful login you'll land on your developer dashboard:
Step 4 — Copy Your Key
Scroll to the "API Credentials" section. Your key is obscured by default. Click "Copy Secret Key" to copy it to your clipboard.
Authentication
Every request to the Nexusify API must include your API key in the Authorization header using the Bearer scheme.
Request Header
Authorization: Bearer YOUR_API_KEY
Example with Environment Variable
The idiomatic approach is to store your key in an environment variable and reference it at runtime, as shown below.
# Add to your .env file or shell profile
export NEXUS_API_KEY="your_key_here"
import os
api_key = os.getenv("NEXUS_API_KEY")
const apiKey = process.env.NEXUS_API_KEY;
Pricing & Credits
Nexus uses a credit-based system. New accounts receive $25 in free credits every week, automatically renewed on Mondays — no credit card required to start.
How Credits Work
Weekly Free Credits
$25 in free credits every Monday. Automatically topped up — no action needed on your part.
Paid Credits
Purchase additional credits at any time from your dashboard. They never expire and are only drawn after free credits are exhausted.
Pay-as-you-go
You're billed only for what you use. Credits for failed requests are automatically refunded.
Text Model Pricing
Text models are billed per million tokens, with input and output priced separately. All prices are in USD.
| Model | Input / 1M tokens | Output / 1M tokens |
|---|---|---|
gpt-4 | $30.00 | $60.00 |
grok-3 | $3.00 | $15.00 |
llama-3.1-405b-instruct | $5.00 | $15.00 |
gemini-2.5-flash | $0.15 | $0.60 |
gpt-4o-mini | $0.15 | $0.60 |
mistral-large-3 | $2.00 | $6.00 |
deepseek-v3.2 | $0.27 | $1.10 |
kimi-k2.5 | $0.80 | $3.00 |
qwen3-235b-a22b | $0.20 | $0.60 |
| All other models follow per-token pricing. Refer to the full model list for details. | ||
Image Model Pricing
Image generation is billed at a flat rate per request. The base price applies to 512×512 px; larger resolutions scale proportionally.
| Model | Base (512×512) | Large (1024×1024) |
|---|---|---|
flux | $0.003 | $0.012 |
zimage | $0.005 | $0.020 |
klein | $0.006 | $0.024 |
gptimage | $0.040 | $0.160 |
qwen-image | $0.020 | $0.080 |
wan-image | $0.008 | $0.032 |
p-image | $0.055 | $0.220 |
402 error. Purchase credits at dash.nexusify.co.
Limits & Errors
Understanding the operational limits and HTTP error responses helps you build more resilient integrations.
Usage Limits
Error Codes
| Code | Status | Description |
|---|---|---|
400 | Bad Request | Missing required parameters or invalid request format. |
401 | Unauthorized | Invalid or missing API key in the Authorization header. |
402 | Insufficient Credits | Your credit balance is too low. Purchase credits at the dashboard. |
429 | Too Many Requests | You have exceeded the 130 requests per minute rate limit. |
500 | Internal Server Error | An unexpected error occurred on our end. Please retry after a short wait. |
Chat Completions
Generate text using 99 distinct AI models through a single, OpenAI-compatible endpoint. The request and response format is identical to the OpenAI Chat Completions API, so existing SDKs work without modification.
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
model | string | Yes | The model ID to use, e.g. gpt-4, gemini-2.5-flash, mistral-large-3. |
messages | array | Yes | Array of message objects, each with a role ("system", "user", "assistant") and content. |
stream | boolean | No | If true, responses are streamed as Server-Sent Events. |
temperature | float | No | Sampling temperature between 0.0 and 2.0. Defaults to 0.7. |
max_tokens | integer | No | Maximum number of tokens to generate in the response. |
Example Request
curl https://api.nexusify.co/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $NEXUS_API_KEY" \
-d '{
"model": "gemini-2.5-flash",
"messages": [
{"role": "user", "content": "Explain general relativity to a 10-year-old."}
]
}'
import requests
response = requests.post(
"https://api.nexusify.co/v1/chat/completions",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
},
json={
"model": "gemini-2.5-flash",
"messages": [
{"role": "user", "content": "Explain general relativity to a 10-year-old."}
]
}
)
print(response.json()["choices"][0]["message"]["content"])
const res = await fetch("https://api.nexusify.co/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.NEXUS_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "gemini-2.5-flash",
messages: [
{ role: "user", content: "Explain general relativity to a 10-year-old." }
]
})
});
const data = await res.json();
console.log(data.choices[0].message.content);
Response Format
The response mirrors the OpenAI Chat Completions response schema.
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"model": "gemini-2.5-flash",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Imagine space is like a stretched rubber sheet..."
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 22,
"completion_tokens": 148,
"total_tokens": 170
}
}
Available Models
A complete list of the 99 text models currently supported by Nexusify. Pass any Model ID as the model parameter in your request.
| Provider | Model ID | Capabilities | Input ($/1M) | Output ($/1M) |
|---|
Vision (Image Input)
Several models support multimodal input, allowing you to pass images alongside text in the messages array using the standard OpenAI content-parts format.
gemini-3-flash-preview, kimi-k2.5, llama-3.2-90b-vision-instruct, mistral-large-3, mistral-small-3.1, and others tagged Vision in the model list.
How It Works
Instead of passing a plain string as content, you pass an array of content parts. Each part is either a text block or an image_url block. Images can be sent as public URLs or as base64-encoded data URIs.
Example Request
curl https://api.nexusify.co/v1/chat/completions \
-H "Authorization: Bearer $NEXUS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-3-flash-preview",
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
]
}]
}'
response = requests.post(
"https://api.nexusify.co/v1/chat/completions",
headers={"Authorization": f"Bearer {api_key}"},
json={
"model": "gemini-3-flash-preview",
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{"type": "image_url",
"image_url": {"url": "https://example.com/photo.jpg"}}
]
}]
}
)
Sending a Base64 Image
To send an image from disk, encode it as a base64 data URI and place it in the url field:
{
"type": "image_url",
"image_url": {
"url": "data:image/jpeg;base64,/9j/4AAQSkZJRgAB..."
}
}
deepseek-v3.2), the image content is silently stripped and only the text portion of the message is forwarded. No error is returned.
Tool Calling
Tool calling — also known as function calling — lets the model invoke your application's functions during a conversation. Instead of replying with plain text, the model returns a structured JSON payload describing which function to call and with what arguments. Your code executes the function and feeds the result back, allowing the model to produce a final, grounded answer.
tools schema works out of the box. Look for the Tools tag in the model list for models with confirmed tool-calling support.
How It Works
The full conversation loop has four steps.
| Step | Who acts | What happens |
|---|---|---|
| 1. Send request | Your app | POST to /v1/chat/completions with a tools array describing available functions. |
| 2. Model calls a tool | Model | Returns finish_reason: "tool_calls" and a tool_calls array with the function name and JSON arguments. |
| 3. Execute & return result | Your app | Run the function locally. Append the assistant message (with tool_calls) and a new role: "tool" message containing the result. |
| 4. Final answer | Model | Reads the tool result and produces a natural-language reply to the user. |
Request Parameters
| Parameter | Type | Description |
|---|---|---|
tools |
array | List of function definitions. Each entry must have type: "function" and a function object with name, description, and a JSON Schema parameters object. |
tool_choice |
string / object |
"auto" — model decides when to call a tool (default)."none" — tools are defined but the model must not call any.{"type":"function","function":{"name":"..."}} — force a specific function.
|
parallel_tool_calls |
boolean | When true (default), the model may invoke multiple tools in a single turn. Set to false to enforce sequential calls. |
Step 1 — Initial Request
curl https://api.nexusify.co/v1/chat/completions -H "Authorization: Bearer $NEXUS_API_KEY" -H "Content-Type: application/json" -d '{
"model": "deepseek-v3.2",
"messages": [
{"role": "user", "content": "What is the weather in Madrid right now?"}
],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a given city.",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name, e.g. Madrid"},
"units": {"type": "string", "enum": ["celsius", "fahrenheit"], "description": "Temperature unit"}
},
"required": ["city"]
}
}
}],
"tool_choice": "auto"
}'
import requests, json
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a given city.",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string"},
"units": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["city"]
}
}
}]
messages = [{"role": "user", "content": "What is the weather in Madrid right now?"}]
r = requests.post(
"https://api.nexusify.co/v1/chat/completions",
headers={"Authorization": f"Bearer {api_key}"},
json={"model": "deepseek-v3.2", "messages": messages, "tools": tools}
)
const tools = [{
type: "function",
function: {
name: "get_weather",
description: "Get the current weather for a given city.",
parameters: {
type: "object",
properties: {
city: { type: "string" },
units: { type: "string", enum: ["celsius", "fahrenheit"] }
},
required: ["city"]
}
}
}];
const response = await fetch("https://api.nexusify.co/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${apiKey}`,
"Content-Type": "application/json"
},
body: JSON.stringify({
model: "deepseek-v3.2",
messages: [{ role: "user", content: "What is the weather in Madrid right now?" }],
tools,
tool_choice: "auto"
})
});
const data = await response.json();
Step 2 — Model Response (tool call)
When the model wants to call a tool it sets finish_reason to "tool_calls" and returns a tool_calls array. The arguments field is always a JSON-encoded string.
{
"id": "chatcmpl-xyz",
"choices": [{
"finish_reason": "tool_calls",
"message": {
"role": "assistant",
"content": null,
"tool_calls": [{
"id": "call_a1b2c3",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{"city": "Madrid", "units": "celsius"}"
}
}]
}
}]
}
Step 3 — Execute & Return Result
Parse the arguments string, call your function, and send a follow-up request. You must include the assistant's tool-call message in the history, followed by a new role: "tool" message.
assistant_msg = r.json()["choices"][0]["message"]
tool_call = assistant_msg["tool_calls"][0]
args = json.loads(tool_call["function"]["arguments"])
# Execute your function
result = get_weather(city=args["city"], units=args.get("units", "celsius"))
# Build follow-up messages
messages.append(assistant_msg) # keep assistant turn with tool_calls
messages.append({
"role": "tool",
"tool_call_id": tool_call["id"],
"content": json.dumps(result)
})
# Second request — get final answer
final = requests.post(
"https://api.nexusify.co/v1/chat/completions",
headers={"Authorization": f"Bearer {api_key}"},
json={"model": "deepseek-v3.2", "messages": messages, "tools": tools}
)
print(final.json()["choices"][0]["message"]["content"])
const assistantMsg = data.choices[0].message;
const toolCall = assistantMsg.tool_calls[0];
const args = JSON.parse(toolCall.function.arguments);
// Execute your function
const result = await getWeather(args.city, args.units ?? "celsius");
// Build follow-up messages
const followUp = [
...messages,
assistantMsg, // keep the assistant tool_calls turn
{
role: "tool",
tool_call_id: toolCall.id,
content: JSON.stringify(result)
}
];
// Second request — get final answer
const finalRes = await fetch("https://api.nexusify.co/v1/chat/completions", {
method: "POST",
headers: { "Authorization": `Bearer ${apiKey}`, "Content-Type": "application/json" },
body: JSON.stringify({ model: "deepseek-v3.2", messages: followUp, tools })
});
const answer = (await finalRes.json()).choices[0].message.content;
console.log(answer);
Step 4 — Final Model Response
{
"choices": [{
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": "The current temperature in Madrid is 28°C with clear skies."
}
}]
}
Defining Good Tools
The quality of your function description directly affects how reliably the model will call it. Follow these guidelines for best results.
| Practice | Why it matters |
|---|---|
Write a clear, specific description | The model reads this to decide when to call the function. Vague descriptions lead to missed or wrong calls. |
| Describe every parameter | Include a description field for each property in the JSON Schema so the model knows what value to fill in. |
Use enum for fixed values | Prevents the model from inventing values for parameters that have a known set of options. |
| Mark required parameters | List every field the function cannot work without in the required array. |
| Keep names snake_case | Function and parameter names like get_current_weather are more reliably understood than camelCase or abbreviations. |
Supported Models
The following models have confirmed native tool-calling support. Any other model on the API may also accept the tools parameter if the underlying provider supports it.
| Provider | Models |
|---|---|
| DeepSeek | deepseek-v3.1, deepseek-v3.2 |
| OpenAI | gpt-4, gpt-4o-mini, gpt-5 series, gpt-5.x series |
| xAI | grok-3, grok-4, grok-4.1 series |
| Moonshot | kimi-k2, kimi-k2.5 |
| MiniMax | minimax-m2, minimax-m2.5 |
| Mistral AI | mistral-small-3.2, mistral-nemotron, mistral-large-3, ministral-3-8b, ministral-3-14b |
| Cohere | command-a-3, command-r-plus, command-r |
| Meta / NVIDIA | llama-3.3-70b-instruct, llama-3.1-405b-instruct, llama-3.1-8b-instruct, nemotron-super-49b-v1, nemotron-super-49b-v1.5 |
| Alibaba | qwen3-235b-a22b, qwen3-next-80b |
gemini-2.5-flash, gemini-3-flash-preview | |
| RNJ AI | rnj-1 |
baseURL: "https://api.nexusify.co/v1" and your Nexus key — tool calling works identically to the native OpenAI SDK.
Generate Image
Create images from text prompts using state-of-the-art diffusion and synthesis models. A single endpoint provides access to all supported image models.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
prompt | string | — | A detailed text description of the image to generate. Required. |
model | string | flux | The image model ID to use. See Image Models for the full list. |
width | integer | 512 | Output width in pixels. Maximum 2048. |
height | integer | 512 | Output height in pixels. Maximum 2048. |
Example Request
curl https://api.nexusify.co/v1/generate-image \
-H "Authorization: Bearer $NEXUS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Cyberpunk city with neon rain, cinematic lighting",
"model": "flux",
"width": 1024,
"height": 1024
}'
const res = await fetch("https://api.nexusify.co/v1/generate-image", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.NEXUS_API_KEY}`,
"Content-Type": "application/json"
},
body: JSON.stringify({
prompt: "Cyberpunk city with neon rain, cinematic lighting",
model: "flux",
width: 1024,
height: 1024
})
});
const data = await res.json();
console.log(data.imageUrl);
response = requests.post(
"https://api.nexusify.co/v1/generate-image",
headers={"Authorization": f"Bearer {api_key}"},
json={
"prompt": "Cyberpunk city with neon rain, cinematic lighting",
"model": "flux",
"width": 1024,
"height": 1024
}
)
image_url = response.json()["imageUrl"]
Response
A successful response includes the image URL, metadata about the request, and the current credit balance of your account. The image is hosted temporarily — it expires after 2 hours.
{
"success": true,
"model": "flux",
"prompt": "A futuristic city at sunset",
"size": "512x512",
"imageUrl": "/data/generated-images/pollinations-1735000000000-a1b2c3d4e.png",
"imagePath": "/data/generated-images/pollinations-1735000000000-a1b2c3d4e.png",
"expiresIn": "2 hours",
"message": "Image generated successfully",
"user": {
"email": "you@example.com",
"plan": "free",
"usageRemaining": 24.85,
"nextGenerationAvailableIn": 0
}
}
| Field | Type | Description |
|---|---|---|
imageUrl | string | Relative path to the generated image. To access it, prepend https://api.nexusify.co — e.g. https://api.nexusify.co/data/generated-images/pollinations-....png |
imagePath | string | Same as imageUrl. Provided for convenience. |
size | string | Dimensions of the generated image, e.g. "512x512". |
expiresIn | string | How long the image will remain accessible. Images are deleted after 2 hours. |
user.usageRemaining | float | Your remaining credit balance in USD after this request. |
Image Gen (OpenAI Format)
An OpenAI-compatible image generation endpoint. If you are already using the OpenAI Images API, you can point your client at Nexusify with zero code changes — just swap the base URL and API key.
POST /v1/images/generations from the OpenAI API. Existing SDKs and integrations work without modification — just set base_url to https://api.nexusify.co.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
prompt | string | — | A text description of the desired image. Required. |
model | string | flux | The image model ID to use. See Image Models for the full list. |
size | string | "1024x1024" | Image dimensions as "WxH", e.g. "512x512" or "1024x1792". Larger sizes are billed proportionally. |
response_format | string | "url" | "url" returns a hosted image URL; "b64_json" returns the raw image as a base-64 string. |
n | integer | 1 | Number of images to generate. Currently always 1. |
stream | boolean | false | If true, the server emits Server-Sent Events while generating the image, then delivers the final result as the last event. |
Response
A successful response returns a JSON object that mirrors the OpenAI Images API response shape:
{
"created": 1712345678,
"data": [
{
"url": "https://api.nexusify.co/v1/images/gen-abc123.png",
"revised_prompt": "A majestic mountain landscape at sunset..."
}
]
}
When response_format is "b64_json", the url field is omitted and a b64_json field containing the base-64 encoded PNG is returned instead.
Generated images are accessible via GET /v1/images/<filename> and are retained for 2 hours after creation.
Example Request
curl https://api.nexusify.co/v1/images/generations \ -H "Authorization: Bearer $NEXUS_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "prompt": "A majestic mountain landscape at golden hour", "model": "flux", "size": "1024x1024", "response_format": "url" }'
from openai import OpenAI client = OpenAI( api_key="$NEXUS_API_KEY", base_url="https://api.nexusify.co" ) response = client.images.generate( prompt="A majestic mountain landscape at golden hour", model="flux", size="1024x1024", response_format="url", n=1 ) print(response.data[0].url)
import OpenAI from "openai"; const client = new OpenAI({ apiKey: "$NEXUS_API_KEY", baseURL: "https://api.nexusify.co", }); const response = await client.images.generate({ prompt: "A majestic mountain landscape at golden hour", model: "flux", size: "1024x1024", response_format: "url", n: 1, }); console.log(response.data[0].url);
Streaming
Set stream: true to receive Server-Sent Events while the image is being generated. Each event carries a status field; the final event contains the complete response object.
curl https://api.nexusify.co/v1/images/generations \ -H "Authorization: Bearer $NEXUS_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "prompt": "Neon-lit cyberpunk city at midnight", "model": "klein", "size": "1024x1024", "stream": true }' # Server-Sent Events output: # data: {"status":"generating","progress":0.2} # data: {"status":"generating","progress":0.7} # data: {"created":1712345678,"data":[{"url":"https://api.nexusify.co/v1/images/gen-xyz.png"}]}
url can be fetched with a plain GET request — no authentication required. They expire after 2 hours.
Image Models
Seven image models are currently available, each optimized for different use cases. Pass the ID as the model parameter in your generate-image request.
| ID | Description | Best For | Base Price (512×512) |
|---|
Text Generate Legacy
A simpler, more direct alternative to /chat/completions. Instead of wrapping everything in a messages array, you can pass a plain prompt string and get a plain completion string back — no extra nesting required. It supports the same 99 models, streaming, conversation history, and all generation parameters.
/chat/completions instead.
Parameters
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
model | string | Yes | — | The model ID to use. Any model from the full model list is supported. |
prompt | string | * | — | A plain text message to send to the model. Required if messages is not provided. |
messages | array | * | — | Full conversation history in OpenAI format. Use this instead of prompt for multi-turn conversations. |
systemInstruction | string | No | — | A system-level instruction that shapes the model's behavior for the entire conversation. |
temperature | float | No | 0.7 | Controls randomness. 0.0 is deterministic, 2.0 is very creative. |
max_tokens | integer | No | 300 | Maximum number of tokens to generate. |
top_p | float | No | 1.0 | Nucleus sampling threshold. Values below 1.0 restrict the token pool. |
stop | string / array | No | null | One or more sequences that will stop generation when encountered. |
stream | boolean | No | false | If true, the response is streamed as Server-Sent Events. |
userid | string | No | — | An identifier for the user. When provided, the server stores conversation history and automatically includes it in future requests with the same userid. |
* Either prompt or messages must be provided.
Example Request
curl https://api.nexusify.co/v1/text/generate \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $NEXUS_API_KEY" \
-d '{
"model": "gemini-2.5-flash",
"prompt": "Explain how black holes form in two sentences.",
"temperature": 0.7,
"max_tokens": 150
}'
import requests
response = requests.post(
"https://api.nexusify.co/v1/text/generate",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
},
json={
"model": "gemini-2.5-flash",
"prompt": "Explain how black holes form in two sentences.",
"temperature": 0.7,
"max_tokens": 150,
}
)
print(response.json()["completion"])
const res = await fetch("https://api.nexusify.co/v1/text/generate", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.NEXUS_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "gemini-2.5-flash",
prompt: "Explain how black holes form in two sentences.",
temperature: 0.7,
max_tokens: 150,
})
});
const data = await res.json();
console.log(data.completion);
Response
On success, the response contains a completion field with the model's reply as a plain string — no nested arrays to unwrap.
{
"success": true,
"model": "gemini-2.5-flash",
"completion": "Black holes form when a massive star exhausts its nuclear fuel and collapses under gravity...",
"reasoning": "...",
"userid": null,
"historyLength": 2,
"messagesUsed": 1,
"usage": {
"prompt_tokens": 14,
"completion_tokens": 38,
"total_tokens": 52
},
"user": {
"email": "you@example.com",
"plan": "free",
"usageRemaining": 24.99
}
}
| Field | Type | Description |
|---|---|---|
completion | string | The model's generated reply. |
reasoning | string | Chain-of-thought output, only present on reasoning models (e.g. grok-3-mini, deepseek-v3.2). |
historyLength | integer | Total messages in the conversation after this turn. Only meaningful when userid is used. |
messagesUsed | integer | Number of messages sent to the model in this request. |
usage | object | Token breakdown: prompt_tokens, completion_tokens, total_tokens. |
user.usageRemaining | float | Your remaining credit balance in USD. |
Streaming
Set "stream": true to receive the response as a sequence of Server-Sent Events. Each event delivers a chunk of the completion as it's generated. The stream ends with a data: [DONE] message.
curl https://api.nexusify.co/v1/text/generate \
-H "Authorization: Bearer $NEXUS_API_KEY" \
-H "Content-Type: application/json" \
--no-buffer \
-d '{
"model": "gpt-5-mini",
"prompt": "Write a short story about a robot learning to paint.",
"stream": true,
"max_tokens": 400
}'
const res = await fetch("https://api.nexusify.co/v1/text/generate", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.NEXUS_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "gpt-5-mini",
prompt: "Write a short story about a robot learning to paint.",
stream: true,
max_tokens: 400,
})
});
const reader = res.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { value, done } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
for (const line of chunk.split("\n")) {
if (!line.startsWith("data: ")) continue;
const payload = line.slice(6);
if (payload === "[DONE]") break;
const json = JSON.parse(payload);
process.stdout.write(json.choices[0]?.delta?.content ?? "");
}
}
Conversation History
Pass a userid string to enable persistent conversation memory. The server saves each exchange and automatically prepends the history on your next request with the same userid and model — so the model always has context from prior turns without you managing it manually.
// First message
{ "model": "gpt-5-mini", "prompt": "My name is Alex.", "userid": "user_42" }
// Follow-up — model remembers the name
{ "model": "gpt-5-mini", "prompt": "What's my name?", "userid": "user_42" }
// → completion: "Your name is Alex."
To clear the stored history for a user, send a DELETE request to /v1/text/history/{userid}. This resets the conversation across all models at once.
Responses API
The Responses API is OpenAI's modern alternative to Chat Completions. Instead of a flat choices array, it returns a structured output array of typed items — separate objects for the assistant message and (when present) for reasoning traces. This makes it especially well-suited for agentic applications, multi-step workflows, and any integration that already targets the OpenAI Responses API shape.
Parameters
| Name | Type | Required | Default | Description |
|---|---|---|---|---|
model | string | Yes | — | The model ID to use. Any model from the full model list is supported. |
input | string / array | Yes | — | The user's input. Can be a plain string, an array of {role, content} objects (same shape as messages in Chat Completions), or an array of Responses API message objects with typed content parts. |
instructions | string | No | — | A system-level instruction prepended to the conversation. Equivalent to a system message in Chat Completions. |
max_output_tokens | integer | No | — | Maximum number of tokens to generate. Maps to max_tokens internally. |
temperature | float | No | 1.0 | Sampling temperature between 0.0 and 2.0. |
top_p | float | No | 1.0 | Nucleus sampling threshold. |
stream | boolean | No | false | If true, the response is delivered as a sequence of named Server-Sent Events. |
tools | array | No | — | List of tool definitions available to the model (same schema as Chat Completions). |
tool_choice | string / object | No | "auto" | Controls whether the model calls a tool. |
store | boolean | No | true | Whether this response should be stored. Reflected in the response envelope but not processed server-side. |
metadata | object | No | {} | Arbitrary key/value pairs echoed back in the response. Useful for tagging requests. |
previous_response_id | string | No | — | ID of a prior response to continue from. Echoed in the response envelope. |
user | string | No | — | An identifier for the end-user. Used for audit purposes; not stored as conversation history. |
Example Request
curl https://api.nexusify.co/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $NEXUS_API_KEY" \
-d '{
"model": "grok-4",
"input": "Explain the twin paradox in simple terms.",
"instructions": "You are a friendly physics tutor.",
"temperature": 0.7,
"max_output_tokens": 300
}'
import requests
response = requests.post(
"https://api.nexusify.co/v1/responses",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
},
json={
"model": "grok-4",
"input": "Explain the twin paradox in simple terms.",
"instructions": "You are a friendly physics tutor.",
"temperature": 0.7,
"max_output_tokens": 300,
}
)
data = response.json()
# The assistant text lives in the first message output item
text = data["output"][-1]["content"][0]["text"]
print(text)
const res = await fetch("https://api.nexusify.co/v1/responses", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.NEXUS_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "grok-4",
input: "Explain the twin paradox in simple terms.",
instructions: "You are a friendly physics tutor.",
temperature: 0.7,
max_output_tokens: 300,
})
});
const data = await res.json();
// Grab the assistant message (last output item)
const msg = data.output.find(o => o.type === "message");
console.log(msg.content[0].text);
Response Format
Rather than a flat choices array, the response wraps everything in an output array of typed items. A standard (non-reasoning) response contains one item of "type": "message". When the model produces a reasoning trace, a "type": "reasoning" item appears first at index 0, and the message shifts to index 1.
{
"id": "resp_01j9abc123",
"object": "response",
"created_at": 1735000000,
"model": "grok-4",
"status": "completed",
"output": [{
"type": "message",
"id": "msg_01abc",
"role": "assistant",
"status": "completed",
"content": [{
"type": "output_text",
"text": "Imagine twins Alice and Bob..."
}]
}],
"usage": {
"input_tokens": 22,
"output_tokens": 84,
"total_tokens": 106
},
"temperature": 0.7,
"top_p": 1,
"max_output_tokens": 300,
"error": null
}
{
"id": "resp_01j9xyz456",
"object": "response",
"model": "grok-4-thinking",
"status": "completed",
"output": [
{
"type": "reasoning",
"id": "rs_01abc",
"summary": [{
"type": "summary_text",
"text": "Let me think about special relativity step by step..."
}]
},
{
"type": "message",
"id": "msg_01def",
"role": "assistant",
"status": "completed",
"content": [{
"type": "output_text",
"text": "Imagine twins Alice and Bob..."
}]
}
],
"usage": {
"input_tokens": 22,
"output_tokens": 312,
"total_tokens": 334
}
}
| Field | Type | Description |
|---|---|---|
id | string | Unique response identifier, prefixed resp_. |
object | string | Always "response". |
status | string | Always "completed" for non-streaming responses. |
output | array | Ordered list of output items. A "reasoning" item (with a summary array) appears first when the model produces a thinking trace; the "message" item always appears last. |
output[].content[].text | string | The assistant's final reply text. Access via output.find(o => o.type === "message").content[0].text. |
usage.input_tokens | integer | Tokens consumed by the prompt (estimated for providers that omit usage). |
usage.output_tokens | integer | Tokens in the generated reply. |
error | null / object | null on success; an error object on failure. |
Multi-turn Conversations
Pass input as an array of message objects to send a full conversation history, just as you would with the messages parameter in Chat Completions. Each item should have a role ("user" or "assistant") and a content string.
{
"model": "gemini-2.5-flash",
"instructions": "You are a helpful assistant.",
"input": [
{ "role": "user", "content": "My name is Alex." },
{ "role": "assistant", "content": "Nice to meet you, Alex!" },
{ "role": "user", "content": "What is my name?" }
]
}
Streaming
Set "stream": true and the server delivers the response as a sequence of named Server-Sent Events. Each event has both an event: field (the event name) and a data: field (a JSON object). The stream follows the OpenAI Responses API event lifecycle exactly, so any client library that supports that spec will work without changes.
curl https://api.nexusify.co/v1/responses \
-H "Authorization: Bearer $NEXUS_API_KEY" \
-H "Content-Type: application/json" \
--no-buffer \
-d '{
"model": "grok-4.1-fast",
"input": "Write a haiku about the ocean.",
"stream": true
}'
const res = await fetch("https://api.nexusify.co/v1/responses", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.NEXUS_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "grok-4.1-fast",
input: "Write a haiku about the ocean.",
stream: true,
})
});
const reader = res.body.getReader();
const decoder = new TextDecoder();
let buffer = "";
while (true) {
const { value, done } = await reader.read();
if (done) break;
buffer += decoder.decode(value);
// Events are separated by double newlines
const parts = buffer.split("\n\n");
buffer = parts.pop() ?? ""; // keep incomplete trailing chunk
for (const part of parts) {
const dataLine = part.split("\n").find(l => l.startsWith("data: "));
if (!dataLine) continue;
const evt = JSON.parse(dataLine.slice(6));
if (evt.type === "response.output_text.delta") {
process.stdout.write(evt.delta); // stream text as it arrives
}
if (evt.type === "response.completed") {
console.log("\n[done]", evt.response.usage);
}
}
}
Streaming Events Reference
The following events are emitted in order during a streaming response. Your client only needs to handle the ones relevant to its use case — most applications only need response.output_text.delta for live text and response.completed for the final state.
| Event name | When it fires | Key fields in data |
|---|---|---|
response.created | Immediately — before any tokens are generated. | response.id, response.status: "in_progress" |
response.in_progress | Generation has started. | response.status: "in_progress" |
response.output_item.added | A new output item (e.g. the message) has been opened. | output_index, item.type, item.role |
response.content_part.added | A content part inside the message has been opened. | item_id, content_index, part.type: "output_text" |
response.reasoning_text.delta | Only on reasoning models — a chunk of the thinking trace. | delta (string chunk) |
response.output_text.delta | A chunk of the assistant's reply text. | delta (string chunk), item_id |
response.output_text.done | The full reply text has been sent. | text (complete accumulated string) |
response.content_part.done | The content part is closed. | part.text |
response.output_item.done | An output item is fully delivered (one per item in output). | item (complete item object) |
response.completed | The full response is ready. Contains the final response object with usage. | response (complete response object) |
response.done | Terminal event — stream is closed after this. | response (same as response.completed) |
Reasoning Models
When you use a thinking-capable model such as grok-4-thinking, grok-4.1-thinking, or grok-3-thinking, the server emits an additional response.reasoning_text.delta stream event for each chunk of the internal reasoning trace. In the final (non-streaming) response this trace appears as a top-level "type": "reasoning" item at index 0 of the output array, with the actual reply in the "message" item at index 1. Models that sometimes think and sometimes don't (hybrid models such as grok-4 and grok-4.1-expert) will include the reasoning item only when they choose to reason — your code should always check output[i].type rather than assuming a fixed index.