Chat API
Chat Completions
OpenAI-compatible chat endpoint with automatic memory injection. Drop-in replacement that gives your AI persistent memory.
How It Works
When you send a chat request, Hebbrix automatically searches for relevant memories and injects them into the system prompt. Your AI gets context from past conversations without any extra code.
1
Receive Message
User sends a message to your AI
2
Search Memories
Hebbrix finds relevant context
3
Respond with Context
AI responds with full memory context
Endpoint
Code Examples
Drop-in Replacement for OpenAI
Python (OpenAI SDK)
from openai import OpenAI
# Just change the base URL and API key
client = OpenAI(
api_key="mem_sk_your_hebbrix_key",
base_url="https://api.hebbrix.com/v1"
)
# Use exactly like OpenAI
response = client.chat.completions.create(
model="gpt-5-nano",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What did we discuss last time?"}
]
)
print(response.choices[0].message.content)
# The response will include context from previous conversations!With Hebbrix SDK
Python (direct API)
import os
import requests
BASE = "https://api.hebbrix.com/v1"
H = {"Authorization": f"Bearer {os.environ['HEBBRIX_API_KEY']}"}
# First turn — the backend extracts and stores the fact automatically
# when the `features.memory` and `features.learning` flags are enabled.
r = requests.post(
f"{BASE}/chat/completions",
headers=H,
json={
"model": "gpt-5-nano",
"messages": [
{"role": "user", "content": "Remember that my favorite color is blue"},
],
"session_id": "user_123_session",
"features": {
"memory": True,
"user_profile": True,
"learning": True,
},
},
)
# Later conversation — the backend injects prior memories as context
r = requests.post(
f"{BASE}/chat/completions",
headers=H,
json={
"model": "gpt-5-nano",
"messages": [
{"role": "user", "content": "What's my favorite color?"},
],
"session_id": "user_123_session",
},
)
print(r.json()["choices"][0]["message"]["content"])
# → "Your favorite color is blue!"Streaming Response
Python (Streaming)
from openai import OpenAI
# Drop-in OpenAI client — just point base_url at Hebbrix
client = OpenAI(
api_key="mem_sk_your_hebbrix_key",
base_url="https://api.hebbrix.com/v1",
)
# Enable streaming for real-time responses
stream = client.chat.completions.create(
model="gpt-5-nano",
messages=[{"role": "user", "content": "Tell me a story"}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
print(delta, end="")cURL Example
POST
/v1/chat/completionscurl -X POST "https://api.hebbrix.com/v1/chat/completions" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5-nano",
"messages": [
{
"role": "user",
"content": "What are my preferences?"
}
],
"features": {
"memory": true
}
}'Memory Context in Response
Each response includes a memory_context object showing which memories were used:
Response memory_context
{
"memory_context": {
"memories_used": 5,
"profile_injected": true,
"relevance_score": 0.89,
"processing_time_ms": 45,
"memories": [
{
"id": "mem_abc123",
"content": "User prefers dark mode",
"score": 0.95
}
]
}
}