Hebbrix
Production pattern

Chatbot & Agent Memory

The recommended end-to-end pattern for using Hebbrix as the memory layer for a conversational chatbot or an autonomous agent — provisioning, corrections, read-after-write consistency, low-latency search, and current-vs-retired facts.

1. One collection per user

2. Process corrections

3. Poll processing_status

4. Search with fast:true

5. Filter current facts

6. Audit superseded

7. Scoped chat

Core principles

One collection per user

The collection is the isolation boundary — every read is hard-scoped to it. Keep one per user (or per agent / namespace).

Writes durable on return; indexing async

/memories/process returns immediately. A memory becomes searchable once processing_status == "completed" (~2–8s). Poll it for read-after-write.

conflict_status == "none" is the current truth

Corrections retire the old fact (conflict_status = "superseded"). The default listing hides retired facts; search excludes them.

superseded_by_id traces the chain

A superseded memory points at the memory that replaced it. Use include_superseded=true for audit views only.

fast:true for latency-sensitive search

Skips the neural reranker (fusion-only ranking): ~350–450ms and consistent. Normal search adds reranking quality but its p95 can include occasional ~2–3s outliers.

Scope chat with collection_id

Memory chat without a scope searches the whole account and is slow. Always pass collection_id. learning:true adds no response-time cost — extraction runs in the background.

A reusable client

Node.js 18+ with built-in fetch — no SDK required.

hebbrix-memory.js
const BASE = "https://api.hebbrix.com/v1";
const sleep = (ms) => new Promise((r) => setTimeout(r, ms));

class HebbrixMemory {
  constructor(apiKey) { this.key = apiKey; }

  async #req(path, { method = "GET", body } = {}) {
    const res = await fetch(BASE + path, {
      method,
      headers: {
        Authorization: `Bearer ${this.key}`,
        ...(body ? { "Content-Type": "application/json" } : {}),
      },
      body: body ? JSON.stringify(body) : undefined,
    });
    const text = await res.text();
    const json = text ? JSON.parse(text) : null;
    if (!res.ok) {
      // Structured errors carry an error_id you can quote to support.
      throw Object.assign(new Error(`HTTP ${res.status}`), { status: res.status, body: json });
    }
    return json;
  }

  // 1. One collection per user
  createUserCollection(userId) {
    return this.#req("/collections", {
      method: "POST",
      body: { name: `user:${userId}`, metadata: { user_id: userId } },
    }).then((c) => c.id);
  }

  // 2. Process a conversation turn / correction
  process(collectionId, messages) {
    return this.#req("/memories/process", {
      method: "POST",
      body: { collection_id: collectionId, messages },
    });
  }

  getMemory(id) {
    return this.#req(`/memories/${encodeURIComponent(id)}`);
  }

  // 3. Read-after-write: wait until a memory is searchable
  async waitUntilSearchable(memoryId, { maxMs = 15000, intervalMs = 1000 } = {}) {
    const start = Date.now();
    while (Date.now() - start <= maxMs) {
      const m = await this.getMemory(memoryId);
      if (m.processing_status === "completed") return m;
      await sleep(intervalMs);
    }
    return null; // still indexing; safe to retry the search later
  }

  // 4. Low-latency search
  search(collectionId, query, { fast = true, limit = 8, minScore } = {}) {
    return this.#req("/search", {
      method: "POST",
      body: {
        collection_id: collectionId, query, limit, fast,
        ...(minScore !== undefined ? { min_score: minScore } : {}),
      },
    });
  }

  // 6. Audit view: includes retired facts + supersede chain
  listMemories(collectionId, { includeSuperseded = false, limit = 50 } = {}) {
    const q = new URLSearchParams({ collection_id: collectionId, limit: String(limit) });
    if (includeSuperseded) q.set("include_superseded", "true");
    return this.#req(`/memories?${q.toString()}`);
  }

  // 7. Memory-aware chat (OpenAI-compatible)
  chat(collectionId, messages, { model = "gpt-5-nano", learning = false } = {}) {
    return this.#req("/chat/completions", {
      method: "POST",
      body: { model, collection_id: collectionId, messages, features: { memory: true, learning } },
    });
  }
}

End-to-end example

example.js
const mem = new HebbrixMemory(process.env.HEBBRIX_API_KEY);

// 1. One collection per user (do this once, then store the id).
const collectionId = await mem.createUserCollection("user-123");

// 2. Process a turn, then a correction.
await mem.process(collectionId, [
  { role: "user", content: "My preferred IDE is Cursor and my timezone is America/New_York." },
  { role: "assistant", content: "Noted." },
]);

const correction = await mem.process(collectionId, [
  { role: "user", content: "Actually, my preferred IDE is VS Code, not Cursor." },
  { role: "assistant", content: "Updated." },
]);

// The UPDATE event's memory_id is the NEW (current) fact.
const updated = correction.events.find((e) => e.event === "UPDATE");

// 3. Read-after-write: wait until the new fact is searchable.
if (updated?.memory_id) await mem.waitUntilSearchable(updated.memory_id);

// 4. Search (fast path) — returns the corrected fact, not the stale one.
const results = await mem.search(collectionId, "what is my preferred IDE");
//   -> [{ memory_id, content: "User's preferred IDE is VS Code.", score, conflict_status: "none" }]

// 5. Filter current facts (search already excludes retired facts; this is the
//    rule for any raw-memory reads).
const current = (results.results || []).filter((r) => (r.conflict_status ?? "none") === "none");

// 6. Audit view — see the supersede chain.
const audit = await mem.listMemories(collectionId, { includeSuperseded: true });
//   items include the retired fact:
//   { content: "User's preferred IDE is Cursor.", conflict_status: "superseded",
//     superseded_by_id: "<id of the VS Code memory>" }

// 7. Memory-aware chat, scoped to the user's collection.
const reply = await mem.chat(collectionId, [
  { role: "user", content: "Which IDE should I open?" },
], { learning: true });
//   reply.choices[0].message.content references the current fact (VS Code).
//   reply.memory_context.memories_used -> how many memories grounded the answer.

Read-after-write consistency

Writes are durable on return; vector + keyword indexing is asynchronous (~2–8s). A memory is guaranteed searchable once its processing_status is "completed".

read-after-write
POST /v1/memories/process        // returns immediately (events + ids)
GET  /v1/memories/{id}           // poll until processing_status == "completed"
POST /v1/search                  // now reliably reflects the write

If you don't need strict read-after-write (e.g. background ingestion), skip the poll — the memory becomes searchable on its own.

Current vs. retired facts

  • conflict_status == "none" — the current fact.
  • conflict_status == "superseded" / "duplicate" — retired.
  • GET /v1/memories (default) and /v1/search exclude retired facts.
  • GET /v1/memories?include_superseded=true includes them, each with superseded_by_id pointing at the replacement.
  • superseded_by_id is populated in normal correction flows; it can be null only if the replacement memory was itself later deleted. Treat conflict_status as the authoritative "is this retired?" signal.

Latency expectations (warm, scoped)

Endpointp50p95Notes
POST /v1/search (fast:true)~350–450ms~450msFusion-only ranking; consistent. Use for latency-sensitive paths.
POST /v1/search (normal)~400msup to ~2–3sAdds a neural reranker; p95 includes occasional outliers.
POST /v1/memories/process~5s~7sLLM fact extraction / conflict resolution.
POST /v1/search/reason~2.7s~3.6sMulti-step grounded reasoning (LLM).
POST /v1/chat/completions (memory, scoped)~2s~4.6slearning:true adds no response-time cost.

Figures are warm calls. The first request after idle or a deploy pays a one-time cold-start — send a lightweight warm-up request if you need a predictable first call. Memory-enabled chat without a collection_id searches the whole account and is much slower; always scope it.

Agent (policy / correction memory)

The same pattern works for an autonomous agent — store policies and corrections, and retrieve grounded policy before acting.

agent.js
// Store a policy / correction
await mem.process(agentCollectionId, [
  { role: "user", content: "For customer ACME, refunds above $250 require human approval." },
  { role: "assistant", content: "Recorded." },
]);

// Before acting, retrieve grounded policy (fast path)
const policy = await mem.search(agentCollectionId, "refund approval threshold for ACME", { fast: true });

// For a grounded natural-language decision, use reasoning search
const decision = await fetch("https://api.hebbrix.com/v1/search/reason", {
  method: "POST",
  headers: { Authorization: `Bearer ${process.env.HEBBRIX_API_KEY}`, "Content-Type": "application/json" },
  body: JSON.stringify({
    collection_id: agentCollectionId,
    query: "Can I approve a $275 refund for ACME without a human?",
  }),
}).then((r) => r.json());
//   decision.answer -> grounded answer citing the policy; non-2xx on internal failure.
Search reference Chat Completions Memories API

Assistant

Ask me anything about Hebbrix