Billing & Usage
Understand Hebbrix pricing tiers, monitor your API usage, and manage your subscription programmatically.
Pricing Tiers
Free
For side projects and experimentation
- 1K credits/month
- GPT-5-nano model
- Hybrid search
- Community support
Starter
For indie developers
- 25K credits/month
- GPT-5-nano model
- Hybrid search
- BYOK support
- Priority support
Pro
For teams and production apps
- 200K credits/month
- GPT-5-nano model
- RL training
- 99.9% SLA guarantee
- Advanced analytics
Scale
For teams with heavy workloads
- 1M credits/month
- GPT-5-nano model
- Knowledge graph
- Team collaboration
- Custom integrations
Enterprise
For large organizations
- Pay as you go
- Any model (BYOK)
- Dedicated infrastructure
- SOC2 / HIPAA
- Custom SLA
Usage Limits
| Resource | Free | Starter | Pro | Scale | Enterprise |
|---|---|---|---|---|---|
| Price | $0/mo | $19/mo | $99/mo | $399/mo | Custom |
| Credits | 1K/mo | 25K/mo | 200K/mo | 1M/mo | Pay as you go |
| Rate limit | 60/min | 300/min | 1,200/min | 2,000/min | 3,000/min |
| Default Model | GPT-5-nano | GPT-5-mini | GPT-5-mini | GPT-5-mini | GPT-5-mini (Any with BYOK) |
| Knowledge Graph | - | - | ✓ | ✓ | ✓ |
| BYOK | - | ✓ | ✓ | ✓ | ✓ |
| RL Training | - | - | ✓ | ✓ | ✓ |
| SLA | - | - | 99.9% | 99.9% | Custom |
| Support | Community | Priority | Priority | Priority | 24/7 Dedicated |
Endpoints
Code Examples
Check Usage
import os
import requests
BASE = "https://api.hebbrix.com/v1"
H = {"Authorization": f"Bearer {os.environ['HEBBRIX_API_KEY']}"}
# GET /v1/usage — dashboard-style overview with PDF-contract api_calls
# object exposed at the top level.
r = requests.get(f"{BASE}/usage", headers=H)
usage = r.json()
ac = usage["api_calls"]
print(f"API Calls: {ac['used']}/{ac['limit']}") # limit = -1 for unlimited tiers
print(f"Remaining: {ac['remaining']}")
print(f"Percentage: {ac['percentage']:.1f}%")
# Summary block has success_rate, total_bytes_*, avg_latency_ms, etc.
print(f"Avg latency: {usage['summary']['avg_latency_ms']} ms")Monitor Usage Programmatically
# Warn when nearing the quota. Handle "unlimited" (limit=-1) safely.
r = requests.get(f"{BASE}/usage", headers=H)
ac = r.json()["api_calls"]
if ac["limit"] > 0:
percent_used = (ac["used"] / ac["limit"]) * 100
if percent_used > 80:
print(f"Warning: {percent_used:.1f}% of API calls used this period")
if percent_used > 95:
print("Critical: upgrade or wait for the quota reset")
else:
print("Unlimited tier — no quota pressure")Upgrade Subscription
# POST /v1/billing/upgrade — body accepts `tier` (preferred) or the
# legacy `plan` key. Both resolve to the same internal field.
r = requests.post(
f"{BASE}/billing/upgrade",
headers=H,
json={"tier": "pro", "billing_interval": "monthly"},
)
new_sub = r.json()
# GET /v1/billing/subscription — current plan + renewal date
r = requests.get(f"{BASE}/billing/subscription", headers=H)
sub = r.json()
print(f"Status: {sub['status']}")
print(f"Renews: {sub['current_period_end']}")Rate Limit Headers
Every API response includes headers to help you track your rate limits:
| Header | Description |
|---|---|
X-RateLimit-Limit | Max requests per window |
X-RateLimit-Remaining | Requests remaining in window |
X-RateLimit-Reset | Unix timestamp when window resets |
X-Monthly-Usage | Current month API call count |
HTTP/1.1 200 OK
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1705312800
X-Monthly-Usage: 45230
Content-Type: application/jsonOverage & Limits
When you hit limits
If you exceed your monthly API calls, requests will return 429 Too Many Requests. Upgrade your plan or wait until the next billing period.
Billing Cycle
Usage resets monthly on your subscription anniversary date. Unused quota doesn't roll over.
Instant Upgrades
Upgrades take effect immediately with pro-rated billing. New limits apply right away.
Billing response fields
Metered operations return a set of billing fields so you can show cost and remaining quota to your users. The table below documents the meaning and stability of each one.
| Field | Type | Meaning |
|---|---|---|
quota_remaining | int | Credits left in the current period. A non-negative number is the literal remaining count. -1 means UNLIMITED (enterprise / unmetered plans) — always special-case it before doing arithmetic. |
estimated_cost_usd | float | Estimated USD cost of the operation. 0.0 means there is no metered charge for that call — e.g. a NOOP (a duplicate or no-op memory write) or an operation included in the plan. It is an estimate, not a final invoice line. |
billed_tokens | int | Tokens actually billed for the call, after any multiplier is applied. |
actual_tokens | int | Raw tokens consumed before any multiplier. |
token_multiplier | float | Plan/operation multiplier applied to actual_tokens to produce billed_tokens. |
Credit model
1 credit = 1 operation. NOOPs (no-op or duplicate writes) are free. Overage is user-controlled and feature-flagged — you are never charged beyond your plan unless you opt in.
Stability
The following fields are a stable public contract: quota_remaining, estimated_cost_usd, billed_tokens, actual_tokens, and token_multiplier. You can rely on their names and semantics across releases.
cURL Example
/v1/usagecurl -X GET "https://api.hebbrix.com/v1/usage" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json"