Architecture
TL;DR: Fabra creates a Context Record for every AI decision — an immutable snapshot of what data was used, where it came from, and what got dropped. DuckDB for local, Postgres + Redis + pgvector for production.
The Context Record
Everything in Fabra flows through the Context Record — the atomic unit of AI accountability.
{
"context_id": "ctx_018f3a2b-...",
"created_at": "2025-02-14T18:42:31.221Z",
"content": "You are a helpful assistant...",
"token_count": 3847,
"features": [
{"name": "user_tier", "value": "premium", "freshness_ms": 1200},
{"name": "support_tickets", "value": 3, "freshness_ms": 45000}
],
"retrieved_items": [
{"chunk_id": "doc_123_chunk_2", "similarity": 0.89, "tokens": 512},
{"chunk_id": "doc_456_chunk_1", "similarity": 0.84, "tokens": 384}
],
"assembly": {
"max_tokens": 4000,
"tokens_used": 3847,
"items_dropped": [
{"chunk_id": "doc_789_chunk_3", "reason": "budget_exceeded", "tokens": 256}
]
},
"integrity": {
"record_hash": "sha256:a1b2c3d4..."
}
}This record answers every question about an AI decision:
- What features? →
features[]with exact values and freshness - What documents? →
retrieved_items[]with similarity scores - What got dropped? →
assembly.items_dropped[]with reasons - Can we verify it? →
integrity.record_hashfor tamper detection
System Overview
┌─────────────────────────────────────────────────────────────────┐
│ Your Application │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Fabra Server │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ @feature │ │ @retriever │ │ @context │ │
│ │ decorator │ │ decorator │ │ decorator │ │
│ └──────┬──────┘ └──────┬──────┘ └───────────┬─────────────┘ │
│ │ │ │ │
│ └────────────────┴─────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────┐ │
│ │ Context Record │ ← Created on every call │
│ │ (immutable) │ │
│ └───────────┬───────────┘ │
└──────────────────────────┼──────────────────────────────────────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ Online │ │ Offline │ │ Vector │
│ Store │ │ Store │ │ Store │
│ (Redis) │ │ (Postgres) │ │ (pgvector) │
└────────────┘ └────────────┘ └────────────┘The Stack
Local Development
| Component | Technology | Purpose |
|---|---|---|
| Offline Store | DuckDB (embedded) | Feature history, Context Records |
| Online Store | Python dict | Feature cache |
| Vector Store | In-memory | Document embeddings |
| Scheduler | APScheduler | Background refresh |
Infrastructure required: None. Just pip install fabra-ai.
Production
| Component | Technology | Purpose |
|---|---|---|
| Offline Store | Postgres 13+ | Feature history, Context Records |
| Online Store | Redis 6+ | Feature cache, sub-ms reads |
| Vector Store | pgvector | Semantic search |
| Scheduler | Distributed workers | Redis-locked refresh |
Infrastructure required: 1x Postgres (with pgvector), 1x Redis, Nx API pods.
How Context Records Are Created
1. Feature Resolution
When you access a feature, Fabra tracks it:
@feature(entity=User, refresh=timedelta(hours=1))
def user_tier(user_id: str) -> str:
return db.query("SELECT tier FROM users WHERE id = ?", user_id)Every feature access is logged with:
- Value at access time
- Freshness (how old the cached value is)
- Source (cache hit vs computed)
2. Document Retrieval
When you retrieve documents, Fabra tracks them:
@retriever(index="support_docs", top_k=5)
async def search_docs(query: str) -> list[str]:
pass # Fabra handles the vector searchEvery retrieval is logged with:
- Chunk IDs and document sources
- Similarity scores
- Token counts
- Freshness (when indexed)
3. Context Assembly
When you assemble context, Fabra creates the record:
@context(store, max_tokens=4000)
async def chat_context(user_id: str, query: str) -> list[ContextItem]:
docs = await search_docs(query)
tier = await store.get_feature("user_tier", user_id)
return [
ContextItem(content="You are helpful.", priority=0, required=True),
ContextItem(content=f"User tier: {tier}", priority=1),
ContextItem(content=str(docs), priority=2),
]The @context decorator:
- Executes your function
- Tracks all feature accesses and retrievals
- Applies token budget (drops low-priority items if needed)
- Creates immutable Context Record with hash
- Stores record in Offline Store
Token Budget Management
Context assembly uses priority-based truncation:
@context(store, max_tokens=4000)
async def chat_context(...):
return [
ContextItem(content=system, priority=0, required=True), # Never dropped
ContextItem(content=user_data, priority=1, required=True), # Never dropped
ContextItem(content=docs, priority=2), # Dropped if needed
ContextItem(content=suggestions, priority=3), # Dropped first
]When budget is exceeded:
- Sort items by priority (lowest = most important)
- Drop highest-priority items first
- Record every dropped item with reason
- Raise
ContextBudgetErrorif required items can't fit
The Context Record captures what was dropped and why — critical for debugging "why didn't the AI know about X?"
Data Flow Diagrams
Feature Materialization
sequenceDiagram
participant S as Scheduler
participant F as Feature Function
participant O as Offline Store
participant C as Online Cache
S->>F: Trigger refresh
F->>F: Compute value
F->>O: Write (entity_id, value, timestamp)
F->>C: Write (entity_id, value, TTL)Context Assembly
sequenceDiagram
participant A as Application
participant C as @context
participant F as Features
participant R as Retriever
participant V as Vector Store
participant O as Offline Store
A->>C: chat_context(user_id, query)
C->>F: get_feature("user_tier")
F-->>C: "premium" (tracked)
C->>R: search_docs(query)
R->>V: Vector similarity
V-->>R: Top-K chunks
R-->>C: Documents (tracked)
C->>C: Apply token budget
C->>C: Create Context Record
C->>O: Store record
C-->>A: Context with ctx.idPoint-in-Time Correctness
Training/serving skew kills ML models. Fabra ensures consistency:
The Problem
Your model trains on Monday's features but serves Tuesday's features. The model learned patterns that no longer exist.
The Solution
Fabra's get_training_data() uses "as-of" joins:
# Get features AS THEY WERE at each event time
training_df = await store.get_training_data(
entity_df=events, # DataFrame with entity_id + timestamp
features=["user_tier", "purchase_count"],
)DuckDB (local):
SELECT e.*, f.value
FROM entity_df e
ASOF LEFT JOIN feature_table f
ON e.entity_id = f.entity_id AND e.timestamp >= f.timestampPostgres (production):
LEFT JOIN LATERAL (
SELECT value FROM feature_table f
WHERE f.entity_id = e.entity_id AND f.timestamp <= e.timestamp
ORDER BY f.timestamp DESC LIMIT 1
) f ON TRUESame logic offline (training) and online (serving). No skew.
Design Decisions
Why Redis-Only Caching?
Most feature stores use L1 (in-memory) + L2 (Redis). This creates cache coherence bugs — different pods see different values.
Fabra uses Redis-only:
- 1ms localhost latency (fast enough)
- Strong consistency (all pods see same value)
- Simpler debugging (one source of truth)
For fraud detection and financial decisions, correctness > micro-optimization.
Why No Leader Election?
Distributed schedulers typically use leader election or consistent hashing. Complex, fragile, hard to debug.
Fabra uses randomized work selection + Redis SETNX locks:
- Any worker can pick up any job
- Redis lock prevents double-execution
- Self-healing, no topology awareness needed
- "Even enough" distribution statistically
Why Explicit Over Magic?
No auto-caching hot features. No query optimization. No hidden materialization.
You write materialize=True, we cache. That's it.
Predictability > cleverness when debugging production issues.
Integrity and Verification
Every Context Record includes cryptographic integrity:
# Verify a record hasn't been tampered with
fabra context verify ctx_018f3a2b-...
# ✓ Record integrity verified (hash matches)The record_hash is computed from canonical JSON of all fields (excluding the hash itself). Any modification — even a single character — produces a different hash.
This matters for:
- Compliance: Prove records haven't been altered
- Debugging: Trust that replays show real data
- Audits: Evidence, not assertions