F
Fabra

Fabra: The Inference Context Ledger

Prove what your AI knew.

Fabra captures exactly what data your AI used at decision time — with full lineage, freshness guarantees, and replay. From notebook to production in 30 seconds.

Get Started → | Try in Browser →

At a Glance

What Inference Context Ledger — we own the write path
Context Record Immutable snapshot of AI decision context
Install pip install fabra-ai
Features @feature decorator for ML features
RAG @retriever + @context for LLM context assembly
Vector DB pgvector (Postgres extension)
Local DuckDB + in-memory (zero setup)
Production Postgres + Redis (one env var)
Deploy fabra deploy fly|cloudrun|ecs|railway|render

The Problem

You're building an AI app. You need:

  • Structured features (user tier, purchase history) for personalization
  • Unstructured context (relevant docs, chat history) for your LLM
  • Vector search for semantic retrieval
  • Token budgets to fit your context window

Today, this means stitching together LangChain, Pinecone, a feature store, Redis, and prayer.

Fabra stores, indexes, and serves the data your AI uses — and tracks exactly what was retrieved for every decision.

This is "write path ownership": we ingest and manage your context data, not just query it. This enables replay, lineage, and traceability that read-only wrappers cannot provide.


The 30-Second Quickstart

Fastest Path

pip install fabra-ai && fabra demo

That's it. Server starts, makes a test request, and prints a context_id (your receipt). No Docker. No config files. No API keys.

Next:

fabra context show <context_id>
fabra context verify <context_id>

Build Your Own

pip install fabra-ai
from fabra.core import FeatureStore, entity, feature
from fabra.context import context, ContextItem
from fabra.retrieval import retriever
from datetime import timedelta

store = FeatureStore()

@entity(store)
class User:
    user_id: str

@feature(entity=User, refresh=timedelta(days=1))
def user_tier(user_id: str) -> str:
    return "premium" if hash(user_id) % 2 == 0 else "free"

@retriever(index="docs", top_k=3)
async def find_docs(query: str):
    pass  # Automatic vector search via pgvector

@context(store, max_tokens=4000)
async def build_prompt(user_id: str, query: str):
    tier = await store.get_feature("user_tier", user_id)
    docs = await find_docs(query)
    return [
        ContextItem(content=f"User is {tier}.", priority=0),
        ContextItem(content=str(docs), priority=1),
    ]
fabra serve features.py
# Server running on http://localhost:8000

curl localhost:8000/features/user_tier?entity_id=user123
# {"value": "premium", "freshness_ms": 0, "served_from": "online"}

That's it. No infrastructure. No config files. Just Python.


Why Fabra?

Traditional Stack Fabra
Config 500 lines of YAML Python decorators
Infrastructure Kubernetes + Spark + Pinecone Your laptop (DuckDB)
RAG Pipeline LangChain spaghetti @retriever + @context
Feature Serving Separate feature store Same @feature decorator
Time to Production Weeks 30 seconds

We Own the Write Path

LangChain and other frameworks are read-only wrappers — they query your data but don't manage it. Fabra is the system of record for inference context. Every context assembly becomes a durable Context Record with:

  • Cryptographic integrity (tamper-evident hashes)
  • Full lineage (what data was used, when, from where)
  • Point-in-time replay (reproduce any decision exactly)

Infrastructure, Not a Framework

Fabra is not an orchestration layer. It's the system of record for what your AI knows. Features, retrievers, and context assembly in one infrastructure layer with production reliability.

Local-First, Production-Ready

FABRA_ENV=development  # DuckDB + In-Memory (default)
FABRA_ENV=production   # Postgres + Redis + pgvector

Same code. Zero changes. Just flip an environment variable.

Point-in-Time Correctness

Training ML models? We use ASOF JOIN (DuckDB) and LATERAL JOIN (Postgres) to ensure your training data reflects the world exactly as it was — no data leakage, ever.

Token Budget Management

@context(store, max_tokens=4000)
async def build_prompt(user_id: str, query: str):
    return [
        ContextItem(content=critical_info, priority=0, required=True),
        ContextItem(content=nice_to_have, priority=2),  # Dropped if over budget
    ]

Automatically assembles context that fits your LLM's window. Priority-based truncation. No more "context too long" errors.

Production-Grade Reliability

  • Self-Healing: fabra doctor diagnoses environment issues
  • Fallback Chain: Cache → Compute → Default
  • Circuit Breakers: Built-in protection against cascading failures
  • Observability: Prometheus metrics, structured JSON logging, OpenTelemetry

Key Capabilities

For AI Engineers (Context Store)

  • Vector Search: Built-in pgvector with automatic chunking and embedding
  • Magic Retrievers: @retriever auto-wires to your vector index
  • Context Assembly: Token budgets, priority truncation, explainability API
  • Semantic Cache: Cache expensive LLM calls and retrieval results

For ML Engineers (Feature Store)

For Everyone

For Compliance & Debugging

  • Context Accountability: Full lineage tracking — every AI decision traces back through the data that informed it
  • Context Replay: Reproduce exactly what your AI knew at any point in time for debugging and compliance
  • Traceability: UUIDv7-based context IDs with complete data provenance
  • Freshness SLAs: Ensure data freshness with configurable thresholds and degraded mode

Use Cases


Start Here

I'm an ML Engineer I'm an AI Engineer
"I need to serve features without Kubernetes" "I need RAG with traceability"
Feature Store Without K8s → Context Accountability →
Feast vs Fabra → Context Store →
Quickstart (ML Track) → Quickstart (AI Track) →

Building in a regulated industry? Compliance Guide →


Documentation

Getting Started

For ML Engineers

For AI Engineers

Guides

  • Comparisons — vs Feast, LangChain, Pinecone, Tecton

Tools

  • WebUI — Visual feature store & context explorer

Specifications

Reference

Blog


Quick FAQ

Q: What is Fabra? A: Fabra is context infrastructure for AI applications. It stores, indexes, and serves the data your AI uses — and tracks exactly what was retrieved for every decision. We call this "write path ownership": we manage your context data, not just query it.

Q: How is Fabra different from LangChain? A: LangChain is a framework (orchestration). Fabra is infrastructure (storage + serving). LangChain queries external stores; Fabra owns the write path with freshness tracking, replay, and full lineage. You can use both together.

Q: How is Fabra different from Feast? A: Fabra is a lightweight alternative with Python decorators instead of YAML, plus built-in context/RAG support (vector search, token budgeting, lineage) that Feast doesn't have.

Q: Do I need Kubernetes or Docker? A: No. Fabra runs locally with DuckDB and in-memory cache. For production, set FABRA_ENV=production with Postgres and Redis.

Q: What vector database does Fabra use? A: pgvector (Postgres extension). Your vectors live alongside your relational data—no separate vector database required.


Contributing

We love contributions! See CONTRIBUTING.md to get started.