Use Case: Churn Prediction
Churn prediction is the "Hello World" of Point-in-Time Correctness. If you get this wrong, your model will look perfect in training but fail in production.
The Problem: Data Leakage
Imagine you are training a model to predict if a user will churn next month. You have a features table:
| user_id | txn_count | timestamp |
|---|---|---|
| u1 | 10 | 2024-01-01 |
| u1 | 50 | 2024-02-01 |
And a labels table (churn events):
| user_id | churned | timestamp |
|---|---|---|
| u1 | True | 2024-01-15 |
If you naively join these tables on user_id, you might accidentally use the txn_count=50 feature (from Feb 1st) to predict the churn event (on Jan 15th). This is data leakage. You are using data from the future to predict the past.
The Solution: Point-in-Time Correctness
Fabra solves this automatically using ASOF JOIN (DuckDB) or LATERAL JOIN (Postgres).
# features.py
@feature(entity=User, sql="SELECT * FROM transactions", materialize=True)
def txn_count(user_id: str) -> int:
return 0
# training.py
training_df = await store.get_training_data(
entity_df=labels_df, # Contains user_id and timestamp (Jan 15th)
features=["txn_count"]
)Fabra ensures that for the label on Jan 15th, it only sees the feature value from Jan 1st (txn_count=10). It ignores the future value.
Why This Matters
- Correctness: Your offline metrics (AUC/F1) will match online performance.
- Simplicity: You don't need to write complex window functions or temporal joins manually.
- Consistency: The same logic applies whether you are using DuckDB locally or Postgres in production.
Next Steps
- Feature Store Without K8s — Get started in 30 seconds
- Feast vs Fabra — Why we're simpler
- Quickstart — Full setup guide