Full Quality, 13× Faster Queries

Inference-free SPLADE nearly matches full SPLADE quality at 13× lower query latency, no GPU needed at query time. Benchmarked against full SPLADE and BM25 on Qdrant and pyserini.

Inference-Free SPLADE

Read and Write in Your Agent's Language

Agents read and write in tokens (BPE). But our DB engines are still designed for humans (UTF-8). Persist payloads as BPE token IDs with a static entropy coder and you get ~3.3× lossless compression, zero-cost persistence of LLM output, and a representation agents already speak.

Token-Native Storage

Search is everywhere and is one of the hardest problems in CS. Here's why I'm obsessed with it.

Why I'm Obsessed With Search

Year in Review

Reflections on a year of growth, travel, and pursuing mastery

2025

I Reverse-Engineered Exa.ai Infrastructure Cost with Napkin Math

Exploring the JSON embeddings to for matching new products with existing ones