Latest

Kumar Shivendu's blog

  • A friend said they expand SPLADE terms into an OpenSearch BM25 text field and it works. We tested it — 0.36 NDCG@10, worse than plain BM25. Here is why, and how to mostly fix it.
    Published on
  • Vector databases compress text payloads with LZ4 by default. Tokenize the text first and entropy-code the token IDs, and you get ~3x lossless compression, 2.6x what LZ4 manages, using pieces every ML stack already ships.
    Published on
  • Search is everywhere and is one of the hardest problems in CS. Here's why I'm obsessed with it.
    Published on
  • Reflections on a year of growth, travel, and pursuing mastery
    Published on
Subscribe to the newsletter