jakozaur 13 hours ago

This feels similar to TurboBuffer, which is also built on top of S3 storage. TurboBuffer has been a leader in this space, powering vector search for major companies like Cursor, Linear, and Notion.

It seems AWS is leveraging its strong S3 brand to compete directly in the vector database market.

For more context, check out the TurboBuffer architecture docs and Notion’s presentation from the Data Council:

https://turbopuffer.com/docs/architecture

https://www.youtube.com/watch?v=_yb6Nw21QxA

(anti-disclaimer: I'm not affiliated with TurboBuffer in any way.)

  • benji1009 11 hours ago

    turbopuffer is the name btw

bob1029 11 hours ago

I still contend that what most people want is traditional full text search, not another layer of black box weirdness behind the LLM.

You already have a model with incredibly powerful semantic understanding. Why do we need the document store to also be a smartass? The model can project multiple OR clauses into the search term based upon its interpretation of the context.

If you are using something like Lucene, queries are extremely fast and the maximum # of supported documents in one index far exceeds what AWS says they can support here.

  • pilotneko 8 hours ago

    Keyword alone sucks for negation. Searching a set of patient documents for “Which of my patients has COPD?” to get a set of responses that states “COPD not indicated” will not endear you to the physician using your tool. Hybrid (keyword + semantic) is much more all-encompassing.

    • bob1029 7 hours ago

      Forwarding the users query directly to the document store seems ridiculous to me. The whole point is for the LLM to interpret the context and issue multiple targeted queries based upon the interpretation(s) arrived at.

      The LLM is the semantic part. FTS is the keyword part. This is the hybrid you're looking for.

      • pilotneko 4 hours ago

        Sometimes you are searching for supporting evidence that is semantically related. COPD was just an example, you won’t get a direct keyword match if the Physician is searching for lung disease.