thirdtrigger 4 days ago

Might be interesting to add an optional embedded Weaviate [1] with a flat-index [2] to the project. It wouldn't use external services and is fully disk-based. Would allow you to search the whole filesystem (about 1.5kb per file (384 dimensions) which would be added to the metadata as well).

1. https://weaviate.io/developers/weaviate/installation/embedde... 2. https://weaviate.io/developers/academy/py/vector_index/flat

1
binarymax 4 days ago

Why weaviate and not FAISS? The latter is faster and lighter.

bobvanluijt 3 days ago

It depends on additional filters and whether you want to use vector search only. The upside of using Faiss would be storing the ID as file metadata and embedding it in the Faiss index. However, if you need any other filters or data, you would need to store it somewhere else.

lysp 4 days ago

I think they are associated with the project