I am building an index on text files, containing an embedding vector and a file path, amongst other attributes.
I want to run kNN queries on these embeddings and use a directory path for pre-filtering.
e.g., Filtering by /data/folder1
would only consider files under this directory as candidates.
Also, the files may be contained right under folder1
or within subdirectories.
I thought about using prefix or wildcard queries and indexing the file paths as strings but this would be an expensive operation.
What is the most efficient way of achieving this?