Improve Performance when filtering collection as part of Atlas Vector Search
Adding filters to the $vectorSearch aggregation step should improve performance of vectorSearch given that we are searching over a smaller subset of the collection. Especially if those filters are for fields that we have an index for.
When testing vectorSearch performance, this is not the case:
Our setup:
Index exists with type: vectorSearch, Index Fields: fieldA, fieldB, fieldC. Status is Active.
Add this filter to the $vectorSearch aggregation step:
'filter': {
'$and': [
{
‘fieldA’: value1,
‘fieldB’: False,
}
]
},
The original query times are as follows:
P50 Total Query Time: 2.0953004360198975
P95 Total Query Time: 3.579429221153259
P99 Total Query Time: 3.8835108208656313
The query times when filtering is applied are as follows:
P50 Total Query Time: 3.05288302898407
P95 Total Query Time: 4.789729547500606
P99 Total Query Time: 8.064464559555052
-
AdminHenry (Admin, MongoDB) commented
It's possible if you have a filter that is not very selective that the approximate search executes in cycles for a while where each greedy search through the HNSW layers leads to a candidate at the bottom most layer that does not meet the prefilter, causing wasteful similarity comparisons.
Can you share more about how many unique values exist for fieldA?