Skip to content

Conversation

@jupyterjazz
Copy link
Contributor

@jupyterjazz jupyterjazz commented Jun 12, 2023

#1640

Hybrid search (find+filter) for InMemoryExactNNIndex was prioritizing low similarities (lower scores) for returned matches. Fixed by adding an option to sort matches in a reverse order based on their scores.

# prepare a query
q_doc = MyDoc(embedding=np.random.rand(128), text='query')

query = (
    db.build_query()
    .find(query=q_doc, search_field='embedding')
    .filter(filter_query={'text': {'$exists': True}})
    .build()
)

results = db.execute_query(query)
# Before: results was sorted from worst to best matches
# Now: It's sorted in the correct order, showing better matches first

Signed-off-by: jupyterjazz <saba.sturua@jina.ai>
Comment on lines 31 to 40
Args:
doc_index: Document index instance.
Either InMemoryExactNNIndex or HnswDocumentIndex.
query: Dictionary containing search and filtering configuration.
reverse_order: Flag indicating whether to sort in descending order. If set to
False (default), the sorting will be in ascending order.
Returns:
Sorted documents and their corresponding scores.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not the right style for docstring

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got too used to langchain docstrings lol

Copy link
Member

@samsja samsja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good.

Tho I have to say it is unclear why adding this reverse sorting fix the original problem. Can you add a comment somewhere in the code to explain it ?

FYI: docstring are not in the right format

Signed-off-by: jupyterjazz <saba.sturua@jina.ai>
@jupyterjazz jupyterjazz requested a review from samsja June 13, 2023 07:42
@github-actions
Copy link

📝 Docs are deployed on https://ft-fix-filter-and-find--jina-docs.netlify.app 🎉

@jupyterjazz jupyterjazz merged commit f36c621 into main Jun 13, 2023
@jupyterjazz jupyterjazz deleted the fix-filter-and-find branch June 13, 2023 08:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants