Rank by Vector Similarity¶

Description¶

Ranks objects in SOURCE [OBJ,VECTOR] according to the relevance scores of each VEC with QUERY [VECTOR]. The relevance score is computed using the Euclidean distance between vectors.

Input¶

SOURCE [OBJ, VECTOR]: a 2-column input with an object-vector pair. Typically obtained with the Extract vectors block.
QVECS [VECTOR]: a list of vectors to rank SOURCE objects against

Note: Vectors can differ in length, model, and pooling method. For correct usage, each vector should be encoded by the same embedding model, ideally with the same pooling method.

Output¶

RETRIEVE [OBJ]: a list of ranked objects

Parameters¶

Search type: the method used for vector similarity search
- EXACT: computes the exact distance between each source and query vector, only recommended for a small amount of source vectors (~100,000 or less)
- HNSW: computes the approximate distance between each source and query vector, based on the Hierarchical Navigable Small World algorithm.
K value: the amount of objects to retrieve when using an approximate Search type, greatly affects search time.
Index name: name necessary for storing the graph-based indices used during approximate search, needs to be unique per source data

Output scores can be normalised.

Note: When using HNSW, if the SOURCE vectors are changed/updated the index will not automatically update. Change Index name to create a new index and see the changes.