Model	Type	Quality	Cost	Example
Multi-vector	OSS	Good	Low	ColBERT
Cross encoder	OSS	Great	Medium	BGE, sentence transformers
Rerank API	Private	Great	Medium	Cohere, Mixedbread, Jina
LLM API	Private	Best	Very High	GPT, Claude

Cross encoder rerankers:

BGE flag embedding reranker
Can be finetuned

from FlagEmbedding import FlagReranker
reranker = FlagReranker('BAAI/bge-reranker-v2-m3', use_fp16=True) # use fp16 for speed

query_passage_1 = ["What is vector search?", "Vector search is cool"]
query_passage_2 = [
  "What is vector search?",
  ("Vector search is an ML technique that uses vectors to represent data and find "
  "similar items based on their mathematical similarity in a high-dimensional space.")
]

scores = reranker.compute_score([query_passage_1, query_passage_2], normalize=True)
print(scores) # [0.22970796268567764, 0.9996698472749022]

Sort based on score

$ whoami

Topics to cover

Vectors

Vector search

What is Qdrant

The HNSW Index

Indexing:

Search:

Why care about search (relevance)?

Are vector search results relevant enough?

Reranking for RAG:

Popular rerankers for RAG:

ColBERT: Contextualized Late Interaction BERT

Cross encoder rerankers:

Cohere reranker:

Cohere response:

Reranker finetuning:

Fusion: 2nd way to get 2nd opinion

RRF: Reciprocal Ranked Fusion

RSF: Relative score Fusion

Summary