I’ve worked a little more with Qdrant’s hybrid queries and noticed that they are useful beyond what I described in my last article. When building a recommendation system, we usually start with vector search to retrieve promising candidates — for example, the top-M videos for a user based on their past behavior in a shared embedding space. This stage focuses on recall, ensuring that relevant items make it into the shortlist. The next step, re-ranking, then improves precision by surfacing the most relevant items using richer signals, such as user affinities toward certain item types.

This often involves multiple systems — a vector database like Qdrant for candidate generation and a separate model (such as XGBoost or a neural network) for re-ranking — leading to additional complexity.

It turns out that you can use Qdrant’s hybrid queries to push model-based re-ranking into the vector database itself! One can implement lightweight models, such as logistic regression, directly inside the query, performing candidate generation and re-ranking through a single API call. It’s useful in early-stage projects where iteration speed and going live quickly are more important than using the most sophisticated model possible. Let’s take a look at an example.

Recall the definition of the probability function of a logistic regression model:

\[ P(y = 1 \mid \mathbf{x}) = \sigma(\mathbf{w}^\top \mathbf{x} + b) = \frac{1}{1 + e^{-(\mathbf{w}^\top \mathbf{x} + b)}} \]

where P(y = 1 | x) models the relevance for candidate x. This can be expressed via the Qdrant API as follows:

from qdrant_client import models

w = [0.1, 0.5, 0.3, 0.1] # example weights
b = 0.2 # example bias
user_features = {...}

# z = w^T x + b
z_linear = models.SumExpression(sum=[
    models.MultExpression(mult=[w[0], "$score"]),
    models.MultExpression(mult=[w[1], "x1"]), # from Qdrant payload
    models.MultExpression(mult=[w[2], "x2"]), # from Qdrant payload
    models.MultExpression(mult=[w[3], user_features[user_id]["item_affinity"]]),
    b,
])

Note that we can drop the sigmoid term as it does not influence the ranking. This formula can then be used as part of a Qdrant query:

from qdrant_client import QdrantClient

client = QdrantClient(url="http://localhost:6333")

result = client.query_points(
    collection_name="mycollection",
    prefetch=models.Prefetch(
        query=[0.2, 0.8, 0.6],
        limit=50
    ),
    query=models.FormulaQuery(
        formula=z_linear
    )
)

That’s it! This query now combines both candidate generation (via prefetch) and re-ranking. Note that the weights of the logistic regression model must still be learned — for example, using an implementation from scikit-learn. This approach isn’t meant to replace a full-fledged ranking system though, but I think it could be an excellent starting point for rapid experimentation and early-stage deployments.