[django] How to combine two or more querysets in a Django view?

I am trying to build the search for a Django site I am building, and in that search, I am searching in 3 different models. And to get pagination on the search result list, I would like to use a generic object_list view to display the results. But to do that, I have to merge 3 querysets into one.

How can I do that? I've tried this:

result_list = []            
page_list = Page.objects.filter(
    Q(title__icontains=cleaned_search_term) | 
    Q(body__icontains=cleaned_search_term))
article_list = Article.objects.filter(
    Q(title__icontains=cleaned_search_term) | 
    Q(body__icontains=cleaned_search_term) | 
    Q(tags__icontains=cleaned_search_term))
post_list = Post.objects.filter(
    Q(title__icontains=cleaned_search_term) | 
    Q(body__icontains=cleaned_search_term) | 
    Q(tags__icontains=cleaned_search_term))

for x in page_list:
    result_list.append(x)
for x in article_list:
    result_list.append(x)
for x in post_list:
    result_list.append(x)

return object_list(
    request, 
    queryset=result_list, 
    template_object_name='result',
    paginate_by=10, 
    extra_context={
        'search_term': search_term},
    template_name="search/result_list.html")

But this doesn't work. I get an error when I try to use that list in the generic view. The list is missing the clone attribute.

Does anybody know how I can merge the three lists, page_list, article_list and post_list?

This question is related to django search django-queryset django-q

The answer is


The big downside of your current approach is its inefficiency with large search result sets, as you have to pull down the entire result set from the database each time, even though you only intend to display one page of results.

In order to only pull down the objects you actually need from the database, you have to use pagination on a QuerySet, not a list. If you do this, Django actually slices the QuerySet before the query is executed, so the SQL query will use OFFSET and LIMIT to only get the records you will actually display. But you can't do this unless you can cram your search into a single query somehow.

Given that all three of your models have title and body fields, why not use model inheritance? Just have all three models inherit from a common ancestor that has title and body, and perform the search as a single query on the ancestor model.


This recursive function concatenates array of querysets into one queryset.

def merge_query(ar):
    if len(ar) ==0:
        return [ar]
    while len(ar)>1:
        tmp=ar[0] | ar[1]
        ar[0]=tmp
        ar.pop(1)
        return ar

You can use the QuerySetChain class below. When using it with Django's paginator, it should only hit the database with COUNT(*) queries for all querysets and SELECT() queries only for those querysets whose records are displayed on the current page.

Note that you need to specify template_name= if using a QuerySetChain with generic views, even if the chained querysets all use the same model.

from itertools import islice, chain

class QuerySetChain(object):
    """
    Chains multiple subquerysets (possibly of different models) and behaves as
    one queryset.  Supports minimal methods needed for use with
    django.core.paginator.
    """

    def __init__(self, *subquerysets):
        self.querysets = subquerysets

    def count(self):
        """
        Performs a .count() for all subquerysets and returns the number of
        records as an integer.
        """
        return sum(qs.count() for qs in self.querysets)

    def _clone(self):
        "Returns a clone of this queryset chain"
        return self.__class__(*self.querysets)

    def _all(self):
        "Iterates records in all subquerysets"
        return chain(*self.querysets)

    def __getitem__(self, ndx):
        """
        Retrieves an item or slice from the chained set of results from all
        subquerysets.
        """
        if type(ndx) is slice:
            return list(islice(self._all(), ndx.start, ndx.stop, ndx.step or 1))
        else:
            return islice(self._all(), ndx, ndx+1).next()

In your example, the usage would be:

pages = Page.objects.filter(Q(title__icontains=cleaned_search_term) |
                            Q(body__icontains=cleaned_search_term))
articles = Article.objects.filter(Q(title__icontains=cleaned_search_term) |
                                  Q(body__icontains=cleaned_search_term) |
                                  Q(tags__icontains=cleaned_search_term))
posts = Post.objects.filter(Q(title__icontains=cleaned_search_term) |
                            Q(body__icontains=cleaned_search_term) | 
                            Q(tags__icontains=cleaned_search_term))
matches = QuerySetChain(pages, articles, posts)

Then use matches with the paginator like you used result_list in your example.

The itertools module was introduced in Python 2.3, so it should be available in all Python versions Django runs on.


This can be achieved by two ways either.

1st way to do this

Use union operator for queryset | to take union of two queryset. If both queryset belongs to same model / single model than it is possible to combine querysets by using union operator.

For an instance

pagelist1 = Page.objects.filter(
    Q(title__icontains=cleaned_search_term) | 
    Q(body__icontains=cleaned_search_term))
pagelist2 = Page.objects.filter(
    Q(title__icontains=cleaned_search_term) | 
    Q(body__icontains=cleaned_search_term))
combined_list = pagelist1 | pagelist2 # this would take union of two querysets

2nd way to do this

One other way to achieve combine operation between two queryset is to use itertools chain function.

from itertools import chain
combined_results = list(chain(pagelist1, pagelist2))

This will do the work without using any other libs

result_list = page_list | article_list | post_list

here's an idea... just pull down one full page of results from each of the three and then throw out the 20 least useful ones... this eliminates the large querysets and that way you only sacrifice a little performance instead of a lot


In case you want to chain a lot of querysets, try this:

from itertools import chain
result = list(chain(*docs))

where: docs is a list of querysets


Try this:

matches = pages | articles | posts

It retains all the functions of the querysets which is nice if you want to order_by or similar.

Please note: this doesn't work on querysets from two different models.


This recursive function concatenates array of querysets into one queryset.

def merge_query(ar):
    if len(ar) ==0:
        return [ar]
    while len(ar)>1:
        tmp=ar[0] | ar[1]
        ar[0]=tmp
        ar.pop(1)
        return ar

The big downside of your current approach is its inefficiency with large search result sets, as you have to pull down the entire result set from the database each time, even though you only intend to display one page of results.

In order to only pull down the objects you actually need from the database, you have to use pagination on a QuerySet, not a list. If you do this, Django actually slices the QuerySet before the query is executed, so the SQL query will use OFFSET and LIMIT to only get the records you will actually display. But you can't do this unless you can cram your search into a single query somehow.

Given that all three of your models have title and body fields, why not use model inheritance? Just have all three models inherit from a common ancestor that has title and body, and perform the search as a single query on the ancestor model.


You can use the QuerySetChain class below. When using it with Django's paginator, it should only hit the database with COUNT(*) queries for all querysets and SELECT() queries only for those querysets whose records are displayed on the current page.

Note that you need to specify template_name= if using a QuerySetChain with generic views, even if the chained querysets all use the same model.

from itertools import islice, chain

class QuerySetChain(object):
    """
    Chains multiple subquerysets (possibly of different models) and behaves as
    one queryset.  Supports minimal methods needed for use with
    django.core.paginator.
    """

    def __init__(self, *subquerysets):
        self.querysets = subquerysets

    def count(self):
        """
        Performs a .count() for all subquerysets and returns the number of
        records as an integer.
        """
        return sum(qs.count() for qs in self.querysets)

    def _clone(self):
        "Returns a clone of this queryset chain"
        return self.__class__(*self.querysets)

    def _all(self):
        "Iterates records in all subquerysets"
        return chain(*self.querysets)

    def __getitem__(self, ndx):
        """
        Retrieves an item or slice from the chained set of results from all
        subquerysets.
        """
        if type(ndx) is slice:
            return list(islice(self._all(), ndx.start, ndx.stop, ndx.step or 1))
        else:
            return islice(self._all(), ndx, ndx+1).next()

In your example, the usage would be:

pages = Page.objects.filter(Q(title__icontains=cleaned_search_term) |
                            Q(body__icontains=cleaned_search_term))
articles = Article.objects.filter(Q(title__icontains=cleaned_search_term) |
                                  Q(body__icontains=cleaned_search_term) |
                                  Q(tags__icontains=cleaned_search_term))
posts = Post.objects.filter(Q(title__icontains=cleaned_search_term) |
                            Q(body__icontains=cleaned_search_term) | 
                            Q(tags__icontains=cleaned_search_term))
matches = QuerySetChain(pages, articles, posts)

Then use matches with the paginator like you used result_list in your example.

The itertools module was introduced in Python 2.3, so it should be available in all Python versions Django runs on.


Requirements: Django==2.0.2, django-querysetsequence==0.8

In case you want to combine querysets and still come out with a QuerySet, you might want to check out django-queryset-sequence.

But one note about it. It only takes two querysets as it's argument. But with python reduce you can always apply it to multiple querysets.

from functools import reduce
from queryset_sequence import QuerySetSequence

combined_queryset = reduce(QuerySetSequence, list_of_queryset)

And that's it. Below is a situation I ran into and how I employed list comprehension, reduce and django-queryset-sequence

from functools import reduce
from django.shortcuts import render    
from queryset_sequence import QuerySetSequence

class People(models.Model):
    user = models.OneToOneField(User, on_delete=models.CASCADE)
    mentor = models.ForeignKey('self', null=True, on_delete=models.SET_NULL, related_name='my_mentees')

class Book(models.Model):
    name = models.CharField(max_length=20)
    owner = models.ForeignKey(Student, on_delete=models.CASCADE)

# as a mentor, I want to see all the books owned by all my mentees in one view.
def mentee_books(request):
    template = "my_mentee_books.html"
    mentor = People.objects.get(user=request.user)
    my_mentees = mentor.my_mentees.all() # returns QuerySet of all my mentees
    mentee_books = reduce(QuerySetSequence, [each.book_set.all() for each in my_mentees])

    return render(request, template, {'mentee_books' : mentee_books})

You can use the QuerySetChain class below. When using it with Django's paginator, it should only hit the database with COUNT(*) queries for all querysets and SELECT() queries only for those querysets whose records are displayed on the current page.

Note that you need to specify template_name= if using a QuerySetChain with generic views, even if the chained querysets all use the same model.

from itertools import islice, chain

class QuerySetChain(object):
    """
    Chains multiple subquerysets (possibly of different models) and behaves as
    one queryset.  Supports minimal methods needed for use with
    django.core.paginator.
    """

    def __init__(self, *subquerysets):
        self.querysets = subquerysets

    def count(self):
        """
        Performs a .count() for all subquerysets and returns the number of
        records as an integer.
        """
        return sum(qs.count() for qs in self.querysets)

    def _clone(self):
        "Returns a clone of this queryset chain"
        return self.__class__(*self.querysets)

    def _all(self):
        "Iterates records in all subquerysets"
        return chain(*self.querysets)

    def __getitem__(self, ndx):
        """
        Retrieves an item or slice from the chained set of results from all
        subquerysets.
        """
        if type(ndx) is slice:
            return list(islice(self._all(), ndx.start, ndx.stop, ndx.step or 1))
        else:
            return islice(self._all(), ndx, ndx+1).next()

In your example, the usage would be:

pages = Page.objects.filter(Q(title__icontains=cleaned_search_term) |
                            Q(body__icontains=cleaned_search_term))
articles = Article.objects.filter(Q(title__icontains=cleaned_search_term) |
                                  Q(body__icontains=cleaned_search_term) |
                                  Q(tags__icontains=cleaned_search_term))
posts = Post.objects.filter(Q(title__icontains=cleaned_search_term) |
                            Q(body__icontains=cleaned_search_term) | 
                            Q(tags__icontains=cleaned_search_term))
matches = QuerySetChain(pages, articles, posts)

Then use matches with the paginator like you used result_list in your example.

The itertools module was introduced in Python 2.3, so it should be available in all Python versions Django runs on.


This can be achieved by two ways either.

1st way to do this

Use union operator for queryset | to take union of two queryset. If both queryset belongs to same model / single model than it is possible to combine querysets by using union operator.

For an instance

pagelist1 = Page.objects.filter(
    Q(title__icontains=cleaned_search_term) | 
    Q(body__icontains=cleaned_search_term))
pagelist2 = Page.objects.filter(
    Q(title__icontains=cleaned_search_term) | 
    Q(body__icontains=cleaned_search_term))
combined_list = pagelist1 | pagelist2 # this would take union of two querysets

2nd way to do this

One other way to achieve combine operation between two queryset is to use itertools chain function.

from itertools import chain
combined_results = list(chain(pagelist1, pagelist2))

DATE_FIELD_MAPPING = {
    Model1: 'date',
    Model2: 'pubdate',
}

def my_key_func(obj):
    return getattr(obj, DATE_FIELD_MAPPING[type(obj)])

And then sorted(chain(Model1.objects.all(), Model2.objects.all()), key=my_key_func)

Quoted from https://groups.google.com/forum/#!topic/django-users/6wUNuJa4jVw. See Alex Gaynor


Related, for mixing querysets from the same model, or for similar fields from a few models, Starting with Django 1.11 a QuerySet.union() method is also available:

union()

union(*other_qs, all=False)

New in Django 1.11. Uses SQL’s UNION operator to combine the results of two or more QuerySets. For example:

>>> qs1.union(qs2, qs3)

The UNION operator selects only distinct values by default. To allow duplicate values, use the all=True argument.

union(), intersection(), and difference() return model instances of the type of the first QuerySet even if the arguments are QuerySets of other models. Passing different models works as long as the SELECT list is the same in all QuerySets (at least the types, the names don’t matter as long as the types in the same order).

In addition, only LIMIT, OFFSET, and ORDER BY (i.e. slicing and order_by()) are allowed on the resulting QuerySet. Further, databases place restrictions on what operations are allowed in the combined queries. For example, most databases don’t allow LIMIT or OFFSET in the combined queries.


Requirements: Django==2.0.2, django-querysetsequence==0.8

In case you want to combine querysets and still come out with a QuerySet, you might want to check out django-queryset-sequence.

But one note about it. It only takes two querysets as it's argument. But with python reduce you can always apply it to multiple querysets.

from functools import reduce
from queryset_sequence import QuerySetSequence

combined_queryset = reduce(QuerySetSequence, list_of_queryset)

And that's it. Below is a situation I ran into and how I employed list comprehension, reduce and django-queryset-sequence

from functools import reduce
from django.shortcuts import render    
from queryset_sequence import QuerySetSequence

class People(models.Model):
    user = models.OneToOneField(User, on_delete=models.CASCADE)
    mentor = models.ForeignKey('self', null=True, on_delete=models.SET_NULL, related_name='my_mentees')

class Book(models.Model):
    name = models.CharField(max_length=20)
    owner = models.ForeignKey(Student, on_delete=models.CASCADE)

# as a mentor, I want to see all the books owned by all my mentees in one view.
def mentee_books(request):
    template = "my_mentee_books.html"
    mentor = People.objects.get(user=request.user)
    my_mentees = mentor.my_mentees.all() # returns QuerySet of all my mentees
    mentee_books = reduce(QuerySetSequence, [each.book_set.all() for each in my_mentees])

    return render(request, template, {'mentee_books' : mentee_books})

You can use the QuerySetChain class below. When using it with Django's paginator, it should only hit the database with COUNT(*) queries for all querysets and SELECT() queries only for those querysets whose records are displayed on the current page.

Note that you need to specify template_name= if using a QuerySetChain with generic views, even if the chained querysets all use the same model.

from itertools import islice, chain

class QuerySetChain(object):
    """
    Chains multiple subquerysets (possibly of different models) and behaves as
    one queryset.  Supports minimal methods needed for use with
    django.core.paginator.
    """

    def __init__(self, *subquerysets):
        self.querysets = subquerysets

    def count(self):
        """
        Performs a .count() for all subquerysets and returns the number of
        records as an integer.
        """
        return sum(qs.count() for qs in self.querysets)

    def _clone(self):
        "Returns a clone of this queryset chain"
        return self.__class__(*self.querysets)

    def _all(self):
        "Iterates records in all subquerysets"
        return chain(*self.querysets)

    def __getitem__(self, ndx):
        """
        Retrieves an item or slice from the chained set of results from all
        subquerysets.
        """
        if type(ndx) is slice:
            return list(islice(self._all(), ndx.start, ndx.stop, ndx.step or 1))
        else:
            return islice(self._all(), ndx, ndx+1).next()

In your example, the usage would be:

pages = Page.objects.filter(Q(title__icontains=cleaned_search_term) |
                            Q(body__icontains=cleaned_search_term))
articles = Article.objects.filter(Q(title__icontains=cleaned_search_term) |
                                  Q(body__icontains=cleaned_search_term) |
                                  Q(tags__icontains=cleaned_search_term))
posts = Post.objects.filter(Q(title__icontains=cleaned_search_term) |
                            Q(body__icontains=cleaned_search_term) | 
                            Q(tags__icontains=cleaned_search_term))
matches = QuerySetChain(pages, articles, posts)

Then use matches with the paginator like you used result_list in your example.

The itertools module was introduced in Python 2.3, so it should be available in all Python versions Django runs on.


In case you want to chain a lot of querysets, try this:

from itertools import chain
result = list(chain(*docs))

where: docs is a list of querysets


here's an idea... just pull down one full page of results from each of the three and then throw out the 20 least useful ones... this eliminates the large querysets and that way you only sacrifice a little performance instead of a lot


You can use Union

qs = qs1.union(qs2, qs3)

But if you want to apply order_by on the foreign models of the combined queryset.. then you need to Select them before hand this way.. otherwise it won't work

Example

qs = qs1.union(qs2.select_related("foreignModel"), qs3.select_related("foreignModel"))
qs.order_by("foreignModel__prop1")

where prop1 is a property in the foreign model


This will do the work without using any other libs

result_list = page_list | article_list | post_list

here's an idea... just pull down one full page of results from each of the three and then throw out the 20 least useful ones... this eliminates the large querysets and that way you only sacrifice a little performance instead of a lot


The big downside of your current approach is its inefficiency with large search result sets, as you have to pull down the entire result set from the database each time, even though you only intend to display one page of results.

In order to only pull down the objects you actually need from the database, you have to use pagination on a QuerySet, not a list. If you do this, Django actually slices the QuerySet before the query is executed, so the SQL query will use OFFSET and LIMIT to only get the records you will actually display. But you can't do this unless you can cram your search into a single query somehow.

Given that all three of your models have title and body fields, why not use model inheritance? Just have all three models inherit from a common ancestor that has title and body, and perform the search as a single query on the ancestor model.


Try this:

matches = pages | articles | posts

It retains all the functions of the querysets which is nice if you want to order_by or similar.

Please note: this doesn't work on querysets from two different models.


Related, for mixing querysets from the same model, or for similar fields from a few models, Starting with Django 1.11 a QuerySet.union() method is also available:

union()

union(*other_qs, all=False)

New in Django 1.11. Uses SQL’s UNION operator to combine the results of two or more QuerySets. For example:

>>> qs1.union(qs2, qs3)

The UNION operator selects only distinct values by default. To allow duplicate values, use the all=True argument.

union(), intersection(), and difference() return model instances of the type of the first QuerySet even if the arguments are QuerySets of other models. Passing different models works as long as the SELECT list is the same in all QuerySets (at least the types, the names don’t matter as long as the types in the same order).

In addition, only LIMIT, OFFSET, and ORDER BY (i.e. slicing and order_by()) are allowed on the resulting QuerySet. Further, databases place restrictions on what operations are allowed in the combined queries. For example, most databases don’t allow LIMIT or OFFSET in the combined queries.


here's an idea... just pull down one full page of results from each of the three and then throw out the 20 least useful ones... this eliminates the large querysets and that way you only sacrifice a little performance instead of a lot


The big downside of your current approach is its inefficiency with large search result sets, as you have to pull down the entire result set from the database each time, even though you only intend to display one page of results.

In order to only pull down the objects you actually need from the database, you have to use pagination on a QuerySet, not a list. If you do this, Django actually slices the QuerySet before the query is executed, so the SQL query will use OFFSET and LIMIT to only get the records you will actually display. But you can't do this unless you can cram your search into a single query somehow.

Given that all three of your models have title and body fields, why not use model inheritance? Just have all three models inherit from a common ancestor that has title and body, and perform the search as a single query on the ancestor model.


You can use Union

qs = qs1.union(qs2, qs3)

But if you want to apply order_by on the foreign models of the combined queryset.. then you need to Select them before hand this way.. otherwise it won't work

Example

qs = qs1.union(qs2.select_related("foreignModel"), qs3.select_related("foreignModel"))
qs.order_by("foreignModel__prop1")

where prop1 is a property in the foreign model


DATE_FIELD_MAPPING = {
    Model1: 'date',
    Model2: 'pubdate',
}

def my_key_func(obj):
    return getattr(obj, DATE_FIELD_MAPPING[type(obj)])

And then sorted(chain(Model1.objects.all(), Model2.objects.all()), key=my_key_func)

Quoted from https://groups.google.com/forum/#!topic/django-users/6wUNuJa4jVw. See Alex Gaynor


Examples related to django

How to fix error "ERROR: Command errored out with exit status 1: python." when trying to install django-heroku using pip Pylint "unresolved import" error in Visual Studio Code Is it better to use path() or url() in urls.py for django 2.0? Unable to import path from django.urls Error loading MySQLdb Module 'Did you install mysqlclient or MySQL-python?' ImportError: Couldn't import Django Django - Reverse for '' not found. '' is not a valid view function or pattern name Class has no objects member Getting TypeError: __init__() missing 1 required positional argument: 'on_delete' when trying to add parent table after child table with entries How to switch Python versions in Terminal? Find a file by name in Visual Studio Code Search all the occurrences of a string in the entire project in Android Studio Java List.contains(Object with field value equal to x) Trigger an action after selection select2 How can I search for a commit message on GitHub? SQL search multiple values in same field Find a string by searching all tables in SQL Server Management Studio 2008 Search File And Find Exact Match And Print Line? Java - Search for files in a directory How to put a delay on AngularJS instant search?

Examples related to django-queryset

Django values_list vs values Get the latest record with filter in Django How to do a less than or equal to filter in Django queryset? How can I filter a Django query with a list of values? How to obtain a QuerySet of all rows, with specific fields for each one of them? How to perform OR condition in django queryset? Getting a count of objects in a queryset in django How do I filter query objects by date range in Django? How to remove all of the data in a table using Django Select DISTINCT individual columns in django?

Examples related to django-q

How to combine two or more querysets in a Django view?