What s the difference between select related and prefetch related in Django ORM

Question

In Django doc   select related    quot follows quot  foreign-key relationships  selecting additional related-object data when it executes its query  prefetch related   does a separate lookup for each relationship  and does the  quot joining quot  in Python   What does it mean by  quot doing the joining in python quot   Can someone illustrate with an example  My understanding is that for foreign key relationship  use select related  and for M2M relationship  use prefetch related  Is this correct

User · Accepted Answer

Your understanding is mostly correct  You use select related when the object that you re going to be selecting is a single object  so OneToOneField or a ForeignKey  You use prefetch related when you re going to get a  set  of things  so ManyToManyFields as you stated or reverse ForeignKeys  Just to clarify what I mean by  reverse ForeignKeys  here s an example   class ModelA models Model       pass  class ModelB models Model       a   ForeignKey ModelA   ModelB objects select related  a   all     Forward ForeignKey relationship ModelA objects prefetch related  modelb set   all     Reverse ForeignKey relationship   The difference is that select related does an SQL join and therefore gets the results back as part of the table from the SQL server  prefetch related on the other hand executes another query and therefore reduces the redundant columns in the original object  ModelA in the above example   You may use prefetch related for anything that you can use select related for   The tradeoffs are that prefetch related has to create and send a list of IDs to select back to the server  this can take a while  I m not sure if there s a nice way of doing this in a transaction  but my understanding is that Django always just sends a list and says SELECT     WHERE pk IN               basically  In this case if the prefetched data is sparse  let s say U S  State objects linked to people s addresses  this can be very good  however if it s closer to one-to-one  this can waste a lot of communications  If in doubt  try both and see which performs better   Everything discussed above is basically about the communications with the database  On the Python side however prefetch related has the extra benefit that a single object is used to represent each object in the database  With select related duplicate objects will be created in Python for each  parent  object  Since objects in Python have a decent bit of memory overhead this can also be a consideration

User · Answer

Both methods achieve the same purpose  to forego unnecessary db queries  But they use different approaches for efficiency   The only reason to use either of these methods is when a single large query is preferable to many small queries  Django uses the large query to create models in memory preemptively rather than performing on demand queries against the database    select related performs a join with each lookup  but extends the select to include the columns of all joined tables  However this approach has a caveat   Joins have the potential to multiply the number of rows in a query  When you perform a join over a foreign key or one-to-one field  the number of rows won t increase  However  many-to-many joins do not have this guarantee  So  Django restricts select related to relations that won t unexpectedly result in a massive join    The  join in python  for prefetch related is a little more alarming then it should be  It creates a separate query for each table to be joined  It filters each of these table with a WHERE IN clause  like   SELECT  credential   id           credential   uuid           credential   identity id  FROM    credential  WHERE   credential   identity id  IN      84706  48746  871441  84713  76492  84621  51472     Rather than performing a single join with potentially too many rows  each table is split into a separate query

User · Answer

As Django documentation says      prefetch related        Returns a QuerySet that will automatically retrieve  in a single   batch  related objects for each of the specified lookups       This has a similar purpose to select related  in that both are   designed to stop the deluge of database queries that is caused by   accessing related objects  but the strategy is quite different       select related works by creating an SQL join and including the fields   of the related object in the SELECT statement  For this reason    select related gets the related objects in the same database query    However  to avoid the much larger result set that would result from   joining across a    many    relationship  select related is limited to   single-valued relationships - foreign key and one-to-one       prefetch related  on the other hand  does a separate lookup for each   relationship  and does the    joining    in Python  This allows it to   prefetch many-to-many and many-to-one objects  which cannot be done   using select related  in addition to the foreign key and one-to-one   relationships that are supported by select related  It also supports   prefetching of GenericRelation and GenericForeignKey  however  it must   be restricted to a homogeneous set of results  For example    prefetching objects referenced by a GenericForeignKey is only   supported if the query is restricted to one ContentType    More information about this  https   docs djangoproject com en 2 2 ref models querysets  prefetch-related

User · Answer

Gone through the already posted answers  Just thought it would be better if I add an answer with actual example   Let  say you have 3 Django models which are related   class M1 models Model       name   models CharField max length 10   class M2 models Model       name   models CharField max length 10      select relation   models ForeignKey M1  on delete models CASCADE      prefetch relation   models ManyToManyField to  M3    class M3 models Model       name   models CharField max length 10    Here you can query M2 model and its relative M1 objects using select relation field and M3 objects using prefetch relation field   However as we ve mentioned M1 s relation from M2 is a ForeignKey  it just returns only 1 record for any M2 object  Same thing applies for OneToOneField as well   But M3 s relation from M2 is a ManyToManyField which might return any number of M1 objects    Consider a case where you have 2 M2 objects m21  m22 who have same 5 associated M3 objects with IDs 1 2 3 4 5  When you fetch associated M3 objects for each of those M2 objects  if you use select related  this is how it s going to work   Steps    Find m21 object  Query all the M3 objects related to m21 object whose IDs are 1 2 3 4 5  Repeat same thing for m22 object and all other M2 objects    As we have same 1 2 3 4 5 IDs for both m21  m22 objects  if we use select related option  it s going to query the DB twice for the same IDs which were already fetched   Instead if you use prefetch related  when you try to get M2 objects  it will make a note of all the IDs that your objects returned  Note  only the IDs  while querying M2 table and as last step  Django is going to make a query to M3 table with the set of all IDs that your M2 objects have returned  and join them to M2 objects using Python instead of database   This way you re querying all the M3 objects only once which improves performance

[python] What's the difference between select_related and prefetch_related in Django ORM?

Examples related to python

Examples related to django

Examples related to django-models

Examples related to django-orm