[django] When saving, how can you check if a field has changed?

In my model I have :

class Alias(MyBaseModel):
    remote_image = models.URLField(max_length=500, null=True, help_text="A URL that is downloaded and cached for the image. Only
 used when the alias is made")
    image = models.ImageField(upload_to='alias', default='alias-default.png', help_text="An image representing the alias")


    def save(self, *args, **kw):
        if (not self.image or self.image.name == 'alias-default.png') and self.remote_image :
            try :
                data = utils.fetch(self.remote_image)
                image = StringIO.StringIO(data)
                image = Image.open(image)
                buf = StringIO.StringIO()
                image.save(buf, format='PNG')
                self.image.save(hashlib.md5(self.string_id).hexdigest() + ".png", ContentFile(buf.getvalue()))
            except IOError :
                pass

Which works great for the first time the remote_image changes.

How can I fetch a new image when someone has modified the remote_image on the alias? And secondly, is there a better way to cache a remote image?

This question is related to django image caching django-models

The answer is


The mixin from @ivanlivski is great.

I've extended it to

  • Ensure it works with Decimal fields.
  • Expose properties to simplify usage

The updated code is available here: https://github.com/sknutsonsf/python-contrib/blob/master/src/django/utils/ModelDiffMixin.py

To help people new to Python or Django, I'll give a more complete example. This particular usage is to take a file from a data provider and ensure the records in the database reflect the file.

My model object:

class Station(ModelDiffMixin.ModelDiffMixin, models.Model):
    station_name = models.CharField(max_length=200)
    nearby_city = models.CharField(max_length=200)

    precipitation = models.DecimalField(max_digits=5, decimal_places=2)
    # <list of many other fields>

   def is_float_changed (self,v1, v2):
        ''' Compare two floating values to just two digit precision
        Override Default precision is 5 digits
        '''
        return abs (round (v1 - v2, 2)) > 0.01

The class that loads the file has these methods:

class UpdateWeather (object)
    # other methods omitted

    def update_stations (self, filename):
        # read all existing data 
        all_stations = models.Station.objects.all()
        self._existing_stations = {}

        # insert into a collection for referencing while we check if data exists
        for stn in all_stations.iterator():
            self._existing_stations[stn.id] = stn

        # read the file. result is array of objects in known column order
        data = read_tabbed_file(filename)

        # iterate rows from file and insert or update where needed
        for rownum in range(sh.nrows):
            self._update_row(sh.row(rownum));

        # now anything remaining in the collection is no longer active
        # since it was not found in the newest file
        # for now, delete that record
        # there should never be any of these if the file was created properly
        for stn in self._existing_stations.values():
            stn.delete()
            self._num_deleted = self._num_deleted+1


    def _update_row (self, rowdata):
        stnid = int(rowdata[0].value) 
        name = rowdata[1].value.strip()

        # skip the blank names where data source has ids with no data today
        if len(name) < 1:
            return

        # fetch rest of fields and do sanity test
        nearby_city = rowdata[2].value.strip()
        precip = rowdata[3].value

        if stnid in self._existing_stations:
            stn = self._existing_stations[stnid]
            del self._existing_stations[stnid]
            is_update = True;
        else:
            stn = models.Station()
            is_update = False;

        # object is new or old, don't care here            
        stn.id = stnid
        stn.station_name = name;
        stn.nearby_city = nearby_city
        stn.precipitation = precip

        # many other fields updated from the file 

        if is_update == True:

            # we use a model mixin to simplify detection of changes
            # at the cost of extra memory to store the objects            
            if stn.has_changed == True:
                self._num_updated = self._num_updated + 1;
                stn.save();
        else:
            self._num_created = self._num_created + 1;
            stn.save()

And now for direct answer: one way to check if the value for the field has changed is to fetch original data from database before saving instance. Consider this example:

class MyModel(models.Model):
    f1 = models.CharField(max_length=1)

    def save(self, *args, **kw):
        if self.pk is not None:
            orig = MyModel.objects.get(pk=self.pk)
            if orig.f1 != self.f1:
                print 'f1 changed'
        super(MyModel, self).save(*args, **kw)

The same thing applies when working with a form. You can detect it at the clean or save method of a ModelForm:

class MyModelForm(forms.ModelForm):

    def clean(self):
        cleaned_data = super(ProjectForm, self).clean()
        #if self.has_changed():  # new instance or existing updated (form has data to save)
        if self.instance.pk is not None:  # new instance only
            if self.instance.f1 != cleaned_data['f1']:
                print 'f1 changed'
        return cleaned_data

    class Meta:
        model = MyModel
        exclude = []

improving @josh answer for all fields:

class Person(models.Model):
  name = models.CharField()

def __init__(self, *args, **kwargs):
    super(Person, self).__init__(*args, **kwargs)
    self._original_fields = dict([(field.attname, getattr(self, field.attname))
        for field in self._meta.local_fields if not isinstance(field, models.ForeignKey)])

def save(self, *args, **kwargs):
  if self.id:
    for field in self._meta.local_fields:
      if not isinstance(field, models.ForeignKey) and\
        self._original_fields[field.name] != getattr(self, field.name):
        # Do Something    
  super(Person, self).save(*args, **kwargs)

just to clarify, the getattr works to get fields like person.name with strings (i.e. getattr(person, "name")


I had this situation before my solution was to override the pre_save() method of the target field class it will be called only if the field has been changed
useful with FileField example:

class PDFField(FileField):
    def pre_save(self, model_instance, add):
        # do some operations on your file 
        # if and only if you have changed the filefield

disadvantage:
not useful if you want to do any (post_save) operation like using the created object in some job (if certain field has changed)


Another late answer, but if you're just trying to see if a new file has been uploaded to a file field, try this: (adapted from Christopher Adams's comment on the link http://zmsmith.com/2010/05/django-check-if-a-field-has-changed/ in zach's comment here)

Updated link: https://web.archive.org/web/20130101010327/http://zmsmith.com:80/2010/05/django-check-if-a-field-has-changed/

def save(self, *args, **kw):
    from django.core.files.uploadedfile import UploadedFile
    if hasattr(self.image, 'file') and isinstance(self.image.file, UploadedFile) :
        # Handle FileFields as special cases, because the uploaded filename could be
        # the same as the filename that's already there even though there may
        # be different file contents.

        # if a file was just uploaded, the storage model with be UploadedFile
        # Do new file stuff here
        pass

You can use django-model-changes to do this without an additional database lookup:

from django.dispatch import receiver
from django_model_changes import ChangesMixin

class Alias(ChangesMixin, MyBaseModel):
   # your model

@receiver(pre_save, sender=Alias)
def do_something_if_changed(sender, instance, **kwargs):
    if 'remote_image' in instance.changes():
        # do something

If you do not find interest in overriding save method, you can do

  model_fields = [f.name for f in YourModel._meta.get_fields()]
  valid_data = {
        key: new_data[key]
        for key in model_fields
        if key in new_data.keys()
  }

  for (key, value) in valid_data.items():
        if getattr(instance, key) != value:
           print ('Data has changed')

        setattr(instance, key, value)

 instance.save()

There is an attribute __dict__ which have all the fields as the keys and value as the field values. So we can just compare two of them

Just change the save function of model to the function below

def save(self, force_insert=False, force_update=False, using=None, update_fields=None):
    if self.pk is not None:
        initial = A.objects.get(pk=self.pk)
        initial_json, final_json = initial.__dict__.copy(), self.__dict__.copy()
        initial_json.pop('_state'), final_json.pop('_state')
        only_changed_fields = {k: {'final_value': final_json[k], 'initial_value': initial_json[k]} for k in initial_json if final_json[k] != initial_json[k]}
        print(only_changed_fields)
    super(A, self).save(force_insert=False, force_update=False, using=None, update_fields=None)

Example Usage:

class A(models.Model):
    name = models.CharField(max_length=200, null=True, blank=True)
    senior = models.CharField(choices=choices, max_length=3)
    timestamp = models.DateTimeField(null=True, blank=True)

    def save(self, force_insert=False, force_update=False, using=None, update_fields=None):
        if self.pk is not None:
            initial = A.objects.get(pk=self.pk)
            initial_json, final_json = initial.__dict__.copy(), self.__dict__.copy()
            initial_json.pop('_state'), final_json.pop('_state')
            only_changed_fields = {k: {'final_value': final_json[k], 'initial_value': initial_json[k]} for k in initial_json if final_json[k] != initial_json[k]}
            print(only_changed_fields)
        super(A, self).save(force_insert=False, force_update=False, using=None, update_fields=None)

yields output with only those fields that have been changed

{'name': {'initial_value': '1234515', 'final_value': 'nim'}, 'senior': {'initial_value': 'no', 'final_value': 'yes'}}

Here is another way of doing it.

class Parameter(models.Model):

    def __init__(self, *args, **kwargs):
        super(Parameter, self).__init__(*args, **kwargs)
        self.__original_value = self.value

    def clean(self,*args,**kwargs):
        if self.__original_value == self.value:
            print("igual")
        else:
            print("distinto")

    def save(self,*args,**kwargs):
        self.full_clean()
        return super(Parameter, self).save(*args, **kwargs)
        self.__original_value = self.value

    key = models.CharField(max_length=24, db_index=True, unique=True)
    value = models.CharField(max_length=128)

As per documentation: validating objects

"The second step full_clean() performs is to call Model.clean(). This method should be overridden to perform custom validation on your model. This method should be used to provide custom model validation, and to modify attributes on your model if desired. For instance, you could use it to automatically provide a value for a field, or to do validation that requires access to more than a single field:"


Best way is with a pre_save signal. May not have been an option back in '09 when this question was asked and answered, but anyone seeing this today should do it this way:

@receiver(pre_save, sender=MyModel)
def do_something_if_changed(sender, instance, **kwargs):
    try:
        obj = sender.objects.get(pk=instance.pk)
    except sender.DoesNotExist:
        pass # Object is new, so field hasn't technically changed, but you may want to do something else here.
    else:
        if not obj.some_field == instance.some_field: # Field has changed
            # do something

I use following mixin:

from django.forms.models import model_to_dict


class ModelDiffMixin(object):
    """
    A model mixin that tracks model fields' values and provide some useful api
    to know what fields have been changed.
    """

    def __init__(self, *args, **kwargs):
        super(ModelDiffMixin, self).__init__(*args, **kwargs)
        self.__initial = self._dict

    @property
    def diff(self):
        d1 = self.__initial
        d2 = self._dict
        diffs = [(k, (v, d2[k])) for k, v in d1.items() if v != d2[k]]
        return dict(diffs)

    @property
    def has_changed(self):
        return bool(self.diff)

    @property
    def changed_fields(self):
        return self.diff.keys()

    def get_field_diff(self, field_name):
        """
        Returns a diff for field if it's changed and None otherwise.
        """
        return self.diff.get(field_name, None)

    def save(self, *args, **kwargs):
        """
        Saves model and set initial state.
        """
        super(ModelDiffMixin, self).save(*args, **kwargs)
        self.__initial = self._dict

    @property
    def _dict(self):
        return model_to_dict(self, fields=[field.name for field in
                             self._meta.fields])

Usage:

>>> p = Place()
>>> p.has_changed
False
>>> p.changed_fields
[]
>>> p.rank = 42
>>> p.has_changed
True
>>> p.changed_fields
['rank']
>>> p.diff
{'rank': (0, 42)}
>>> p.categories = [1, 3, 5]
>>> p.diff
{'categories': (None, [1, 3, 5]), 'rank': (0, 42)}
>>> p.get_field_diff('categories')
(None, [1, 3, 5])
>>> p.get_field_diff('rank')
(0, 42)
>>>

Note

Please note that this solution works well in context of current request only. Thus it's suitable primarily for simple cases. In concurrent environment where multiple requests can manipulate the same model instance at the same time, you definitely need a different approach.


As of Django 1.8, there's the from_db method, as Serge mentions. In fact, the Django docs include this specific use case as an example:

https://docs.djangoproject.com/en/dev/ref/models/instances/#customizing-model-loading

Below is an example showing how to record the initial values of fields that are loaded from the database


A modification to @ivanperelivskiy's answer:

@property
def _dict(self):
    ret = {}
    for field in self._meta.get_fields():
        if isinstance(field, ForeignObjectRel):
            # foreign objects might not have corresponding objects in the database.
            if hasattr(self, field.get_accessor_name()):
                ret[field.get_accessor_name()] = getattr(self, field.get_accessor_name())
            else:
                ret[field.get_accessor_name()] = None
        else:
            ret[field.attname] = getattr(self, field.attname)
    return ret

This uses django 1.10's public method get_fields instead. This makes the code more future proof, but more importantly also includes foreign keys and fields where editable=False.

For reference, here is the implementation of .fields

@cached_property
def fields(self):
    """
    Returns a list of all forward fields on the model and its parents,
    excluding ManyToManyFields.

    Private API intended only to be used by Django itself; get_fields()
    combined with filtering of field properties is the public API for
    obtaining this field list.
    """
    # For legacy reasons, the fields property should only contain forward
    # fields that are not private or with a m2m cardinality. Therefore we
    # pass these three filters as filters to the generator.
    # The third lambda is a longwinded way of checking f.related_model - we don't
    # use that property directly because related_model is a cached property,
    # and all the models may not have been loaded yet; we don't want to cache
    # the string reference to the related_model.
    def is_not_an_m2m_field(f):
        return not (f.is_relation and f.many_to_many)

    def is_not_a_generic_relation(f):
        return not (f.is_relation and f.one_to_many)

    def is_not_a_generic_foreign_key(f):
        return not (
            f.is_relation and f.many_to_one and not (hasattr(f.remote_field, 'model') and f.remote_field.model)
        )

    return make_immutable_fields_list(
        "fields",
        (f for f in self._get_fields(reverse=False)
         if is_not_an_m2m_field(f) and is_not_a_generic_relation(f) and is_not_a_generic_foreign_key(f))
    )

How about using David Cramer's solution:

http://cramer.io/2010/12/06/tracking-changes-to-fields-in-django/

I've had success using it like this:

@track_data('name')
class Mode(models.Model):
    name = models.CharField(max_length=5)
    mode = models.CharField(max_length=5)

    def save(self, *args, **kwargs):
        if self.has_changed('name'):
            print 'name changed'

    # OR #

    @classmethod
    def post_save(cls, sender, instance, created, **kwargs):
        if instance.has_changed('name'):
            print "Hooray!"

This works for me in Django 1.8

def clean(self):
    if self.cleaned_data['name'] != self.initial['name']:
        # Do something

The optimal solution is probably one that does not include an additional database read operation prior to saving the model instance, nor any further django-library. This is why laffuste's solutions is preferable. In the context of an admin site, one can simply override the save_model-method, and invoke the form's has_changed method there, just as in Sion's answer above. You arrive at something like this, drawing on Sion's example setting but using changed_data to get every possible change:

class ModelAdmin(admin.ModelAdmin):
   fields=['name','mode']
   def save_model(self, request, obj, form, change):
     form.changed_data #output could be ['name']
     #do somethin the changed name value...
     #call the super method
     super(self,ModelAdmin).save_model(request, obj, form, change)
  • Override save_model:

https://docs.djangoproject.com/en/1.10/ref/contrib/admin/#django.contrib.admin.ModelAdmin.save_model

  • Built-in changed_data-method for a Field:

https://docs.djangoproject.com/en/1.10/ref/forms/api/#django.forms.Form.changed_data


I have extended the mixin of @livskiy as follows:

class ModelDiffMixin(models.Model):
    """
    A model mixin that tracks model fields' values and provide some useful api
    to know what fields have been changed.
    """
    _dict = DictField(editable=False)
    def __init__(self, *args, **kwargs):
        super(ModelDiffMixin, self).__init__(*args, **kwargs)
        self._initial = self._dict

    @property
    def diff(self):
        d1 = self._initial
        d2 = self._dict
        diffs = [(k, (v, d2[k])) for k, v in d1.items() if v != d2[k]]
        return dict(diffs)

    @property
    def has_changed(self):
        return bool(self.diff)

    @property
    def changed_fields(self):
        return self.diff.keys()

    def get_field_diff(self, field_name):
        """
        Returns a diff for field if it's changed and None otherwise.
        """
        return self.diff.get(field_name, None)

    def save(self, *args, **kwargs):
        """
        Saves model and set initial state.
        """
        object_dict = model_to_dict(self,
               fields=[field.name for field in self._meta.fields])
        for field in object_dict:
            # for FileFields
            if issubclass(object_dict[field].__class__, FieldFile):
                try:
                    object_dict[field] = object_dict[field].path
                except :
                    object_dict[field] = object_dict[field].name

            # TODO: add other non-serializable field types
        self._dict = object_dict
        super(ModelDiffMixin, self).save(*args, **kwargs)

    class Meta:
        abstract = True

and the DictField is:

class DictField(models.TextField):
    __metaclass__ = models.SubfieldBase
    description = "Stores a python dict"

    def __init__(self, *args, **kwargs):
        super(DictField, self).__init__(*args, **kwargs)

    def to_python(self, value):
        if not value:
            value = {}

        if isinstance(value, dict):
            return value

        return json.loads(value)

    def get_prep_value(self, value):
        if value is None:
            return value
        return json.dumps(value)

    def value_to_string(self, obj):
        value = self._get_val_from_obj(obj)
        return self.get_db_prep_value(value)

it can be used by extending it in your models a _dict field will be added when you sync/migrate and that field will store the state of your objects


Very late to the game, but this is a version of Chris Pratt's answer that protects against race conditions while sacrificing performance, by using a transaction block and select_for_update()

@receiver(pre_save, sender=MyModel)
@transaction.atomic
def do_something_if_changed(sender, instance, **kwargs):
    try:
        obj = sender.objects.select_for_update().get(pk=instance.pk)
    except sender.DoesNotExist:
        pass # Object is new, so field hasn't technically changed, but you may want to do something else here.
    else:
        if not obj.some_field == instance.some_field: # Field has changed
            # do something

I am a bit late to the party but I found this solution also: Django Dirty Fields


If you are using a form, you can use Form's changed_data (docs):

class AliasForm(ModelForm):

    def save(self, commit=True):
        if 'remote_image' in self.changed_data:
            # do things
            remote_image = self.cleaned_data['remote_image']
            do_things(remote_image)
        super(AliasForm, self).save(commit)

    class Meta:
        model = Alias

Since Django 1.8 released, you can use from_db classmethod to cache old value of remote_image. Then in save method you can compare old and new value of field to check if the value has changed.

@classmethod
def from_db(cls, db, field_names, values):
    new = super(Alias, cls).from_db(db, field_names, values)
    # cache value went from the base
    new._loaded_remote_image = values[field_names.index('remote_image')]
    return new

def save(self, force_insert=False, force_update=False, using=None,
         update_fields=None):
    if (self._state.adding and self.remote_image) or \
        (not self._state.adding and self._loaded_remote_image != self.remote_image):
        # If it is first save and there is no cached remote_image but there is new one, 
        # or the value of remote_image has changed - do your stuff!

Note that field change tracking is available in django-model-utils.

https://django-model-utils.readthedocs.org/en/latest/index.html


While this doesn't actually answer your question, I'd go about this in a different way.

Simply clear the remote_image field after successfully saving the local copy. Then in your save method you can always update the image whenever remote_image isn't empty.

If you'd like to keep a reference to the url, you could use an non-editable boolean field to handle the caching flag rather than remote_image field itself.


as an extension of SmileyChris' answer, you can add a datetime field to the model for last_updated, and set some sort of limit for the max age you'll let it get to before checking for a change


Examples related to django

How to fix error "ERROR: Command errored out with exit status 1: python." when trying to install django-heroku using pip Pylint "unresolved import" error in Visual Studio Code Is it better to use path() or url() in urls.py for django 2.0? Unable to import path from django.urls Error loading MySQLdb Module 'Did you install mysqlclient or MySQL-python?' ImportError: Couldn't import Django Django - Reverse for '' not found. '' is not a valid view function or pattern name Class has no objects member Getting TypeError: __init__() missing 1 required positional argument: 'on_delete' when trying to add parent table after child table with entries How to switch Python versions in Terminal?

Examples related to image

Reading images in python Numpy Resize/Rescale Image Convert np.array of type float64 to type uint8 scaling values Extract a page from a pdf as a jpeg How do I stretch an image to fit the whole background (100% height x 100% width) in Flutter? Angular 4 img src is not found How to make a movie out of images in python Load local images in React.js How to install "ifconfig" command in my ubuntu docker image? How do I display local image in markdown?

Examples related to caching

Disable nginx cache for JavaScript files How to prevent Browser cache on Angular 2 site? Curl command without using cache Notepad++ cached files location Laravel 5 Clear Views Cache Write-back vs Write-Through caching? Tomcat 8 throwing - org.apache.catalina.webresources.Cache.getResource Unable to add the resource Chrome - ERR_CACHE_MISS How do I use disk caching in Picasso? How to clear gradle cache?

Examples related to django-models

Getting TypeError: __init__() missing 1 required positional argument: 'on_delete' when trying to add parent table after child table with entries "Post Image data using POSTMAN" What does on_delete do on Django models? Django values_list vs values What's the difference between select_related and prefetch_related in Django ORM? Django Model() vs Model.objects.create() 'NOT NULL constraint failed' after adding to models.py django - get() returned more than one topic How to convert Django Model object to dict with its fields and values? How do I make an auto increment integer field in Django?