[python] Understanding __get__ and __set__ and Python descriptors

I am trying to understand what Python's descriptors are and what they can be useful for.

Descriptors are class attributes (like properties or methods) with any of the following special methods:

  • __get__ (non-data descriptor method, for example on a method/function)
  • __set__ (data descriptor method, for example on a property instance)
  • __delete__ (data descriptor method)

These descriptor objects can be used as attributes on other object class definitions. (That is, they live in the __dict__ of the class object.)

Descriptor objects can be used to programmatically manage the results of a dotted lookup (e.g. foo.descriptor) in a normal expression, an assignment, and even a deletion.

Functions/methods, bound methods, property, classmethod, and staticmethod all use these special methods to control how they are accessed via the dotted lookup.

A data descriptor, like property, can allow for lazy evaluation of attributes based on a simpler state of the object, allowing instances to use less memory than if you precomputed each possible attribute.

Another data descriptor, a member_descriptor, created by __slots__, allow memory savings by allowing the class to store data in a mutable tuple-like datastructure instead of the more flexible but space-consuming __dict__.

Non-data descriptors, usually instance, class, and static methods, get their implicit first arguments (usually named cls and self, respectively) from their non-data descriptor method, __get__.

Most users of Python need to learn only the simple usage, and have no need to learn or understand the implementation of descriptors further.

In Depth: What Are Descriptors?

A descriptor is an object with any of the following methods (__get__, __set__, or __delete__), intended to be used via dotted-lookup as if it were a typical attribute of an instance. For an owner-object, obj_instance, with a descriptor object:

  • obj_instance.descriptor invokes
    descriptor.__get__(self, obj_instance, owner_class) returning a value
    This is how all methods and the get on a property work.

  • obj_instance.descriptor = value invokes
    descriptor.__set__(self, obj_instance, value) returning None
    This is how the setter on a property works.

  • del obj_instance.descriptor invokes
    descriptor.__delete__(self, obj_instance) returning None
    This is how the deleter on a property works.

obj_instance is the instance whose class contains the descriptor object's instance. self is the instance of the descriptor (probably just one for the class of the obj_instance)

To define this with code, an object is a descriptor if the set of its attributes intersects with any of the required attributes:

def has_descriptor_attrs(obj):
    return set(['__get__', '__set__', '__delete__']).intersection(dir(obj))

def is_descriptor(obj):
    """obj can be instance of descriptor or the descriptor class"""
    return bool(has_descriptor_attrs(obj))

A Data Descriptor has a __set__ and/or __delete__.
A Non-Data-Descriptor has neither __set__ nor __delete__.

def has_data_descriptor_attrs(obj):
    return set(['__set__', '__delete__']) & set(dir(obj))

def is_data_descriptor(obj):
    return bool(has_data_descriptor_attrs(obj))

Builtin Descriptor Object Examples:

  • classmethod
  • staticmethod
  • property
  • functions in general

Non-Data Descriptors

We can see that classmethod and staticmethod are Non-Data-Descriptors:

>>> is_descriptor(classmethod), is_data_descriptor(classmethod)
(True, False)
>>> is_descriptor(staticmethod), is_data_descriptor(staticmethod)
(True, False)

Both only have the __get__ method:

>>> has_descriptor_attrs(classmethod), has_descriptor_attrs(staticmethod)
(set(['__get__']), set(['__get__']))

Note that all functions are also Non-Data-Descriptors:

>>> def foo(): pass
... 
>>> is_descriptor(foo), is_data_descriptor(foo)
(True, False)

Data Descriptor, property

However, property is a Data-Descriptor:

>>> is_data_descriptor(property)
True
>>> has_descriptor_attrs(property)
set(['__set__', '__get__', '__delete__'])

Dotted Lookup Order

These are important distinctions, as they affect the lookup order for a dotted lookup.

obj_instance.attribute
  1. First the above looks to see if the attribute is a Data-Descriptor on the class of the instance,
  2. If not, it looks to see if the attribute is in the obj_instance's __dict__, then
  3. it finally falls back to a Non-Data-Descriptor.

The consequence of this lookup order is that Non-Data-Descriptors like functions/methods can be overridden by instances.

Recap and Next Steps

We have learned that descriptors are objects with any of __get__, __set__, or __delete__. These descriptor objects can be used as attributes on other object class definitions. Now we will look at how they are used, using your code as an example.


Analysis of Code from the Question

Here's your code, followed by your questions and answers to each:

class Celsius(object):
    def __init__(self, value=0.0):
        self.value = float(value)
    def __get__(self, instance, owner):
        return self.value
    def __set__(self, instance, value):
        self.value = float(value)

class Temperature(object):
    celsius = Celsius()
  1. Why do I need the descriptor class?

Your descriptor ensures you always have a float for this class attribute of Temperature, and that you can't use del to delete the attribute:

>>> t1 = Temperature()
>>> del t1.celsius
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: __delete__

Otherwise, your descriptors ignore the owner-class and instances of the owner, instead, storing state in the descriptor. You could just as easily share state across all instances with a simple class attribute (so long as you always set it as a float to the class and never delete it, or are comfortable with users of your code doing so):

class Temperature(object):
    celsius = 0.0

This gets you exactly the same behavior as your example (see response to question 3 below), but uses a Pythons builtin (property), and would be considered more idiomatic:

class Temperature(object):
    _celsius = 0.0
    @property
    def celsius(self):
        return type(self)._celsius
    @celsius.setter
    def celsius(self, value):
        type(self)._celsius = float(value)
  1. What is instance and owner here? (in get). What is the purpose of these parameters?

instance is the instance of the owner that is calling the descriptor. The owner is the class in which the descriptor object is used to manage access to the data point. See the descriptions of the special methods that define descriptors next to the first paragraph of this answer for more descriptive variable names.

  1. How would I call/use this example?

Here's a demonstration:

>>> t1 = Temperature()
>>> t1.celsius
0.0
>>> t1.celsius = 1
>>> 
>>> t1.celsius
1.0
>>> t2 = Temperature()
>>> t2.celsius
1.0

You can't delete the attribute:

>>> del t2.celsius
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: __delete__

And you can't assign a variable that can't be converted to a float:

>>> t1.celsius = '0x02'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 7, in __set__
ValueError: invalid literal for float(): 0x02

Otherwise, what you have here is a global state for all instances, that is managed by assigning to any instance.

The expected way that most experienced Python programmers would accomplish this outcome would be to use the property decorator, which makes use of the same descriptors under the hood, but brings the behavior into the implementation of the owner class (again, as defined above):

class Temperature(object):
    _celsius = 0.0
    @property
    def celsius(self):
        return type(self)._celsius
    @celsius.setter
    def celsius(self, value):
        type(self)._celsius = float(value)

Which has the exact same expected behavior of the original piece of code:

>>> t1 = Temperature()
>>> t2 = Temperature()
>>> t1.celsius
0.0
>>> t1.celsius = 1.0
>>> t2.celsius
1.0
>>> del t1.celsius
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: can't delete attribute
>>> t1.celsius = '0x02'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 8, in celsius
ValueError: invalid literal for float(): 0x02

Conclusion

We've covered the attributes that define descriptors, the difference between data- and non-data-descriptors, builtin objects that use them, and specific questions about use.

So again, how would you use the question's example? I hope you wouldn't. I hope you would start with my first suggestion (a simple class attribute) and move on to the second suggestion (the property decorator) if you feel it is necessary.