I am trying to understand what Python's descriptors are and what they can be useful for.
Descriptors are class attributes (like properties or methods) with any of the following special methods:
__get__
(non-data descriptor method, for example on a method/function)__set__
(data descriptor method, for example on a property instance)__delete__
(data descriptor method)These descriptor objects can be used as attributes on other object class definitions. (That is, they live in the __dict__
of the class object.)
Descriptor objects can be used to programmatically manage the results of a dotted lookup (e.g. foo.descriptor
) in a normal expression, an assignment, and even a deletion.
Functions/methods, bound methods, property
, classmethod
, and staticmethod
all use these special methods to control how they are accessed via the dotted lookup.
A data descriptor, like property
, can allow for lazy evaluation of attributes based on a simpler state of the object, allowing instances to use less memory than if you precomputed each possible attribute.
Another data descriptor, a member_descriptor
, created by __slots__
, allow memory savings by allowing the class to store data in a mutable tuple-like datastructure instead of the more flexible but space-consuming __dict__
.
Non-data descriptors, usually instance, class, and static methods, get their implicit first arguments (usually named cls
and self
, respectively) from their non-data descriptor method, __get__
.
Most users of Python need to learn only the simple usage, and have no need to learn or understand the implementation of descriptors further.
A descriptor is an object with any of the following methods (__get__
, __set__
, or __delete__
), intended to be used via dotted-lookup as if it were a typical attribute of an instance. For an owner-object, obj_instance
, with a descriptor
object:
obj_instance.descriptor
invokes
descriptor.__get__(self, obj_instance, owner_class)
returning a value
This is how all methods and the get
on a property work.
obj_instance.descriptor = value
invokes
descriptor.__set__(self, obj_instance, value)
returning None
This is how the setter
on a property works.
del obj_instance.descriptor
invokes
descriptor.__delete__(self, obj_instance)
returning None
This is how the deleter
on a property works.
obj_instance
is the instance whose class contains the descriptor object's instance. self
is the instance of the descriptor (probably just one for the class of the obj_instance
)
To define this with code, an object is a descriptor if the set of its attributes intersects with any of the required attributes:
def has_descriptor_attrs(obj):
return set(['__get__', '__set__', '__delete__']).intersection(dir(obj))
def is_descriptor(obj):
"""obj can be instance of descriptor or the descriptor class"""
return bool(has_descriptor_attrs(obj))
A Data Descriptor has a __set__
and/or __delete__
.
A Non-Data-Descriptor has neither __set__
nor __delete__
.
def has_data_descriptor_attrs(obj):
return set(['__set__', '__delete__']) & set(dir(obj))
def is_data_descriptor(obj):
return bool(has_data_descriptor_attrs(obj))
classmethod
staticmethod
property
We can see that classmethod
and staticmethod
are Non-Data-Descriptors:
>>> is_descriptor(classmethod), is_data_descriptor(classmethod)
(True, False)
>>> is_descriptor(staticmethod), is_data_descriptor(staticmethod)
(True, False)
Both only have the __get__
method:
>>> has_descriptor_attrs(classmethod), has_descriptor_attrs(staticmethod)
(set(['__get__']), set(['__get__']))
Note that all functions are also Non-Data-Descriptors:
>>> def foo(): pass
...
>>> is_descriptor(foo), is_data_descriptor(foo)
(True, False)
property
However, property
is a Data-Descriptor:
>>> is_data_descriptor(property)
True
>>> has_descriptor_attrs(property)
set(['__set__', '__get__', '__delete__'])
These are important distinctions, as they affect the lookup order for a dotted lookup.
obj_instance.attribute
obj_instance
's __dict__
, then The consequence of this lookup order is that Non-Data-Descriptors like functions/methods can be overridden by instances.
We have learned that descriptors are objects with any of __get__
, __set__
, or __delete__
. These descriptor objects can be used as attributes on other object class definitions. Now we will look at how they are used, using your code as an example.
Here's your code, followed by your questions and answers to each:
class Celsius(object):
def __init__(self, value=0.0):
self.value = float(value)
def __get__(self, instance, owner):
return self.value
def __set__(self, instance, value):
self.value = float(value)
class Temperature(object):
celsius = Celsius()
- Why do I need the descriptor class?
Your descriptor ensures you always have a float for this class attribute of Temperature
, and that you can't use del
to delete the attribute:
>>> t1 = Temperature()
>>> del t1.celsius
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: __delete__
Otherwise, your descriptors ignore the owner-class and instances of the owner, instead, storing state in the descriptor. You could just as easily share state across all instances with a simple class attribute (so long as you always set it as a float to the class and never delete it, or are comfortable with users of your code doing so):
class Temperature(object):
celsius = 0.0
This gets you exactly the same behavior as your example (see response to question 3 below), but uses a Pythons builtin (property
), and would be considered more idiomatic:
class Temperature(object):
_celsius = 0.0
@property
def celsius(self):
return type(self)._celsius
@celsius.setter
def celsius(self, value):
type(self)._celsius = float(value)
- What is instance and owner here? (in get). What is the purpose of these parameters?
instance
is the instance of the owner that is calling the descriptor. The owner is the class in which the descriptor object is used to manage access to the data point. See the descriptions of the special methods that define descriptors next to the first paragraph of this answer for more descriptive variable names.
- How would I call/use this example?
Here's a demonstration:
>>> t1 = Temperature()
>>> t1.celsius
0.0
>>> t1.celsius = 1
>>>
>>> t1.celsius
1.0
>>> t2 = Temperature()
>>> t2.celsius
1.0
You can't delete the attribute:
>>> del t2.celsius
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: __delete__
And you can't assign a variable that can't be converted to a float:
>>> t1.celsius = '0x02'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 7, in __set__
ValueError: invalid literal for float(): 0x02
Otherwise, what you have here is a global state for all instances, that is managed by assigning to any instance.
The expected way that most experienced Python programmers would accomplish this outcome would be to use the property
decorator, which makes use of the same descriptors under the hood, but brings the behavior into the implementation of the owner class (again, as defined above):
class Temperature(object):
_celsius = 0.0
@property
def celsius(self):
return type(self)._celsius
@celsius.setter
def celsius(self, value):
type(self)._celsius = float(value)
Which has the exact same expected behavior of the original piece of code:
>>> t1 = Temperature()
>>> t2 = Temperature()
>>> t1.celsius
0.0
>>> t1.celsius = 1.0
>>> t2.celsius
1.0
>>> del t1.celsius
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: can't delete attribute
>>> t1.celsius = '0x02'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 8, in celsius
ValueError: invalid literal for float(): 0x02
We've covered the attributes that define descriptors, the difference between data- and non-data-descriptors, builtin objects that use them, and specific questions about use.
So again, how would you use the question's example? I hope you wouldn't. I hope you would start with my first suggestion (a simple class attribute) and move on to the second suggestion (the property decorator) if you feel it is necessary.