What is the purpose of __slots__
in Python — especially with respect to when I would want to use it, and when not?
This question is related to
python
oop
python-internals
slots
Quoting Jacob Hallen:
The proper use of
__slots__
is to save space in objects. Instead of having a dynamic dict that allows adding attributes to objects at anytime, there is a static structure which does not allow additions after creation. [This use of__slots__
eliminates the overhead of one dict for every object.] While this is sometimes a useful optimization, it would be completely unnecessary if the Python interpreter was dynamic enough so that it would only require the dict when there actually were additions to the object.Unfortunately there is a side effect to slots. They change the behavior of the objects that have slots in a way that can be abused by control freaks and static typing weenies. This is bad, because the control freaks should be abusing the metaclasses and the static typing weenies should be abusing decorators, since in Python, there should be only one obvious way of doing something.
Making CPython smart enough to handle saving space without
__slots__
is a major undertaking, which is probably why it is not on the list of changes for P3k (yet).
Each python object has a __dict__
atttribute which is a dictionary containing all other attributes. e.g. when you type self.attr
python is actually doing self.__dict__['attr']
. As you can imagine using a dictionary to store attribute takes some extra space & time for accessing it.
However, when you use __slots__
, any object created for that class won't have a __dict__
attribute. Instead, all attribute access is done directly via pointers.
So if want a C style structure rather than a full fledged class you can use __slots__
for compacting size of the objects & reducing attribute access time. A good example is a Point class containing attributes x & y. If you are going to have a lot of points, you can try using __slots__
in order to conserve some memory.
Another somewhat obscure use of __slots__
is to add attributes to an object proxy from the ProxyTypes package, formerly part of the PEAK project. Its ObjectWrapper
allows you to proxy another object, but intercept all interactions with the proxied object. It is not very commonly used (and no Python 3 support), but we have used it to implement a thread-safe blocking wrapper around an async implementation based on tornado that bounces all access to the proxied object through the ioloop, using thread-safe concurrent.Future
objects to synchronise and return results.
By default any attribute access to the proxy object will give you the result from the proxied object. If you need to add an attribute on the proxy object, __slots__
can be used.
from peak.util.proxies import ObjectWrapper
class Original(object):
def __init__(self):
self.name = 'The Original'
class ProxyOriginal(ObjectWrapper):
__slots__ = ['proxy_name']
def __init__(self, subject, proxy_name):
# proxy_info attributed added directly to the
# Original instance, not the ProxyOriginal instance
self.proxy_info = 'You are proxied by {}'.format(proxy_name)
# proxy_name added to ProxyOriginal instance, since it is
# defined in __slots__
self.proxy_name = proxy_name
super(ProxyOriginal, self).__init__(subject)
if __name__ == "__main__":
original = Original()
proxy = ProxyOriginal(original, 'Proxy Overlord')
# Both statements print "The Original"
print "original.name: ", original.name
print "proxy.name: ", proxy.name
# Both statements below print
# "You are proxied by Proxy Overlord", since the ProxyOriginal
# __init__ sets it to the original object
print "original.proxy_info: ", original.proxy_info
print "proxy.proxy_info: ", proxy.proxy_info
# prints "Proxy Overlord"
print "proxy.proxy_name: ", proxy.proxy_name
# Raises AttributeError since proxy_name is only set on
# the proxy object
print "original.proxy_name: ", proxy.proxy_name
A very simple example of __slot__
attribute.
__slots__
If I don't have __slot__
attribute in my class, I can add new attributes to my objects.
class Test:
pass
obj1=Test()
obj2=Test()
print(obj1.__dict__) #--> {}
obj1.x=12
print(obj1.__dict__) # --> {'x': 12}
obj1.y=20
print(obj1.__dict__) # --> {'x': 12, 'y': 20}
obj2.x=99
print(obj2.__dict__) # --> {'x': 99}
If you look at example above, you can see that obj1 and obj2 have their own x and y attributes and python has also created a dict
attribute for each object (obj1 and obj2).
Suppose if my class Test has thousands of such objects? Creating an additional attribute dict
for each object will cause lot of overhead (memory, computing power etc.) in my code.
__slots__
Now in the following example my class Test contains __slots__
attribute. Now I can't add new attributes to my objects (except attribute x
) and python doesn't create a dict
attribute anymore. This eliminates overhead for each object, which can become significant if you have many objects.
class Test:
__slots__=("x")
obj1=Test()
obj2=Test()
obj1.x=12
print(obj1.x) # --> 12
obj2.x=99
print(obj2.x) # --> 99
obj1.y=28
print(obj1.y) # --> AttributeError: 'Test' object has no attribute 'y'
Each python object has a __dict__
atttribute which is a dictionary containing all other attributes. e.g. when you type self.attr
python is actually doing self.__dict__['attr']
. As you can imagine using a dictionary to store attribute takes some extra space & time for accessing it.
However, when you use __slots__
, any object created for that class won't have a __dict__
attribute. Instead, all attribute access is done directly via pointers.
So if want a C style structure rather than a full fledged class you can use __slots__
for compacting size of the objects & reducing attribute access time. A good example is a Point class containing attributes x & y. If you are going to have a lot of points, you can try using __slots__
in order to conserve some memory.
A very simple example of __slot__
attribute.
__slots__
If I don't have __slot__
attribute in my class, I can add new attributes to my objects.
class Test:
pass
obj1=Test()
obj2=Test()
print(obj1.__dict__) #--> {}
obj1.x=12
print(obj1.__dict__) # --> {'x': 12}
obj1.y=20
print(obj1.__dict__) # --> {'x': 12, 'y': 20}
obj2.x=99
print(obj2.__dict__) # --> {'x': 99}
If you look at example above, you can see that obj1 and obj2 have their own x and y attributes and python has also created a dict
attribute for each object (obj1 and obj2).
Suppose if my class Test has thousands of such objects? Creating an additional attribute dict
for each object will cause lot of overhead (memory, computing power etc.) in my code.
__slots__
Now in the following example my class Test contains __slots__
attribute. Now I can't add new attributes to my objects (except attribute x
) and python doesn't create a dict
attribute anymore. This eliminates overhead for each object, which can become significant if you have many objects.
class Test:
__slots__=("x")
obj1=Test()
obj2=Test()
obj1.x=12
print(obj1.x) # --> 12
obj2.x=99
print(obj2.x) # --> 99
obj1.y=28
print(obj1.y) # --> AttributeError: 'Test' object has no attribute 'y'
You would want to use __slots__
if you are going to instantiate a lot (hundreds, thousands) of objects of the same class. __slots__
only exists as a memory optimization tool.
It's highly discouraged to use __slots__
for constraining attribute creation.
Pickling objects with __slots__
won't work with the default (oldest) pickle protocol; it's necessary to specify a later version.
Some other introspection features of python may also be adversely affected.
You have — essentially — no use for __slots__
.
For the time when you think you might need __slots__
, you actually want to use Lightweight or Flyweight design patterns. These are cases when you no longer want to use purely Python objects. Instead, you want a Python object-like wrapper around an array, struct, or numpy array.
class Flyweight(object):
def get(self, theData, index):
return theData[index]
def set(self, theData, index, value):
theData[index]= value
The class-like wrapper has no attributes — it just provides methods that act on the underlying data. The methods can be reduced to class methods. Indeed, it could be reduced to just functions operating on the underlying array of data.
You would want to use __slots__
if you are going to instantiate a lot (hundreds, thousands) of objects of the same class. __slots__
only exists as a memory optimization tool.
It's highly discouraged to use __slots__
for constraining attribute creation.
Pickling objects with __slots__
won't work with the default (oldest) pickle protocol; it's necessary to specify a later version.
Some other introspection features of python may also be adversely affected.
Slots are very useful for library calls to eliminate the "named method dispatch" when making function calls. This is mentioned in the SWIG documentation. For high performance libraries that want to reduce function overhead for commonly called functions using slots is much faster.
Now this may not be directly related to the OPs question. It is related more to building extensions than it does to using the slots syntax on an object. But it does help complete the picture for the usage of slots and some of the reasoning behind them.
The original question was about general use cases not only about memory. So it should be mentioned here that you also get better performance when instantiating large amounts of objects - interesting e.g. when parsing large documents into objects or from a database.
Here is a comparison of creating object trees with a million entries, using slots and without slots. As a reference also the performance when using plain dicts for the trees (Py2.7.10 on OSX):
********** RUN 1 **********
1.96036410332 <class 'css_tree_select.element.Element'>
3.02922606468 <class 'css_tree_select.element.ElementNoSlots'>
2.90828204155 dict
********** RUN 2 **********
1.77050495148 <class 'css_tree_select.element.Element'>
3.10655999184 <class 'css_tree_select.element.ElementNoSlots'>
2.84120798111 dict
********** RUN 3 **********
1.84069895744 <class 'css_tree_select.element.Element'>
3.21540498734 <class 'css_tree_select.element.ElementNoSlots'>
2.59615707397 dict
********** RUN 4 **********
1.75041103363 <class 'css_tree_select.element.Element'>
3.17366290092 <class 'css_tree_select.element.ElementNoSlots'>
2.70941114426 dict
Test classes (ident, appart from slots):
class Element(object):
__slots__ = ['_typ', 'id', 'parent', 'childs']
def __init__(self, typ, id, parent=None):
self._typ = typ
self.id = id
self.childs = []
if parent:
self.parent = parent
parent.childs.append(self)
class ElementNoSlots(object): (same, w/o slots)
testcode, verbose mode:
na, nb, nc = 100, 100, 100
for i in (1, 2, 3, 4):
print '*' * 10, 'RUN', i, '*' * 10
# tree with slot and no slot:
for cls in Element, ElementNoSlots:
t1 = time.time()
root = cls('root', 'root')
for i in xrange(na):
ela = cls(typ='a', id=i, parent=root)
for j in xrange(nb):
elb = cls(typ='b', id=(i, j), parent=ela)
for k in xrange(nc):
elc = cls(typ='c', id=(i, j, k), parent=elb)
to = time.time() - t1
print to, cls
del root
# ref: tree with dicts only:
t1 = time.time()
droot = {'childs': []}
for i in xrange(na):
ela = {'typ': 'a', id: i, 'childs': []}
droot['childs'].append(ela)
for j in xrange(nb):
elb = {'typ': 'b', id: (i, j), 'childs': []}
ela['childs'].append(elb)
for k in xrange(nc):
elc = {'typ': 'c', id: (i, j, k), 'childs': []}
elb['childs'].append(elc)
td = time.time() - t1
print td, 'dict'
del droot
You have — essentially — no use for __slots__
.
For the time when you think you might need __slots__
, you actually want to use Lightweight or Flyweight design patterns. These are cases when you no longer want to use purely Python objects. Instead, you want a Python object-like wrapper around an array, struct, or numpy array.
class Flyweight(object):
def get(self, theData, index):
return theData[index]
def set(self, theData, index, value):
theData[index]= value
The class-like wrapper has no attributes — it just provides methods that act on the underlying data. The methods can be reduced to class methods. Indeed, it could be reduced to just functions operating on the underlying array of data.
The original question was about general use cases not only about memory. So it should be mentioned here that you also get better performance when instantiating large amounts of objects - interesting e.g. when parsing large documents into objects or from a database.
Here is a comparison of creating object trees with a million entries, using slots and without slots. As a reference also the performance when using plain dicts for the trees (Py2.7.10 on OSX):
********** RUN 1 **********
1.96036410332 <class 'css_tree_select.element.Element'>
3.02922606468 <class 'css_tree_select.element.ElementNoSlots'>
2.90828204155 dict
********** RUN 2 **********
1.77050495148 <class 'css_tree_select.element.Element'>
3.10655999184 <class 'css_tree_select.element.ElementNoSlots'>
2.84120798111 dict
********** RUN 3 **********
1.84069895744 <class 'css_tree_select.element.Element'>
3.21540498734 <class 'css_tree_select.element.ElementNoSlots'>
2.59615707397 dict
********** RUN 4 **********
1.75041103363 <class 'css_tree_select.element.Element'>
3.17366290092 <class 'css_tree_select.element.ElementNoSlots'>
2.70941114426 dict
Test classes (ident, appart from slots):
class Element(object):
__slots__ = ['_typ', 'id', 'parent', 'childs']
def __init__(self, typ, id, parent=None):
self._typ = typ
self.id = id
self.childs = []
if parent:
self.parent = parent
parent.childs.append(self)
class ElementNoSlots(object): (same, w/o slots)
testcode, verbose mode:
na, nb, nc = 100, 100, 100
for i in (1, 2, 3, 4):
print '*' * 10, 'RUN', i, '*' * 10
# tree with slot and no slot:
for cls in Element, ElementNoSlots:
t1 = time.time()
root = cls('root', 'root')
for i in xrange(na):
ela = cls(typ='a', id=i, parent=root)
for j in xrange(nb):
elb = cls(typ='b', id=(i, j), parent=ela)
for k in xrange(nc):
elc = cls(typ='c', id=(i, j, k), parent=elb)
to = time.time() - t1
print to, cls
del root
# ref: tree with dicts only:
t1 = time.time()
droot = {'childs': []}
for i in xrange(na):
ela = {'typ': 'a', id: i, 'childs': []}
droot['childs'].append(ela)
for j in xrange(nb):
elb = {'typ': 'b', id: (i, j), 'childs': []}
ela['childs'].append(elb)
for k in xrange(nc):
elc = {'typ': 'c', id: (i, j, k), 'childs': []}
elb['childs'].append(elc)
td = time.time() - t1
print td, 'dict'
del droot
Each python object has a __dict__
atttribute which is a dictionary containing all other attributes. e.g. when you type self.attr
python is actually doing self.__dict__['attr']
. As you can imagine using a dictionary to store attribute takes some extra space & time for accessing it.
However, when you use __slots__
, any object created for that class won't have a __dict__
attribute. Instead, all attribute access is done directly via pointers.
So if want a C style structure rather than a full fledged class you can use __slots__
for compacting size of the objects & reducing attribute access time. A good example is a Point class containing attributes x & y. If you are going to have a lot of points, you can try using __slots__
in order to conserve some memory.
Quoting Jacob Hallen:
The proper use of
__slots__
is to save space in objects. Instead of having a dynamic dict that allows adding attributes to objects at anytime, there is a static structure which does not allow additions after creation. [This use of__slots__
eliminates the overhead of one dict for every object.] While this is sometimes a useful optimization, it would be completely unnecessary if the Python interpreter was dynamic enough so that it would only require the dict when there actually were additions to the object.Unfortunately there is a side effect to slots. They change the behavior of the objects that have slots in a way that can be abused by control freaks and static typing weenies. This is bad, because the control freaks should be abusing the metaclasses and the static typing weenies should be abusing decorators, since in Python, there should be only one obvious way of doing something.
Making CPython smart enough to handle saving space without
__slots__
is a major undertaking, which is probably why it is not on the list of changes for P3k (yet).
You would want to use __slots__
if you are going to instantiate a lot (hundreds, thousands) of objects of the same class. __slots__
only exists as a memory optimization tool.
It's highly discouraged to use __slots__
for constraining attribute creation.
Pickling objects with __slots__
won't work with the default (oldest) pickle protocol; it's necessary to specify a later version.
Some other introspection features of python may also be adversely affected.
Each python object has a __dict__
atttribute which is a dictionary containing all other attributes. e.g. when you type self.attr
python is actually doing self.__dict__['attr']
. As you can imagine using a dictionary to store attribute takes some extra space & time for accessing it.
However, when you use __slots__
, any object created for that class won't have a __dict__
attribute. Instead, all attribute access is done directly via pointers.
So if want a C style structure rather than a full fledged class you can use __slots__
for compacting size of the objects & reducing attribute access time. A good example is a Point class containing attributes x & y. If you are going to have a lot of points, you can try using __slots__
in order to conserve some memory.
Quoting Jacob Hallen:
The proper use of
__slots__
is to save space in objects. Instead of having a dynamic dict that allows adding attributes to objects at anytime, there is a static structure which does not allow additions after creation. [This use of__slots__
eliminates the overhead of one dict for every object.] While this is sometimes a useful optimization, it would be completely unnecessary if the Python interpreter was dynamic enough so that it would only require the dict when there actually were additions to the object.Unfortunately there is a side effect to slots. They change the behavior of the objects that have slots in a way that can be abused by control freaks and static typing weenies. This is bad, because the control freaks should be abusing the metaclasses and the static typing weenies should be abusing decorators, since in Python, there should be only one obvious way of doing something.
Making CPython smart enough to handle saving space without
__slots__
is a major undertaking, which is probably why it is not on the list of changes for P3k (yet).
Slots are very useful for library calls to eliminate the "named method dispatch" when making function calls. This is mentioned in the SWIG documentation. For high performance libraries that want to reduce function overhead for commonly called functions using slots is much faster.
Now this may not be directly related to the OPs question. It is related more to building extensions than it does to using the slots syntax on an object. But it does help complete the picture for the usage of slots and some of the reasoning behind them.
An attribute of a class instance has 3 properties: the instance, the name of the attribute, and the value of the attribute.
In regular attribute access, the instance acts as a dictionary and the name of the attribute acts as the key in that dictionary looking up value.
instance(attribute) --> value
In __slots__ access, the name of the attribute acts as the dictionary and the instance acts as the key in the dictionary looking up value.
attribute(instance) --> value
In flyweight pattern, the name of the attribute acts as the dictionary and the value acts as the key in that dictionary looking up the instance.
attribute(value) --> instance
Another somewhat obscure use of __slots__
is to add attributes to an object proxy from the ProxyTypes package, formerly part of the PEAK project. Its ObjectWrapper
allows you to proxy another object, but intercept all interactions with the proxied object. It is not very commonly used (and no Python 3 support), but we have used it to implement a thread-safe blocking wrapper around an async implementation based on tornado that bounces all access to the proxied object through the ioloop, using thread-safe concurrent.Future
objects to synchronise and return results.
By default any attribute access to the proxy object will give you the result from the proxied object. If you need to add an attribute on the proxy object, __slots__
can be used.
from peak.util.proxies import ObjectWrapper
class Original(object):
def __init__(self):
self.name = 'The Original'
class ProxyOriginal(ObjectWrapper):
__slots__ = ['proxy_name']
def __init__(self, subject, proxy_name):
# proxy_info attributed added directly to the
# Original instance, not the ProxyOriginal instance
self.proxy_info = 'You are proxied by {}'.format(proxy_name)
# proxy_name added to ProxyOriginal instance, since it is
# defined in __slots__
self.proxy_name = proxy_name
super(ProxyOriginal, self).__init__(subject)
if __name__ == "__main__":
original = Original()
proxy = ProxyOriginal(original, 'Proxy Overlord')
# Both statements print "The Original"
print "original.name: ", original.name
print "proxy.name: ", proxy.name
# Both statements below print
# "You are proxied by Proxy Overlord", since the ProxyOriginal
# __init__ sets it to the original object
print "original.proxy_info: ", original.proxy_info
print "proxy.proxy_info: ", proxy.proxy_info
# prints "Proxy Overlord"
print "proxy.proxy_name: ", proxy.proxy_name
# Raises AttributeError since proxy_name is only set on
# the proxy object
print "original.proxy_name: ", proxy.proxy_name
An attribute of a class instance has 3 properties: the instance, the name of the attribute, and the value of the attribute.
In regular attribute access, the instance acts as a dictionary and the name of the attribute acts as the key in that dictionary looking up value.
instance(attribute) --> value
In __slots__ access, the name of the attribute acts as the dictionary and the instance acts as the key in the dictionary looking up value.
attribute(instance) --> value
In flyweight pattern, the name of the attribute acts as the dictionary and the value acts as the key in that dictionary looking up the instance.
attribute(value) --> instance
In addition to the other answers, here is an example of using __slots__
:
>>> class Test(object): #Must be new-style class!
... __slots__ = ['x', 'y']
...
>>> pt = Test()
>>> dir(pt)
['__class__', '__delattr__', '__doc__', '__getattribute__', '__hash__',
'__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__',
'__repr__', '__setattr__', '__slots__', '__str__', 'x', 'y']
>>> pt.x
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: x
>>> pt.x = 1
>>> pt.x
1
>>> pt.z = 2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Test' object has no attribute 'z'
>>> pt.__dict__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Test' object has no attribute '__dict__'
>>> pt.__slots__
['x', 'y']
So, to implement __slots__
, it only takes an extra line (and making your class a new-style class if it isn't already). This way you can reduce the memory footprint of those classes 5-fold, at the expense of having to write custom pickle code, if and when that becomes necessary.
You have — essentially — no use for __slots__
.
For the time when you think you might need __slots__
, you actually want to use Lightweight or Flyweight design patterns. These are cases when you no longer want to use purely Python objects. Instead, you want a Python object-like wrapper around an array, struct, or numpy array.
class Flyweight(object):
def get(self, theData, index):
return theData[index]
def set(self, theData, index, value):
theData[index]= value
The class-like wrapper has no attributes — it just provides methods that act on the underlying data. The methods can be reduced to class methods. Indeed, it could be reduced to just functions operating on the underlying array of data.
In addition to the other answers, here is an example of using __slots__
:
>>> class Test(object): #Must be new-style class!
... __slots__ = ['x', 'y']
...
>>> pt = Test()
>>> dir(pt)
['__class__', '__delattr__', '__doc__', '__getattribute__', '__hash__',
'__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__',
'__repr__', '__setattr__', '__slots__', '__str__', 'x', 'y']
>>> pt.x
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: x
>>> pt.x = 1
>>> pt.x
1
>>> pt.z = 2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Test' object has no attribute 'z'
>>> pt.__dict__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Test' object has no attribute '__dict__'
>>> pt.__slots__
['x', 'y']
So, to implement __slots__
, it only takes an extra line (and making your class a new-style class if it isn't already). This way you can reduce the memory footprint of those classes 5-fold, at the expense of having to write custom pickle code, if and when that becomes necessary.
You have — essentially — no use for __slots__
.
For the time when you think you might need __slots__
, you actually want to use Lightweight or Flyweight design patterns. These are cases when you no longer want to use purely Python objects. Instead, you want a Python object-like wrapper around an array, struct, or numpy array.
class Flyweight(object):
def get(self, theData, index):
return theData[index]
def set(self, theData, index, value):
theData[index]= value
The class-like wrapper has no attributes — it just provides methods that act on the underlying data. The methods can be reduced to class methods. Indeed, it could be reduced to just functions operating on the underlying array of data.
Quoting Jacob Hallen:
The proper use of
__slots__
is to save space in objects. Instead of having a dynamic dict that allows adding attributes to objects at anytime, there is a static structure which does not allow additions after creation. [This use of__slots__
eliminates the overhead of one dict for every object.] While this is sometimes a useful optimization, it would be completely unnecessary if the Python interpreter was dynamic enough so that it would only require the dict when there actually were additions to the object.Unfortunately there is a side effect to slots. They change the behavior of the objects that have slots in a way that can be abused by control freaks and static typing weenies. This is bad, because the control freaks should be abusing the metaclasses and the static typing weenies should be abusing decorators, since in Python, there should be only one obvious way of doing something.
Making CPython smart enough to handle saving space without
__slots__
is a major undertaking, which is probably why it is not on the list of changes for P3k (yet).
Source: Stackoverflow.com