Skip to content Skip to sidebar Skip to footer

Why Does My Class Cost So Much Memory?

from guppy import hpy hp = hpy() class Demo(object): __slots__ = ('v0', 'v1') def __init__(self, v0, v1): self.v0 = v0 self.v1 = v1 from array import

Solution 1:

sys.getsizeof() doesn't recurse into sub-objects, and you only took the size of the class, not of an instance. Each instance takes up 64 bytes, plus 24 bytes per float object (on OS X, using Python 2.7.12):

>>>d = Demo(1.0, 2.0)>>>sys.getsizeof(d)
64
>>>sys.getsizeof(d.v0)
24
>>>sys.getsizeof(d) + sys.getsizeof(d.v0) + sys.getsizeof(d.v1)
112

Each slot only reserves memory for a pointer in the instance object; on my machine that's 8 bytes per pointer.

There are several differences between your Demo() instances and the array:

  • Instances have a minimal overhead to support reference counting and weak references, as well as contain a pointer to their class. The arrays store the values directly, without any of that overhead.
  • The instance stores Python floats. These are full-fledged objects, including reference counting and weak reference support. The array stores single precision floats as C values, while the Python float object models double precision floats. So the instance uses 2 * 24 bytes (on my Mac) just for those floats, vs. just 4 bytes per single-precision 'f' value in an array.
  • To track 5 million Demo instances, you also needed to create a list object, which is sized to handle at least 5 million object references. The array stores the C single-precision floats directly.

The hp.heap() output only counts the instance footprint, not the referenced float values on each line, but the totals match up:

  • 5 million times 64 bytes is 320.000.000 bytes of memory for the Demo instances.
  • 10 million times 24 bytes is 240.000.000 bytes of memory for the float instances, plus a further 108 floats referenced elsewhere.

Together, these two groups make up the majority of the 15 million Python objects on the heap.

  • The list object you created to hold the instances contains 5 million pointers, that's 40.000.000 bytes just to point to all the Demo instances, plus the accounting overhead for that object. There are a further 367 lists on the heap, referenced by other Python code.
  • 2 array instances with each 5 million 4-byte floats is 40.000.000 bytes, plus 56 bytes per array overhead.

So array objects are vastly more efficient to store a large number of numeric values, because it stores these as primitive C values. However, the disadvantage is that Python has to box each value you try to access; so accessing ar[10] returns a Python float object.

Post a Comment for "Why Does My Class Cost So Much Memory?"