Better way to convert file sizes in Python

Question

I am using a library that reads a file and returns its size in bytes   This file size is then displayed to the end user  to make it easier for them to understand it  I am explicitly converting the file size to MB by dividing it by 1024 0   1024 0  Of course this works  but I am wondering is there a better way to do this in Python   By better  I mean perhaps a stdlib function that can manipulate sizes according to the type I want  Like if I specify MB  it automatically divides it by 1024 0   1024 0  Somethign on these lines

User · Answer

Here is the compact function to calculate size

def GetHumanReadable(size,precision=2):
    suffixes=['B','KB','MB','GB','TB']
    suffixIndex = 0
    while size > 1024 and suffixIndex < 4:
        suffixIndex += 1 #increment the index of the suffix
        size = size/1024.0 #apply the division
    return "%.*f%s"%(precision,size,suffixes[suffixIndex])

For more detailed output and vice versa operation please refer: http://code.activestate.com/recipes/578019-bytes-to-human-human-to-bytes-converter/

User · Answer

Here s a version that matches the output of ls -lh   def human size num  int  - gt  str      base   1     for unit in   B    K    M    G    T    P    E    Z    Y            n   num   base         if n  lt  9 95 and unit     B                 Less than 10 then keep 1 decimal place             value       1f     format n  unit              return value         if round n   lt  1000                Less than 4 digits so use this             value          format round n   unit              return value         base    1024     value          format round n   unit      return value

User · Answer

Just in case anyone s searching for the reverse of this problem  as I sure did  here s what works for me   def get bytes size  suffix       size   int float size       suffix   suffix lower        if suffix     kb  or suffix     kib           return size  lt  lt  10     elif suffix     mb  or suffix     mib           return size  lt  lt  20     elif suffix     gb  or suffix     gib           return size  lt  lt  30      return False

User · Answer

Here it is   def convert bytes size      for x in   bytes    KB    MB    GB    TB           if size  lt  1024 0             return   3 1f  s     size  x         size    1024 0     return size

User · Answer

Here is what I use    import math  def convert size size bytes      if size bytes    0         return  0B     size name     B    KB    MB    GB    TB    PB    EB    ZB    YB      i   int math floor math log size bytes  1024       p   math pow 1024  i     s   round size bytes   p  2     return   s  s     s  size name i     NB   size should be sent in Bytes

User · Answer

Here is my implementation   from bisect import bisect  def to filesize bytes num  si True       decade   1000 if si else 1024     partitions   tuple decade    n for n in range 1  6       suffixes   tuple  BKMGTP        i   bisect partitions  bytes num      s   suffixes i       for n in range i           bytes num    decade      f       3f   format bytes num       return        format f rstrip  0   rstrip       s    It will print up to three decimals and it strips trailing zeros and periods  The boolean parameter si will toggle usage of 10-based vs  2-based size magnitude   This is its counterpart  It allows to write clean configuration files like   maximum filesize   from filesize  10M    It returns an integer that approximates the intended filesize  I am not using bit shifting because the source value is a floating point number  it will accept from filesize  2 15M   just fine   Converting it to an integer decimal would work but makes the code more complicated and it already works as it is   def from filesize spec  si True       decade   1000 if si else 1024     suffixes   tuple  BKMGTP        num   float spec  -1       s   spec -1      i   suffixes index s       for n in range i           num    decade      return int num

User · Answer

Instead of a size divisor of 1024   1024 you could use the  lt  lt  bitwise shifting operator  i e  1 lt  lt 20 to get megabytes  1 lt  lt 30 to get gigabytes  etc   In the simplest scenario you can have e g  a constant MBFACTOR   float 1 lt  lt 20  which can then be used with bytes  i e   megas   size in bytes MBFACTOR    Megabytes are usually all that you need  or otherwise something like this can be used     bytes pretty-printing UNITS MAPPING          1 lt  lt 50    PB         1 lt  lt 40    TB         1 lt  lt 30    GB         1 lt  lt 20    MB         1 lt  lt 10    KB         1     byte     bytes         def pretty size bytes  units UNITS MAPPING          Get human-readable file sizes      simplified version of https   pypi python org pypi hurry filesize              for factor  suffix in units          if bytes  gt   factor              break     amount   int bytes   factor       if isinstance suffix  tuple           singular  multiple   suffix         if amount    1              suffix   singular         else              suffix   multiple     return str amount    suffix  print pretty size 1   print pretty size 42   print pretty size 4096   print pretty size 238048577   print pretty size 334073741824   print pretty size 96995116277763   print pretty size 3125899904842624        Out                              1 byte 42 bytes 4 KB 227 MB 311 GB 88 TB 2 PB

User · Answer

I wanted 2 way conversion  and I wanted to use Python 3 format   support to be most pythonic  Maybe try datasize library module  https   pypi org project datasize    pip install -qqq datasize   python      gt  gt  gt  from datasize import DataSize  gt  gt  gt   My new   GB  SSD really only stores    2GiB  of data   format DataSize  750GB   DataSize DataSize  750GB     0 8    My new 750GB SSD really only stores 558 79GiB of data

User · Answer

Here my two cents  which permits casting up and down  and adds customizable precision   def convertFloatToDecimal f 0 0  precision 2               Convert a float to string of decimal      precision  by default 2      If no arg provided  return  0 00               return         str precision     f     f  def formatFileSize size  sizeIn  sizeOut  precision 0               Convert file size to a string representing its value in B  KB  MB and GB      The convention is based on sizeIn as original unit and sizeOut     as final unit               assert sizeIn upper   in   B    KB    MB    GB     sizeIn type error      assert sizeOut upper   in   B    KB    MB    GB     sizeOut type error      if sizeIn     B           if sizeOut     KB               return convertFloatToDecimal  size 1024 0   precision          elif sizeOut     MB               return convertFloatToDecimal  size 1024 0  2   precision          elif sizeOut     GB               return convertFloatToDecimal  size 1024 0  3   precision      elif sizeIn     KB           if sizeOut     B               return convertFloatToDecimal  size 1024 0   precision          elif sizeOut     MB               return convertFloatToDecimal  size 1024 0   precision          elif sizeOut     GB               return convertFloatToDecimal  size 1024 0  2   precision      elif sizeIn     MB           if sizeOut     B               return convertFloatToDecimal  size 1024 0  2   precision          elif sizeOut     KB               return convertFloatToDecimal  size 1024 0   precision          elif sizeOut     GB               return convertFloatToDecimal  size 1024 0   precision      elif sizeIn     GB           if sizeOut     B               return convertFloatToDecimal  size 1024 0  3   precision          elif sizeOut     KB               return convertFloatToDecimal  size 1024 0  2   precision          elif sizeOut     MB               return convertFloatToDecimal  size 1024 0   precision    Add TB  etc  as you wish

User · Answer

UNITS    1000    KB    MB    GB                1024    KiB    MiB    GiB     def approximate size size  flag 1024 or 1000 True       mult   1024 if flag 1024 or 1000 else 1000     for unit in UNITS mult           size   size   mult         if size  lt  mult              return   0  3f   1   format size  unit   approximate size 2123  False

User · Answer

Here are some easy-to-copy one liners to use if you already know what unit size you want   If you re looking for in a more generic function with a few nice options  see my FEB 2021 update further on    Bytes print       0f   format os path getsize filepath    quot  B quot    Kilobits print       0f   format os path getsize filepath  float 1 lt  lt 7    quot  kb quot    Kilobytes print       0f   format os path getsize filepath  float 1 lt  lt 10    quot  KB quot    Megabits print       0f   format os path getsize filepath  float 1 lt  lt 17    quot  mb quot    Megabytes print       0f   format os path getsize filepath  float 1 lt  lt 20    quot  MB quot    Gigabits print       0f   format os path getsize filepath  float 1 lt  lt 27    quot  gb quot    Gigabytes print       0f   format os path getsize filepath  float 1 lt  lt 30    quot  GB quot    Terabytes print       0f   format os path getsize filepath  float 1 lt  lt 40    quot  TB quot     UPDATE FEB 2021 Here are my updated and fleshed-out functions to a  get file folder size  b  convert into desired units  from pathlib import Path  def get path size path   Path       recursive False        quot  quot  quot      Gets file size  or total directory size      Parameters     ----------     path  str   pathlib Path         File path or directory folder path      recursive  bool         True - gt  use  rglob i e  include nested files and directories         False - gt  use  glob i e  only process current directory folder      Returns     -------     int          File size or recursive directory size in bytes         Use cleverutils format bytes to convert to other units e g  MB      quot  quot  quot      path   Path path      if path is file            size   path stat   st size     elif path is dir            path glob   path rglob        if recursive else path glob                size   sum file stat   st size for file in path glob      return size   def format bytes bytes  unit  SI False        quot  quot  quot      Converts bytes to common units such as kb  kib  KB  mb  mib  MB      Parameters     ---------     bytes  int         Number of bytes to be converted      unit  str         Desired unit of measure for output       SI  bool         True - gt  Use SI standard e g  KB   1000 bytes         False - gt  Use JEDEC standard e g  KB   1024 bytes      Returns     -------     str          E g   quot 7 MiB quot  where MiB is the original unit abbreviation supplied      quot  quot  quot      if unit lower   in  quot b bit bits quot  split            return f quot  bytes 8   unit  quot      unitN   unit 0  upper   unit 1   replace  quot s quot   quot  quot      Normalised     reference     quot Kb Kib Kibibit Kilobit quot    7  1                     quot KB KiB Kibibyte Kilobyte quot    10  1                     quot Mb Mib Mebibit Megabit quot    17  2                     quot MB MiB Mebibyte Megabyte quot    20  2                     quot Gb Gib Gibibit Gigabit quot    27  3                     quot GB GiB Gibibyte Gigabyte quot    30  3                     quot Tb Tib Tebibit Terabit quot    37  4                     quot TB TiB Tebibyte Terabyte quot    40  4                     quot Pb Pib Pebibit Petabit quot    47  5                     quot PB PiB Pebibyte Petabyte quot    50  5                     quot Eb Eib Exbibit Exabit quot    57  6                     quot EB EiB Exbibyte Exabyte quot    60  6                     quot Zb Zib Zebibit Zettabit quot    67  7                     quot ZB ZiB Zebibyte Zettabyte quot    70  7                     quot Yb Yib Yobibit Yottabit quot    77  8                     quot YB YiB Yobibyte Yottabyte quot    80  8                          key list     n  join   quot      b Bit quot      x for x in reference keys       quot  n quot      if unitN not in key list          raise IndexError f quot  n nConversion unit must be one of  n n key list  quot       units  divisors     k v  for k v in reference items   if unitN in k  0      if SI          divisor   1000  divisors 1  8 if  quot bit quot  in units else 1000  divisors 1      else          divisor   float 1  lt  lt  divisors 0       value   bytes   divisor     if value    1 and len unitN   gt  3              unitN     quot s quot    Create plural unit of measure     return  quot     0f  quot  format value     quot   quot    unitN     Tests   gt  gt  gt  assert format bytes 1  quot b quot       8 b   gt  gt  gt  assert format bytes 1  quot bits quot       8 bits   gt  gt  gt  assert format bytes 1024   quot kilobyte quot       quot 1 Kilobyte quot   gt  gt  gt  assert format bytes 1024   quot kB quot       quot 1 KB quot   gt  gt  gt  assert format bytes 7141000   quot mb quot       54 Mb   gt  gt  gt  assert format bytes 7141000   quot mib quot       54 Mib   gt  gt  gt  assert format bytes 7141000   quot Mb quot       54 Mb   gt  gt  gt  assert format bytes 7141000   quot MB quot       7 MB   gt  gt  gt  assert format bytes 7141000   quot mebibytes quot       7 Mebibytes   gt  gt  gt  assert format bytes 7141000   quot gb quot       0 Gb   gt  gt  gt  assert format bytes 1000000   quot kB quot       977 KB   gt  gt  gt  assert format bytes 1000000   quot kB quot   SI True      1 000 KB   gt  gt  gt  assert format bytes 1000000   quot kb quot       7 812 Kb   gt  gt  gt  assert format bytes 1000000   quot kb quot   SI True      8 000 Kb   gt  gt  gt  assert format bytes 125000   quot kb quot       977 Kb   gt  gt  gt  assert format bytes 125000   quot kb quot   SI True      1 000 Kb   gt  gt  gt  assert format bytes 125 1024   quot kb quot       1 000 Kb   gt  gt  gt  assert format bytes 125 1024   quot kb quot   SI True      1 024 Kb

User · Answer

There is hurry filesize that will take the size in bytes and make a nice string out if it    gt  gt  gt  from hurry filesize import size  gt  gt  gt  size 11000   10K   gt  gt  gt  size 198283722   189M    Or if you want 1K    1000  which is what most users assume     gt  gt  gt  from hurry filesize import size  si  gt  gt  gt  size 11000  system si   11K   gt  gt  gt  size 198283722  system si   198M    It has IEC support as well  but that wasn t documented     gt  gt  gt  from hurry filesize import size  iec  gt  gt  gt  size 11000  system iec   10Ki   gt  gt  gt  size 198283722  system iec   189Mi    Because it s written by the Awesome Martijn Faassen  the code is small  clear and extensible  Writing your own systems is dead easy   Here is one   mysystem          1024    5    Megamanys         1024    4    Lotses         1024    3    Tons          1024    2    Heaps          1024    1    Bunches         1024    0    Thingies            Used like so    gt  gt  gt  from hurry filesize import size  gt  gt  gt  size 11000  system mysystem   10 Bunches   gt  gt  gt  size 198283722  system mysystem   189 Heaps

[python] Better way to convert file sizes in Python

Examples related to python

Examples related to filesize