[python] How to pad zeroes to a string?

What is a Pythonic way to pad a numeric string with zeroes to the left, i.e. so the numeric string has a specific length?

This question is related to python string zero-padding

The answer is


Its ok too:

 h = 2
 m = 7
 s = 3
 print("%02d:%02d:%02d" % (h, m, s))

so output will be: "02:07:03"


Just use the rjust method of the string object.

This example will make a string of 10 characters long, padding as necessary.

>>> t = 'test'
>>> t.rjust(10, '0')
>>> '000000test'

For zip codes saved as integers:

>>> a = 6340
>>> b = 90210
>>> print '%05d' % a
06340
>>> print '%05d' % b
90210

Besides zfill, you can use general string formatting:

print(f'{number:05d}') # (since Python 3.6), or
print('{:05d}'.format(number)) # or
print('{0:05d}'.format(number)) # or (explicit 0th positional arg. selection)
print('{n:05d}'.format(n=number)) # or (explicit `n` keyword arg. selection)
print(format(number, '05d'))

Documentation for string formatting and f-strings.


For the ones who came here to understand and not just a quick answer. I do these especially for time strings:

hour = 4
minute = 3
"{:0>2}:{:0>2}".format(hour,minute)
# prints 04:03

"{:0>3}:{:0>5}".format(hour,minute)
# prints '004:00003'

"{:0<3}:{:0<5}".format(hour,minute)
# prints '400:30000'

"{:$<3}:{:#<5}".format(hour,minute)
# prints '4$$:3####'

"0" symbols what to replace with the "2" padding characters, the default is an empty space

">" symbols allign all the 2 "0" character to the left of the string

":" symbols the format_spec


For Python 3.6+ using f-strings:

>>> i = 1
>>> f"{i:0>2}"  # Works for both numbers and strings.
'01'
>>> f"{i:02}"  # Works only for numbers.
'01'

For Python 2 to Python 3.5:

>>> "{:0>2}".format("1")  # Works for both numbers and strings.
'01'
>>> "{:02}".format(1)  # Works only for numbers.
'01'

Besides zfill, you can use general string formatting:

print(f'{number:05d}') # (since Python 3.6), or
print('{:05d}'.format(number)) # or
print('{0:05d}'.format(number)) # or (explicit 0th positional arg. selection)
print('{n:05d}'.format(n=number)) # or (explicit `n` keyword arg. selection)
print(format(number, '05d'))

Documentation for string formatting and f-strings.


When using Python >= 3.6, the cleanest way is to use f-strings with string formatting:

>>> s = f"{1:08}"  # inline with int
>>> s
'00000001'
>>> s = f"{'1':0>8}"  # inline with str
>>> s
'00000001'
>>> n = 1
>>> s = f"{n:08}"  # int variable
>>> s
'00000001'
>>> c = "1"
>>> s = f"{c:0>8}"  # str variable
>>> s
'00000001'

I would prefer formatting with an int, since only then the sign is handled correctly:

>>> f"{-1:08}"
'-0000001'

>>> f"{1:+08}"
'+0000001'

>>> f"{'-1':0>8}"
'000000-1'

Besides zfill, you can use general string formatting:

print(f'{number:05d}') # (since Python 3.6), or
print('{:05d}'.format(number)) # or
print('{0:05d}'.format(number)) # or (explicit 0th positional arg. selection)
print('{n:05d}'.format(n=number)) # or (explicit `n` keyword arg. selection)
print(format(number, '05d'))

Documentation for string formatting and f-strings.


You could also repeat "0", prepend it to str(n) and get the rightmost width slice. Quick and dirty little expression.

def pad_left(n, width, pad="0"):
    return ((pad * width) + str(n))[-width:]

>>> '99'.zfill(5)
'00099'
>>> '99'.rjust(5,'0')
'00099'

if you want the opposite:

>>> '99'.ljust(5,'0')
'99000'

width = 10
x = 5
print "%0*d" % (width, x)
> 0000000005

See the print documentation for all the exciting details!

Update for Python 3.x (7.5 years later)

That last line should now be:

print("%0*d" % (width, x))

I.e. print() is now a function, not a statement. Note that I still prefer the Old School printf() style because, IMNSHO, it reads better, and because, um, I've been using that notation since January, 1980. Something ... old dogs .. something something ... new tricks.


You could also repeat "0", prepend it to str(n) and get the rightmost width slice. Quick and dirty little expression.

def pad_left(n, width, pad="0"):
    return ((pad * width) + str(n))[-width:]

str(n).zfill(width) will work with strings, ints, floats... and is Python 2.x and 3.x compatible:

>>> n = 3
>>> str(n).zfill(5)
'00003'
>>> n = '3'
>>> str(n).zfill(5)
'00003'
>>> n = '3.0'
>>> str(n).zfill(5)
'003.0'

Another approach would be to use a list comprehension with a condition checking for lengths. Below is a demonstration:

# input list of strings that we want to prepend zeros
In [71]: list_of_str = ["101010", "10101010", "11110", "0000"]

# prepend zeros to make each string to length 8, if length of string is less than 8
In [83]: ["0"*(8-len(s)) + s if len(s) < desired_len else s for s in list_of_str]
Out[83]: ['00101010', '10101010', '00011110', '00000000']

Quick timing comparison:

setup = '''
from random import randint
def test_1():
    num = randint(0,1000000)
    return str(num).zfill(7)
def test_2():
    num = randint(0,1000000)
    return format(num, '07')
def test_3():
    num = randint(0,1000000)
    return '{0:07d}'.format(num)
def test_4():
    num = randint(0,1000000)
    return format(num, '07d')
def test_5():
    num = randint(0,1000000)
    return '{:07d}'.format(num)
def test_6():
    num = randint(0,1000000)
    return '{x:07d}'.format(x=num)
def test_7():
    num = randint(0,1000000)
    return str(num).rjust(7, '0')
'''
import timeit
print timeit.Timer("test_1()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_2()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_3()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_4()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_5()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_6()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_7()", setup=setup).repeat(3, 900000)


> [2.281613943830961, 2.2719342631547077, 2.261691106209631]
> [2.311480238815406, 2.318420542148333, 2.3552384305184493]
> [2.3824197456864304, 2.3457239951596485, 2.3353268829498646]
> [2.312442972404032, 2.318053102249902, 2.3054072168069872]
> [2.3482314132374853, 2.3403386400002475, 2.330108825844775]
> [2.424549090688892, 2.4346475296851438, 2.429691196530058]
> [2.3259756401716487, 2.333549212826732, 2.32049893822186]

I've made different tests of different repetitions. The differences are not huge, but in all tests, the zfill solution was fastest.


Besides zfill, you can use general string formatting:

print(f'{number:05d}') # (since Python 3.6), or
print('{:05d}'.format(number)) # or
print('{0:05d}'.format(number)) # or (explicit 0th positional arg. selection)
print('{n:05d}'.format(n=number)) # or (explicit `n` keyword arg. selection)
print(format(number, '05d'))

Documentation for string formatting and f-strings.


What is the most pythonic way to pad a numeric string with zeroes to the left, i.e., so the numeric string has a specific length?

str.zfill is specifically intended to do this:

>>> '1'.zfill(4)
'0001'

Note that it is specifically intended to handle numeric strings as requested, and moves a + or - to the beginning of the string:

>>> '+1'.zfill(4)
'+001'
>>> '-1'.zfill(4)
'-001'

Here's the help on str.zfill:

>>> help(str.zfill)
Help on method_descriptor:

zfill(...)
    S.zfill(width) -> str

    Pad a numeric string S with zeros on the left, to fill a field
    of the specified width. The string S is never truncated.

Performance

This is also the most performant of alternative methods:

>>> min(timeit.repeat(lambda: '1'.zfill(4)))
0.18824880896136165
>>> min(timeit.repeat(lambda: '1'.rjust(4, '0')))
0.2104538488201797
>>> min(timeit.repeat(lambda: f'{1:04}'))
0.32585487607866526
>>> min(timeit.repeat(lambda: '{:04}'.format(1)))
0.34988890308886766

To best compare apples to apples for the % method (note it is actually slower), which will otherwise pre-calculate:

>>> min(timeit.repeat(lambda: '1'.zfill(0 or 4)))
0.19728074967861176
>>> min(timeit.repeat(lambda: '%04d' % (0 or 1)))
0.2347015216946602

Implementation

With a little digging, I found the implementation of the zfill method in Objects/stringlib/transmogrify.h:

static PyObject *
stringlib_zfill(PyObject *self, PyObject *args)
{
    Py_ssize_t fill;
    PyObject *s;
    char *p;
    Py_ssize_t width;

    if (!PyArg_ParseTuple(args, "n:zfill", &width))
        return NULL;

    if (STRINGLIB_LEN(self) >= width) {
        return return_self(self);
    }

    fill = width - STRINGLIB_LEN(self);

    s = pad(self, fill, 0, '0');

    if (s == NULL)
        return NULL;

    p = STRINGLIB_STR(s);
    if (p[fill] == '+' || p[fill] == '-') {
        /* move sign to beginning of string */
        p[0] = p[fill];
        p[fill] = '0';
    }

    return s;
}

Let's walk through this C code.

It first parses the argument positionally, meaning it doesn't allow keyword arguments:

>>> '1'.zfill(width=4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: zfill() takes no keyword arguments

It then checks if it's the same length or longer, in which case it returns the string.

>>> '1'.zfill(0)
'1'

zfill calls pad (this pad function is also called by ljust, rjust, and center as well). This basically copies the contents into a new string and fills in the padding.

static inline PyObject *
pad(PyObject *self, Py_ssize_t left, Py_ssize_t right, char fill)
{
    PyObject *u;

    if (left < 0)
        left = 0;
    if (right < 0)
        right = 0;

    if (left == 0 && right == 0) {
        return return_self(self);
    }

    u = STRINGLIB_NEW(NULL, left + STRINGLIB_LEN(self) + right);
    if (u) {
        if (left)
            memset(STRINGLIB_STR(u), fill, left);
        memcpy(STRINGLIB_STR(u) + left,
               STRINGLIB_STR(self),
               STRINGLIB_LEN(self));
        if (right)
            memset(STRINGLIB_STR(u) + left + STRINGLIB_LEN(self),
                   fill, right);
    }

    return u;
}

After calling pad, zfill moves any originally preceding + or - to the beginning of the string.

Note that for the original string to actually be numeric is not required:

>>> '+foo'.zfill(10)
'+000000foo'
>>> '-foo'.zfill(10)
'-000000foo'

For Python 3.6+ using f-strings:

>>> i = 1
>>> f"{i:0>2}"  # Works for both numbers and strings.
'01'
>>> f"{i:02}"  # Works only for numbers.
'01'

For Python 2 to Python 3.5:

>>> "{:0>2}".format("1")  # Works for both numbers and strings.
'01'
>>> "{:02}".format(1)  # Works only for numbers.
'01'

>>> '99'.zfill(5)
'00099'
>>> '99'.rjust(5,'0')
'00099'

if you want the opposite:

>>> '99'.ljust(5,'0')
'99000'

width = 10
x = 5
print "%0*d" % (width, x)
> 0000000005

See the print documentation for all the exciting details!

Update for Python 3.x (7.5 years later)

That last line should now be:

print("%0*d" % (width, x))

I.e. print() is now a function, not a statement. Note that I still prefer the Old School printf() style because, IMNSHO, it reads better, and because, um, I've been using that notation since January, 1980. Something ... old dogs .. something something ... new tricks.


When using Python >= 3.6, the cleanest way is to use f-strings with string formatting:

>>> s = f"{1:08}"  # inline with int
>>> s
'00000001'
>>> s = f"{'1':0>8}"  # inline with str
>>> s
'00000001'
>>> n = 1
>>> s = f"{n:08}"  # int variable
>>> s
'00000001'
>>> c = "1"
>>> s = f"{c:0>8}"  # str variable
>>> s
'00000001'

I would prefer formatting with an int, since only then the sign is handled correctly:

>>> f"{-1:08}"
'-0000001'

>>> f"{1:+08}"
'+0000001'

>>> f"{'-1':0>8}"
'000000-1'

I am adding how to use a int from a length of a string within an f-string because it didn't appear to be covered:

>>> pad_number = len("this_string")
11
>>> s = f"{1:0{pad_number}}" }
>>> s
'00000000001'


Quick timing comparison:

setup = '''
from random import randint
def test_1():
    num = randint(0,1000000)
    return str(num).zfill(7)
def test_2():
    num = randint(0,1000000)
    return format(num, '07')
def test_3():
    num = randint(0,1000000)
    return '{0:07d}'.format(num)
def test_4():
    num = randint(0,1000000)
    return format(num, '07d')
def test_5():
    num = randint(0,1000000)
    return '{:07d}'.format(num)
def test_6():
    num = randint(0,1000000)
    return '{x:07d}'.format(x=num)
def test_7():
    num = randint(0,1000000)
    return str(num).rjust(7, '0')
'''
import timeit
print timeit.Timer("test_1()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_2()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_3()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_4()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_5()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_6()", setup=setup).repeat(3, 900000)
print timeit.Timer("test_7()", setup=setup).repeat(3, 900000)


> [2.281613943830961, 2.2719342631547077, 2.261691106209631]
> [2.311480238815406, 2.318420542148333, 2.3552384305184493]
> [2.3824197456864304, 2.3457239951596485, 2.3353268829498646]
> [2.312442972404032, 2.318053102249902, 2.3054072168069872]
> [2.3482314132374853, 2.3403386400002475, 2.330108825844775]
> [2.424549090688892, 2.4346475296851438, 2.429691196530058]
> [2.3259756401716487, 2.333549212826732, 2.32049893822186]

I've made different tests of different repetitions. The differences are not huge, but in all tests, the zfill solution was fastest.


Just use the rjust method of the string object.

This example will make a string of 10 characters long, padding as necessary.

>>> t = 'test'
>>> t.rjust(10, '0')
>>> '000000test'

I am adding how to use a int from a length of a string within an f-string because it didn't appear to be covered:

>>> pad_number = len("this_string")
11
>>> s = f"{1:0{pad_number}}" }
>>> s
'00000000001'


What is the most pythonic way to pad a numeric string with zeroes to the left, i.e., so the numeric string has a specific length?

str.zfill is specifically intended to do this:

>>> '1'.zfill(4)
'0001'

Note that it is specifically intended to handle numeric strings as requested, and moves a + or - to the beginning of the string:

>>> '+1'.zfill(4)
'+001'
>>> '-1'.zfill(4)
'-001'

Here's the help on str.zfill:

>>> help(str.zfill)
Help on method_descriptor:

zfill(...)
    S.zfill(width) -> str

    Pad a numeric string S with zeros on the left, to fill a field
    of the specified width. The string S is never truncated.

Performance

This is also the most performant of alternative methods:

>>> min(timeit.repeat(lambda: '1'.zfill(4)))
0.18824880896136165
>>> min(timeit.repeat(lambda: '1'.rjust(4, '0')))
0.2104538488201797
>>> min(timeit.repeat(lambda: f'{1:04}'))
0.32585487607866526
>>> min(timeit.repeat(lambda: '{:04}'.format(1)))
0.34988890308886766

To best compare apples to apples for the % method (note it is actually slower), which will otherwise pre-calculate:

>>> min(timeit.repeat(lambda: '1'.zfill(0 or 4)))
0.19728074967861176
>>> min(timeit.repeat(lambda: '%04d' % (0 or 1)))
0.2347015216946602

Implementation

With a little digging, I found the implementation of the zfill method in Objects/stringlib/transmogrify.h:

static PyObject *
stringlib_zfill(PyObject *self, PyObject *args)
{
    Py_ssize_t fill;
    PyObject *s;
    char *p;
    Py_ssize_t width;

    if (!PyArg_ParseTuple(args, "n:zfill", &width))
        return NULL;

    if (STRINGLIB_LEN(self) >= width) {
        return return_self(self);
    }

    fill = width - STRINGLIB_LEN(self);

    s = pad(self, fill, 0, '0');

    if (s == NULL)
        return NULL;

    p = STRINGLIB_STR(s);
    if (p[fill] == '+' || p[fill] == '-') {
        /* move sign to beginning of string */
        p[0] = p[fill];
        p[fill] = '0';
    }

    return s;
}

Let's walk through this C code.

It first parses the argument positionally, meaning it doesn't allow keyword arguments:

>>> '1'.zfill(width=4)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: zfill() takes no keyword arguments

It then checks if it's the same length or longer, in which case it returns the string.

>>> '1'.zfill(0)
'1'

zfill calls pad (this pad function is also called by ljust, rjust, and center as well). This basically copies the contents into a new string and fills in the padding.

static inline PyObject *
pad(PyObject *self, Py_ssize_t left, Py_ssize_t right, char fill)
{
    PyObject *u;

    if (left < 0)
        left = 0;
    if (right < 0)
        right = 0;

    if (left == 0 && right == 0) {
        return return_self(self);
    }

    u = STRINGLIB_NEW(NULL, left + STRINGLIB_LEN(self) + right);
    if (u) {
        if (left)
            memset(STRINGLIB_STR(u), fill, left);
        memcpy(STRINGLIB_STR(u) + left,
               STRINGLIB_STR(self),
               STRINGLIB_LEN(self));
        if (right)
            memset(STRINGLIB_STR(u) + left + STRINGLIB_LEN(self),
                   fill, right);
    }

    return u;
}

After calling pad, zfill moves any originally preceding + or - to the beginning of the string.

Note that for the original string to actually be numeric is not required:

>>> '+foo'.zfill(10)
'+000000foo'
>>> '-foo'.zfill(10)
'-000000foo'

Its ok too:

 h = 2
 m = 7
 s = 3
 print("%02d:%02d:%02d" % (h, m, s))

so output will be: "02:07:03"


I made a function :

def PadNumber(number, n_pad, add_prefix=None):
    number_str = str(number)
    paded_number = number_str.zfill(n_pad)
    if add_prefix:
        paded_number = add_prefix+paded_number
    print(paded_number)

PadNumber(99, 4)
PadNumber(1011, 8, "b'")
PadNumber('7BEF', 6, "#")

The output :

0099
b'00001011
#007BEF

width = 10
x = 5
print "%0*d" % (width, x)
> 0000000005

See the print documentation for all the exciting details!

Update for Python 3.x (7.5 years later)

That last line should now be:

print("%0*d" % (width, x))

I.e. print() is now a function, not a statement. Note that I still prefer the Old School printf() style because, IMNSHO, it reads better, and because, um, I've been using that notation since January, 1980. Something ... old dogs .. something something ... new tricks.


For the ones who came here to understand and not just a quick answer. I do these especially for time strings:

hour = 4
minute = 3
"{:0>2}:{:0>2}".format(hour,minute)
# prints 04:03

"{:0>3}:{:0>5}".format(hour,minute)
# prints '004:00003'

"{:0<3}:{:0<5}".format(hour,minute)
# prints '400:30000'

"{:$<3}:{:#<5}".format(hour,minute)
# prints '4$$:3####'

"0" symbols what to replace with the "2" padding characters, the default is an empty space

">" symbols allign all the 2 "0" character to the left of the string

":" symbols the format_spec


str(n).zfill(width) will work with strings, ints, floats... and is Python 2.x and 3.x compatible:

>>> n = 3
>>> str(n).zfill(5)
'00003'
>>> n = '3'
>>> str(n).zfill(5)
'00003'
>>> n = '3.0'
>>> str(n).zfill(5)
'003.0'

For zip codes saved as integers:

>>> a = 6340
>>> b = 90210
>>> print '%05d' % a
06340
>>> print '%05d' % b
90210

I made a function :

def PadNumber(number, n_pad, add_prefix=None):
    number_str = str(number)
    paded_number = number_str.zfill(n_pad)
    if add_prefix:
        paded_number = add_prefix+paded_number
    print(paded_number)

PadNumber(99, 4)
PadNumber(1011, 8, "b'")
PadNumber('7BEF', 6, "#")

The output :

0099
b'00001011
#007BEF

Just use the rjust method of the string object.

This example will make a string of 10 characters long, padding as necessary.

>>> t = 'test'
>>> t.rjust(10, '0')
>>> '000000test'

width = 10
x = 5
print "%0*d" % (width, x)
> 0000000005

See the print documentation for all the exciting details!

Update for Python 3.x (7.5 years later)

That last line should now be:

print("%0*d" % (width, x))

I.e. print() is now a function, not a statement. Note that I still prefer the Old School printf() style because, IMNSHO, it reads better, and because, um, I've been using that notation since January, 1980. Something ... old dogs .. something something ... new tricks.