I am using tqdm
to print progress in a script I'm running in a Jupyter notebook. I am printing all messages to the console via tqdm.write()
. However, this still gives me a skewed output like so:
That is, each time a new line has to be printed, a new progress bar is printed on the next line. This does not happen when I run the script via terminal. How can I solve this?
This question is related to
python
jupyter-notebook
tqdm
To complete oscarbranson's answer: it's possible to automatically pick console or notebook versions of progress bar depending on where it's being run from:
from tqdm.autonotebook import tqdm
More info can be found here
None of the above works for me. I find that running the following sorts this issue after error (It just clears all the instances of progress bars in the background):
from tqdm import tqdm
# blah blah your code errored
tqdm._instances.clear()
This is an alternative answer for the case where tqdm_notebook doesn't work for you.
from time import sleep
from tqdm import tqdm
values = range(3)
with tqdm(total=len(values)) as pbar:
for i in values:
pbar.write('processed: %d' %i)
pbar.update(1)
sleep(1)
The output would look something like this (progress would show up red):
0%| | 0/3 [00:00<?, ?it/s]
processed: 1
67%|¦¦¦¦¦¦? | 2/3 [00:01<00:00, 1.99it/s]
processed: 2
100%|¦¦¦¦¦¦¦¦¦¦| 3/3 [00:02<00:00, 1.53it/s]
processed: 3
The problem is that the output to stdout and stderr are processed asynchronously and separately in terms of new lines.
If say Jupyter receives on stderr the first line and then the "processed" output on stdout. Then once it receives an output on stderr to update the progress, it wouldn't go back and update the first line as it would only update the last line. Instead it will have to write a new line.
One workaround would be to output both to stdout instead:
import sys
from time import sleep
from tqdm import tqdm
values = range(3)
with tqdm(total=len(values), file=sys.stdout) as pbar:
for i in values:
pbar.write('processed: %d' % (1 + i))
pbar.update(1)
sleep(1)
The output will change to (no more red):
processed: 1 | 0/3 [00:00<?, ?it/s]
processed: 2 | 0/3 [00:00<?, ?it/s]
processed: 3 | 2/3 [00:01<00:00, 1.99it/s]
100%|¦¦¦¦¦¦¦¦¦¦| 3/3 [00:02<00:00, 1.53it/s]
Here we can see that Jupyter doesn't seem to clear until the end of the line. We could add another workaround for that by adding spaces. Such as:
import sys
from time import sleep
from tqdm import tqdm
values = range(3)
with tqdm(total=len(values), file=sys.stdout) as pbar:
for i in values:
pbar.write('processed: %d%s' % (1 + i, ' ' * 50))
pbar.update(1)
sleep(1)
Which gives us:
processed: 1
processed: 2
processed: 3
100%|¦¦¦¦¦¦¦¦¦¦| 3/3 [00:02<00:00, 1.53it/s]
It might in general be more straight forward not to have two outputs but update the description instead, e.g.:
import sys
from time import sleep
from tqdm import tqdm
values = range(3)
with tqdm(total=len(values), file=sys.stdout) as pbar:
for i in values:
pbar.set_description('processed: %d' % (1 + i))
pbar.update(1)
sleep(1)
With the output (description updated while it's processing):
processed: 3: 100%|¦¦¦¦¦¦¦¦¦¦| 3/3 [00:02<00:00, 1.53it/s]
You can mostly get it to work fine with plain tqdm. But if tqdm_notebook works for you, just use that (but then you'd probably not read that far).
Use tqdm_notebook
from tqdm import tqdm_notebook as tqdm
x=[1,2,3,4,5]
for i in tqdm(range(0,len(x))):
print(x[i])
Most of the answers are outdated now. Better if you import tqdm correctly.
from tqdm import tqdm_notebook as tqdm
If the other tips here don't work and - just like me - you're using the pandas
integration through progress_apply
, you can let tqdm
handle it:
from tqdm.autonotebook import tqdm
tqdm.pandas()
df.progress_apply(row_function, axis=1)
The main point here lies in the tqdm.autonotebook
module. As stated in their instructions for use in IPython Notebooks, this makes tqdm
choose between progress bar formats used in Jupyter notebooks and Jupyter consoles - for a reason still lacking further investigations on my side, the specific format chosen by tqdm.autonotebook
works smoothly in pandas
, while all others didn't, for progress_apply
specifically.
For everyone who is on windows and couldn't solve the duplicating bars issue with any of the solutions mentioned here. I had to install the colorama
package as stated in tqdm's known issues which fixed it.
pip install colorama
Try it with this example:
from tqdm import tqdm
from time import sleep
for _ in tqdm(range(5), "All", ncols = 80, position = 0):
for _ in tqdm(range(100), "Sub", ncols = 80, position = 1, leave = False):
sleep(0.01)
Which will produce something like:
All: 60%|¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦ | 3/5 [00:03<00:02, 1.02s/it]
Sub: 50%|¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦ | 50/100 [00:00<00:00, 97.88it/s]
Source: Stackoverflow.com