[python] Pytesseract : "TesseractNotFound Error: tesseract is not installed or it's not in your path", how do I fix this?

I'm trying to run a basic and very simple code in python.

from PIL import Image
import pytesseract

im = Image.open("sample1.jpg")

text = pytesseract.image_to_string(im, lang = 'eng')

print(text)

This is what it looks like, I have actually installed tesseract for windows through the installer. I'm very new to Python, and I'm unsure how to proceed?

Any guidance here would be very helpful. I've tried restarting my Spyder application but to no avail.

This question is related to python tesseract

The answer is


I see steps are scattered in different answers. Based on my recent experience with this pytesseract error on Windows, writing different steps in sequence to make it easier to resolve the error:

1. Install tesseract using windows installer available at: https://github.com/UB-Mannheim/tesseract/wiki

2. Note the tesseract path from the installation. Default installation path at the time of this edit was: C:\Users\USER\AppData\Local\Tesseract-OCR. It may change so please check the installation path.

3. pip install pytesseract

4. Set the tesseract path in the script before calling image_to_string:

pytesseract.pytesseract.tesseract_cmd = r'C:\Users\USER\AppData\Local\Tesseract-OCR\tesseract.exe'


First you should install binary:

On Linux

sudo apt-get update
sudo apt-get install libleptonica-dev 
sudo apt-get install tesseract-ocr tesseract-ocr-dev
sudo apt-get install libtesseract-dev

On Mac

brew install tesseract

On Windows

download binary from https://github.com/UB-Mannheim/tesseract/wiki. then add pytesseract.pytesseract.tesseract_cmd = 'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe' to your script.

Then you should install python package using pip:

pip install tesseract
pip install tesseract-ocr

references: https://pypi.org/project/pytesseract/ (INSTALLATION section) and https://github.com/tesseract-ocr/tesseract/wiki#installation


For Windows Only

1 - You need to have Tesseract OCR installed on your computer.

get it from here. https://github.com/UB-Mannheim/tesseract/wiki

Download the suitable version.

2 - Add Tesseract path to your System Environment. i.e. Edit system variables.

3 - Run pip install pytesseract and pip install tesseract

4 - Add this line to your python script every time

pytesseract.pytesseract.tesseract_cmd = 'C:/OCR/Tesseract-OCR/tesseract.exe'  # your path may be different

5 - Run the code.


From https://pypi.org/project/pytesseract/ :

pytesseract.pytesseract.tesseract_cmd = '<full_path_to_your_tesseract_executable>'
# Include the above line, if you don't have tesseract executable in your PATH
# Example tesseract_cmd: 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract'

In windows:

pip install tesseract

pip install tesseract-ocr

and check the file which is stored in your system usr/appdata/local/programs/site-pakages/python/python36/lib/pytesseract/pytesseract.py file and compile the file


This error is because tesseract is not installed on your computer.

If you are using Ubuntu install tesseract using following command:

sudo apt-get install tesseract-ocr

For mac:

brew install tesseract

On Mac, you can install it like shown below. This works for me.

brew install tesseract

you can install this package... https://github.com/UB-Mannheim/tesseract/wiki after that you should go this path C:\Program Files (x86)\Tesseract-OCR\ tesseract.exe then run tesseract file. I think this will help you...


On Windows 64 bits, just add the following to the PATH environment variable: "C:\Program Files\Tesseract-OCR" and it will work.


I can solve it by updating the tesseract_cmd variable with the bin/tesseract path in the pytesseract.py file


I had the same issue on Windows. I tried to update the environment variables for the path of tesseract which did not work.

What worked for me was to modify the pytesseract.py which can be found at the path C:\Program Files\Python37\Lib\site-packages\pytesseract or usually in the C:\Users\YOUR USER\APPDATA\Python

I changed one line as per below:

#tesseract_cmd = 'tesseract' 
#tesseract_cmd = 'C:\Program Files\Tesseract-OCR\\tesseract.exe'

Note I had to put an extra \ before tesseract as Python was interpreting same as \t and you will get the below error message:

pytesseract.pytesseract.TesseractNotFoundError: C:\Program Files\Tesseract-OCR esseract.exe is not installed or it's not in your path


For Windows users only:

Install tesseract using:

pip install tesseract

and then add this line to your code, mind the "\"

pytesseract.pytesseract.tesseract_cmd = "C:\Program Files (x86)\Tesseract-OCR\\tesseract.exe" 

You would be needing to install tesseract.

https://github.com/tesseract-ocr/tesseract/wiki

Check out the above documentation on the installation.


In windows, the command path must be redirected, for a default windows tesseract installation.

  1. In 32 bit system, add in this line after import commands.
pytesseract.pytesseract.tesseract_cmd = 'C:\Program Files (x86)\Tesseract-OCR\tesseract.exe' 
  1. In 64 bit system, add this line instead.
 pytesseract.pytesseract.tesseract_cmd = 'C:\Program Files\Tesseract-OCR\tesseract.exe'

Step 1:

Install tesseract on your system as per the OS. Latest installers can be found at https://github.com/UB-Mannheim/tesseract/wiki

Step 2: Install the following dependency libraries using : pip install pytesseract pip install opencv-python pip install numpy

Step 3: Sample code

import cv2
import numpy as np
import pytesseract
from PIL import Image
from pytesseract import image_to_string

# Path of working folder on Disk Replace with your working folder
src_path = "C:\\Users\\<user>\\PycharmProjects\\ImageToText\\input\\"
# If you don't have tesseract executable in your PATH, include the 
following:
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract- 
OCR/tesseract'
TESSDATA_PREFIX = 'C:/Program Files (x86)/Tesseract-OCR'

def get_string(img_path):
    # Read image with opencv
    img = cv2.imread(img_path)

    # Convert to gray
    img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    # Apply dilation and erosion to remove some noise
    kernel = np.ones((1, 1), np.uint8)
    img = cv2.dilate(img, kernel, iterations=1)
    img = cv2.erode(img, kernel, iterations=1)

    # Write image after removed noise
    cv2.imwrite(src_path + "removed_noise.png", img)

    #  Apply threshold to get image with only black and white
    #img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)

    # Write the image after apply opencv to do some ...
    cv2.imwrite(src_path + "thres.png", img)

    # Recognize text with tesseract for python
    result = pytesseract.image_to_string(Image.open(src_path + "thres.png"))

    # Remove template file
    #os.remove(temp)

    return result


print('--- Start recognize text from image ---')
print(get_string(src_path + "image.png") )

print("------ Done -------")

Perhaps this is happening because, even if Tesseract is correctly installed, you have not installed your language, as was my case. Fortunately this is very easy to fix, and I did not even need to mess with tesseract_cmd.

sudo apt-get install tesseract-ocr -y
sudo apt-get install tesseract-ocr-spa -y
tesseract --list-langs

Note that in the second line we have specified -spa for Spanish.

If installation has been successful, you should get a list of your available languages, like:

List of available languages (3):
eng
osd
spa

I found this at this blog post (Spanish). There is also a post for installation of Spanish language in Windows (not as easy apparently).

Note: since the question uses lang = 'eng', it is likely this is not the answer in that specific case. But the same error may happen in this other situation, which is why I posted the answer here.


Solution for UBUNTU Worked for me:

Installed tesseract in ubuntu by following below link

https://medium.com/quantrium-tech/installing-tesseract-4-on-ubuntu-18-04-b6fcd0cbd78f

Later added traindata language to tessdata by following below link

Tesseract running error


There are already many nice answers to this problem but I would like to share a wonderful site that I came across when I couldnt solve the 'TesseractNotFound Error: tesseract is not installed or it's not in your path” Please refer this site: https://www.thetopsites.net/article/50655738.shtml

I realised that I got this error because I installed pytesseract with pip but forget to install the binary. You are probably missing tesseract-ocr from your machine. Check the installation instructions here: https://github.com/tesseract-ocr/tesseract/wiki

On a Mac, you can just install using homebrew:

brew install tesseract

It should run fine after that!

Under Windows 10 OS environment, the following method works for me:

  1. Go to this link and Download tesseract and install it. Windows version is available here: https://github.com/UB-Mannheim/tesseract/wiki

  2. Find script file pytesseract.py from C:\Users\User\Anaconda3\Lib\site-packages\pytesseract and open it. Change the following code from tesseract_cmd = 'tesseract' to: tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract.exe' (This is the path where you install Tesseract-OCR so please check where you install it and accordingly update the path)

  3. You may also need to add environment variable C:/Program Files (x86)/Tesseract-OCR/

Hope it works for you!


For Ubuntu 18.04

If you are getting an error like

 tesseract is not installed or it's not in your path

 and 

 OSError: [Errno 12] Cannot allocate memory

That might be and issue with the swap memory allocation issue

You can check this answer allocating more swap memory Hope that helps :)

https://askubuntu.com/questions/920595/fallocate-fallocate-failed-text-file-busy-in-ubuntu-17-04?answertab=active#tab-top


Use the following command to install tesseract

pip install tesseract


# {Windows 10 instructions}
# before you use the script you need to install the dependence
# 1. download the tesseract from the official link:
#   https://github.com/UB-Mannheim/tesseract/wiki
# 2. install the tesseract
#   i chosed this path
#       *replace the user string in the below path with you name of user that you are using in your current machine
#       C:\Users\user\AppData\Local\Tesseract-OCR\
# 3. Install the  pillow for your python version
# * the best way for me is to install is this form(i'am using python3.7 version and in my CMD i run this version of python by     typing py -3.7):
# * if you are using another version of python first look how you start the python from you CMD
# * for some machine the run of python from the CMD is different
    # [examples]
    # =================================
    # PYTHON VERSION 3.7
    # python
    # python3.7
    # python -3.7
    # python 3.7
    # python3
    # python -3
    # python 3
    # py3.7
    # py -3.7
    # py 3.7
    # py3
    # py -3
    # py 3
    # PYTHON VERSION 3.6
    # python
    # python3.6
    # python -3.6
    # python 3.6
    # python3
    # python -3
    # python 3
    # py3.6
    # py -3.6
    # py 3.6
    # py3
    # py -3
    # py 3
    # PYTHON VERSION 2.7
    # python
    # python2.7
    # python -2.7
    # python 2.7
    # python2
    # python -2
    # python 2
    # py2.7
    # py -2.7
    # py 2.7
    # py2
    # py -2
    # py 2
    # ================================
# we are using pip to install the dependences
# because for me i start the python version 3.7 with the following line 
    # py -3.7
# open the CMD in windows machine and type the following line:
    # py -3.7 -m pip install pillow
# 4. Install the  pytesseract and tesseract for your python version
# * the best way for me is to install is this form(i'am using python3.7 version and in my CMD i run this version of python by     typing py -3.7):
# we are using pip to install the dependences
# open the CMD in windows machine and type the following lines:
    # py -3.7 -m pip install pytesseract
    # py -3.7 -m pip install tesseract


#!/usr/bin/python
from PIL import Image
import pytesseract
import os
import getpass

def extract_text_from_image(image_file_name_arg):

    # IMPORTANT
    # if you have followed my instructions to install this dependence in above text explanatin
    # for my machine is
    # if you don't put the right path for tesseract.exe the script will not work
    username = getpass.getuser()
    # here above line get the username for your machine automatically
    tesseract_exe_path_installation="C:\\Users\\"+username+"\\AppData\\Local\\Tesseract-OCR\\tesseract.exe"
    pytesseract.pytesseract.tesseract_cmd=tesseract_exe_path_installation

# specify the direction of your image files manually or use line bellow if the images are in the script directory in     folder  images
    # image_dir="D:\\GIT\\ai_example\\extract_text_from_image\\images"
    image_dir=os.getcwd()+"\\images"
    dir_seperator="\\"
    image_file_name=image_file_name_arg
    # if your image are in different format change the extension(ex. ".png")
    image_ext=".jpg"
    image_path_dir=image_dir+dir_seperator+image_file_name+image_ext

    print("=============================================================================")
    print("image used is in the following path dir:")
    print("\t"+image_path_dir)
    print("=============================================================================")

    img=Image.open(image_path_dir)
    text=pytesseract.image_to_string(img, lang="eng")
    print(text)

# change the name "image_1" whith the name without extension for your image name
# image_file_name_arg="image_1"
image_file_name_arg="image_2"
# image_file_name_arg="image_3"
# image_file_name_arg="image_4"
# image_file_name_arg="image_5"
extract_text_from_image(image_file_name_arg)

# ==================================
# CREATED BY: SHERIFI
# e-mail: sherif_co@yahoo.com
# git-link for script: https://github.com/sherifi/ai_example.git
# ==================================

There looks to be an issue with the latest version of the pip module pytesseract=0.3.7. I have downgraded it to pytesseract=0.3.6 and don't see the error.


Questions with python tag:

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation Upgrade to python 3.8 using conda Unable to allocate array with shape and data type How to fix error "ERROR: Command errored out with exit status 1: python." when trying to install django-heroku using pip How to prevent Google Colab from disconnecting? "UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure." when plotting figure with pyplot on Pycharm How to fix 'Object arrays cannot be loaded when allow_pickle=False' for imdb.load_data() function? "E: Unable to locate package python-pip" on Ubuntu 18.04 Tensorflow 2.0 - AttributeError: module 'tensorflow' has no attribute 'Session' Jupyter Notebook not saving: '_xsrf' argument missing from post How to Install pip for python 3.7 on Ubuntu 18? Python: 'ModuleNotFoundError' when trying to import module from imported package OpenCV TypeError: Expected cv::UMat for argument 'src' - What is this? Requests (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.") Error in PyCharm requesting website How to setup virtual environment for Python in VS Code? Pylint "unresolved import" error in Visual Studio Code Pandas Merging 101 Numpy, multiply array with scalar What is the meaning of "Failed building wheel for X" in pip install? Selenium: WebDriverException:Chrome failed to start: crashed as google-chrome is no longer running so ChromeDriver is assuming that Chrome has crashed Could not install packages due to an EnvironmentError: [Errno 13] OpenCV !_src.empty() in function 'cvtColor' error ConvergenceWarning: Liblinear failed to converge, increase the number of iterations How to downgrade python from 3.7 to 3.6 I can't install pyaudio on Windows? How to solve "error: Microsoft Visual C++ 14.0 is required."? Iterating over arrays in Python 3 How do I install opencv using pip? How do I install Python packages in Google's Colab? How do I use TensorFlow GPU? How to upgrade Python version to 3.7? How to resolve TypeError: can only concatenate str (not "int") to str How can I install a previous version of Python 3 in macOS using homebrew? Flask at first run: Do not use the development server in a production environment TypeError: only integer scalar arrays can be converted to a scalar index with 1D numpy indices array What is the difference between Jupyter Notebook and JupyterLab? Pytesseract : "TesseractNotFound Error: tesseract is not installed or it's not in your path", how do I fix this? Could not install packages due to a "Environment error :[error 13]: permission denied : 'usr/local/bin/f2py'" How do I resolve a TesseractNotFoundError? Trying to merge 2 dataframes but get ValueError Authentication plugin 'caching_sha2_password' is not supported Python Pandas User Warning: Sorting because non-concatenation axis is not aligned

Questions with tesseract tag:

Pytesseract : "TesseractNotFound Error: tesseract is not installed or it's not in your path", how do I fix this? How do I resolve a TesseractNotFoundError? best OCR (Optical character recognition) example in android Tesseract OCR simple example Tesseract running error image processing to improve tesseract OCR accuracy How to make tesseract to recognize only numbers, when they are mixed with letters?