[python] Scraping: SSL: CERTIFICATE_VERIFY_FAILED error for http://en.wikipedia.org

I'm practicing the code from 'Web Scraping with Python', and I keep having this certificate problem:

from urllib.request import urlopen 
from bs4 import BeautifulSoup 
import re

pages = set()
def getLinks(pageUrl):
    global pages
    html = urlopen("http://en.wikipedia.org"+pageUrl)
    bsObj = BeautifulSoup(html)
    for link in bsObj.findAll("a", href=re.compile("^(/wiki/)")):
        if 'href' in link.attrs:
            if link.attrs['href'] not in pages:
                #We have encountered a new page
                newPage = link.attrs['href'] 
                print(newPage) 
                pages.add(newPage) 
                getLinks(newPage)
getLinks("")

The error is:

  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 1319, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1049)>

Btw,I was also practicing scrapy, but kept getting the problem: command not found: scrapy (I tried all sorts of solutions online but none works... really frustrating)

The answer is


I'm a relative novice compared to all the experts on Stack Overflow.

I have 2 versions of jupyter notebook running (one through a fresh Anaconda Navigator installation and one through ????). I think this is because Anaconda was installed as a local installation on my Mac (per Anaconda instructions).

I already had python 3.7 installed. After that, I used my terminal to open jupyter notebook and I think that it put another version globally onto my Mac.

However, I'm not sure because I'm just learning through trial and error!

I did the terminal command:

conda install -c anaconda certifi 

(as directed above, but it didn't work.)

My python 3.7 is installed on OS Catalina10.15.3 in:

  • /Library/Python/3.7/site-packages AND
  • ~/Library/Python/3.7/lib/python/site-packages

The certificate is at:

  • ~/Library/Python/3.7/lib/python/site-packages/certifi-2019.11.28.dist-info

I tried to find the Install Certificate.command ... but couldn't find it through looking through the file structures...not in Applications...not in links above.

I finally installed it by finding it through Spotlight (as someone suggested above). And it double clicked automatically and installed ANOTHER certificate in the same folder as:

  • ~/Library/Python/3.7/lib/python/site-packages/

NONE of the above solved anything for me...I still got the same error.

So, I solved the problem by:

  1. closing my jupyter notebook.
  2. opening Anaconda Navigator.
  3. opening jupyter notebook through the Navigator GUI (instead of through Terminal).
  4. opening my notebook and running the code.

I can't tell you why this worked. But it solved the problem for me.

I just want to save someone the hassle next time. If someone can tell my why it worked, that would be terrific.

I didn't try the other terminal commands because of the 2 versions of jupyter notebook that I knew were a problem. I just don't know how to fix that.


Use requests library. Try this solution, or just add https:// before the URL:

import requests
from bs4 import BeautifulSoup
import re

pages = set()
def getLinks(pageUrl):
    global pages
    html = requests.get("http://en.wikipedia.org"+pageUrl, verify=False).text
    bsObj = BeautifulSoup(html)
    for link in bsObj.findAll("a", href=re.compile("^(/wiki/)")):
        if 'href' in link.attrs:
            if link.attrs['href'] not in pages:
                #We have encountered a new page
                newPage = link.attrs['href']
                print(newPage)
                pages.add(newPage)
                getLinks(newPage)
getLinks("")

Check if this works for you


If you're running on a Mac you could just search for Install Certificates.command on the spotlight and hit enter.


For anyone who is using anaconda, you would install the certifi package, see more at:

https://anaconda.org/anaconda/certifi

To install, type this line in your terminal:

conda install -c anaconda certifi

I could find this solution and is working fine:

cd /Applications/Python\ 3.7/
./Install\ Certificates.command

To solve this:

All you need to do is to install Python certificates! A common issue on macOS.

Open these files:

Install Certificates.command
Update Shell Profile.command

Simply Run these two scripts and you wont have this issue any more.

Hope this helps!


Take a look at this post, it seems like for later versions of Python, certificates are not pre installed which seems to cause this error. You should be able to run the following command to install the certifi package: /Applications/Python\ 3.6/Install\ Certificates.command

Post 1: urllib and "SSL: CERTIFICATE_VERIFY_FAILED" Error

Post 2: Airbrake error: urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate


Two steps worked for me : - going Macintosh HD > Applications > Python3.7 folder - click on "Install Certificates.command"


Install Certificates.command on your mac.


This terminal command:

open /Applications/Python\ 3.7/Install\ Certificates.command

Found here: https://stackoverflow.com/a/57614113/6207266

Resolved it for me. With my config

pip install --upgrade certifi

had no impact.


For novice users, you can go in the Applications folder and expand the Python 3.7 folder. Now first run (or double click) the Install Certificates.command and then Update Shell Profile.command

enter image description here


i didn't solve the problem, sadly. but managed to make to codes work (almost all of my codes have this probelm btw) the local issuer certificate problem happens under python3.7 so i changed back to python2.7 QAQ and all that needed to change including "from urllib2 import urlopen" instead of "from urllib.request import urlopen" so sad...


I am using Debian 10 buster and try download a file with youtube-dl and get this error: sudo youtube-dl -k https://youtu.be/uscis0CnDjk

[youtube] uscis0CnDjk: Downloading webpage ERROR: Unable to download webpage: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)> (caused by URLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)')))

Certificates with python2 and python3.8 are installed correctly, but i persistent receive the same error. finally (which is not the best solution, but works for me was to eliminate the certificate check as it is given as an option in youtube-dl) whith this command sudo youtube-dl -k --no-check-certificate https://youtu.be/uscis0CnDjk


For me the problem was that I was setting REQUESTS_CA_BUNDLE in my .bash_profile

/Users/westonagreene/.bash_profile:
...
export REQUESTS_CA_BUNDLE=/usr/local/etc/openssl/cert.pem
...

Once I set REQUESTS_CA_BUNDLE to blank (i.e. removed from .bash_profile), requests worked again.

export REQUESTS_CA_BUNDLE=""

The problem only exhibited when executing python requests via a CLI (Command Line Interface). If I ran requests.get(URL, CERT) it resolved just fine.

Mac OS Catalina (10.15.6). Pyenv of 3.6.11. Error message I was getting: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)

My answer elsewhere: https://stackoverflow.com/a/64151964/4420657


to use unverified ssl you can add this to your code:

import ssl
ssl._create_default_https_context = ssl._create_unverified_context

This will work. Set the environment variable PYTHONHTTPSVERIFY to 0.

  • By typing linux command:
export PYTHONHTTPSVERIFY = 0

OR

  • Using in python code:
import os
os.environ["PYTHONHTTPSVERIFY"] = "0"

I had the same error and solved the problem by running the program code below:

# install_certifi.py
#
# sample script to install or update a set of default Root Certificates
# for the ssl module.  Uses the certificates provided by the certifi package:
#       https://pypi.python.org/pypi/certifi

import os
import os.path
import ssl
import stat
import subprocess
import sys

STAT_0o775 = ( stat.S_IRUSR | stat.S_IWUSR | stat.S_IXUSR
             | stat.S_IRGRP | stat.S_IWGRP | stat.S_IXGRP
             | stat.S_IROTH |                stat.S_IXOTH )


def main():
    openssl_dir, openssl_cafile = os.path.split(
        ssl.get_default_verify_paths().openssl_cafile)

    print(" -- pip install --upgrade certifi")
    subprocess.check_call([sys.executable,
        "-E", "-s", "-m", "pip", "install", "--upgrade", "certifi"])

    import certifi

    # change working directory to the default SSL directory
    os.chdir(openssl_dir)
    relpath_to_certifi_cafile = os.path.relpath(certifi.where())
    print(" -- removing any existing file or link")
    try:
        os.remove(openssl_cafile)
    except FileNotFoundError:
        pass
    print(" -- creating symlink to certifi certificate bundle")
    os.symlink(relpath_to_certifi_cafile, openssl_cafile)
    print(" -- setting permissions")
    os.chmod(openssl_cafile, STAT_0o775)
    print(" -- update complete")

if __name__ == '__main__':
    main()

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to web-scraping

Scraping: SSL: CERTIFICATE_VERIFY_FAILED error for http://en.wikipedia.org How to print an exception in Python 3? What should I use to open a url instead of urlopen in urllib3 Use Excel VBA to click on a button in Internet Explorer, when the button has no "name" associated How to use Python requests to fake a browser visit a.k.a and generate User Agent? Scraping data from website using vba Using BeautifulSoup to extract text without tags Is it ok to scrape data from Google results? What's the best way of scraping data from a website? Use getElementById on HTMLElement instead of HTMLDocument

Examples related to beautifulsoup

Scraping: SSL: CERTIFICATE_VERIFY_FAILED error for http://en.wikipedia.org What should I use to open a url instead of urlopen in urllib3 TypeError: a bytes-like object is required, not 'str' in python and CSV UnicodeEncodeError: 'ascii' codec can't encode character at special name UnicodeEncodeError: 'charmap' codec can't encode characters bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library? Using BeautifulSoup to extract text without tags python BeautifulSoup parsing table install beautiful soup using pip Python BeautifulSoup extract text between element

Examples related to scrapy

Scraping: SSL: CERTIFICATE_VERIFY_FAILED error for http://en.wikipedia.org TypeError: Object of type 'bytes' is not JSON serializable "OSError: [Errno 1] Operation not permitted" when installing Scrapy in OSX 10.11 (El Capitan) (System Integrity Protection) pip is not able to install packages correctly: Permission denied error Can scrapy be used to scrape dynamic content from websites that are using AJAX?

Examples related to ssl-certificate

How to install OpenSSL in windows 10? Scraping: SSL: CERTIFICATE_VERIFY_FAILED error for http://en.wikipedia.org Not able to install Python packages [SSL: TLSV1_ALERT_PROTOCOL_VERSION] Letsencrypt add domain to existing certificate javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure bypass invalid SSL certificate in .net core How to add Certificate Authority file in CentOS 7 How to use a client certificate to authenticate and authorize in a Web API This certificate has an invalid issuer Apple Push Services iOS9 getting error “an SSL error has occurred and a secure connection to the server cannot be made”