[python] Python Web Crawlers and "getting" html source code

Use Python 2.7, is has more 3rd party libs at the moment. (Edit: see below).

I recommend you using the stdlib module urllib2, it will allow you to comfortably get web resources. Example:

import urllib2

response = urllib2.urlopen("http://google.de")
page_source = response.read()

For parsing the code, have a look at BeautifulSoup.

BTW: what exactly do you want to do:

Just for background, I need to download a page and replace any img with ones I have

Edit: It's 2014 now, most of the important libraries have been ported, and you should definitely use Python 3 if you can. python-requests is a very nice high-level library which is easier to use than urllib2.

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to get

Getting "TypeError: failed to fetch" when the request hasn't actually failed java, get set methods For Restful API, can GET method use json data? Swift GET request with parameters Sending a JSON to server and retrieving a JSON in return, without JQuery Retrofit and GET using parameters Correct way to pass multiple values for same parameter name in GET request How to download HTTP directory with all files and sub-directories as they appear on the online files/folders list? Curl and PHP - how can I pass a json through curl by PUT,POST,GET Making href (anchor tag) request POST instead of GET?

Examples related to web-crawler

TypeError: can't use a string pattern on a bytes-like object in re.findall() Finding the layers and layer sizes for each Docker image Sending "User-agent" using Requests library in Python How to find sitemap.xml path on websites? How to request Google to re-crawl my website? python: [Errno 10054] An existing connection was forcibly closed by the remote host How to get a web page's source code from Java Python: maximum recursion depth exceeded while calling a Python object Python Web Crawlers and "getting" html source code How do I make a simple crawler in PHP?