How can I take a screenshot image of a website using Python

Question

What I want to achieve is to get a website screenshot from any website in python   Env  Linux

User · Answer

I can t comment on ars s answer  but I actually got Roland Tapken s code running using QtWebkit and it works quite well    Just wanted to confirm that what Roland posts on his blog works great on Ubuntu  Our production version ended up not using any of what he wrote but we are using the PyQt QtWebKit bindings with much success   Note  The URL used to be  http   www blogs uni-osnabrueck de rotapken 2008 12 03 create-screenshots-of-a-web-page-using-python-and-qtwebkit  I ve updated it with a working copy

User · Answer

You don t mention what environment you re running in  which makes a big difference because there isn t a pure Python web browser that s capable of rendering HTML   But if you re using a Mac  I ve used webkit2png with great success   If not  as others have pointed out there are plenty of options

User · Answer

import subprocess  def screenshots url  name       subprocess run  webkit2png -F -o       -D   screens  format name  url          shell True

User · Answer

You can use Google Page Speed API to achieve your task easily  In my current project  I have used Google Page Speed API s query written in Python to capture screenshots of any Web URL provided and save it to a location  Have a look   import urllib2 import json import base64 import sys import requests import os import errno      The website s URL as an Input site   sys argv 1  imagePath   sys argv 2       The Google API   Remove   amp strategy mobile  for a desktop screenshot api    https   www googleapis com pagespeedonline v1 runPagespeed screenshot true amp strategy mobile amp url     urllib2 quote site       Get the results from Google try      site data   json load urllib2 urlopen api   except urllib2 URLError      print  Unable to retreive data      sys exit    try      screenshot encoded    site data  screenshot    data   except ValueError      print  Invalid JSON encountered       sys exit        Google has a weird way of encoding the Base64 data screenshot encoded   screenshot encoded replace           screenshot encoded   screenshot encoded replace  -             Decode the Base64 data screenshot decoded   base64 b64decode screenshot encoded   if not os path exists os path dirname impagepath        try          os makedirs os path dirname impagepath           except  OSError as exc              if exc errno     errno EEXIST                  raise      Save the file with open imagePath   w   as file       file  write screenshot decoded    Unfortunately  following are the drawbacks  If these do not matter  you can proceed with Google Page Speed API  It works well    The maximum width is 320px  According to Google API Quota  there is a limit of 25 000 requests per day

User · Answer

This is an old question and most answers are a bit dated  Currently  I would do 1 of 2 things  1  Create a program that takes the screenshots I would use Pyppeteer to take screenshots of websites  This runs on the Puppeteer package  Puppeteer spins up a headless chrome browser  so the screenshots will look exactly like they would in a normal browser  This is taken from the pyppeteer documentation  import asyncio from pyppeteer import launch  async def main        browser   await launch       page   await browser newPage       await page goto  https   example com       await page screenshot   path    example png        await browser close    asyncio get event loop   run until complete main     2  Use a screenshot API You could also use a screenshot API such as this one  The nice thing is that you don t have to set everything up yourself but can simply call an API endpoint  This is taken from the screenshot API s documentation  import urllib parse import urllib request import ssl  ssl  create default https context   ssl  create unverified context    The parameters  token    quot YOUR API TOKEN quot  url   urllib parse quote plus  quot https   example com quot   width   1920 height   1080 output    quot image quot     Create the query URL  query    quot https   screenshotapi net api v1 screenshot quot  query     quot  token  s amp url  s amp width  d amp height  d amp output  s quot     token  url  width  height  output     Call the API  urllib request urlretrieve query   quot   example png quot

User · Answer

11 years later    Taking a website screenshot using Python3 6 and Google PageSpeedApi Insights v5  import base64 import requests import traceback import urllib parse as ul    It s possible to make requests without the api key  but the number of requests is very limited    url    quot https   duckgo com quot  urle   ul quote plus url  image path    quot duckgo jpg quot   key    quot xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx quot  strategy    quot desktop quot     quot mobile quot  u   f quot https   www googleapis com pagespeedonline v5 runPagespeed key  key  amp strategy  strategy  amp url  urle  quot   try      j   requests get u  json       ss encoded   j  lighthouseResult    audits    final-screenshot    details    data   replace  quot data image jpeg base64  quot    quot  quot       ss decoded   base64 b64decode ss encoded      with open image path   wb    as f          f write ss decoded   except       print traceback format exc        exit 1    Notes   Live Demo Pros  Free Conns  Low Resolution Get API Key Docs Limits   Queries per day   25 000 Queries per 100 seconds   400

User · Answer

Using a web service s-shot ru  so it s not so fast   but quite easy to set up what need through the link configuration  And you can easily capture full page screenshots  import requests import urllib parse  BASE    https   mini s-shot ru 1024x0 JPEG 1024 Z100      you can modify size  format  zoom url    https   stackoverflow com   or whatever link you need url   urllib parse quote plus url   service needs link to be joined in encoded format print url   path    target1 jpg  response   requests get BASE   url  stream True   if response status code    200      with open path   wb   as file          for chunk in response              file write chunk

User · Answer

can do using Selenium  from selenium import webdriver  DRIVER    chromedriver  driver   webdriver Chrome DRIVER  driver get  https   www spotify com   screenshot   driver save screenshot  my screenshot png   driver quit     https   sites google com a chromium org chromedriver getting-started

User · Answer

Using Rendertron is an option  Under the hood  this is a headless Chrome exposing the following endpoints     render  url  Access this route e g  with requests get if you are interested in the DOM    screenshot  url  Access this route if you are interested in a screenshot    You would install rendertron with npm  run rendertron in one terminal  access http   localhost 3000 screenshot  url and save the file  but a demo is available at render-tron appspot com making it possible to run this Python3 snippet locally without installing the npm package   import requests  BASE    https   render-tron appspot com screenshot   url    https   google com  path    target jpg  response   requests get BASE   url  stream True    save file  see https   stackoverflow com a 13137873 7665691 if response status code    200      with open path   wb   as file          for chunk in response              file write chunk

User · Answer

Here is my solution by grabbing help from various sources  It takes full web page screen capture and it crops it  optional  and generates thumbnail from the cropped image also  Following are the requirements   Requirements    Install NodeJS Using Node s package manager install phantomjs  npm -g install phantomjs Install selenium  in your virtualenv  if you are using that  Install imageMagick Add phantomjs to system path  on windows      import os from subprocess import Popen  PIPE from selenium import webdriver  abspath   lambda  p  os path abspath os path join  p   ROOT   abspath os path dirname   file       def execute command command       result   Popen command  shell True  stdout PIPE  stdout read       if len result   gt  0 and not result isspace            raise Exception result    def do screen capturing url  screen path  width  height       print  Capturing screen        driver   webdriver PhantomJS         it save service log file in same directory       if you want to have log file stored else where       initialize the webdriver PhantomJS   as       driver   webdriver PhantomJS service log path   var log phantomjs ghostdriver log       driver set script timeout 30      if width and height          driver set window size width  height      driver get url      driver save screenshot screen path    def do crop params       print  Croping captured image        command              convert           params  screen path             -crop     sx s 0 0     params  width    params  height             params  crop path             execute command     join command     def do thumbnail params       print  Generating thumbnail from croped captured image        command              convert           params  crop path             -filter    Lanczos            -thumbnail     sx s     params  width    params  height             params  thumbnail path             execute command     join command     def get screen shot   kwargs       url   kwargs  url       width   int kwargs get  width   1024     screen width to capture     height   int kwargs get  height   768     screen height to capture     filename   kwargs get  filename    screen png     file name e g  screen png     path   kwargs get  path   ROOT    directory path to store screen      crop   kwargs get  crop   False    crop the captured screen     crop width   int kwargs get  crop width   width     the width of crop screen     crop height   int kwargs get  crop height   height     the height of crop screen     crop replace   kwargs get  crop replace   False    does crop image replace original screen capture       thumbnail   kwargs get  thumbnail   False    generate thumbnail from screen  requires crop True     thumbnail width   int kwargs get  thumbnail width   width     the width of thumbnail     thumbnail height   int kwargs get  thumbnail height   height     the height of thumbnail     thumbnail replace   kwargs get  thumbnail replace   False    does thumbnail image replace crop image       screen path   abspath path  filename      crop path   thumbnail path   screen path      if thumbnail and not crop          raise Exception   Thumnail generation requires crop image  set crop True       do screen capturing url  screen path  width  height       if crop          if not crop replace              crop path   abspath path   crop   filename          params                  width   crop width   height   crop height               crop path   crop path   screen path   screen path          do crop params           if thumbnail              if not thumbnail replace                  thumbnail path   abspath path   thumbnail   filename              params                      width   thumbnail width   height   thumbnail height                   thumbnail path   thumbnail path   crop path   crop path              do thumbnail params      return screen path  crop path  thumbnail path   if   name         main                     Requirements          Install NodeJS         Using Node s package manager install phantomjs  npm -g install phantomjs         install selenium  in your virtualenv  if you are using that          install imageMagick         add phantomjs to system path  on windows               url    http   stackoverflow com questions 1197172 how-can-i-take-a-screenshot-image-of-a-website-using-python      screen path  crop path  thumbnail path   get screen shot          url url  filename  sof png           crop True  crop replace False          thumbnail True  thumbnail replace False          thumbnail width 200  thumbnail height 150          These are the generated images    Full web page screen Cropped image from captured screen Thumbnail of a cropped image

User · Answer

Try this        usr bin env python  import gtk gdk  import time  import random  while 1         generate a random time between 120 and 300 sec     random time   random randrange 120 300         wait between 120 and 300 seconds  or between 2 and 5 minutes      print  Next picture in    2f minutes     float random time    60       time sleep random time       w   gtk gdk get default root window       sz   w get size        print  The size of the window is  d x  d    sz      pb   gtk gdk Pixbuf gtk gdk COLORSPACE RGB False 8 sz 0  sz 1       pb   pb get from drawable w w get colormap   0 0 0 0 sz 0  sz 1        ts   time time       filename    screenshot      filename    str ts      filename      png       if  pb    None           pb save filename  png           print  Screenshot saved to   filename     else          print  Unable to get the screenshot

User · Answer

Here is a simple solution using webkit  http   webscraping com blog Webpage-screenshots-with-webkit   import sys import time from PyQt4 QtCore import   from PyQt4 QtGui import   from PyQt4 QtWebKit import    class Screenshot QWebView       def   init   self           self app   QApplication sys argv          QWebView   init   self          self  loaded   False         self loadFinished connect self  loadFinished       def capture self  url  output file           self load QUrl url           self wait load             set to webpage size         frame   self page   mainFrame           self page   setViewportSize frame contentsSize              render image         image   QImage self page   viewportSize    QImage Format ARGB32          painter   QPainter image          frame render painter          painter end           print  saving   output file         image save output file       def wait load self  delay 0             process app events until page loaded         while not self  loaded              self app processEvents               time sleep delay          self  loaded   False      def  loadFinished self  result           self  loaded   True  s   Screenshot   s capture  http   webscraping com    website png   s capture  http   webscraping com blog    blog png

User · Answer

I created a library called pywebcapture that wraps selenium that will do just that  pip install pywebcapture  Once you install with pip  you can do the following to easily get full size screenshots    import modules from pywebcapture import loader  driver    load csv with urls csv file   loader CSVLoader  quot csv file with urls csv quot   has header bool  url column  optional filename column  uri dict   csv file get uri dict      create instance of the driver and run d   driver Driver  quot path to webdriver  quot   output filepath  delay  uri dict  d run    Enjoy  https   pypi org project pywebcapture

User · Answer

On the Mac  there s webkit2png and on Linux KDE  you can use khtml2png   I ve tried the former and it works quite well  and heard of the latter being put to use     I recently came across QtWebKit which claims to be cross platform  Qt rolled WebKit into their library  I guess    But I ve never tried it  so I can t tell you much more   The QtWebKit links shows how to access from Python   You should be able to at least use subprocess to do the same with the others

[python] How can I take a screenshot/image of a website using Python?

Examples related to python

Examples related to screenshot

Examples related to webpage

Examples related to backend