Changing user agent on urllib2 urlopen

Question

How can I download a webpage with a user agent other than the default one on urllib2 urlopen

User · Answer

there are two properties of urllib URLopener   namely  addheaders      User-Agent    Python-urllib 1 17      Accept           and version    Python-urllib 1 17   To fool the website you need to changes both of these values to an accepted User-Agent  for e g  Chrome browser    Mozilla 5 0  X11  Linux x86 64  AppleWebKit 537 36  KHTML  like Gecko  Chrome 33 0 1750 149 Safari 537 36  Google bot    Googlebot 2 1  like this    import urllib page extractor urllib URLopener     page extractor addheaders      User-Agent    Googlebot 2 1      Accept             page extractor version    Googlebot 2 1  page extractor retrieve  lt url gt    lt file path gt     changing just one property does not work because the website marks it as a suspicious request

User · Answer

For urllib you can use   from urllib import FancyURLopener  class MyOpener FancyURLopener  object       version    Mozilla 5 0  Windows  U  Windows NT 5 1  it  rv 1 8 1 11  Gecko 20071127 Firefox 2 0 0 11   myopener   MyOpener   myopener retrieve  https   www google com search q test    useragent html

User · Answer

headers      User-Agent     Mozilla 5 0    req   urllib2 Request  www example com   None  headers  html   urllib2 urlopen req  read     Or  a bit shorter   req   urllib2 Request  www example com   headers    User-Agent    Mozilla 5 0     html   urllib2 urlopen req  read

User · Answer

All these should work in theory  but  with Python 2 7 2 on Windows at least  any time you send a custom User-agent header  urllib2 doesn t send that header   If you don t try to send a User-agent header  it sends the default Python   urllib2   None of these methods seem to work for adding User-agent but they work for other headers   opener   urllib2 build opener proxy  opener addheaders     User-agent   Custom user agent   urllib2 install opener opener   request   urllib2 Request url  headers   User-agent   Custom user agent     request headers  User-agent      Custom user agent   request add header  User-agent    Custom user agent

User · Answer

Another solution in urllib2 and Python 2 7   req   urllib2 Request  http   www example com    req add unredirected header  User-Agent    Custom User-Agent   urllib2 urlopen req

User · Answer

Try this    html source code   requests get  http   www example com                       headers   User-Agent   Mozilla 5 0  Windows NT 6 1  WOW64  AppleWebKit 537 36  KHTML  like Gecko  Chrome 44 0 2403 107 Safari 537 36                                Upgrade-Insecure-Requests    1                                x-runtime    148ms                        allow redirects True  content

User · Answer

I answered a similar question a couple weeks ago   There is example code in that question  but basically you can do something like this   Note the capitalization of User-Agent as of RFC 2616  section 14 43    opener   urllib2 build opener   opener addheaders      User-Agent    Mozilla 5 0    response   opener open  http   www stackoverflow com

User · Answer

For python 3  urllib is split into 3 modules     import urllib request req   urllib request Request url  http   localhost    headers   User-Agent    Mozilla 5 0  Windows NT 6 1  WOW64  rv 12 0  Gecko 20100101 Firefox 12 0    handler   urllib request urlopen req

User · Answer

Setting the User-Agent from everyone s favorite Dive Into Python   The short story  You can use Request add header to do this   You can also pass the headers as a dictionary when creating the Request itself  as the docs note      headers should be a dictionary  and will be treated as if add header   was called with each key and value as arguments  This is often used to    spoof    the User-Agent header  which is used by a browser to identify itself     some HTTP servers only allow requests coming from common browsers as opposed to scripts  For example  Mozilla Firefox may identify itself as  Mozilla 5 0  X11  U  Linux i686  Gecko 20071127 Firefox 2 0 0 11   while urllib2   s default user agent string is  Python-urllib 2 6   on Python 2 6

[python] Changing user agent on urllib2.urlopen

Examples related to python

Examples related to urllib2

Examples related to user-agent