[http] Sites not accepting wget user agent header

When I run this command:

wget --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0"  http://yahoo.com

...I get this result (with nothing else in the file):

<!-- hw147.fp.gq1.yahoo.com uncompressed/chunked Wed Jun 19 03:42:44 UTC 2013 -->

But when I run wget http://yahoo.com with no --user-agent option, I get the full page.

The user agent is the same header that my current browser sends. Why does this happen? Is there a way to make sure the user agent doesn't get blocked when using wget?

This question is related to http wget user-agent

The answer is


I created a ~/.wgetrc file with the following content (obtained from askapache.com but with a newer user agent, because otherwise it didn’t work always):

header = Accept-Language: en-us,en;q=0.5
header = Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
header = Connection: keep-alive
user_agent = Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0
referer = /
robots = off

Now I’m able to download from most (all?) file-sharing (streaming video) sites.


You need to set both the user-agent and the referer:

 wget  --header="Accept: text/html" --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0" --referrer  connect.wso2.com http://dist.wso2.org/products/carbon/4.2.0/wso2carbon-4.2.0.zip

Examples related to http

Access blocked by CORS policy: Response to preflight request doesn't pass access control check Axios Delete request with body and headers? Read response headers from API response - Angular 5 + TypeScript Android 8: Cleartext HTTP traffic not permitted Angular 4 HttpClient Query Parameters Load json from local file with http.get() in angular 2 Angular 2: How to access an HTTP response body? What is HTTP "Host" header? Golang read request body Angular 2 - Checking for server errors from subscribe

Examples related to wget

How to `wget` a list of URLs in a text file? How to install wget in macOS? wget ssl alert handshake failure How to run wget inside Ubuntu Docker image? Unable to establish SSL connection upon wget on Ubuntu 14.04 LTS How to use Python requests to fake a browser visit a.k.a and generate User Agent? wget/curl large file from google drive wget: unable to resolve host address `http' Python equivalent of a given wget command How to download HTTP directory with all files and sub-directories as they appear on the online files/folders list?

Examples related to user-agent

Change user-agent for Selenium web-driver How to use curl to get a GET request exactly same as using Chrome? How to use Python requests to fake a browser visit a.k.a and generate User Agent? Get operating system info Sites not accepting wget user agent header How does HTTP_USER_AGENT work? What is the iOS 6 user agent string? Detect IE version (prior to v9) in JavaScript How to get user agent in PHP How to find the operating system version using JavaScript?