[linux] Get final URL after curl is redirected

I need to get the final URL after a page redirect preferably with curl or wget.

For example http://google.com may redirect to http://www.google.com.

The contents are easy to get(ex. curl --max-redirs 10 http://google.com -L), but I'm only interested in the final url (in the former case http://www.google.com).

Is there any way of doing this by using only Linux built-in tools? (command line only)

This question is related to linux redirect curl wget

The answer is


curl's -w option and the sub variable url_effective is what you are looking for.

Something like

curl -Ls -o /dev/null -w %{url_effective} http://google.com

More info

-L         Follow redirects
-s         Silent mode. Don't output anything
-o FILE    Write output to <file> instead of stdout
-w FORMAT  What to output after completion

More

You might want to add -I (that is an uppercase i) as well, which will make the command not download any "body", but it then also uses the HEAD method, which is not what the question included and risk changing what the server does. Sometimes servers don't respond well to HEAD even when they respond fine to GET.


curl can only follow http redirects. To also follow meta refresh directives and javascript redirects, you need a full-blown browser like headless chrome:

#!/bin/bash
real_url () {
    printf 'location.href\nquit\n' | \
    chromium-browser --headless --disable-gpu --disable-software-rasterizer \
    --disable-dev-shm-usage --no-sandbox --repl "$@" 2> /dev/null \
    | tr -d '>>> ' | jq -r '.result.value'
}

If you don't have chrome installed, you can use it from a docker container:

#!/bin/bash
real_url () {
    printf 'location.href\nquit\n' | \
    docker run -i --rm --user "$(id -u "$USER")" --volume "$(pwd)":/usr/src/app \
    zenika/alpine-chrome --no-sandbox --repl "$@" 2> /dev/null \
    | tr -d '>>> ' | jq -r '.result.value'
}

Like so:

$ real_url http://dx.doi.org/10.1016/j.pgeola.2020.06.005 
https://www.sciencedirect.com/science/article/abs/pii/S0016787820300638?via%3Dihub

I'm not sure how to do it with curl, but libwww-perl installs the GET alias.

$ GET -S -d -e http://google.com
GET http://google.com --> 301 Moved Permanently
GET http://www.google.com/ --> 302 Found
GET http://www.google.ca/ --> 200 OK
Cache-Control: private, max-age=0
Connection: close
Date: Sat, 19 Jun 2010 04:11:01 GMT
Server: gws
Content-Type: text/html; charset=ISO-8859-1
Expires: -1
Client-Date: Sat, 19 Jun 2010 04:11:01 GMT
Client-Peer: 74.125.155.105:80
Client-Response-Num: 1
Set-Cookie: PREF=ID=a1925ca9f8af11b9:TM=1276920661:LM=1276920661:S=ULFrHqOiFDDzDVFB; expires=Mon, 18-Jun-2012 04:11:01 GMT; path=/; domain=.google.ca
Title: Google
X-XSS-Protection: 1; mode=block

Thanks, that helped me. I made some improvements and wrapped that in a helper script "finalurl":

#!/bin/bash
curl $1 -s -L -I -o /dev/null -w '%{url_effective}'
  • -o output to /dev/null
  • -I don't actually download, just discover the final URL
  • -s silent mode, no progressbars

This made it possible to call the command from other scripts like this:

echo `finalurl http://someurl/`

You can do this with wget usually. wget --content-disposition "url" additionally if you add -O /dev/null you will not be actually saving the file.

wget -O /dev/null --content-disposition example.com


This would work:

 curl -I somesite.com | perl -n -e '/^Location: (.*)$/ && print "$1\n"'

Thank you. I ended up implementing your suggestions: curl -i + grep

curl -i http://google.com -L | egrep -A 10 '301 Moved Permanently|302 Found' | grep 'Location' | awk -F': ' '{print $2}' | tail -1

Returns blank if the website doesn't redirect, but that's good enough for me as it works on consecutive redirections.

Could be buggy, but at a glance it works ok.


as another option:

$ curl -i http://google.com
HTTP/1.1 301 Moved Permanently
Location: http://www.google.com/
Content-Type: text/html; charset=UTF-8
Date: Sat, 19 Jun 2010 04:15:10 GMT
Expires: Mon, 19 Jul 2010 04:15:10 GMT
Cache-Control: public, max-age=2592000
Server: gws
Content-Length: 219
X-XSS-Protection: 1; mode=block

<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>

But it doesn't go past the first one.


The parameters -L (--location) and -I (--head) still doing unnecessary HEAD-request to the location-url.

If you are sure that you will have no more than one redirect, it is better to disable follow location and use a curl-variable %{redirect_url}.

This code do only one HEAD-request to the specified URL and takes redirect_url from location-header:

curl --head --silent --write-out "%{redirect_url}\n" --output /dev/null "https://""goo.gl/QeJeQ4"

Speed test

all_videos_link.txt - 50 links of goo.gl+bit.ly which redirect to youtube

1. With follow location

time while read -r line; do
    curl -kIsL -w "%{url_effective}\n" -o /dev/null  $line
done < all_videos_link.txt

Results:

real    1m40.832s
user    0m9.266s
sys     0m15.375s

2. Without follow location

time while read -r line; do
    curl -kIs -w "%{redirect_url}\n" -o /dev/null  $line
done < all_videos_link.txt

Results:

real    0m51.037s
user    0m5.297s
sys     0m8.094s

You could use grep. doesn't wget tell you where it's redirecting too? Just grep that out.


Can you try with it?

#!/bin/bash 
LOCATION=`curl -I 'http://your-domain.com/url/redirect?r=something&a=values-VALUES_FILES&e=zip' | perl -n -e '/^Location: (.*)$/ && print "$1\n"'` 
echo "$LOCATION"

Note: when you execute the command curl -I http://your-domain.com have to use single quotes in the command like curl -I 'http://your-domain.com'


Examples related to linux

grep's at sign caught as whitespace How to prevent Google Colab from disconnecting? "E: Unable to locate package python-pip" on Ubuntu 18.04 How to upgrade Python version to 3.7? Install Qt on Ubuntu Get first line of a shell command's output Cannot connect to the Docker daemon at unix:/var/run/docker.sock. Is the docker daemon running? Run bash command on jenkins pipeline How to uninstall an older PHP version from centOS7 How to update-alternatives to Python 3 without breaking apt?

Examples related to redirect

React-Router External link Laravel 5.4 redirection to custom url after login How to redirect to another page in node.js How to redirect to an external URL in Angular2? How to redirect to a route in laravel 5 by using href tag if I'm not using blade or any template? Use .htaccess to redirect HTTP to HTTPs How to redirect back to form with input - Laravel 5 Using $window or $location to Redirect in AngularJS yii2 redirect in controller action does not work? Python Requests library redirect new url

Examples related to curl

What is the incentive for curl to release the library for free? curl: (35) error:1408F10B:SSL routines:ssl3_get_record:wrong version number Converting a POSTMAN request to Curl git clone error: RPC failed; curl 56 OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 10054 How to post raw body data with curl? Curl : connection refused How to use the curl command in PowerShell? Curl to return http status code along with the response How to install php-curl in Ubuntu 16.04 curl: (35) SSL connect error

Examples related to wget

How to `wget` a list of URLs in a text file? How to install wget in macOS? wget ssl alert handshake failure How to run wget inside Ubuntu Docker image? Unable to establish SSL connection upon wget on Ubuntu 14.04 LTS How to use Python requests to fake a browser visit a.k.a and generate User Agent? wget/curl large file from google drive wget: unable to resolve host address `http' Python equivalent of a given wget command How to download HTTP directory with all files and sub-directories as they appear on the online files/folders list?