[php] How can one check to see if a remote file exists using PHP?

The best I could find, an if fclose fopen type thing, makes the page load really slowly.

Basically what I'm trying to do is the following: I have a list of websites, and I want to display their favicons next to them. However, if a site doesn't have one, I'd like to replace it with another image rather than display a broken image.

This question is related to php file networking testing

The answer is


You could use the following:

$file = 'http://mysite.co.za/images/favicon.ico';
$file_exists = (@fopen($file, "r")) ? true : false;

Worked for me when trying to check if an image exists on the URL


If you're using the Symfony framework, there is also a much simpler way using the HttpClientInterface:

private function remoteFileExists(string $url, HttpClientInterface $client): bool {
    $response = $client->request(
        'GET',
        $url //e.g. http://example.com/file.txt
    );

    return $response->getStatusCode() == 200;
}

The docs for the HttpClient are also very good and maybe worth looking into if you need a more specific approach: https://symfony.com/doc/current/http_client.html


all the answers here that use get_headers() are doing a GET request. It's much faster/cheaper to just do a HEAD request.

To make sure that get_headers() does a HEAD request instead of a GET you should add this:

stream_context_set_default(
    array(
        'http' => array(
            'method' => 'HEAD'
        )
    )
);

so to check if a file exists, your code would look something like this:

stream_context_set_default(
    array(
        'http' => array(
            'method' => 'HEAD'
        )
    )
);
$headers = get_headers('http://website.com/dir/file.jpg', 1);
$file_found = stristr($headers[0], '200');

$file_found will return either false or true, obviously.


This works for me to check if a remote file exist in PHP:

$url = 'https://cdn.sstatic.net/Sites/stackoverflow/img/favicon.ico';
    $header_response = get_headers($url, 1);

    if ( strpos( $header_response[0], "404" ) !== false ) {
        echo 'File does NOT exist';
        } else {
        echo 'File exists';
        }

A radical solution would be to display the favicons as background images in a div above your default icon. That way, all overhead would be placed on the client while still not displaying broken images (missing background images are ignored in all browsers AFAIK).


Don't know if this one is any faster when the file does not exist remotely, is_file(), but you could give it a shot.

$favIcon = 'default FavIcon';
if(is_file($remotePath)) {
   $favIcon = file_get_contents($remotePath);
}

This is not an answer to your original question, but a better way of doing what you're trying to do:

Instead of actually trying to get the site's favicon directly (which is a royal pain given it could be /favicon.png, /favicon.ico, /favicon.gif, or even /path/to/favicon.png), use google:

<img src="http://www.google.com/s2/favicons?domain=[domain]">

Done.


You can use the filesystem: use Symfony\Component\Filesystem\Filesystem; use Symfony\Component\Filesystem\Exception\IOExceptionInterface;

and check $fileSystem = new Filesystem(); if ($fileSystem->exists('path_to_file')==true) {...


function remote_file_exists($url){
   return(bool)preg_match('~HTTP/1\.\d\s+200\s+OK~', @current(get_headers($url)));
}  
$ff = "http://www.emeditor.com/pub/emed32_11.0.5.exe";
    if(remote_file_exists($ff)){
        echo "file exist!";
    }
    else{
        echo "file not exist!!!";
    }

This can be done by obtaining the HTTP Status code (404 = not found) which is possible with file_get_contentsDocs making use of context options. The following code takes redirects into account and will return the status code of the final destination (Demo):

$url = 'http://example.com/';
$code = FALSE;

$options['http'] = array(
    'method' => "HEAD",
    'ignore_errors' => 1
);

$body = file_get_contents($url, NULL, stream_context_create($options));

foreach($http_response_header as $header)
    sscanf($header, 'HTTP/%*d.%*d %d', $code);

echo "Status code: $code";

If you don't want to follow redirects, you can do it similar (Demo):

$url = 'http://example.com/';
$code = FALSE;

$options['http'] = array(
    'method' => "HEAD",
    'ignore_errors' => 1,
    'max_redirects' => 0
);

$body = file_get_contents($url, NULL, stream_context_create($options));

sscanf($http_response_header[0], 'HTTP/%*d.%*d %d', $code);

echo "Status code: $code";

Some of the functions, options and variables in use are explained with more detail on a blog post I've written: HEAD first with PHP Streams.


A complete function of the most voted answer:

function remote_file_exists($url)
{
    $ch = curl_init($url);
    curl_setopt($ch, CURLOPT_NOBODY, 1);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); # handles 301/2 redirects
    curl_exec($ch);
    $httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
    curl_close($ch);
    if( $httpCode == 200 ){return true;}
}

You can use it like this:

if(remote_file_exists($url))
{
    //file exists, do something
}

There's an even more sophisticated alternative. You can do the checking all client-side using a JQuery trick.

$('a[href^="http://"]').filter(function(){
     return this.hostname && this.hostname !== location.hostname;
}).each(function() {
    var link = jQuery(this);
    var faviconURL =
      link.attr('href').replace(/^(http:\/\/[^\/]+).*$/, '$1')+'/favicon.ico';
    var faviconIMG = jQuery('<img src="favicon.png" alt="" />')['appendTo'](link);
    var extImg = new Image();
    extImg.src = faviconURL;
    if (extImg.complete)
      faviconIMG.attr('src', faviconURL);
    else
      extImg.onload = function() { faviconIMG.attr('src', faviconURL); };
});

From http://snipplr.com/view/18782/add-a-favicon-near-external-links-with-jquery/ (the original blog is presently down)


You should issue HEAD requests, not GET one, because you don't need the URI contents at all. As Pies said above, you should check for status code (in 200-299 ranges, and you may optionally follow 3xx redirects).

The answers question contain a lot of code examples which may be helpful: PHP / Curl: HEAD Request takes a long time on some sites


As Pies say you can use cURL. You can get cURL to only give you the headers, and not the body, which might make it faster. A bad domain could always take a while because you will be waiting for the request to time-out; you could probably change the timeout length using cURL.

Here is example:

function remoteFileExists($url) {
    $curl = curl_init($url);

    //don't fetch the actual page, you only want to check the connection is ok
    curl_setopt($curl, CURLOPT_NOBODY, true);

    //do request
    $result = curl_exec($curl);

    $ret = false;

    //if request did not fail
    if ($result !== false) {
        //if request was ok, check response code
        $statusCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);  

        if ($statusCode == 200) {
            $ret = true;   
        }
    }

    curl_close($curl);

    return $ret;
}

$exists = remoteFileExists('http://stackoverflow.com/favicon.ico');
if ($exists) {
    echo 'file exists';
} else {
    echo 'file does not exist';   
}

You can use :

$url=getimagesize(“http://www.flickr.com/photos/27505599@N07/2564389539/”);

if(!is_array($url))
{
   $default_image =”…/directoryFolder/junal.jpg”;
}

PHP's inbuilt functions may not work for checking URL if allow_url_fopen setting is set to off for security reasons. Curl is a better option as we would not need to change our code at later stage. Below is the code I used to verify a valid URL:

$url = str_replace(' ', '%20', $url);
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_exec($ch);
$httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);  
curl_close($ch);
if($httpcode>=200 && $httpcode<300){  return true; } else { return false; } 

Kindly note the CURLOPT_SSL_VERIFYPEER option which also verify the URL's starting with HTTPS.


CoolGoose's solution is good but this is faster for large files (as it only tries to read 1 byte):

if (false === file_get_contents("http://example.com/path/to/image",0,null,0,1)) {
    $image = $default_image;
}

To check for the existence of images, exif_imagetype should be preferred over getimagesize, as it is much faster.

To suppress the E_NOTICE, just prepend the error control operator (@).

if (@exif_imagetype($filename)) {
  // Image exist
}

As a bonus, with the returned value (IMAGETYPE_XXX) from exif_imagetype we could also get the mime-type or file-extension with image_type_to_mime_type / image_type_to_extension.


If you are dealing with images, use getimagesize. Unlike file_exists, this built-in function supports remote files. It will return an array that contains the image information (width, height, type..etc). All you have to do is to check the first element in the array (the width). use print_r to output the content of the array

$imageArray = getimagesize("http://www.example.com/image.jpg");
if($imageArray[0])
{
    echo "it's an image and here is the image's info<br>";
    print_r($imageArray);
}
else
{
    echo "invalid image";
}

if (false === file_get_contents("http://example.com/path/to/image")) {
    $image = $default_image;
}

Should work ;)


If the file is not hosted external you might translate the remote URL to an absolute Path on your webserver. That way you don't have to call CURL or file_get_contents, etc.

function remoteFileExists($url) {

    $root = realpath($_SERVER["DOCUMENT_ROOT"]);
    $urlParts = parse_url( $url );

    if ( !isset( $urlParts['path'] ) )
        return false;

    if ( is_file( $root . $urlParts['path'] ) )
        return true;
    else
        return false;

}

remoteFileExists( 'https://www.yourdomain.com/path/to/remote/image.png' );

Note: Your webserver must populate DOCUMENT_ROOT to use this function


You can instruct curl to use the HTTP HEAD method via CURLOPT_NOBODY.

More or less

$ch = curl_init("http://www.example.com/favicon.ico");

curl_setopt($ch, CURLOPT_NOBODY, true);
curl_exec($ch);
$retcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
// $retcode >= 400 -> not found, $retcode = 200, found.
curl_close($ch);

Anyway, you only save the cost of the HTTP transfer, not the TCP connection establishment and closing. And being favicons small, you might not see much improvement.

Caching the result locally seems a good idea if it turns out to be too slow. HEAD checks the time of the file, and returns it in the headers. You can do like browsers and get the CURLINFO_FILETIME of the icon. In your cache you can store the URL => [ favicon, timestamp ]. You can then compare the timestamp and reload the favicon.


Examples related to php

I am receiving warning in Facebook Application using PHP SDK Pass PDO prepared statement to variables Parse error: syntax error, unexpected [ Preg_match backtrack error Removing "http://" from a string How do I hide the PHP explode delimiter from submitted form results? Problems with installation of Google App Engine SDK for php in OS X Laravel 4 with Sentry 2 add user to a group on Registration php & mysql query not echoing in html with tags? How do I show a message in the foreach loop?

Examples related to file

Gradle - Move a folder from ABC to XYZ Difference between opening a file in binary vs text Angular: How to download a file from HttpClient? Python error message io.UnsupportedOperation: not readable java.io.FileNotFoundException: class path resource cannot be opened because it does not exist Writing JSON object to a JSON file with fs.writeFileSync How to read/write files in .Net Core? How to write to a CSV line by line? Writing a dictionary to a text file? What are the pros and cons of parquet format compared to other formats?

Examples related to networking

Access HTTP response as string in Go Communication between multiple docker-compose projects Can't access 127.0.0.1 How do I delete virtual interface in Linux? ConnectivityManager getNetworkInfo(int) deprecated Bridged networking not working in Virtualbox under Windows 10 Difference between PACKETS and FRAMES How to communicate between Docker containers via "hostname" java.net.ConnectException: failed to connect to /192.168.253.3 (port 2468): connect failed: ECONNREFUSED (Connection refused) wget: unable to resolve host address `http'

Examples related to testing

Test process.env with Jest How to configure "Shorten command line" method for whole project in IntelliJ Jest spyOn function called Simulate a button click in Jest Mockito - NullpointerException when stubbing Method toBe(true) vs toBeTruthy() vs toBeTrue() How-to turn off all SSL checks for postman for a specific site What is the difference between smoke testing and sanity testing? ReferenceError: describe is not defined NodeJs How to properly assert that an exception gets raised in pytest?