How can I check if a URL exists via PHP

Question

How do I check if a URL exists  not 404  in PHP

User · Answer

to check if url is online or offline ---  function get http response code  theURL         headers    get headers  theURL       return substr  headers 0   9  3

User · Answer

All above solutions   extra sugar   Ultimate AIO solution           Check that given URL is valid and exists      param string  url URL to check     return bool TRUE when valid   FALSE anyway     function urlExists    url            Remove all illegal characters from a url      url   filter var  url  FILTER SANITIZE URL           Validate URI     if  filter var  url  FILTER VALIDATE URL      FALSE            check only for http https schemes              in array strtolower parse url  url  PHP URL SCHEME      http   https    true                   return false                Check that URL exists      file headers    get headers  url       return     file headers     file headers 0       HTTP 1 1 404 Not Found        Example    var dump   urlExists  http   stackoverflow com          Output  true

User · Answer

karim79 s get headers   solution didn t worked for me as I gotten crazy results with Pinterest   get headers    SSL operation failed with code 1  OpenSSL Error messages  error 14090086 SSL routines SSL3 GET SERVER CERTIFICATE certificate verify failed  Array        url    gt  https   www pinterest com jonathan parl       exists    gt      get headers    Failed to enable crypto  Array        url    gt  https   www pinterest com jonathan parl       exists    gt      get headers https   www pinterest com jonathan parl    failed to open stream  operation failed  Array        url    gt  https   www pinterest com jonathan parl       exists    gt        Anyway  this developer demonstrates that cURL is way faster than get headers     http   php net manual fr function get-headers php 104723  Since many people asked for karim79 to fix is cURL solution  here s the solution I built today         Send an HTTP request to a the  url and check the header posted back       param  url String url to which we must send the request     param  failCodeList Int array list of code for which the page is considered invalid       return Boolean    public static function isUrlExists  url  array  failCodeList   array 404          exists   false       if  StringManager  stringStartWith  url   http   and  StringManager  stringStartWith  url   ftp               url    https        url             if  preg match RegularExpression  URL   url              handle   curl init  url             curl setopt  handle  CURLOPT RETURNTRANSFER  true            curl setopt  handle  CURLOPT SSL VERIFYPEER  false            curl setopt  handle  CURLOPT HEADER  true            curl setopt  handle  CURLOPT NOBODY  true            curl setopt  handle  CURLOPT USERAGENT  true              headers   curl exec  handle            curl close  handle             if  empty  failCodeList  or  is array  failCodeList                  failCodeList   array 404                       if   empty  headers                  exists   true                headers   explode PHP EOL   headers                foreach  failCodeList as  code                    if  is numeric  code  and strpos  headers 0   strval  code       false                         exists   false                       break                                                         return  exists      Let me explains the curl options   CURLOPT RETURNTRANSFER  return a string instead of displaying the calling page on the screen   CURLOPT SSL VERIFYPEER  cUrl won t checkout the certificate  CURLOPT HEADER  include the header in the string  CURLOPT NOBODY  don t include the body in the string  CURLOPT USERAGENT  some site needs that to function properly  by example   https   plus google com     Additional note  In this function I m using Diego Perini s regex for validating the URL before sending the request   const URL            https  ftp         S      S      d 1 3       d 1 3   3         a-z d x 00a1 - x ffff   -    a-z d x 00a1 - x ffff             a-z d x 00a1 - x ffff   -    a-z d x 00a1 - x ffff           a-z x 00a1 - x ffff   2 6        d         s      iu      copyright Diego Perini   Additional note 2  I explode the header string and user headers 0  to be sure to only validate only the return code and message  example  200  404  405  etc    Additional note 3  Sometime validating only the code 404 is not enough  see the unit test   so there s an optional  failCodeList parameter to supply all the code list to reject   And  of course  here s the unit test  including all the popular social network  to legitimates my coding   public function testIsUrlExists       invalid  this- gt assertFalse ToolManager  isUrlExists  woot       this- gt assertFalse ToolManager  isUrlExists  https   www facebook com jonathan parentlevesque4545646456       this- gt assertFalse ToolManager  isUrlExists  https   plus google com  JonathanParentL C3 A9vesque890800       this- gt assertFalse ToolManager  isUrlExists  https   instagram com mariloubiz1232132    array 404  405       this- gt assertFalse ToolManager  isUrlExists  https   www pinterest com jonathan parl1231        this- gt assertFalse ToolManager  isUrlExists  https   regex101 com 546465465456       this- gt assertFalse ToolManager  isUrlExists  https   twitter com arcadefire4566546       this- gt assertFalse ToolManager  isUrlExists  https   vimeo com            array 400  405       this- gt assertFalse ToolManager  isUrlExists  https   www youtube com user Darkjo666456456456         valid  this- gt assertTrue ToolManager  isUrlExists  www google ca       this- gt assertTrue ToolManager  isUrlExists  https   www facebook com jonathan parentlevesque       this- gt assertTrue ToolManager  isUrlExists  https   plus google com  JonathanParentL C3 A9vesque       this- gt assertTrue ToolManager  isUrlExists  https   instagram com mariloubiz        this- gt assertTrue ToolManager  isUrlExists  https   www facebook com jonathan parentlevesque       this- gt assertTrue ToolManager  isUrlExists  https   www pinterest com        this- gt assertTrue ToolManager  isUrlExists  https   regex101 com       this- gt assertTrue ToolManager  isUrlExists  https   twitter com arcadefire       this- gt assertTrue ToolManager  isUrlExists  https   vimeo com        this- gt assertTrue ToolManager  isUrlExists  https   www youtube com user Darkjo666         Great success to all   Jonathan Parent-L  vesque from Montreal

User · Answer

the simple way is curl  and FASTER too    lt  php  mylinks  http   site com page html    handlerr   curl init  mylinks   curl setopt  handlerr   CURLOPT RETURNTRANSFER  TRUE    resp   curl exec  handlerr    ht   curl getinfo  handlerr  CURLINFO HTTP CODE     if   ht     404          echo  OK    else   echo  NO       gt

User · Answer

I run some tests to see if links on my site are valid - alerts me to when third parties change their links   I was having an issue with a site that had a poorly configured certificate that meant that php s get headers didn t work   SO  I read that curl was faster and decided to give that a go   then i had an issue with linkedin which gave me a 999 error  which turned out to be a user agent issue   I didn t care if the certificate was not valid for this test  and i didn t care if the response was a re-direct   Then I figured use get headers anyway if curl was failing      Give it a go             returns true false if the  url is present         param string  url assumes this is a valid url         return bool     private function url exists  string  url   bool      ch   curl init  url     curl setopt  ch  CURLOPT URL   url     curl setopt  ch  CURLOPT NOBODY  TRUE                  this does a head request to make it faster    curl setopt  ch  CURLOPT HEADER  TRUE                  just the headers   curl setopt  ch  CURLOPT SSL VERIFYSTATUS  FALSE       turn off that pesky ssl stuff - some sys admins can t get it right    curl setopt  ch  CURLOPT SSL VERIFYPEER  FALSE        set a real user agent to stop linkedin getting upset    curl setopt  ch  CURLOPT USERAGENT   Mozilla 5 0  Macintosh  Intel Mac OS X 10 14 1  AppleWebKit 537 36  KHTML  like Gecko  Chrome 70 0 3538 77 Safari 537 36      curl exec  ch      http code   curl getinfo  ch  CURLINFO HTTP CODE     if    http code  gt   HTTP OK  amp  amp   http code  lt  HTTP BAD REQUEST      http code     999          curl close  ch       return TRUE         error   curl error  ch      used for debugging    curl close  ch        just try the get headers - it might work    stream context set default array  http    gt  array  method    gt   HEAD         file headers    get headers  url     if   file headers           response code   substr  file headers 0   9  3       return  response code  gt   200  amp  amp   response code  lt  400        return FALSE

User · Answer

cURL can return HTTP code I don   t think all that extra code is necessary   function urlExists  url NULL                if  url    NULL  return false           ch   curl init  url           curl setopt  ch  CURLOPT TIMEOUT  5           curl setopt  ch  CURLOPT CONNECTTIMEOUT  5           curl setopt  ch  CURLOPT RETURNTRANSFER  true            data   curl exec  ch            httpcode   curl getinfo  ch  CURLINFO HTTP CODE           curl close  ch            if  httpcode gt  200  amp  amp   httpcode lt 300               return true            else               return false

User · Answer

Here   file    http   www example com somefile jpg    file headers    get headers  file   if   file headers     file headers 0      HTTP 1 1 404 Not Found          exists   false    else        exists   true     From here and right below the above post  there s a curl solution  function url exists  url        return curl init  url      false

User · Answer

I use this function           param  url     param array  options     return string     throws Exception     function checkURL  url  array  options   array          if  empty  url             throw new Exception  URL is empty                  list of HTTP status codes      httpStatusCodes   array          100   gt   Continue           101   gt   Switching Protocols           102   gt   Processing           200   gt   OK           201   gt   Created           202   gt   Accepted           203   gt   Non-Authoritative Information           204   gt   No Content           205   gt   Reset Content           206   gt   Partial Content           207   gt   Multi-Status           208   gt   Already Reported           226   gt   IM Used           300   gt   Multiple Choices           301   gt   Moved Permanently           302   gt   Found           303   gt   See Other           304   gt   Not Modified           305   gt   Use Proxy           306   gt   Switch Proxy           307   gt   Temporary Redirect           308   gt   Permanent Redirect           400   gt   Bad Request           401   gt   Unauthorized           402   gt   Payment Required           403   gt   Forbidden           404   gt   Not Found           405   gt   Method Not Allowed           406   gt   Not Acceptable           407   gt   Proxy Authentication Required           408   gt   Request Timeout           409   gt   Conflict           410   gt   Gone           411   gt   Length Required           412   gt   Precondition Failed           413   gt   Payload Too Large           414   gt   Request-URI Too Long           415   gt   Unsupported Media Type           416   gt   Requested Range Not Satisfiable           417   gt   Expectation Failed           418   gt   I  m a teapot           422   gt   Unprocessable Entity           423   gt   Locked           424   gt   Failed Dependency           425   gt   Unordered Collection           426   gt   Upgrade Required           428   gt   Precondition Required           429   gt   Too Many Requests           431   gt   Request Header Fields Too Large           449   gt   Retry With           450   gt   Blocked by Windows Parental Controls           500   gt   Internal Server Error           501   gt   Not Implemented           502   gt   Bad Gateway           503   gt   Service Unavailable           504   gt   Gateway Timeout           505   gt   HTTP Version Not Supported           506   gt   Variant Also Negotiates           507   gt   Insufficient Storage           508   gt   Loop Detected           509   gt   Bandwidth Limit Exceeded           510   gt   Not Extended           511   gt   Network Authentication Required           599   gt   Network Connect Timeout Error               ch   curl init  url       curl setopt  ch  CURLOPT NOBODY  true       curl setopt  ch  CURLOPT FOLLOWLOCATION  true        if  isset  options  timeout                timeout    int   options  timeout            curl setopt  ch  CURLOPT TIMEOUT   timeout              curl exec  ch        returnedStatusCode   curl getinfo  ch  CURLINFO HTTP CODE       curl close  ch        if  array key exists  returnedStatusCode   httpStatusCodes             return  URL     url   - Error code    returnedStatusCode  - Definition    httpStatusCodes  returnedStatusCode           else           return     url   does not exist

User · Answer

function urlIsOk  url         headers    get headers  url        httpStatus   intval substr  headers 0   9  3        if   httpStatus lt 400                return true            return false

User · Answer

function url exists  url         headers    get headers  url       return  strpos  headers 0   200     false   false true

User · Answer

function URLIsValid  URL         exists   true       file headers    get headers  URL        InvalidHeaders   array  404    403    500        foreach  InvalidHeaders as  HeaderVal                    if strstr  file headers 0    HeaderVal                                      exists   false                      break                          return  exists

User · Answer

headers    get headers  this- gt  value   if strpos  headers 0   200     false return false    so anytime you contact a website and get something else than 200 ok it will work

User · Answer

kind of an old thread  but   i do this    file    http   www google com    file headers    get headers  file   if   file headers         exists   true    else        exists   false

User · Answer

The best and simplest answer so far using get headers   The best thing to check for string  quot 200 ok quot   its far better than to check  file headers    get headers  file-path    file headers 0    because sometime the array key numbers varies  so best thing is to check for  quot 200 ok quot   Any URL which is up will have  quot 200 ok quot  anywhere in get headers   response  function url exist  url             urlheaders   get headers  url             print r  urlheaders            urlmatches    preg grep   200 ok i    urlheaders            if  empty  urlmatches               return true            else             return false                now check the function if true or false if url exist php-url-variable-here    URL exist  else    URL don t exist

User · Answer

When figuring out if an url exists from php there are a few things to pay attention to   Is the url itself valid  a string  not empty  good syntax   this is quick to check server side  Waiting for a response might take time and block code execution  Not all headers returned by get headers   are well formed  Use curl  if you can   Prevent fetching the entire body content  but only request the headers  Consider redirecting urls  Do you want the first code returned  Or follow all redirects and return the last code  You might end up with a 200  but it could redirect using meta tags or javascript  Figuring out what happens after is tough   Keep in mind that whatever method you use  it takes time to wait for a response  All code might  and probably will  halt untill you either know the result or the requests have timed out  For example  the code below could take a LONG time to display the page if the urls are invalid or unreachable   lt  php  urls   getUrls       some function getting say 10 or more external links  foreach  urls as  k  gt  url        this could potentially take 0-30 seconds each       more or less depending on connection  target site  timeout settings       if    isValidUrl  url         unset  urls  k           echo  quot yay all done  now show my site quot   foreach  urls as  url     echo  quot  lt a href   quot   url   quot  gt   url  lt  a gt  lt br  gt  quot      The functions below could be helpfull  you probably want to modify them to suit your needs      function isValidUrl  url              first do some quick sanity checks          if   url     is string  url                return false                       quick check url is roughly a valid http request    http   blah                if    preg match    http s        a-z0-9-      a-z0-9-       0-9             i    url                 return false                       the next bit could be slow          if getHttpResponseCode using curl  url     200           if getHttpResponseCode using getheaders  url     200       use this one if you cant use curl             return false                       all good          return true                 function getHttpResponseCode using curl  url   followredirects   true              returns int responsecode  or false  if url does not exist or connection timeout occurs             NOTE  could potentially take up to 0-30 seconds   blocking further code execution  more or less depending on connection  target site  and local timeout settings              if  followredirects    false  return the FIRST known httpcode  ignore redirects             if  followredirects    true   return the LAST  known httpcode  when redirected          if    url      is string  url                return false                     ch    curl init  url           if  ch     false               return false                     curl setopt  ch  CURLOPT HEADER          true         we want headers          curl setopt  ch  CURLOPT NOBODY          true         dont need body          curl setopt  ch  CURLOPT RETURNTRANSFER  true         catch output  do NOT print           if  followredirects                curl setopt  ch  CURLOPT FOLLOWLOCATION  true                curl setopt  ch  CURLOPT MAXREDIRS       10       fairly random number  but could prevent unwanted endless redirects with followlocation true          else               curl setopt  ch  CURLOPT FOLLOWLOCATION  false                      curl setopt  ch  CURLOPT CONNECTTIMEOUT  5        fairly random number  seconds     but could prevent waiting forever to get a result          curl setopt  ch  CURLOPT TIMEOUT         6        fairly random number  seconds     but could prevent waiting forever to get a result          curl setopt  ch  CURLOPT USERAGENT        quot Mozilla 5 0  Windows NT 6 0  AppleWebKit 537 1  KHTML  like Gecko  Chrome 21 0 1180 89 Safari 537 1 quot         pretend we re a regular browser          curl exec  ch           if  curl errno  ch         should be 0              curl close  ch               return false                     code    curl getinfo  ch  CURLINFO HTTP CODE      note  php net documentation shows this returns a string  but really it returns an int          curl close  ch           return  code                 function getHttpResponseCode using getheaders  url   followredirects   true              returns string responsecode  or false if no responsecode found in headers  or url does not exist             NOTE  could potentially take up to 0-30 seconds   blocking further code execution  more or less depending on connection  target site  and local timeout settings              if  followredirects    false  return the FIRST known httpcode  ignore redirects             if  followredirects    true   return the LAST  known httpcode  when redirected          if    url      is string  url                return false                     headers    get headers  url           if  headers  amp  amp  is array  headers                if  followredirects                      we want the last errorcode  reverse array so we start at the end                   headers   array reverse  headers                             foreach  headers as  hline                      search for things like  quot HTTP 1 1 200 OK quot     quot HTTP 1 0 200 OK quot     quot HTTP 1 1 301 PERMANENTLY MOVED quot     quot HTTP 1 1 400 Not Found quot    etc                     note that the exact syntax version output differs  so there is some string magic involved here                 if preg match    HTTP   S  s   1-9  0-9  0-9   s        hline   matches        quot HTTP             quot                       code    matches 1                       return  code                                                 no HTTP xxx found in headers              return false                       no headers           return false

User · Answer

pretty fast   function http response  url        resURL   curl init         curl setopt  resURL  CURLOPT URL   url        curl setopt  resURL  CURLOPT BINARYTRANSFER  1        curl setopt  resURL  CURLOPT HEADERFUNCTION   curlHeaderCallback         curl setopt  resURL  CURLOPT FAILONERROR  1        curl exec   resURL         intReturnCode   curl getinfo  resURL  CURLINFO HTTP CODE        curl close   resURL        if   intReturnCode    200  amp  amp   intReturnCode    302  amp  amp   intReturnCode    304    return 0    else return 1     echo  google    echo http response  http   www google com    echo    ogogle    echo http response  http   www ogogle com

User · Answer

get headers   returns an array with the headers sent by the server in response to a HTTP request       image path    https   your-domain com assets img image jpg     file headers    get headers  image path     Prints the response out in an array   print r  file headers     if  file headers 0      HTTP 1 1 404 Not Found       echo  Failed because path does not exist  lt  br gt     else     echo  It works  Your good to go  lt  br gt

User · Answer

One thing to take into consideration when you check the header for a 404 is the case where a site does not generate a 404 immediately    A lot of sites check whether a page exists or not in the PHP ASP  et cetera  source and forward you to a 404 page  In those cases the header is basically extended by the header of the 404 that is generated  In those cases the 404 error not in the first line of the header  but the tenth    array   get headers  url    string    array 0   print r  string     would generate   Array     0    gt  HTTP 1 0 301 Moved Permanently   1    gt  Date  Fri  09 Nov 2018 16 12 29 GMT   2    gt  Server  Apache 2 4 34  FreeBSD  LibreSSL 2 7 4 PHP 7 0 31   3    gt  X-Powered-By  PHP 7 0 31   4    gt  Set-Cookie  landing  2Freed-diffuser-fig-pudding-50  path    HttpOnly   5    gt  Location   reed-diffuser-fig-pudding-50    6    gt  Content-Length  0   7    gt  Connection  close   8    gt  Content-Type  text html  charset utf-8   9    gt  HTTP 1 0 404 Not Found   10    gt  Date  Fri  09 Nov 2018 16 12 29 GMT   11    gt  Server  Apache 2 4 34  FreeBSD  LibreSSL 2 7 4 PHP 7 0 31   12    gt  X-Powered-By  PHP 7 0 31   13    gt  Set-Cookie  landing  2Freed-diffuser-fig-pudding-50 2F  path    HttpOnly   14    gt  Connection  close   15    gt  Content-Type  text html  charset utf-8

User · Answer

Here is a solution that reads only the first byte of source code    returning false if the file get contents fails     This will also work for remote files like images    function urlExists  url        if   file get contents  url false NULL 0 1                 return true            return false

User · Answer

url    http   google com    not url    stp   google com    if   file get contents  url    echo  Found   url     else  echo  Can t find   url     endif  if   file get contents  not url    echo  Found   not url    else  echo  Can t find   not url     endif      Found  http   google com  Can t find  stp   google com

User · Answer

Other way to check if a URL is valid or not can be    lt  php    if  isValidURL  http   www gimepix com            echo  URL is valid          else         echo  URL is not valid             function isValidURL  url           file headers    get headers  url         if  strpos  file headers 0    200 OK    gt  0             return true          else           return false                gt

User · Answer

you cannot use curl in certain servers u can use this code   lt  php  url    http   www example com    array   get headers  url    string    array 0   if strpos  string  200            echo  url exists         else         echo  url does not exist         gt

[php] How can I check if a URL exists via PHP?

Examples related to php

Examples related to url