PHP validation regex for URL

Question

I ve been looking for a simple regex for URLs  does anybody have one handy that works well   I didn t find one with the zend framework validation classes and have seen several implementations

User · Answer

Just in case you want to know if the url really exists   function url exist  url    se passar a URL existe      c curl init        curl setopt  c CURLOPT URL  url       curl setopt  c CURLOPT HEADER 1    get the header     curl setopt  c CURLOPT NOBODY 1    and  only  get the header     curl setopt  c CURLOPT RETURNTRANSFER 1    get the response as a string from curl exec    rather than echoing it     curl setopt  c CURLOPT FRESH CONNECT 1    don t use a cached version of the url     if  curl exec  c              echo  url   inexists           return false       else            echo  url   exists           return true               httpcode curl getinfo  c CURLINFO HTTP CODE         return   httpcode lt 400

User · Answer

There is a PHP native function for that    url    http   www yoururl co uk sub1 sub2  param 1 amp param2     if     filter var   url  FILTER VALIDATE URL              Wrong   else          Valid     Returns the filtered data  or FALSE if the filter fails   Check it here

User · Answer

And there is your answer    Try to break it  you can t     function link validate url  text     LINK DOMAINS    aero arpa asia biz com cat coop edu gov info int jobs mil museum name nato net org pro travel mobi local      LINK ICHARS DOMAIN    string  html entity decode implode     array      TODO completing letters           amp  x00E6                amp  x00C6                amp  x00C0                amp  x00E0                amp  x00C1                amp  x00E1                amp  x00C2                amp  x00E2                amp  x00E5                amp  x00C5                amp  x00E4                amp  x00C4                amp  x00C7                amp  x00E7                amp  x00D0                amp  x00F0                amp  x00C8                amp  x00E8                amp  x00C9                amp  x00E9                amp  x00CA                amp  x00EA                amp  x00CB                amp  x00EB                amp  x00CE                amp  x00EE                amp  x00CF                amp  x00EF                amp  x00F8                amp  x00D8                amp  x00F6                amp  x00D6                amp  x00D4                amp  x00F4                amp  x00D5                amp  x00F5                amp  x0152                amp  x0153                amp  x00FC                amp  x00DC                amp  x00D9                amp  x00F9                amp  x00DB                amp  x00FB                amp  x0178                amp  x00FF                 amp  x00D1                amp  x00F1                amp  x00FE                amp  x00DE                amp  x00FD                amp  x00DD                amp  x00BF                ENT QUOTES   UTF-8        LINK ICHARS    LINK ICHARS DOMAIN    string  html entity decode implode     array        amp  x00DF                ENT QUOTES   UTF-8       allowed protocols   array  http    https    ftp    news    nntp    telnet    mailto    irc    ssh    sftp    webcal          Starting a parenthesis group with     means that it is grouped  but is not captured    protocol           implode       allowed protocols                 authentication               w   -     amp                 LINK ICHARS        0-9a-f  2            w    LINK ICHARS      -      amp                0-9a-f  2               domain           a-z0-9     LINK ICHARS DOMAIN       a-z0-9    LINK ICHARS DOMAIN     -               a-z0-9     LINK ICHARS DOMAIN     -                 LINK DOMAINS     a-z  2           ipv4        0-9  1 3     0-9  1 3   3        ipv6        0-9a-fA-F  1 4     0-9a-fA-F  1 4   7        port          0-9  1 5            Pattern specific to external links     external pattern          protocol        authentication         domain        ipv4        ipv6     localhost     port             Pattern specific to internal links     internal pattern          a-z0-9    LINK ICHARS     -              internal pattern file          a-z0-9    LINK ICHARS     -            i       directories          a-z0-9    LINK ICHARS     -       amp                            Yes  four backslashes    a single backslash     query               a-z0-9    LINK ICHARS       -            amp                           anchor         a-z0-9    LINK ICHARS     -       amp                              The rest of the path for a standard URL     end    directories        query        anchor         i       message id               domain     newsgroup name        0-9a-z -       0-9a-z -        news pattern      news      newsgroup name        message id      i       user     a-zA-Z0-9    LINK ICHARS     -            amp                                email pattern      mailto     user              domain        ipv4        ipv6    localhost     query            if  strpos  text    lt front gt        0        return false        if  in array  mailto    allowed protocols   amp  amp  preg match  email pattern   text         return false        if  in array  news    allowed protocols   amp  amp  preg match  news pattern   text         return false        if  preg match  internal pattern    end   text         return false        if  preg match  external pattern    end   text         return false        if  preg match  internal pattern file   text         return false         return true

User · Answer

For anyone developing with WordPress  just use   esc url raw  url       url   to validate a URL  here s WordPress  documentation on esc url raw   It handles URLs much better than filter var  url  FILTER VALIDATE URL  because it is unicode and XSS-safe   Here is a good article mentioning all the problems with filter var

User · Answer

function is valid url   url                if   url                     url  this- gt url                      url    parse url  url            if      url                  return false                      url   array map  trim    url            url  port       isset  url  port       80    int  url  port             path    isset  url  path        url  path                  if   path                       path                            path      isset    url  query            url query                    if   isset    url  host     AND  url  host      gethostbyname    url  host                     if   PHP VERSION  gt   5                      headers   get headers   url scheme     url host   url port  path                              else                    fp   fsockopen  url  host     url  port     errno   errstr  30                    if      fp                         return false                                    fputs  fp   HEAD  path HTTP 1 1 r nHost   url host  r n r n                     headers   fread    fp  128                    fclose    fp                               headers     is array    headers       implode     n    headers      headers              return   bool   preg match      HTTP    s   200 301 302    s i    headers                       return false

User · Answer

Use the filter var   function to validate whether a string is URL or not   var dump filter var  example com   FILTER VALIDATE URL      It is bad practice to use regular expressions when not necessary   EDIT  Be careful  this solution is not unicode-safe and not XSS-safe  If you need a complex validation  maybe it s better to look somewhere else

User · Answer

Just in case you want to know if the url really exists   function url exist  url    se passar a URL existe      c curl init        curl setopt  c CURLOPT URL  url       curl setopt  c CURLOPT HEADER 1    get the header     curl setopt  c CURLOPT NOBODY 1    and  only  get the header     curl setopt  c CURLOPT RETURNTRANSFER 1    get the response as a string from curl exec    rather than echoing it     curl setopt  c CURLOPT FRESH CONNECT 1    don t use a cached version of the url     if  curl exec  c              echo  url   inexists           return false       else            echo  url   exists           return true               httpcode curl getinfo  c CURLINFO HTTP CODE         return   httpcode lt 400

User · Answer

Peter s Regex doesn t look right to me for many reasons  It allows all kinds of special characters in the domain name and doesn t test for much   Frankie s function looks good to me and you can build a good regex from the components if you don t want a function  like so     http    https       a-z0-9   -a-z0-9   a-z0-9      1 63      a-z  2 6    Untested but I think that should work   Also  Owen s answer doesn t look 100  either  I took the domain part of the regex and tested it on a Regex tester tool http   erik eae net playground regexp regexp html   I put the following line     S     S      in the  regexp  section and the following line      -hello com   under the  sample text  section   The result allowed the minus character through  Because  S means any non-space character    Note the regex from Frankie handles the minus because it has this part for the first character    a-z0-9    Which won t allow the minus or any other special character

User · Answer

I ve used this one with good success - I don t remember where I got it from   pattern      b      https  ftp       www    -a-z0-9  amp                   -a-z0-9  amp            i

User · Answer

I ve found this to be the most useful for matching a URL      https           da-z  -       a-z    2 6       w   -

User · Answer

Here is the way I did it  But I want to mentoin that I am not so shure about the regex  But It should work thou      pattern       http https      S     S      s                                  lt       s  i            text   preg replace callback  pattern function  m                   return   lt a href    m 1    target    blank   gt  m 1  lt  a gt  m 4                                text     This way you won t need the eval marker on your pattern   Hope it helps

User · Answer

Here is the way I did it  But I want to mentoin that I am not so shure about the regex  But It should work thou      pattern       http https      S     S      s                                  lt       s  i            text   preg replace callback  pattern function  m                   return   lt a href    m 1    target    blank   gt  m 1  lt  a gt  m 4                                text     This way you won t need the eval marker on your pattern   Hope it helps

User · Answer

function validateURL  URL           pattern 1       http https ftp         A-Z0-9  A-Z0-9 -       A-Z0-9  A-Z0-9 -      com org net dk at us tv info uk co uk biz se       d        i          pattern 2       www      A-Z0-9  A-Z0-9 -      com org net dk at us tv info uk co uk biz se       d        i                if preg match  pattern 1   URL     preg match  pattern 2   URL            return true          else          return false

User · Answer

I used this on a few projects  I don t believe I ve run into issues  but I m sure it s not exhaustive    text   preg replace        https  ftp      S     S       s              lt      s    i        lt a href    1   target    blank   gt  3 lt  a gt  4       text      Most of the random junk at the end is to deal with situations like http   domain com  in a sentence  to avoid matching the trailing period   I m sure it could be cleaned up but since it worked  I ve more or less just copied it over from project to project

User · Answer

I used this on a few projects  I don t believe I ve run into issues  but I m sure it s not exhaustive    text   preg replace        https  ftp      S     S       s              lt      s    i        lt a href    1   target    blank   gt  3 lt  a gt  4       text      Most of the random junk at the end is to deal with situations like http   domain com  in a sentence  to avoid matching the trailing period   I m sure it could be cleaned up but since it worked  I ve more or less just copied it over from project to project

User · Answer

function is valid url   url                if   url                     url  this- gt url                      url    parse url  url            if      url                  return false                      url   array map  trim    url            url  port       isset  url  port       80    int  url  port             path    isset  url  path        url  path                  if   path                       path                            path      isset    url  query            url query                    if   isset    url  host     AND  url  host      gethostbyname    url  host                     if   PHP VERSION  gt   5                      headers   get headers   url scheme     url host   url port  path                              else                    fp   fsockopen  url  host     url  port     errno   errstr  30                    if      fp                         return false                                    fputs  fp   HEAD  path HTTP 1 1 r nHost   url host  r n r n                     headers   fread    fp  128                    fclose    fp                               headers     is array    headers       implode     n    headers      headers              return   bool   preg match      HTTP    s   200 301 302    s i    headers                       return false

User · Answer

OK  so this is a little bit more complex then a simple regex  but it allows for different types of urls   Examples    google com www microsoft com  http   www yahoo com  https   www bandcamp com artist   someone-special    All which should be marked as valid   function is valid url  url           First check  is the url just a domain name   allow a slash at the end        domain regex       A-Za-z0-9-      A-Za-z0-9-        A-Za-z  2              if  preg match   domain regex   url             return true                Second  Check if it s a url with a scheme and all       regex        a-z   w-        1 3   a-z0-9    www d 0 3      a-z0-9  -      a-z  2 4         s   lt  gt          s   lt  gt          s   lt  gt                    if  preg match   regex   url   matches                pull out the domain name  and make sure that the domain is valid            parts   parse url  url           if   in array   parts  scheme    array   http    https                  return false              Check the domain using the regex  stops domains like  -example com  passing through         if   preg match   domain regex    parts  host                 return false              This domain looks pretty valid  Only way to check it now is to download it          return true             return false      Note that there is a in array check for the protocols that you want to allow  currently only http and https are in that list    var dump is valid url  google com                true var dump is valid url  google com                true var dump is valid url  http   google com         true var dump is valid url  http   google com         true var dump is valid url  https   google com        true

User · Answer

Here s a simple class for URL Validation using RegEx and then cross-references the domain against popular RBL  Realtime Blackhole Lists  servers    Install    require  URLValidation php     Usage    require  URLValidation php    urlVal   new UrlValidation      Create Object Instance   Add a URL as the parameter of the domain   method and check the the return     urlArray     http   www bokranzr com test php test foo amp test dfdf    https   en-gb facebook com    https   www google com    foreach   urlArray as  k  gt  v         echo var dump  urlVal- gt domain  v       URL       v     lt br gt         Output    bool false  URL  http   www bokranzr com test php test foo amp test dfdf bool true  URL  https   en-gb facebook com bool true  URL  https   www google com   As you can see above  www bokranzr com is listed as malicious website via an RBL so the domain was returned as false

User · Answer

OK  so this is a little bit more complex then a simple regex  but it allows for different types of urls   Examples    google com www microsoft com  http   www yahoo com  https   www bandcamp com artist   someone-special    All which should be marked as valid   function is valid url  url           First check  is the url just a domain name   allow a slash at the end        domain regex       A-Za-z0-9-      A-Za-z0-9-        A-Za-z  2              if  preg match   domain regex   url             return true                Second  Check if it s a url with a scheme and all       regex        a-z   w-        1 3   a-z0-9    www d 0 3      a-z0-9  -      a-z  2 4         s   lt  gt          s   lt  gt          s   lt  gt                    if  preg match   regex   url   matches                pull out the domain name  and make sure that the domain is valid            parts   parse url  url           if   in array   parts  scheme    array   http    https                  return false              Check the domain using the regex  stops domains like  -example com  passing through         if   preg match   domain regex    parts  host                 return false              This domain looks pretty valid  Only way to check it now is to download it          return true             return false      Note that there is a in array check for the protocols that you want to allow  currently only http and https are in that list    var dump is valid url  google com                true var dump is valid url  google com                true var dump is valid url  http   google com         true var dump is valid url  http   google com         true var dump is valid url  https   google com        true

User · Answer

As per John Gruber  Daring Fireball    Regex     i  b    https     www d 0 3      a-z0-9  -      a-z  2 4         s   lt  gt          s   lt  gt          s   lt  gt                     s   lt  gt          s   lt  gt              s                  lt  gt                        using in preg match     preg match     i  b    https     www d 0 3      a-z0-9  -      a-z  2 4         s   lt  gt          s   lt  gt          s   lt  gt                     s   lt  gt          s   lt  gt              s                  lt  gt                          url    Here is the extended regex pattern  with comments      xi   b                           Capture 1  entire matched URL           https                     http or https protocol                                 or     www d 0 3                  www     www1     www2        www999                                       or      a-z0-9  -      a-z  2 4      looks like domain name followed by a slash                                   One or more         s   lt  gt                       Run of non-space  non-   lt  gt                                      or           s   lt  gt          s   lt  gt              balanced parens  up to 2 levels                                    End with            s   lt  gt          s   lt  gt              balanced parens  up to 2 levels                                         or        s                 lt  gt                             not a space or one of these punct chars         For more details please look at  http   daringfireball net 2010 07 improved regex for matching urls

User · Answer

Use the filter var   function to validate whether a string is URL or not   var dump filter var  example com   FILTER VALIDATE URL      It is bad practice to use regular expressions when not necessary   EDIT  Be careful  this solution is not unicode-safe and not XSS-safe  If you need a complex validation  maybe it s better to look somewhere else

User · Answer

As per the PHP manual - parse url should not be used to validate a URL   Unfortunately  it seems that filter var  example com   FILTER VALIDATE URL  does not perform any better   Both parse url   and filter var   will pass malformed URLs such as http        Therefore in this case - regex is the better method

User · Answer

http s          a-z0-9 -       a-z  2 4     a-z  2 4              i      http s       means http    or https      a-z0-9-                            2 0 a-z0-9-  means any a-z character or any 0-9 or  - sign                2 1     means the character can be one or more ex  a1w                    a9- c559s  f                2 2    is    sign               2 3  the     sign after   a-z0-9 -      mean do 2 1 2 2 2 3                  at least 1 time                ex  abc defgh0 ig  aa b ced f gh  also in case www yyy com               3  a-z  2 4  mean a-z at least 2 character but not more than                            4 characters for check that there will not be                            the case                            ex  https   www google co kr asdsdagfsdfsf               4     a-z  2 4              mean                  4 1    a-z  2 4  means like number 3 but start with                        sign                  4 2   means     a-z  2 4  can be use or not use never mind                 4 3    means                  4 4      means any character except blank                4 5     means do 4 3 4 4 4 5 at least 1 times                4 6     after           mean use 4 3 - 4 5 or not use                     no problem                 use for case https   stackoverflow com posts 51441301 edit                 5  when you use regex write in       so it come      http s        a-z0-9-      a-z  2 4    a-z  2 4           i                  6  almost forgot  letter i on the back mean ignore case of                    Big letter or small letter ex  A same as a  SoRRy same                    as sorry     Note   Sorry for bad English  My country not use it well

User · Answer

I ve used this one with good success - I don t remember where I got it from   pattern      b      https  ftp       www    -a-z0-9  amp                   -a-z0-9  amp            i

User · Answer

I used this on a few projects  I don t believe I ve run into issues  but I m sure it s not exhaustive    text   preg replace        https  ftp      S     S       s              lt      s    i        lt a href    1   target    blank   gt  3 lt  a gt  4       text      Most of the random junk at the end is to deal with situations like http   domain com  in a sentence  to avoid matching the trailing period   I m sure it could be cleaned up but since it worked  I ve more or less just copied it over from project to project

User · Answer

For anyone developing with WordPress  just use   esc url raw  url       url   to validate a URL  here s WordPress  documentation on esc url raw   It handles URLs much better than filter var  url  FILTER VALIDATE URL  because it is unicode and XSS-safe   Here is a good article mentioning all the problems with filter var

User · Answer

I ve used this one with good success - I don t remember where I got it from   pattern      b      https  ftp       www    -a-z0-9  amp                   -a-z0-9  amp            i

User · Answer

I don t think that using regular expressions is a smart thing to do in this case  It is impossible to match all of the possibilities and even if you did  there is still a chance that url simply doesn t exist   Here is a very simple way to test if url actually exists and is readable    if  preg match    https           link  and  fopen  link  r    echo  OK      if there is no preg match then this would also validate all filenames on your server

User · Answer

Use the filter var   function to validate whether a string is URL or not   var dump filter var  example com   FILTER VALIDATE URL      It is bad practice to use regular expressions when not necessary   EDIT  Be careful  this solution is not unicode-safe and not XSS-safe  If you need a complex validation  maybe it s better to look somewhere else

User · Answer

Use the filter var   function to validate whether a string is URL or not   var dump filter var  example com   FILTER VALIDATE URL      It is bad practice to use regular expressions when not necessary   EDIT  Be careful  this solution is not unicode-safe and not XSS-safe  If you need a complex validation  maybe it s better to look somewhere else

User · Answer

Inspired in this  NET StackOverflow question and in this referenced article from that question there is this URI validator  URI means it validates both URL and URN    if    preg match       a-z  a-z0-9  -                         a-z0-9-      amp              0-9A-F  2        3            0-9A-F    2          a-z0-9-      amp             0-9A-F  2       5          d     6               a-z0-9-      amp                  0-9A-F  2       8                       a-z0-9-      amp                  0-9A-F  2       10                 a-z0-9-      amp                   0-9A-F  2       11              a-z0-9-      amp                   0-9A-F  2       12    i    uri           throw new  RuntimeException   URI has not a valid format          I have successfully unit-tested this function inside a ValueObject I made named Uri and tested by UriTest   UriTest php  Contains valid and invalid cases for both URLs and URNs    lt  php  declare  strict types   1     namespace XaviMontero ThrasherPortage Tests Tour   use XaviMontero ThrasherPortage Tour Uri   class UriTest extends  PHPUnit Framework TestCase       private  sut       public function testCreationIsOfProperClassWhenUriIsValid                  sut   new Uri   http   example com              this- gt assertInstanceOf   XaviMontero  ThrasherPortage  Tour  Uri    sut                           dataProvider urlIsValidProvider         dataProvider urnIsValidProvider             public function testGetUriAsStringWhenUriIsValid  string  uri                  sut   new Uri   uri             actual    sut- gt getUriAsString              this- gt assertInternalType   string    actual             this- gt assertEquals   uri   actual               public function urlIsValidProvider                 return                                  http   example-server                        http   example com                        http   example com                         http   subdomain example com path  parameter1 value1 amp parameter2 value2                        random-protocol   example com                        http   example com 80                        http   example com no-path-separator                        http   example com pa 20th                         ftp   example org resource txt                        file            relative path needs protocol resource txt                        http   example com  one-fragment                        http   example edu 8080 one-fragment                               public function urnIsValidProvider                 return                                  urn isbn 0-486-27557-4                        urn example mammal monotreme echidna                        urn mpeg mpeg7 schema 2001                        urn uuid 6e8bc430-9c3a-11d9-9669-0800200c9a66                        rare-urn uuid 6e8bc430-9c3a-11d9-9669-0800200c9a66                        urn FOO a123 456                                          dataProvider urlIsNotValidProvider         dataProvider urnIsNotValidProvider             public function testCreationThrowsExceptionWhenUriIsNotValid  string  uri                  this- gt expectException   RuntimeException              this- gt sut   new Uri   uri               public function urlIsNotValidProvider                 return                                  only-text                        http  missing colon example com path  parameter1 value1 amp parameter2 value2                        missing protocol example com path                         http   example com  bad-separator                        http   example com bad-separator                        ht tp   example com                        http   exampl e com                        http   example com pa th                                  relative path needs protocol resource txt                        http   example com  two-fragments not-allowed                        http   example edu portMustBeANumber one-fragment                               public function urnIsNotValidProvider                 return                                  urn mpeg mpeg7 sch ema 2001                        urn mpeg mpeg7 schema 2001                        urn mpeg mpeg7 schema 2001                        urn mpeg mpeg7 schema 2001                        urn mpeg mpeg7 schema 2001                              Uri php  Value Object    lt  php  declare  strict types   1     namespace XaviMontero ThrasherPortage Tour   class Uri            var string        private  uri       public function   construct  string  uri                  this- gt assertUriIsCorrect   uri             this- gt uri    uri             public function getUriAsString                 return  this- gt uri             private function assertUriIsCorrect  string  uri                    https   stackoverflow com questions 30847 regex-to-validate-uris            http   snipplr com view 6889 regular-expressions-for-uri-validationparsing           if    preg match       a-z  a-z0-9  -                         a-z0-9-      amp              0-9A-F  2        3            0-9A-F    2          a-z0-9-      amp             0-9A-F  2       5          d     6               a-z0-9-      amp                  0-9A-F  2       8                       a-z0-9-      amp                  0-9A-F  2       10                 a-z0-9-      amp                   0-9A-F  2       11              a-z0-9-      amp                   0-9A-F  2       12    i    uri                           throw new  RuntimeException   URI has not a valid format                          Running UnitTests  There are 65 assertions in 46 tests  Caution  there are 2 data-providers for valid and 2 more for invalid expressions  One is for URLs and the other for URNs  If you are using a version of PhpUnit of v5 6  or earlier then you need to join the two data providers into a single one   xavi bromo   custom www hello-trip mutant-migrant  vendor bin phpunit PHPUnit 5 7 3 by Sebastian Bergmann and contributors                                                                     46   46  100    Time  82 ms  Memory  4 00MB  OK  46 tests  65 assertions    Code coverage  There s is 100  of code-coverage in this sample URI checker

User · Answer

I ve used this one with good success - I don t remember where I got it from   pattern      b      https  ftp       www    -a-z0-9  amp                   -a-z0-9  amp            i

User · Answer

As per the PHP manual - parse url should not be used to validate a URL   Unfortunately  it seems that filter var  example com   FILTER VALIDATE URL  does not perform any better   Both parse url   and filter var   will pass malformed URLs such as http        Therefore in this case - regex is the better method

User · Answer

I ve found this to be the most useful for matching a URL      https           da-z  -       a-z    2 6       w   -

User · Answer

function validateURL  URL           pattern 1       http https ftp         A-Z0-9  A-Z0-9 -       A-Z0-9  A-Z0-9 -      com org net dk at us tv info uk co uk biz se       d        i          pattern 2       www      A-Z0-9  A-Z0-9 -      com org net dk at us tv info uk co uk biz se       d        i                if preg match  pattern 1   URL     preg match  pattern 2   URL            return true          else          return false

User · Answer

There is a PHP native function for that    url    http   www yoururl co uk sub1 sub2  param 1 amp param2     if     filter var   url  FILTER VALIDATE URL              Wrong   else          Valid     Returns the filtered data  or FALSE if the filter fails   Check it here

User · Answer

Inspired in this  NET StackOverflow question and in this referenced article from that question there is this URI validator  URI means it validates both URL and URN    if    preg match       a-z  a-z0-9  -                         a-z0-9-      amp              0-9A-F  2        3            0-9A-F    2          a-z0-9-      amp             0-9A-F  2       5          d     6               a-z0-9-      amp                  0-9A-F  2       8                       a-z0-9-      amp                  0-9A-F  2       10                 a-z0-9-      amp                   0-9A-F  2       11              a-z0-9-      amp                   0-9A-F  2       12    i    uri           throw new  RuntimeException   URI has not a valid format          I have successfully unit-tested this function inside a ValueObject I made named Uri and tested by UriTest   UriTest php  Contains valid and invalid cases for both URLs and URNs    lt  php  declare  strict types   1     namespace XaviMontero ThrasherPortage Tests Tour   use XaviMontero ThrasherPortage Tour Uri   class UriTest extends  PHPUnit Framework TestCase       private  sut       public function testCreationIsOfProperClassWhenUriIsValid                  sut   new Uri   http   example com              this- gt assertInstanceOf   XaviMontero  ThrasherPortage  Tour  Uri    sut                           dataProvider urlIsValidProvider         dataProvider urnIsValidProvider             public function testGetUriAsStringWhenUriIsValid  string  uri                  sut   new Uri   uri             actual    sut- gt getUriAsString              this- gt assertInternalType   string    actual             this- gt assertEquals   uri   actual               public function urlIsValidProvider                 return                                  http   example-server                        http   example com                        http   example com                         http   subdomain example com path  parameter1 value1 amp parameter2 value2                        random-protocol   example com                        http   example com 80                        http   example com no-path-separator                        http   example com pa 20th                         ftp   example org resource txt                        file            relative path needs protocol resource txt                        http   example com  one-fragment                        http   example edu 8080 one-fragment                               public function urnIsValidProvider                 return                                  urn isbn 0-486-27557-4                        urn example mammal monotreme echidna                        urn mpeg mpeg7 schema 2001                        urn uuid 6e8bc430-9c3a-11d9-9669-0800200c9a66                        rare-urn uuid 6e8bc430-9c3a-11d9-9669-0800200c9a66                        urn FOO a123 456                                          dataProvider urlIsNotValidProvider         dataProvider urnIsNotValidProvider             public function testCreationThrowsExceptionWhenUriIsNotValid  string  uri                  this- gt expectException   RuntimeException              this- gt sut   new Uri   uri               public function urlIsNotValidProvider                 return                                  only-text                        http  missing colon example com path  parameter1 value1 amp parameter2 value2                        missing protocol example com path                         http   example com  bad-separator                        http   example com bad-separator                        ht tp   example com                        http   exampl e com                        http   example com pa th                                  relative path needs protocol resource txt                        http   example com  two-fragments not-allowed                        http   example edu portMustBeANumber one-fragment                               public function urnIsNotValidProvider                 return                                  urn mpeg mpeg7 sch ema 2001                        urn mpeg mpeg7 schema 2001                        urn mpeg mpeg7 schema 2001                        urn mpeg mpeg7 schema 2001                        urn mpeg mpeg7 schema 2001                              Uri php  Value Object    lt  php  declare  strict types   1     namespace XaviMontero ThrasherPortage Tour   class Uri            var string        private  uri       public function   construct  string  uri                  this- gt assertUriIsCorrect   uri             this- gt uri    uri             public function getUriAsString                 return  this- gt uri             private function assertUriIsCorrect  string  uri                    https   stackoverflow com questions 30847 regex-to-validate-uris            http   snipplr com view 6889 regular-expressions-for-uri-validationparsing           if    preg match       a-z  a-z0-9  -                         a-z0-9-      amp              0-9A-F  2        3            0-9A-F    2          a-z0-9-      amp             0-9A-F  2       5          d     6               a-z0-9-      amp                  0-9A-F  2       8                       a-z0-9-      amp                  0-9A-F  2       10                 a-z0-9-      amp                   0-9A-F  2       11              a-z0-9-      amp                   0-9A-F  2       12    i    uri                           throw new  RuntimeException   URI has not a valid format                          Running UnitTests  There are 65 assertions in 46 tests  Caution  there are 2 data-providers for valid and 2 more for invalid expressions  One is for URLs and the other for URNs  If you are using a version of PhpUnit of v5 6  or earlier then you need to join the two data providers into a single one   xavi bromo   custom www hello-trip mutant-migrant  vendor bin phpunit PHPUnit 5 7 3 by Sebastian Bergmann and contributors                                                                     46   46  100    Time  82 ms  Memory  4 00MB  OK  46 tests  65 assertions    Code coverage  There s is 100  of code-coverage in this sample URI checker

User · Answer

And there is your answer    Try to break it  you can t     function link validate url  text     LINK DOMAINS    aero arpa asia biz com cat coop edu gov info int jobs mil museum name nato net org pro travel mobi local      LINK ICHARS DOMAIN    string  html entity decode implode     array      TODO completing letters           amp  x00E6                amp  x00C6                amp  x00C0                amp  x00E0                amp  x00C1                amp  x00E1                amp  x00C2                amp  x00E2                amp  x00E5                amp  x00C5                amp  x00E4                amp  x00C4                amp  x00C7                amp  x00E7                amp  x00D0                amp  x00F0                amp  x00C8                amp  x00E8                amp  x00C9                amp  x00E9                amp  x00CA                amp  x00EA                amp  x00CB                amp  x00EB                amp  x00CE                amp  x00EE                amp  x00CF                amp  x00EF                amp  x00F8                amp  x00D8                amp  x00F6                amp  x00D6                amp  x00D4                amp  x00F4                amp  x00D5                amp  x00F5                amp  x0152                amp  x0153                amp  x00FC                amp  x00DC                amp  x00D9                amp  x00F9                amp  x00DB                amp  x00FB                amp  x0178                amp  x00FF                 amp  x00D1                amp  x00F1                amp  x00FE                amp  x00DE                amp  x00FD                amp  x00DD                amp  x00BF                ENT QUOTES   UTF-8        LINK ICHARS    LINK ICHARS DOMAIN    string  html entity decode implode     array        amp  x00DF                ENT QUOTES   UTF-8       allowed protocols   array  http    https    ftp    news    nntp    telnet    mailto    irc    ssh    sftp    webcal          Starting a parenthesis group with     means that it is grouped  but is not captured    protocol           implode       allowed protocols                 authentication               w   -     amp                 LINK ICHARS        0-9a-f  2            w    LINK ICHARS      -      amp                0-9a-f  2               domain           a-z0-9     LINK ICHARS DOMAIN       a-z0-9    LINK ICHARS DOMAIN     -               a-z0-9     LINK ICHARS DOMAIN     -                 LINK DOMAINS     a-z  2           ipv4        0-9  1 3     0-9  1 3   3        ipv6        0-9a-fA-F  1 4     0-9a-fA-F  1 4   7        port          0-9  1 5            Pattern specific to external links     external pattern          protocol        authentication         domain        ipv4        ipv6     localhost     port             Pattern specific to internal links     internal pattern          a-z0-9    LINK ICHARS     -              internal pattern file          a-z0-9    LINK ICHARS     -            i       directories          a-z0-9    LINK ICHARS     -       amp                            Yes  four backslashes    a single backslash     query               a-z0-9    LINK ICHARS       -            amp                           anchor         a-z0-9    LINK ICHARS     -       amp                              The rest of the path for a standard URL     end    directories        query        anchor         i       message id               domain     newsgroup name        0-9a-z -       0-9a-z -        news pattern      news      newsgroup name        message id      i       user     a-zA-Z0-9    LINK ICHARS     -            amp                                email pattern      mailto     user              domain        ipv4        ipv6    localhost     query            if  strpos  text    lt front gt        0        return false        if  in array  mailto    allowed protocols   amp  amp  preg match  email pattern   text         return false        if  in array  news    allowed protocols   amp  amp  preg match  news pattern   text         return false        if  preg match  internal pattern    end   text         return false        if  preg match  external pattern    end   text         return false        if  preg match  internal pattern file   text         return false         return true

User · Answer

Edit   As incidence pointed out this code has been DEPRECATED with the release of PHP 5 3 0  2009-06-30  and should be used accordingly     Just my two cents but I ve developed this function and have been using it for a while with success  It s well documented and separated so you can easily change it      Checks if string is a URL     param string  url     return bool function isURL  url   NULL        if  url  NULL  return false        protocol     http    https            allowed      a-z0-9   -a-z0-9   a-z0-9              regex         protocol      must include the protocol                     allowed     1 63           1 or several sub domains with a max of 63 chars                a-z       2 6       followed by a TLD     if eregi  regex   url   true  return true      else return false

User · Answer

I don t think that using regular expressions is a smart thing to do in this case  It is impossible to match all of the possibilities and even if you did  there is still a chance that url simply doesn t exist   Here is a very simple way to test if url actually exists and is readable    if  preg match    https           link  and  fopen  link  r    echo  OK      if there is no preg match then this would also validate all filenames on your server

User · Answer

The best URL Regex that worked for me   function valid URL  url       return preg match          https  ftp         S      S      d 1 3       d 1 3   3         a-z d x 00a1 - x ffff   -    a-z d x 00a1 - x ffff             a-z d x 00a1 - x ffff   -    a-z d x 00a1 - x ffff           a-z x 00a1 - x ffff   2 6        d         s      iu    url       Examples   valid URL  https   twitter com       true valid URL  http   twitter com        true valid URL  http   twitter co         true valid URL  http   t co               true valid URL  http   twitter c          false valid URL  htt   twitter com         false  valid URL  http   example com  a 1 amp b 2 amp c 3       true valid URL  http   127 0 0 1          true valid URL                            false valid URL 1                          false   Source  http   urlregex com

User · Answer

Peter s Regex doesn t look right to me for many reasons  It allows all kinds of special characters in the domain name and doesn t test for much   Frankie s function looks good to me and you can build a good regex from the components if you don t want a function  like so     http    https       a-z0-9   -a-z0-9   a-z0-9      1 63      a-z  2 6    Untested but I think that should work   Also  Owen s answer doesn t look 100  either  I took the domain part of the regex and tested it on a Regex tester tool http   erik eae net playground regexp regexp html   I put the following line     S     S      in the  regexp  section and the following line      -hello com   under the  sample text  section   The result allowed the minus character through  Because  S means any non-space character    Note the regex from Frankie handles the minus because it has this part for the first character    a-z0-9    Which won t allow the minus or any other special character

User · Answer

As per John Gruber  Daring Fireball    Regex     i  b    https     www d 0 3      a-z0-9  -      a-z  2 4         s   lt  gt          s   lt  gt          s   lt  gt                     s   lt  gt          s   lt  gt              s                  lt  gt                        using in preg match     preg match     i  b    https     www d 0 3      a-z0-9  -      a-z  2 4         s   lt  gt          s   lt  gt          s   lt  gt                     s   lt  gt          s   lt  gt              s                  lt  gt                          url    Here is the extended regex pattern  with comments      xi   b                           Capture 1  entire matched URL           https                     http or https protocol                                 or     www d 0 3                  www     www1     www2        www999                                       or      a-z0-9  -      a-z  2 4      looks like domain name followed by a slash                                   One or more         s   lt  gt                       Run of non-space  non-   lt  gt                                      or           s   lt  gt          s   lt  gt              balanced parens  up to 2 levels                                    End with            s   lt  gt          s   lt  gt              balanced parens  up to 2 levels                                         or        s                 lt  gt                             not a space or one of these punct chars         For more details please look at  http   daringfireball net 2010 07 improved regex for matching urls

User · Answer

http s          a-z0-9 -       a-z  2 4     a-z  2 4              i      http s       means http    or https      a-z0-9-                            2 0 a-z0-9-  means any a-z character or any 0-9 or  - sign                2 1     means the character can be one or more ex  a1w                    a9- c559s  f                2 2    is    sign               2 3  the     sign after   a-z0-9 -      mean do 2 1 2 2 2 3                  at least 1 time                ex  abc defgh0 ig  aa b ced f gh  also in case www yyy com               3  a-z  2 4  mean a-z at least 2 character but not more than                            4 characters for check that there will not be                            the case                            ex  https   www google co kr asdsdagfsdfsf               4     a-z  2 4              mean                  4 1    a-z  2 4  means like number 3 but start with                        sign                  4 2   means     a-z  2 4  can be use or not use never mind                 4 3    means                  4 4      means any character except blank                4 5     means do 4 3 4 4 4 5 at least 1 times                4 6     after           mean use 4 3 - 4 5 or not use                     no problem                 use for case https   stackoverflow com posts 51441301 edit                 5  when you use regex write in       so it come      http s        a-z0-9-      a-z  2 4    a-z  2 4           i                  6  almost forgot  letter i on the back mean ignore case of                    Big letter or small letter ex  A same as a  SoRRy same                    as sorry     Note   Sorry for bad English  My country not use it well

User · Answer

Edit   As incidence pointed out this code has been DEPRECATED with the release of PHP 5 3 0  2009-06-30  and should be used accordingly     Just my two cents but I ve developed this function and have been using it for a while with success  It s well documented and separated so you can easily change it      Checks if string is a URL     param string  url     return bool function isURL  url   NULL        if  url  NULL  return false        protocol     http    https            allowed      a-z0-9   -a-z0-9   a-z0-9              regex         protocol      must include the protocol                     allowed     1 63           1 or several sub domains with a max of 63 chars                a-z       2 6       followed by a TLD     if eregi  regex   url   true  return true      else return false

User · Answer

The best URL Regex that worked for me   function valid URL  url       return preg match          https  ftp         S      S      d 1 3       d 1 3   3         a-z d x 00a1 - x ffff   -    a-z d x 00a1 - x ffff             a-z d x 00a1 - x ffff   -    a-z d x 00a1 - x ffff           a-z x 00a1 - x ffff   2 6        d         s      iu    url       Examples   valid URL  https   twitter com       true valid URL  http   twitter com        true valid URL  http   twitter co         true valid URL  http   t co               true valid URL  http   twitter c          false valid URL  htt   twitter com         false  valid URL  http   example com  a 1 amp b 2 amp c 3       true valid URL  http   127 0 0 1          true valid URL                            false valid URL 1                          false   Source  http   urlregex com

User · Answer

As per the PHP manual - parse url should not be used to validate a URL   Unfortunately  it seems that filter var  example com   FILTER VALIDATE URL  does not perform any better   Both parse url   and filter var   will pass malformed URLs such as http        Therefore in this case - regex is the better method

User · Answer

Here s a simple class for URL Validation using RegEx and then cross-references the domain against popular RBL  Realtime Blackhole Lists  servers    Install    require  URLValidation php     Usage    require  URLValidation php    urlVal   new UrlValidation      Create Object Instance   Add a URL as the parameter of the domain   method and check the the return     urlArray     http   www bokranzr com test php test foo amp test dfdf    https   en-gb facebook com    https   www google com    foreach   urlArray as  k  gt  v         echo var dump  urlVal- gt domain  v       URL       v     lt br gt         Output    bool false  URL  http   www bokranzr com test php test foo amp test dfdf bool true  URL  https   en-gb facebook com bool true  URL  https   www google com   As you can see above  www bokranzr com is listed as malicious website via an RBL so the domain was returned as false

User · Answer

I used this on a few projects  I don t believe I ve run into issues  but I m sure it s not exhaustive    text   preg replace        https  ftp      S     S       s              lt      s    i        lt a href    1   target    blank   gt  3 lt  a gt  4       text      Most of the random junk at the end is to deal with situations like http   domain com  in a sentence  to avoid matching the trailing period   I m sure it could be cleaned up but since it worked  I ve more or less just copied it over from project to project

User · Answer

As per the PHP manual - parse url should not be used to validate a URL   Unfortunately  it seems that filter var  example com   FILTER VALIDATE URL  does not perform any better   Both parse url   and filter var   will pass malformed URLs such as http        Therefore in this case - regex is the better method

[php] PHP validation/regex for URL

Examples related to php

Examples related to regex

Examples related to url

Examples related to validation