Match the path of a URL minus the filename extension

Question

What would be the best regular expression for this scenario   Given this URL   http   php net manual en function preg-match php   How should I go about selecting everything between  but not including  http   php net and  php    manual en function preg-match   This is for an Nginx configuration file

User · Accepted Answer

Like this   if  preg match      lt  net        php      subject   regs          result    regs 0       Explanation        lt          Assert that the regex below can be matched  with the match ending at this position  positive lookbehind     net         Match the characters    net    literally               Match any single character that is not a line break character                Between zero and unlimited times  as many times as possible  giving back as needed  greedy              Assert that the regex below can be matched  starting at this position  positive lookahead                 Match the character         literally    php         Match the characters    php    literally

User · Answer

Simple    url    http   php net manual en function preg-match php   preg match   http     php  net      php     url   matches   echo  matches 1      matches 0  is your full URL   matches 1  is the part you want   See yourself  http   codepad viper-7 com hHmwI2

User · Answer

There s no need to use a regular expression to dissect a URL  PHP has built-in functions for this  pathinfo   and parse url

User · Answer

Regular expression for matching everything after  net  and before   php     pattern    net  a-zA-Z0-9      php      In the above regular expression  you can find the matching group of characters enclosed by      to be what you are looking for    Hope it s useful

User · Answer

Just for the fun of it  here are two ways that have not been explored   substr  url  strpos  s       8   -4    Or   substr  s  strpos  s       8   -strlen  s    strrpos  s          Based on the idea that HTTP schemes http    and https    are at most 8 characters  so typically it suffices to find the first slash from the 9th position onwards  If the extension is always  php the first code will work  otherwise the other one is required   For a pure regular expression solution you can break the string down like this                                                                             The path portion would be inside the first memory group  i e  index 1   indicated by the   in the line underneath the expression  Removing the extension can be done using pathinfo      parts   pathinfo  matches 1    echo  parts  dirname            parts  filename      You can also tweak the expression to this                                    This expression is not very optimal though  because it has some back tracking in it  In the end I would go for something less custom    parts   pathinfo parse url  url  PHP URL PATH    echo  parts  dirname            parts  filename

User · Answer

lt   w          w       select everything from the first literal     preceded by  look behind a Word  w  character  until followed by a look ahead   literal     appended by  one or more Word  w  characters  before the end         re      lt   w          w     Compile time 0 0011 milliseconds Memory allocation  code space   32   Study time 0 0002 milliseconds Capturing subpattern count   0 No options First char       No need char Max lookbehind   1 Subject length lower bound   2 No set of starting bytes data  http   php net manual en function preg-match php Execute time 0 0007 milliseconds  0   manual en function preg-match                  w      find two literal      followed by anything but a literal      select everything until find literal     followed by only Word  w characters before the end        re                 w    Compile time 0 0010 milliseconds Memory allocation  code space   28   Study time 0 0002 milliseconds Capturing subpattern count   1 No options First char       Need char       Subject length lower bound   4 No set of starting bytes data  http   php net manual en function preg-match php Execute time 0 0005 milliseconds  0    php net manual en function preg-match php  1   manual en function preg-match                    find literal     followed by at least 1 or more non literal     aggressive select everything before the last literal          re                 Compile time 0 0008 milliseconds Memory allocation  code space   23   Study time 0 0002 milliseconds Capturing subpattern count   1 No options First char       Need char       Subject length lower bound   3 No set of starting bytes data  http   php net manual en function preg-match php Execute time 0 0005 milliseconds  0   php net manual en function preg-match   1   manual en function preg-match           K            find literal     followed by at least 1 or more non literal     Reset select start  K aggressive select everything before  look ahead last literal          re          K          Compile time 0 0009 milliseconds Memory allocation  code space   22   Study time 0 0002 milliseconds Capturing subpattern count   0 No options First char       No need char Subject length lower bound   2 No set of starting bytes data  http   php net manual en function preg-match php Execute time 0 0005 milliseconds  0   manual en function preg-match     w  K             find one or more Word  w  characters before a literal     reset select start  K select literal     followed by  anything before look ahead last literal          re    w  K           Compile time 0 0009 milliseconds Memory allocation  code space   22   Study time 0 0003 milliseconds Capturing subpattern count   0 No options No first char Need char       Subject length lower bound   2 Starting byte set  0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P    Q R S T U V W X Y Z   a b c d e f g h i j k l m n o p q r s t u v w x y z  data  http   php net manual en function preg-match php Execute time 0 0011 milliseconds  0   manual en function preg-match

User · Answer

A regular expression might not be the most effective tool for this job   Try using parse url    combined with pathinfo      url         http   php net manual en function preg-match php    path       parse url  url  PHP URL PATH    pathinfo   pathinfo  path    echo  pathinfo  dirname          pathinfo  filename      The above code outputs    manual en function preg-match

User · Answer

http      2                        let s see  what it done   http      2             - non-capture group for http   php net          - capture part until last dot occur   manual en function preg-match         - matching extension of file like this   php

User · Answer

Try this   preg match   net      php     http   php net manual en function preg-match php    matches   echo  matches 1      prints  manual en function preg-match

User · Answer

Here s a regex solution better than what most have provided so far  if you ask me  http   regex101 com r nQ8rH5    http            K               i

User · Answer

This general URL match allows you to select parts of a URL   if  preg match     b  P lt protocol gt https  ftp        P lt domain gt  -A-Z0-9      P lt file gt    -A-Z0-9  amp                     P lt parameters gt     -A-Z0-9  amp                    i    subject   regs          result    regs  file          or you can append the  regs  parameters   too   else        result

[regex] Match the path of a URL, minus the filename extension

Examples related to regex

Examples related to nginx