[php] PHP Regex to get youtube video ID?

Can someone show me how to get the youtube id out of a url regardless of what other GET variables are in the URL.

Use this video for example: http://www.youtube.com/watch?v=C4kxS1ksqtw&feature=related
So between v= and before the next &

This question is related to php regex

The answer is


Use this code :

$url = "http://www.youtube.com/watch?v=C4kxS1ksqtw&feature=related"; 
$parse = parse_url($url, PHP_URL_QUERY); 
parse_str($parse, $output); 
echo $output['watch'];

result : C4kxS1ksqtw


This can be very easily accomplished using parse_str and parse_url and is more reliable in my opinion.

My function supports the following urls:

Also includes the test below the function.

/**
 * Get Youtube video ID from URL
 *
 * @param string $url
 * @return mixed Youtube video ID or FALSE if not found
 */
function getYoutubeIdFromUrl($url) {
    $parts = parse_url($url);
    if(isset($parts['query'])){
        parse_str($parts['query'], $qs);
        if(isset($qs['v'])){
            return $qs['v'];
        }else if(isset($qs['vi'])){
            return $qs['vi'];
        }
    }
    if(isset($parts['path'])){
        $path = explode('/', trim($parts['path'], '/'));
        return $path[count($path)-1];
    }
    return false;
}
// Test
$urls = array(
    'http://youtube.com/v/dQw4w9WgXcQ?feature=youtube_gdata_player',
    'http://youtube.com/vi/dQw4w9WgXcQ?feature=youtube_gdata_player',
    'http://youtube.com/?v=dQw4w9WgXcQ&feature=youtube_gdata_player',
    'http://www.youtube.com/watch?v=dQw4w9WgXcQ&feature=youtube_gdata_player',
    'http://youtube.com/?vi=dQw4w9WgXcQ&feature=youtube_gdata_player',
    'http://youtube.com/watch?v=dQw4w9WgXcQ&feature=youtube_gdata_player',
    'http://youtube.com/watch?vi=dQw4w9WgXcQ&feature=youtube_gdata_player',
    'http://youtu.be/dQw4w9WgXcQ?feature=youtube_gdata_player'
);
foreach($urls as $url){
    echo $url . ' : ' . getYoutubeIdFromUrl($url) . "\n";
}

and what if I want to extract a youtue url from a string full of other characters? like this:

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco https://www.youtube.com/watch?v=cPW9Y94BJI0 laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

and get https://www.youtube.com/watch?v=cPW9Y94BJI0 from that string?


(?<=\?v=)([a-zA-Z0-9_-]){11}

This should do it as well.


I know that the title of the thread refers to the use of a regex, but just as the Zawinski quote says, I really think that avoiding regexes is best here. I'd recommend this function instead:

function get_youtube_id($url)
{
    if (strpos( $url,"v=") !== false)
    {
        return substr($url, strpos($url, "v=") + 2, 11);
    }
    elseif(strpos( $url,"embed/") !== false)
    {
        return substr($url, strpos($url, "embed/") + 6, 11);
    }

}

I recommend this because the ID of YouTube videos is always the same, independent from the style of the URL, e.g.

  • http://www.youtube.com/watch?v=t_uW44Bsezg
  • http://www.youtube.com/watch?feature=endscreen&v=Id3xG4xnOfA&NR=1
  • `And Other Ulr Form In Which The Word "embed/" Is Placed Before The Id ... !!

and that might be the case for embedded and iframe-ed stuff.


preg_match("#(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=v\/)[^&\n]+|(?<=v=)[^&\n]+|(?<=youtu.be/)[^&\n]+#", $url, $matches);

This will account for

youtube.com/v/{vidid}
youtube.com/vi/{vidid}
youtube.com/?v={vidid}
youtube.com/?vi={vidid}
youtube.com/watch?v={vidid}
youtube.com/watch?vi={vidid}
youtu.be/{vidid}

I improved it slightly to support: http://www.youtube.com/v/5xADESocujo?feature=autoshare&version=3&autohide=1&autoplay=1

The line I use now is:

preg_match("#(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=v\/)[^&\n]+(?=\?)|(?<=v=)[^&\n]+|(?<=youtu.be/)[^&\n]+#", $link, $matches);

fixed based on How to validate youtube video ids?

<?php

$links = [
    "youtube.com/v/tFad5gHoBjY",
    "youtube.com/vi/tFad5gHoBjY",
    "youtube.com/?v=tFad5gHoBjY",
    "youtube.com/?vi=tFad5gHoBjY",
    "youtube.com/watch?v=tFad5gHoBjY",
    "youtube.com/watch?vi=tFad5gHoBjY",
    "youtu.be/tFad5gHoBjY",
    "http://youtu.be/qokEYBNWA_0?t=30m26s",
    "youtube.com/v/vidid",
    "youtube.com/vi/vidid",
    "youtube.com/?v=vidid",
    "youtube.com/?vi=vidid",
    "youtube.com/watch?v=vidid",
    "youtube.com/watch?vi=vidid",
    "youtu.be/vidid",
    "youtube.com/embed/vidid",
    "http://youtube.com/v/vidid",
    "http://www.youtube.com/v/vidid",
    "https://www.youtube.com/v/vidid",
    "youtube.com/watch?v=vidid&wtv=wtv",
    "http://www.youtube.com/watch?dev=inprogress&v=vidid&feature=related",
    "youtube.com/watch?v=7HCZvhRAk-M"
];

foreach($links as $link){
    preg_match("#([\/|\?|&]vi?[\/|=]|youtu\.be\/|embed\/)([a-zA-Z0-9_-]+)#", $link, $matches);
    var_dump(end($matches));
}

For extracting the id in a capturing group, the following expression or some derivative of that might be an option too:

(?im)\b(?:https?:\/\/)?(?:w{3}\.)?youtu(?:be)?\.(?:com|be)\/(?:(?:\??v=?i?=?\/?)|watch\?vi?=|watch\?.*?&v=|embed\/|)([A-Z0-9_-]{11})\S*(?=\s|$)

Demo

Test

$re = '/(?im)\b(?:https?:\/\/)?(?:w{3}\.)?youtu(?:be)?\.(?:com|be)\/(?:(?:\??v=?i?=?\/?)|watch\?vi?=|watch\?.*?&v=|embed\/|)([A-Z0-9_-]{11})\S*(?=\s|$)/';
$str = 'http://youtube.com/v/tFad5gHoBjY
https://youtube.com/vi/tFad5gHoBjY
http://www.youtube.com/?v=tFad5gHoBjY
http://www.youtube.com/?vi=tFad5gHoBjY
https://www.youtube.com/watch?v=tFad5gHoBjY
youtube.com/watch?vi=tFad5gHoBjY
youtu.be/tFad5gHoBjY
http://youtu.be/qokEYBNWA_0?t=30m26s
youtube.com/v/7HCZvhRAk-M
youtube.com/vi/7HCZvhRAk-M
youtube.com/?v=7HCZvhRAk-M
youtube.com/?vi=7HCZvhRAk-M
youtube.com/watch?v=7HCZvhRAk-M
youtube.com/watch?vi=7HCZvhRAk-M
youtu.be/7HCZvhRAk-M
youtube.com/embed/7HCZvhRAk-M
http://youtube.com/v/7HCZvhRAk-M
http://www.youtube.com/v/7HCZvhRAk-M
https://www.youtube.com/v/7HCZvhRAk-M
youtube.com/watch?v=7HCZvhRAk-M&wtv=wtv
http://www.youtube.com/watch?dev=inprogress&v=7HCZvhRAk-M&feature=related
youtube.com/watch?v=7HCZvhRAk-M
http://youtube.com/v/dQw4w9WgXcQ?feature=youtube_gdata_player
http://youtube.com/vi/dQw4w9WgXcQ?feature=youtube_gdata_player
http://youtube.com/?v=dQw4w9WgXcQ&feature=youtube_gdata_player
http://www.youtube.com/watch?v=dQw4w9WgXcQ&feature=youtube_gdata_player
http://youtube.com/?vi=dQw4w9WgXcQ&feature=youtube_gdata_player
http://youtube.com/watch?v=dQw4w9WgXcQ&feature=youtube_gdata_player
http://youtube.com/watch?vi=dQw4w9WgXcQ&feature=youtube_gdata_player
http://youtu.be/dQw4w9WgXcQ?feature=youtube_gdata_player';

preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);

var_dump($matches);

Output

array(30) {
  [0]=>
  array(2) {
    [0]=>
    string(32) "http://youtube.com/v/tFad5gHoBjY"
    [1]=>
    string(11) "tFad5gHoBjY"
  }
  [1]=>
  array(2) {
    [0]=>
    string(34) "https://youtube.com/vi/tFad5gHoBjY"
    [1]=>
    string(11) "tFad5gHoBjY"
  }
  [2]=>
  array(2) {
    [0]=>
    string(37) "http://www.youtube.com/?v=tFad5gHoBjY"
    [1]=>
    string(11) "tFad5gHoBjY"
  }
  [3]=>
  array(2) {
    [0]=>
    string(38) "http://www.youtube.com/?vi=tFad5gHoBjY"
    [1]=>
    string(11) "tFad5gHoBjY"
  }
  [4]=>
  array(2) {
    [0]=>
    string(43) "https://www.youtube.com/watch?v=tFad5gHoBjY"
    [1]=>
    string(11) "tFad5gHoBjY"
  }
  [5]=>
  array(2) {
    [0]=>
    string(32) "youtube.com/watch?vi=tFad5gHoBjY"
    [1]=>
    string(11) "tFad5gHoBjY"
  }
  [6]=>
  array(2) {
    [0]=>
    string(20) "youtu.be/tFad5gHoBjY"
    [1]=>
    string(11) "tFad5gHoBjY"
  }
  [7]=>
  array(2) {
    [0]=>
    string(27) "http://youtu.be/qokEYBNWA_0"
    [1]=>
    string(11) "qokEYBNWA_0"
  }
  [8]=>
  array(2) {
    [0]=>
    string(25) "youtube.com/v/7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [9]=>
  array(2) {
    [0]=>
    string(26) "youtube.com/vi/7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [10]=>
  array(2) {
    [0]=>
    string(26) "youtube.com/?v=7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [11]=>
  array(2) {
    [0]=>
    string(27) "youtube.com/?vi=7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [12]=>
  array(2) {
    [0]=>
    string(31) "youtube.com/watch?v=7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [13]=>
  array(2) {
    [0]=>
    string(32) "youtube.com/watch?vi=7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [14]=>
  array(2) {
    [0]=>
    string(20) "youtu.be/7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [15]=>
  array(2) {
    [0]=>
    string(29) "youtube.com/embed/7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [16]=>
  array(2) {
    [0]=>
    string(32) "http://youtube.com/v/7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [17]=>
  array(2) {
    [0]=>
    string(36) "http://www.youtube.com/v/7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [18]=>
  array(2) {
    [0]=>
    string(37) "https://www.youtube.com/v/7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [19]=>
  array(2) {
    [0]=>
    string(31) "youtube.com/watch?v=7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [20]=>
  array(2) {
    [0]=>
    string(57) "http://www.youtube.com/watch?dev=inprogress&v=7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [21]=>
  array(2) {
    [0]=>
    string(31) "youtube.com/watch?v=7HCZvhRAk-M"
    [1]=>
    string(11) "7HCZvhRAk-M"
  }
  [22]=>
  array(2) {
    [0]=>
    string(32) "http://youtube.com/v/dQw4w9WgXcQ"
    [1]=>
    string(11) "dQw4w9WgXcQ"
  }
  [23]=>
  array(2) {
    [0]=>
    string(33) "http://youtube.com/vi/dQw4w9WgXcQ"
    [1]=>
    string(11) "dQw4w9WgXcQ"
  }
  [24]=>
  array(2) {
    [0]=>
    string(33) "http://youtube.com/?v=dQw4w9WgXcQ"
    [1]=>
    string(11) "dQw4w9WgXcQ"
  }
  [25]=>
  array(2) {
    [0]=>
    string(42) "http://www.youtube.com/watch?v=dQw4w9WgXcQ"
    [1]=>
    string(11) "dQw4w9WgXcQ"
  }
  [26]=>
  array(2) {
    [0]=>
    string(34) "http://youtube.com/?vi=dQw4w9WgXcQ"
    [1]=>
    string(11) "dQw4w9WgXcQ"
  }
  [27]=>
  array(2) {
    [0]=>
    string(38) "http://youtube.com/watch?v=dQw4w9WgXcQ"
    [1]=>
    string(11) "dQw4w9WgXcQ"
  }
  [28]=>
  array(2) {
    [0]=>
    string(39) "http://youtube.com/watch?vi=dQw4w9WgXcQ"
    [1]=>
    string(11) "dQw4w9WgXcQ"
  }
  [29]=>
  array(2) {
    [0]=>
    string(27) "http://youtu.be/dQw4w9WgXcQ"
    [1]=>
    string(11) "dQw4w9WgXcQ"
  }
}

If you wish to simplify/modify/explore the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.


RegEx Circuit

jex.im visualizes regular expressions:

enter image description here



The following will work for all youtube links

<?php
    // Here is a sample of the URLs this regex matches: (there can be more content after the given URL that will be ignored)
    // http://youtu.be/dQw4w9WgXcQ
    // http://www.youtube.com/embed/dQw4w9WgXcQ
    // http://www.youtube.com/watch?v=dQw4w9WgXcQ
    // http://www.youtube.com/?v=dQw4w9WgXcQ
    // http://www.youtube.com/v/dQw4w9WgXcQ
    // http://www.youtube.com/e/dQw4w9WgXcQ
    // http://www.youtube.com/user/username#p/u/11/dQw4w9WgXcQ
    // http://www.youtube.com/sandalsResorts#p/c/54B8C800269D7C1B/0/dQw4w9WgXcQ
    // http://www.youtube.com/watch?feature=player_embedded&v=dQw4w9WgXcQ
    // http://www.youtube.com/?feature=player_embedded&v=dQw4w9WgXcQ
    // It also works on the youtube-nocookie.com URL with the same above options.
    // It will also pull the ID from the URL in an embed code (both iframe and object tags)

$url = "https://www.youtube.com/watch?v=v2_MLFVdlQM";

    preg_match('%(?:youtube(?:-nocookie)?\.com/(?:[^/]+/.+/|(?:v|e(?:mbed)?)/|.*[?&]v=)|youtu\.be/)([^"&?/ ]{11})%i', $url, $match);

    $youtube_id = $match[1];

echo $youtube_id;
    ?>

I think you are trying to do this.

<?php
  $video = 'https://www.youtube.com/watch?v=u00FY9vADfQ';
  $parsed_video = parse_url($video, PHP_URL_QUERY);
  parse_str($parsed_video, $arr);
?>
<iframe
src="https://www.youtube.com/embed/<?php echo $arr['v'];  ?>"
frameborder="0">
</iframe>

We know that video ID is 11 chars length and can be preceded by v= or vi= or v/ or vi/ or youtu.be/. So the simplest way to do this:

<?php
$youtube = 'http://youtube.com/v/dQw4w9WgXcQ?feature=youtube_gdata_player
http://youtube.com/vi/dQw4w9WgXcQ?feature=youtube_gdata_player
http://youtube.com/?v=dQw4w9WgXcQ&feature=youtube_gdata_player
http://www.youtube.com/watch?v=dQw4w9WgXcQ&feature=youtube_gdata_player
http://youtube.com/?vi=dQw4w9WgXcQ&feature=youtube_gdata_player
http://youtube.com/watch?v=dQw4w9WgXcQ&feature=youtube_gdata_player
http://youtube.com/watch?vi=dQw4w9WgXcQ&feature=youtube_gdata_player
http://youtu.be/dQw4w9WgXcQ?feature=youtube_gdata_player';

preg_match_all("#(?<=v=|v\/|vi=|vi\/|youtu.be\/)[a-zA-Z0-9_-]{11}#", $youtube, $matches);

var_dump($matches[0]);

And output:

array(8) {
  [0]=>
  string(11) "dQw4w9WgXcQ"
  [1]=>
  string(11) "dQw4w9WgXcQ"
  [2]=>
  string(11) "dQw4w9WgXcQ"
  [3]=>
  string(11) "dQw4w9WgXcQ"
  [4]=>
  string(11) "dQw4w9WgXcQ"
  [5]=>
  string(11) "dQw4w9WgXcQ"
  [6]=>
  string(11) "dQw4w9WgXcQ"
  [7]=>
  string(11) "dQw4w9WgXcQ"
}

Just found this online at http://snipplr.com/view/62238/get-youtube-video-id-very-robust/

function getYouTubeId($url) {
// Format all domains to http://domain for easier URL parsing
str_replace('https://', 'http://', $url);
if (!stristr($url, 'http://') && (strlen($url) != 11)) {
    $url = 'http://' . $url;
}
$url = str_replace('http://www.', 'http://', $url);

if (strlen($url) == 11) {
    $code = $url;
} else if (preg_match('/http:\/\/youtu.be/', $url)) {
    $url = parse_url($url, PHP_URL_PATH);
    $code = substr($url, 1, 11);
} else if (preg_match('/watch/', $url)) {
    $arr = parse_url($url);
    parse_str($url);
    $code = isset($v) ? substr($v, 0, 11) : false;
} else if (preg_match('/http:\/\/youtube.com\/v/', $url)) {
    $url = parse_url($url, PHP_URL_PATH);
    $code = substr($url, 3, 11);
} else if (preg_match('/http:\/\/youtube.com\/embed/', $url, $matches)) {
    $url = parse_url($url, PHP_URL_PATH);
    $code = substr($url, 7, 11);
} else if (preg_match("#(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=[0-9]/)[^&\n]+|(?<=v=)[^&\n]+#", $url, $matches) ) {
    $code = substr($matches[0], 0, 11);
} else {
    $code = false;
}

if ($code && (strlen($code) < 11)) {
    $code = false;
}

return $code;
}

I used the data from Shawn's answer but generalized and shortened the regex a little. The key difference with this one is that it won't test for a valid Youtube URL, it'll just look for a video ID. Meaning it will still return a video ID for www.facebook.com?wtv=youtube.com/v/vidid. Works for all the test cases, but is a bit more lax. Consequently, it will output a false positive for something like https://www.twitter.com/watch?v=vidid. Use this method if the data is super inconsistent, otherwise use a more specific regex or parse_url() and parse_str().

preg_match("/([\?&\/]vi?|embed|\.be)[\/=]([\w-]+)/",$url,$matches);
print($matches[2]);

I had some post content I had to cipher throughout to get the Youtube ID out of. It happened to be in the form of the <iframe> embed code Youtube provides.

 <iframe src="http://www.youtube.com/embed/Zpk8pMz_Kgw?rel=0" frameborder="0" width="620" height="360"></iframe>

The following pattern I got from @rob above. The snippet does a foreach loop once the matches are found, and for a added bonus I linked it to the preview image found on Youtube. It could potentially match more types of Youtube embed types and urls:

$pattern = '#(?<=(?:v|i)=)[a-zA-Z0-9-]+(?=&)|(?<=(?:v|i)\/)[^&\n]+|(?<=embed\/)[^"&\n]+|(?<=??(?:v|i)=)[^&\n]+|(?<=youtu.be\/)[^&\n]+#';

preg_match_all($pattern, $post_content, $matches);

foreach ($matches as $match) {
    $img = "<img src='http://img.youtube.com/vi/".str_replace('?rel=0','', $match[0])."/0.jpg' />";
    break;
}

Rob's profile: https://stackoverflow.com/users/149615/rob


if (preg_match('![?&]{1}v=([^&]+)!', $url . '&', $m))
    $video_id = $m[1];

Based on bokor's comment on Anthony's answer:

preg_match("/^(?:http(?:s)?:\/\/)?(?:www\.)?(?:m\.)?(?:youtu\.be\/|youtube\.com\/(?:(?:watch)?\?(?:.*&)?v(?:i)?=|(?:embed|v|vi|user)\/))([^\?&\"'>]+)/", $url, $matches);

$matches[1] contains the vidid

Matches:

Does not match:

  • www.facebook.com?wtv=youtube.com/v/vidid

SOLTUION For any link type!!:

<?php
function get_youtube_id_from_url($url)  {
     preg_match('/(http(s|):|)\/\/(www\.|)yout(.*?)\/(embed\/|watch.*?v=|)([a-z_A-Z0-9\-]{11})/i', $url, $results);    return $results[6];
}


echo get_youtube_id_from_url('http://www.youtube.com/watch?var1=blabla#v=GvJehZx3eQ1$var2=bla');
      // or                   http://youtu.be/GvJehZx3eQ1 
      // or                   http://www.youtube.com/embed/GvJehZx3eQ1
      // or                   http://www.youtu.be/GvJehZx3eQ1/blabla?xyz
?>

outputs: GvJehZx3eQ1


$vid = preg_replace('/^.*(\?|\&)v\=/', '', $url);  // Strip all meuk before and including '?v=' or '&v='.

$vid = preg_replace('/[^\w\-\_].*$/', '', $vid);  // Strip trailing meuk.