[php] PHP/regex: How to get the string value of HTML tag?

I need help on regex or preg_match because I am not that experienced yet with regards to those so here is my problem.

I need to get the value "get me" but I think my function has an error. The number of html tags are dynamic. It can contain many nested html tag like a bold tag. Also, the "get me" value is dynamic.

<?php
function getTextBetweenTags($string, $tagname) {
    $pattern = "/<$tagname>(.*?)<\/$tagname>/";
    preg_match($pattern, $string, $matches);
    return $matches[1];
}

$str = '<textformat leading="2"><p align="left"><font size="10">get me</font></p></textformat>';
$txt = getTextBetweenTags($str, "font");
echo $txt;
?>

This question is related to php html regex

The answer is


$userinput = "http://www.example.vn/";
//$url = urlencode($userinput);
$input = @file_get_contents($userinput) or die("Could not access file: $userinput");
$regexp = "<tagname\s[^>]*>(.*)<\/tagname>";
//==Example:
//$regexp = "<div\s[^>]*>(.*)<\/div>";

if(preg_match_all("/$regexp/siU", $input, $matches, PREG_SET_ORDER)) {
    foreach($matches as $match) {
        // $match[2] = link address 
        // $match[3] = link text
    }
}

Since attribute values may contain a plain > character, try this regular expression:

$pattern = '/<'.preg_quote($tagname, '/').'(?:[^"'>]*|"[^"]*"|\'[^\']*\')*>(.*?)<\/'.preg_quote($tagname, '/').'>/s';

But regular expressions are not suitable for parsing non-regular languages like HTML. You should better use a parser like SimpleXML or DOMDocument.


The following php snippets would return the text between html tags/elements.

regex : "/tagname(.*)endtag/" will return text between tags.

i.e.

$regex="/[start_tag_name](.*)[/end_tag_name]/";
$content="[start_tag_name]SOME TEXT[/end_tag_name]";
preg_replace($regex,$content); 

It will return "SOME TEXT".


Try this

$str = '<option value="123">abc</option>
        <option value="123">aabbcc</option>';

preg_match_all("#<option.*?>([^<]+)</option>#", $str, $foo);

print_r($foo[1]);

try $pattern = "<($tagname)\b.*?>(.*?)</\1>" and return $matches[2]


In your pattern, you simply want to match all text between the two tags. Thus, you could use for example a [\w\W] to match all characters.

function getTextBetweenTags($string, $tagname) {
    $pattern = "/<$tagname>([\w\W]*?)<\/$tagname>/";
    preg_match($pattern, $string, $matches);
    return $matches[1];
}

this might be old but my answer might help someone

You can simply use

$str = '<textformat leading="2"><p align="left"><font size="10">get me</font></p></textformat>';
echo strip_tags($str);

https://www.php.net/manual/en/function.strip-tags.php


Your HTML

$html='<ul id="main">
    <li>
        <h1><a href="[link]">My Title</a></h1>
        <span class="date">Date</span>
        <div class="section">
            [content]
        </div>
    </li>
</ul>';

//function call you can change the tag name

echo contentBetweenTags($html,"span");

// this function will help you to fetch the data from a specific tag

function contentBetweenTags($content, $tagname){
    $pattern = "#<\s*?$tagname\b[^>]*>(.*?)</$tagname\b[^>]*>#s";
    preg_match($pattern, $content, $matches);
    
    if(empty($matches))
        return;
    
    $str = "<$tagname>".html_entity_decode($matches[1])."</$tagname>";
    return $str;
}

Examples related to php

I am receiving warning in Facebook Application using PHP SDK Pass PDO prepared statement to variables Parse error: syntax error, unexpected [ Preg_match backtrack error Removing "http://" from a string How do I hide the PHP explode delimiter from submitted form results? Problems with installation of Google App Engine SDK for php in OS X Laravel 4 with Sentry 2 add user to a group on Registration php & mysql query not echoing in html with tags? How do I show a message in the foreach loop?

Examples related to html

Embed ruby within URL : Middleman Blog Please help me convert this script to a simple image slider Generating a list of pages (not posts) without the index file Why there is this "clear" class before footer? Is it possible to change the content HTML5 alert messages? Getting all files in directory with ajax DevTools failed to load SourceMap: Could not load content for chrome-extension How to set width of mat-table column in angular? How to open a link in new tab using angular? ERROR Error: Uncaught (in promise), Cannot match any routes. URL Segment

Examples related to regex

Why my regexp for hyphenated words doesn't work? grep's at sign caught as whitespace Preg_match backtrack error regex match any single character (one character only) re.sub erroring with "Expected string or bytes-like object" Only numbers. Input number in React Visual Studio Code Search and Replace with Regular Expressions Strip / trim all strings of a dataframe return string with first match Regex How to capture multiple repeated groups?