[java] Find everything between two XML tags with RegEx

In RegEx, I want to find the tag and everything between two XML tags, like the following:

<primaryAddress>
    <addressLine>280 Flinders Mall</addressLine>
    <geoCodeGranularity>PROPERTY</geoCodeGranularity>
    <latitude>-19.261365</latitude>
    <longitude>146.815585</longitude>
    <postcode>4810</postcode>
    <state>QLD</state>
    <suburb>Townsville</suburb>
    <type>PHYSICAL</type>
</primaryAddress>

I want to find the tag and everything between primaryAddress, and erase that.

Everything between the primaryAddress tag is a variable, but I want to remove the entire tag and sub-tags whenever I get primaryAddress.

Anyone have any idea how to do that?

This question is related to java php regex xml perl

The answer is


It is not good to use this method but if you really want to split it with regex

<primaryAddress.*>((.|\n)*?)<\/primaryAddress>

the verified answer returns the tags but this just return the value between tags.


You should be able to match it with: /<primaryAddress>(.+?)<\/primaryAddress>/

The content between the tags will be in the matched group.


In our case, we receive an XML as a String and need to get rid of the values that have some "special" characters, like &<> etc. Basically someone can provide an XML to us in this form:

<notes>
  <note>
     <to>jenice & carl </to>
     <from>your neighbor <; </from>
  </note>
</notes>

So I need to find in that String the values jenice & carl and your neighbor <; and properly escape & and < (otherwise this is an invalid xml if you later pass it to an engine that shall rename unnamed).

Doing this with regex is a rather dumb idea to begin with, but it's cheap and easy. So the brave ones that would like to do the same thing I did, here you go:

    String xml = ...
    Pattern p = Pattern.compile("<(.+)>(?!\\R<)(.+)</(\\1)>");
    Matcher m = p.matcher(xml);
    String result = m.replaceAll(mr -> {
        if (mr.group(2).contains("&")) {
            return "<" + m.group(1) + ">" + m.group(2) + "+ some change" + "</" + m.group(3) + ">";
        }
        return "<" + m.group(1) + ">" + mr.group(2) + "</" + m.group(3) + ">";
    });

this can capture most outermost layer pair of tags, even with attribute in side or without end tags

(<!--((?!-->).)*-->|<\w*((?!\/<).)*\/>|<(?<tag>\w+)[^>]*>(?>[^<]|(?R))*<\/\k<tag>\s*>)

edit: as mentioned in comment above, regex is always not enough to parse xml, trying to modify the regex to fit more situation only makes it longer but still useless


Examples related to java

Under what circumstances can I call findViewById with an Options Menu / Action Bar item? How much should a function trust another function How to implement a simple scenario the OO way Two constructors How do I get some variable from another class in Java? this in equals method How to split a string in two and store it in a field How to do perspective fixing? String index out of range: 4 My eclipse won't open, i download the bundle pack it keeps saying error log

Examples related to php

I am receiving warning in Facebook Application using PHP SDK Pass PDO prepared statement to variables Parse error: syntax error, unexpected [ Preg_match backtrack error Removing "http://" from a string How do I hide the PHP explode delimiter from submitted form results? Problems with installation of Google App Engine SDK for php in OS X Laravel 4 with Sentry 2 add user to a group on Registration php & mysql query not echoing in html with tags? How do I show a message in the foreach loop?

Examples related to regex

Why my regexp for hyphenated words doesn't work? grep's at sign caught as whitespace Preg_match backtrack error regex match any single character (one character only) re.sub erroring with "Expected string or bytes-like object" Only numbers. Input number in React Visual Studio Code Search and Replace with Regular Expressions Strip / trim all strings of a dataframe return string with first match Regex How to capture multiple repeated groups?

Examples related to xml

strange error in my Animation Drawable How do I POST XML data to a webservice with Postman? PHP XML Extension: Not installed How to add a Hint in spinner in XML Generating Request/Response XML from a WSDL Manifest Merger failed with multiple errors in Android Studio How to set menu to Toolbar in Android How to add colored border on cardview? Android: ScrollView vs NestedScrollView WARNING: Exception encountered during context initialization - cancelling refresh attempt

Examples related to perl

The program can't start because api-ms-win-crt-runtime-l1-1-0.dll is missing while starting Apache server on my computer "End of script output before headers" error in Apache Perl - Multiple condition if statement without duplicating code? How to decrypt hash stored by bcrypt Split a string into array in Perl Turning multiple lines into one comma separated line String compare in Perl with "eq" vs "==" how to remove the first two columns in a file using shell (awk, sed, whatever) Find everything between two XML tags with RegEx Difference between \w and \b regular expression meta characters