[regex] How to use a variable in the replacement side of the Perl substitution operator?

I would like to do the following:

$find="start (.*) end";
$replace="foo \1 bar";

$var = "start middle end";
$var =~ s/$find/$replace/;

I would expect $var to contain "foo middle bar", but it does not work. Neither does:

$replace='foo \1 bar';

Somehow I am missing something regarding the escaping.


I fixed the missing 's'

This question is related to regex perl substitution

The answer is


On the replacement side, you must use $1, not \1.

And you can only do what you want by making replace an evalable expression that gives the result you want and telling s/// to eval it with the /ee modifier like so:

$find="start (.*) end";
$replace='"foo $1 bar"';

$var = "start middle end";
$var =~ s/$find/$replace/ee;

print "var: $var\n";

To see why the "" and double /e are needed, see the effect of the double eval here:

$ perl
$foo = "middle";
$replace='"foo $foo bar"';
print eval('$replace'), "\n";
print eval(eval('$replace')), "\n";
__END__
"foo $foo bar"
foo middle bar

(Though as ikegami notes, a single /e or the first /e of a double e isn't really an eval(); rather, it tells the compiler that the substitution is code to compile, not a string. Nonetheless, eval(eval(...)) still demonstrates why you need to do what you need to do to get /ee to work as desired.)


#!/usr/bin/perl

$sub = "\\1";
$str = "hi1234";
$res = $str;
$match = "hi(.*)";
$res =~ s/$match/$1/g;

print $res

This got me the '1234'.


I did not manage to make the most popular answers work.

  • The ee method complained when my replacement string contained several consecutive backreferences.
  • Kent Fredric's answer only replaced the first match, and I need my search and replace to be global. I did not figure out a way to make it replace all matches that didn't cause other issues. For example, I tried running the method recursively until it no longer caused the string to change, but that causes an infinite loop if the replacement string contains the search string, whereas a regular global replacement does not do that.

I attempted to come up with a solution of my own using plain old eval:

eval '$var =~ s/' . $find . '/' . $replace . '/gsu;';

Of course, this allows for code injection. But as far as I know, the only way to escape the regex query and inject code is to insert two forward slashes in $find or one in $replace, followed by a semi-colon, after which you can add add code. For example, if I set the variables this way:

my $find = 'foo';
my $replace = 'bar/; print "You\'ve just been hacked!\n"; #';

The evaluated code is this:

$var =~ s/foo/bar/; print "You've just been hacked!\n"; #/gsu;';

So what I do is make sure the strings don't contain any unescaped forward slashes.

First, I copy the strings into dummy strings.

my $findTest = $find;
my $replaceTest = $replace;

Then, I remove all escaped backslashes (backslash pairs) from the dummy strings. This allows me to find forward slashes that are not escaped, without falling into the trap of considering a forward slash escaped if it's preceded by an escaped backslash. For example: \/ contains an escaped forward slash, but \\/ contains a literal forward slash, because the backslash is escaped.

$findTest =~ s/\\\\//gmu;
$replaceTest =~ s/\\\\//gmu;

Now if any forward slash that is not preceded by a backslash remains in the strings, I throw a fatal error, as that would allow the user to insert arbitrary code.

if ($findTest =~ /(?<!\\)\// || $replaceTest =~ /(?<!\\)\//)
{
  print "String must not contain unescaped slashes.\n";
  exit 1;
}

Then I eval.

eval '$var =~ s/' . $find . '/' . $replace . '/gsu;';

I'm not an expert at preventing code injection, but I'm the only one using my script, so I'm content using this solution without fully knowing if it's vulnerable. But as far as I know, it may be, so if anyone knows if there is or isn't any way to inject code into this, please provide your insight in a comment.


See THIS previous SO post on using a variable on the replacement side of s///in Perl. Look both at the accepted answer and the rebuttal answer.

What you are trying to do is possible with the s///ee form that performs a double eval on the right hand string. See perlop quote like operators for more examples.

Be warned that there are security impilcations of evaland this will not work in taint mode.


# perl -de 0
$match="hi(.*)"
$sub='$1'
$res="hi1234"
$res =~ s/$match/$sub/gee
p $res
  1234

Be careful, though. This causes two layers of eval to occur, one for each e at the end of the regex:

  1. $sub --> $1
  2. $1 --> final value, in the example, 1234

I would suggest something like:

$text =~ m{(.*)$find(.*)};
$text = $1 . $replace . $2;

It is quite readable and seems to be safe. If multiple replace is needed, it is easy:

while ($text =~ m{(.*)$find(.*)}){
     $text = $1 . $replace . $2;
}

As others have suggested, you could use the following:

my $find = 'start (.*) end';
my $replace = 'foo $1 bar';   # 'foo \1 bar' is an error.
my $var = "start middle end";
$var =~ s/$find/$replace/ee;

The above is short for the following:

my $find = 'start (.*) end';
my $replace = 'foo $1 bar';
my $var = "start middle end";
$var =~ s/$find/ eval($replace) /e;

I prefer the second to the first since it doesn't hide the fact that eval(EXPR) is used. However, both of the above silence errors, so the following would be better:

my $find = 'start (.*) end';
my $replace = 'foo $1 bar';
my $var = "start middle end";
$var =~ s/$find/ my $r = eval($replace); die $@ if $@; $r /e;

But as you can see, all of the above allow for the execution of arbitrary Perl code. The following would be far safer:

use String::Substitution qw( sub_modify );

my $find = 'start (.*) end';
my $replace = 'foo $1 bar';
my $var = "start middle end";
sub_modify($var, $find, $replace);

I'm not certain on what it is you're trying to achieve. But maybe you can use this:

$var =~ s/^start/foo/;
$var =~ s/end$/bar/;

I.e. just leave the middle alone and replace the start and end.


See THIS previous SO post on using a variable on the replacement side of s///in Perl. Look both at the accepted answer and the rebuttal answer.

What you are trying to do is possible with the s///ee form that performs a double eval on the right hand string. See perlop quote like operators for more examples.

Be warned that there are security impilcations of evaland this will not work in taint mode.


# perl -de 0
$match="hi(.*)"
$sub='$1'
$res="hi1234"
$res =~ s/$match/$sub/gee
p $res
  1234

Be careful, though. This causes two layers of eval to occur, one for each e at the end of the regex:

  1. $sub --> $1
  2. $1 --> final value, in the example, 1234

Deparse tells us this is what is being executed:

$find = 'start (.*) end';
$replace = "foo \cA bar";
$var = 'start middle end';
$var =~ s/$find/$replace/;

However,

 /$find/foo \1 bar/

Is interpreted as :

$var =~ s/$find/foo $1 bar/;

Unfortunately it appears there is no easy way to do this.

You can do it with a string eval, but thats dangerous.

The most sane solution that works for me was this:

$find = "start (.*) end"; 
$replace = 'foo \1 bar';

$var = "start middle end"; 

sub repl { 
    my $find = shift; 
    my $replace = shift; 
    my $var = shift;

    # Capture first 
    my @items = ( $var =~ $find ); 
    $var =~ s/$find/$replace/; 
    for( reverse 0 .. $#items ){ 
        my $n = $_ + 1; 
        #  Many More Rules can go here, ie: \g matchers  and \{ } 
        $var =~ s/\\$n/${items[$_]}/g ;
        $var =~ s/\$$n/${items[$_]}/g ;
    }
    return $var; 
}

print repl $find, $replace, $var; 

A rebuttal against the ee technique:

As I said in my answer, I avoid evals for a reason.

$find="start (.*) end";
$replace='do{ print "I am a dirty little hacker" while 1; "foo $1 bar" }';

$var = "start middle end";
$var =~ s/$find/$replace/ee;

print "var: $var\n";

this code does exactly what you think it does.

If your substitution string is in a web application, you just opened the door to arbitrary code execution.

Good Job.

Also, it WON'T work with taints turned on for this very reason.

$find="start (.*) end";
$replace='"' . $ARGV[0] . '"';

$var = "start middle end";
$var =~ s/$find/$replace/ee;

print "var: $var\n"


$ perl /tmp/re.pl  'foo $1 bar'
var: foo middle bar
$ perl -T /tmp/re.pl 'foo $1 bar' 
Insecure dependency in eval while running with -T switch at /tmp/re.pl line 10.

However, the more careful technique is sane, safe, secure, and doesn't fail taint. ( Be assured tho, the string it emits is still tainted, so you don't lose any security. )


I did not manage to make the most popular answers work.

  • The ee method complained when my replacement string contained several consecutive backreferences.
  • Kent Fredric's answer only replaced the first match, and I need my search and replace to be global. I did not figure out a way to make it replace all matches that didn't cause other issues. For example, I tried running the method recursively until it no longer caused the string to change, but that causes an infinite loop if the replacement string contains the search string, whereas a regular global replacement does not do that.

I attempted to come up with a solution of my own using plain old eval:

eval '$var =~ s/' . $find . '/' . $replace . '/gsu;';

Of course, this allows for code injection. But as far as I know, the only way to escape the regex query and inject code is to insert two forward slashes in $find or one in $replace, followed by a semi-colon, after which you can add add code. For example, if I set the variables this way:

my $find = 'foo';
my $replace = 'bar/; print "You\'ve just been hacked!\n"; #';

The evaluated code is this:

$var =~ s/foo/bar/; print "You've just been hacked!\n"; #/gsu;';

So what I do is make sure the strings don't contain any unescaped forward slashes.

First, I copy the strings into dummy strings.

my $findTest = $find;
my $replaceTest = $replace;

Then, I remove all escaped backslashes (backslash pairs) from the dummy strings. This allows me to find forward slashes that are not escaped, without falling into the trap of considering a forward slash escaped if it's preceded by an escaped backslash. For example: \/ contains an escaped forward slash, but \\/ contains a literal forward slash, because the backslash is escaped.

$findTest =~ s/\\\\//gmu;
$replaceTest =~ s/\\\\//gmu;

Now if any forward slash that is not preceded by a backslash remains in the strings, I throw a fatal error, as that would allow the user to insert arbitrary code.

if ($findTest =~ /(?<!\\)\// || $replaceTest =~ /(?<!\\)\//)
{
  print "String must not contain unescaped slashes.\n";
  exit 1;
}

Then I eval.

eval '$var =~ s/' . $find . '/' . $replace . '/gsu;';

I'm not an expert at preventing code injection, but I'm the only one using my script, so I'm content using this solution without fully knowing if it's vulnerable. But as far as I know, it may be, so if anyone knows if there is or isn't any way to inject code into this, please provide your insight in a comment.


I'm not certain on what it is you're trying to achieve. But maybe you can use this:

$var =~ s/^start/foo/;
$var =~ s/end$/bar/;

I.e. just leave the middle alone and replace the start and end.


Deparse tells us this is what is being executed:

$find = 'start (.*) end';
$replace = "foo \cA bar";
$var = 'start middle end';
$var =~ s/$find/$replace/;

However,

 /$find/foo \1 bar/

Is interpreted as :

$var =~ s/$find/foo $1 bar/;

Unfortunately it appears there is no easy way to do this.

You can do it with a string eval, but thats dangerous.

The most sane solution that works for me was this:

$find = "start (.*) end"; 
$replace = 'foo \1 bar';

$var = "start middle end"; 

sub repl { 
    my $find = shift; 
    my $replace = shift; 
    my $var = shift;

    # Capture first 
    my @items = ( $var =~ $find ); 
    $var =~ s/$find/$replace/; 
    for( reverse 0 .. $#items ){ 
        my $n = $_ + 1; 
        #  Many More Rules can go here, ie: \g matchers  and \{ } 
        $var =~ s/\\$n/${items[$_]}/g ;
        $var =~ s/\$$n/${items[$_]}/g ;
    }
    return $var; 
}

print repl $find, $replace, $var; 

A rebuttal against the ee technique:

As I said in my answer, I avoid evals for a reason.

$find="start (.*) end";
$replace='do{ print "I am a dirty little hacker" while 1; "foo $1 bar" }';

$var = "start middle end";
$var =~ s/$find/$replace/ee;

print "var: $var\n";

this code does exactly what you think it does.

If your substitution string is in a web application, you just opened the door to arbitrary code execution.

Good Job.

Also, it WON'T work with taints turned on for this very reason.

$find="start (.*) end";
$replace='"' . $ARGV[0] . '"';

$var = "start middle end";
$var =~ s/$find/$replace/ee;

print "var: $var\n"


$ perl /tmp/re.pl  'foo $1 bar'
var: foo middle bar
$ perl -T /tmp/re.pl 'foo $1 bar' 
Insecure dependency in eval while running with -T switch at /tmp/re.pl line 10.

However, the more careful technique is sane, safe, secure, and doesn't fail taint. ( Be assured tho, the string it emits is still tainted, so you don't lose any security. )


As others have suggested, you could use the following:

my $find = 'start (.*) end';
my $replace = 'foo $1 bar';   # 'foo \1 bar' is an error.
my $var = "start middle end";
$var =~ s/$find/$replace/ee;

The above is short for the following:

my $find = 'start (.*) end';
my $replace = 'foo $1 bar';
my $var = "start middle end";
$var =~ s/$find/ eval($replace) /e;

I prefer the second to the first since it doesn't hide the fact that eval(EXPR) is used. However, both of the above silence errors, so the following would be better:

my $find = 'start (.*) end';
my $replace = 'foo $1 bar';
my $var = "start middle end";
$var =~ s/$find/ my $r = eval($replace); die $@ if $@; $r /e;

But as you can see, all of the above allow for the execution of arbitrary Perl code. The following would be far safer:

use String::Substitution qw( sub_modify );

my $find = 'start (.*) end';
my $replace = 'foo $1 bar';
my $var = "start middle end";
sub_modify($var, $find, $replace);

I'm not certain on what it is you're trying to achieve. But maybe you can use this:

$var =~ s/^start/foo/;
$var =~ s/end$/bar/;

I.e. just leave the middle alone and replace the start and end.


Deparse tells us this is what is being executed:

$find = 'start (.*) end';
$replace = "foo \cA bar";
$var = 'start middle end';
$var =~ s/$find/$replace/;

However,

 /$find/foo \1 bar/

Is interpreted as :

$var =~ s/$find/foo $1 bar/;

Unfortunately it appears there is no easy way to do this.

You can do it with a string eval, but thats dangerous.

The most sane solution that works for me was this:

$find = "start (.*) end"; 
$replace = 'foo \1 bar';

$var = "start middle end"; 

sub repl { 
    my $find = shift; 
    my $replace = shift; 
    my $var = shift;

    # Capture first 
    my @items = ( $var =~ $find ); 
    $var =~ s/$find/$replace/; 
    for( reverse 0 .. $#items ){ 
        my $n = $_ + 1; 
        #  Many More Rules can go here, ie: \g matchers  and \{ } 
        $var =~ s/\\$n/${items[$_]}/g ;
        $var =~ s/\$$n/${items[$_]}/g ;
    }
    return $var; 
}

print repl $find, $replace, $var; 

A rebuttal against the ee technique:

As I said in my answer, I avoid evals for a reason.

$find="start (.*) end";
$replace='do{ print "I am a dirty little hacker" while 1; "foo $1 bar" }';

$var = "start middle end";
$var =~ s/$find/$replace/ee;

print "var: $var\n";

this code does exactly what you think it does.

If your substitution string is in a web application, you just opened the door to arbitrary code execution.

Good Job.

Also, it WON'T work with taints turned on for this very reason.

$find="start (.*) end";
$replace='"' . $ARGV[0] . '"';

$var = "start middle end";
$var =~ s/$find/$replace/ee;

print "var: $var\n"


$ perl /tmp/re.pl  'foo $1 bar'
var: foo middle bar
$ perl -T /tmp/re.pl 'foo $1 bar' 
Insecure dependency in eval while running with -T switch at /tmp/re.pl line 10.

However, the more careful technique is sane, safe, secure, and doesn't fail taint. ( Be assured tho, the string it emits is still tainted, so you don't lose any security. )


I would suggest something like:

$text =~ m{(.*)$find(.*)};
$text = $1 . $replace . $2;

It is quite readable and seems to be safe. If multiple replace is needed, it is easy:

while ($text =~ m{(.*)$find(.*)}){
     $text = $1 . $replace . $2;
}

#!/usr/bin/perl

$sub = "\\1";
$str = "hi1234";
$res = $str;
$match = "hi(.*)";
$res =~ s/$match/$1/g;

print $res

This got me the '1234'.


I'm not certain on what it is you're trying to achieve. But maybe you can use this:

$var =~ s/^start/foo/;
$var =~ s/end$/bar/;

I.e. just leave the middle alone and replace the start and end.


Examples related to regex

Why my regexp for hyphenated words doesn't work? grep's at sign caught as whitespace Preg_match backtrack error regex match any single character (one character only) re.sub erroring with "Expected string or bytes-like object" Only numbers. Input number in React Visual Studio Code Search and Replace with Regular Expressions Strip / trim all strings of a dataframe return string with first match Regex How to capture multiple repeated groups?

Examples related to perl

The program can't start because api-ms-win-crt-runtime-l1-1-0.dll is missing while starting Apache server on my computer "End of script output before headers" error in Apache Perl - Multiple condition if statement without duplicating code? How to decrypt hash stored by bcrypt Split a string into array in Perl Turning multiple lines into one comma separated line String compare in Perl with "eq" vs "==" how to remove the first two columns in a file using shell (awk, sed, whatever) Find everything between two XML tags with RegEx Difference between \w and \b regular expression meta characters

Examples related to substitution

How can I combine multiple nested Substitute functions in Excel? bash: Bad Substitution Remove all newlines from inside a string JavaScript - Replace all commas in a string Substitute multiple whitespace with single whitespace in Python How to use a variable in the replacement side of the Perl substitution operator?