I've seen this example:
hello=ho02123ware38384you443d34o3434ingtod38384day
echo ${hello//[0-9]/}
Which follows this syntax: ${variable//pattern/replacement}
Unfortunately the pattern
field doesn't seem to support full regex syntax (if I use .
or \s
, for example, it tries to match the literal characters).
How can I search/replace a string using full regex syntax?
This example in the input hello ugly world
it searches for the regex bad|ugly
and replaces it with nice
#!/bin/bash
# THIS FUNCTION NEEDS THREE PARAMETERS
# arg1 = input Example: hello ugly world
# arg2 = search regex Example: bad|ugly
# arg3 = replace Example: nice
function regex_replace()
{
# $1 = hello ugly world
# $2 = bad|ugly
# $3 = nice
# REGEX
re="(.*?)($2)(.*)"
if [[ $1 =~ $re ]]; then
# if there is a match
# ${BASH_REMATCH[0]} = hello ugly world
# ${BASH_REMATCH[1]} = hello
# ${BASH_REMATCH[2]} = ugly
# ${BASH_REMATCH[3]} = world
# hello + nice + world
echo ${BASH_REMATCH[1]}$3${BASH_REMATCH[3]}
else
# if no match return original input hello ugly world
echo "$1"
fi
}
# prints 'hello nice world'
regex_replace 'hello ugly world' 'bad|ugly' 'nice'
# to save output to a variable
x=$(regex_replace 'hello ugly world' 'bad|ugly' 'nice')
echo "output of replacement is: $x"
exit
Use [[:digit:]]
(note the double brackets) as the pattern:
$ hello=ho02123ware38384you443d34o3434ingtod38384day
$ echo ${hello//[[:digit:]]/}
howareyoudoingtodday
Just wanted to summarize the answers (especially @nickl-'s https://stackoverflow.com/a/22261334/2916086).
I know this is an ancient thread, but it was my first hit on Google, and I wanted to share the following resub
that I put together, which adds support for multiple $1, $2, etc. backreferences...
#!/usr/bin/env bash
############################################
### resub - regex substitution in bash ###
############################################
resub() {
local match="$1" subst="$2" tmp
if [[ -z $match ]]; then
echo "Usage: echo \"some text\" | resub '(.*) (.*)' '\$2 me \${1}time'" >&2
return 1
fi
### First, convert "$1" to "$BASH_REMATCH[1]" and 'single-quote' for later eval-ing...
### Utility function to 'single-quote' a list of strings
squot() { local a=(); for i in "$@"; do a+=( $(echo \'${i//\'/\'\"\'\"\'}\' )); done; echo "${a[@]}"; }
tmp=""
while [[ $subst =~ (.*)\${([0-9]+)}(.*) ]] || [[ $subst =~ (.*)\$([0-9]+)(.*) ]]; do
tmp="\${BASH_REMATCH[${BASH_REMATCH[2]}]}$(squot "${BASH_REMATCH[3]}")${tmp}"
subst="${BASH_REMATCH[1]}"
done
subst="$(squot "${subst}")${tmp}"
### Now start (globally) substituting
tmp=""
while read line; do
counter=0
while [[ $line =~ $match(.*) ]]; do
eval tmp='"${tmp}${line%${BASH_REMATCH[0]}}"'"${subst}"
line="${BASH_REMATCH[$(( ${#BASH_REMATCH[@]} - 1 ))]}"
done
echo "${tmp}${line}"
done
}
resub "$@"
##################
### EXAMPLES ###
##################
### % echo "The quick brown fox jumps quickly over the lazy dog" | resub quick slow
### The slow brown fox jumps slowly over the lazy dog
### % echo "The quick brown fox jumps quickly over the lazy dog" | resub 'quick ([^ ]+) fox' 'slow $1 sheep'
### The slow brown sheep jumps quickly over the lazy dog
### % animal="sheep"
### % echo "The quick brown fox 'jumps' quickly over the \"lazy\" \$dog" | resub 'quick ([^ ]+) fox' "\"\$low\" \${1} '$animal'"
### The "$low" brown 'sheep' 'jumps' quickly over the "lazy" $dog
### % echo "one two three four five" | resub "one ([^ ]+) three ([^ ]+) five" 'one $2 three $1 five'
### one four three two five
### % echo "one two one four five" | resub "one ([^ ]+) " 'XXX $1 '
### XXX two XXX four five
### % echo "one two three four five one six three seven eight" | resub "one ([^ ]+) three ([^ ]+) " 'XXX $1 YYY $2 '
### XXX two YYY four five XXX six YYY seven eight
H/T to @Charles Duffy re: (.*)$match(.*)
These examples also work in bash no need to use sed:
#!/bin/bash
MYVAR=ho02123ware38384you443d34o3434ingtod38384day
MYVAR=${MYVAR//[a-zA-Z]/X}
echo ${MYVAR//[0-9]/N}
you can also use the character class bracket expressions
#!/bin/bash
MYVAR=ho02123ware38384you443d34o3434ingtod38384day
MYVAR=${MYVAR//[[:alpha:]]/X}
echo ${MYVAR//[[:digit:]]/N}
output
XXNNNNNXXXXNNNNNXXXNNNXNNXNNNNXXXXXXNNNNNXXX
What @Lanaru wanted to know however, if I understand the question correctly, is why the "full" or PCRE extensions \s\S\w\W\d\D
etc don't work as supported in php ruby python etc. These extensions are from Perl-compatible regular expressions (PCRE) and may not be compatible with other forms of shell based regular expressions.
These don't work:
#!/bin/bash
hello=ho02123ware38384you443d34o3434ingtod38384day
echo ${hello//\d/}
#!/bin/bash
hello=ho02123ware38384you443d34o3434ingtod38384day
echo $hello | sed 's/\d//g'
output with all literal "d" characters removed
ho02123ware38384you44334o3434ingto38384ay
but the following does work as expected
#!/bin/bash
hello=ho02123ware38384you443d34o3434ingtod38384day
echo $hello | perl -pe 's/\d//g'
output
howareyoudoingtodday
Hope that clarifies things a bit more but if you are not confused yet why don't you try this on Mac OS X which has the REG_ENHANCED flag enabled:
#!/bin/bash
MYVAR=ho02123ware38384you443d34o3434ingtod38384day;
echo $MYVAR | grep -o -E '\d'
On most flavours of *nix you will only see the following output:
d
d
d
nJoy!
This actually can be done in pure bash:
hello=ho02123ware38384you443d34o3434ingtod38384day
re='(.*)[0-9]+(.*)'
while [[ $hello =~ $re ]]; do
hello=${BASH_REMATCH[1]}${BASH_REMATCH[2]}
done
echo "$hello"
...yields...
howareyoudoingtodday
If you are making repeated calls and are concerned with performance, This test reveals the BASH method is ~15x faster than forking to sed and likely any other external process.
hello=123456789X123456789X123456789X123456789X123456789X123456789X123456789X123456789X123456789X123456789X123456789X
P1=$(date +%s)
for i in {1..10000}
do
echo $hello | sed s/X//g > /dev/null
done
P2=$(date +%s)
echo $[$P2-$P1]
for i in {1..10000}
do
echo ${hello//X/} > /dev/null
done
P3=$(date +%s)
echo $[$P3-$P2]
Source: Stackoverflow.com