I agree with Charles Duffy that a proper XML parser is the right way to go.
But as to what's wrong with your sed
command (or did you do it on purpose?).
$data
was not quoted, so $data
is subject to shell's word splitting, filename expansion among other things. One of the consequences being that the spacing in the XML snippet is not preserved.So given your specific XML structure, this modified sed
command should work
title=$(sed -ne '/title/{s/.*<title>\(.*\)<\/title>.*/\1/p;q;}' <<< "$data")
Basically for the line that contains title
, extract the text between the tags, then quit (so you don't extract the 2nd <title>
)