For reasonably modern versions of sed, edit the standard input to yield the standard output with
$ echo 't???? ß?ß??? ?? ??p??' | sed -E -e 's/[[:blank:]]+/\n/g'
t????
ß?ß???
??
??p??
If your vocabulary words are in files named lesson1
and lesson2
, redirect sed’s standard output to the file all-vocab
with
sed -E -e 's/[[:blank:]]+/\n/g' lesson1 lesson2 > all-vocab
What it means:
[[:blank:]]
matches either a single space character or
a single tab character.
[[:space:]]
instead to match any single whitespace character (commonly space, tab, newline, carriage return, form-feed, and vertical tab).+
quantifier means match one or more of the previous pattern.[[:blank:]]+
is a sequence of one or more characters that are all space or tab.\n
in the replacement is the newline that you want./g
modifier on the end means perform the substitution as many times as possible rather than just once.-E
option tells sed to use POSIX extended regex syntax and in particular for this case the +
quantifier. Without -E
, your sed command becomes sed -e 's/[[:blank:]]\+/\n/g'
. (Note the use of \+
rather than simple +
.)For those familiar with Perl-compatible regexes and a PCRE-capable sed, use \s+
to match runs of at least one whitespace character, as in
sed -E -e 's/\s+/\n/g' old > new
or
sed -e 's/\s\+/\n/g' old > new
These commands read input from the file old
and write the result to a file named new
in the current directory.
Going back to almost any version of sed since Version 7 Unix, the command invocation is a bit more baroque.
$ echo 't???? ß?ß??? ?? ??p??' | sed -e 's/[ \t][ \t]*/\
/g'
t????
ß?ß???
??
??p??
Notes:
+
quantifier and simulate it with a single space-or-tab ([ \t]
) followed by zero or more of them ([ \t]*
).\n
for newline, we have to include it on the command line verbatim.
\
and the end of the first line of the command is a continuation marker that escapes the immediately following newline, and the remainder of the command is on the next line.
The commands above all used single quotes (''
) rather than double quotes (""
). Consider:
$ echo '\\\\' "\\\\"
\\\\ \\
That is, the shell applies different escaping rules to single-quoted strings as compared with double-quoted strings. You typically want to protect all the backslashes common in regexes with single quotes.