Am trying to grep pattern from dozen files .tar.gz but its very slow
tar -ztf file.tar.gz | while read FILENAME do if tar -zxf file.tar.gz "$FILENAME" -O | grep "string" > /dev/null then echo "$FILENAME contains string" fi done
That's actually very easy with ugrep option -z
:
-z, --decompress
Decompress files to search, when compressed. Archives (.cpio,
.pax, .tar, and .zip) and compressed archives (e.g. .taz, .tgz,
.tpz, .tbz, .tbz2, .tb2, .tz2, .tlz, and .txz) are searched and
matching pathnames of files in archives are output in braces. If
-g, -O, -M, or -t is specified, searches files within archives
whose name matches globs, matches file name extensions, matches
file signature magic bytes, or matches file types, respectively.
Supported compression formats: gzip (.gz), compress (.Z), zip,
bzip2 (requires suffix .bz, .bz2, .bzip2, .tbz, .tbz2, .tb2, .tz2),
lzma and xz (requires suffix .lzma, .tlz, .xz, .txz).
Which requires just one command to search file.tar.gz
as follows:
ugrep -z "string" file.tar.gz
This greps each of the archived files to display matches. Archived filenames are shown in braces to distinguish them from ordinary filenames. For example:
$ ugrep -z "Hello" archive.tgz
{Hello.bat}:echo "Hello World!"
Binary file archive.tgz{Hello.class} matches
{Hello.java}:public class Hello // prints a Hello World! greeting
{Hello.java}: { System.out.println("Hello World!");
{Hello.pdf}:(Hello)
{Hello.sh}:echo "Hello World!"
{Hello.txt}:Hello
If you just want the file names, use option -l
(--files-with-matches
) and customize the filename output with option --format="%z%~"
to get rid of the braces:
$ ugrep -z Hello -l --format="%z%~" archive.tgz
Hello.bat
Hello.class
Hello.java
Hello.pdf
Hello.sh
Hello.txt