Possible Duplicate:
How can I search for a multiline pattern in a file ? Use pcregrep
I'm running a grep
to find any *.sql file that has the word select
followed by the word customerName
followed by the word from
. This select statement can span many lines and can contain tabs and newlines.
I've tried a few variations on the following:
$ grep -liIr --include="*.sql" --exclude-dir="\.svn*" --regexp="select[a-zA-Z0-
9+\n\r]*customerName[a-zA-Z0-9+\n\r]*from"
This, however, just runs forever. Can anyone help me with the correct syntax please?
Without the need to install the grep variant pcregrep, you can do multiline search with grep.
$ grep -Pzo "(?s)^(\s*)\N*main.*?{.*?^\1}" *.c
Explanation:
-P
activate perl-regexp for grep (a powerful extension of regular expressions)
-z
suppress newline at the end of line, substituting it for null character. That is, grep knows where end of line is, but sees the input as one big line.
-o
print only matching. Because we're using -z
, the whole file is like a single big line, so if there is a match, the entire file would be printed; this way it won't do that.
In regexp:
(?s)
activate PCRE_DOTALL
, which means that .
finds any character or newline
\N
find anything except newline, even with PCRE_DOTALL
activated
.*?
find .
in non-greedy mode, that is, stops as soon as possible.
^
find start of line
\1
backreference to the first group (\s*
). This is a try to find the same indentation of method.
As you can imagine, this search prints the main method in a C (*.c
) source file.
Your fundamental problem is that grep
works one line at a time - so it cannot find a SELECT statement spread across lines.
Your second problem is that the regex you are using doesn't deal with the complexity of what can appear between SELECT and FROM - in particular, it omits commas, full stops (periods) and blanks, but also quotes and anything that can be inside a quoted string.
I would likely go with a Perl-based solution, having Perl read 'paragraphs' at a time and applying a regex to that. The downside is having to deal with the recursive search - there are modules to do that, of course, including the core module File::Find.
In outline, for a single file:
$/ = "\n\n"; # Paragraphs
while (<>)
{
if ($_ =~ m/SELECT.*customerName.*FROM/mi)
{
printf file name
go to next file
}
}
That needs to be wrapped into a sub that is then invoked by the methods of File::Find.
I am not very good in grep. But your problem can be solved using AWK command. Just see
awk '/select/,/from/' *.sql
The above code will result from first occurence of select
till first sequence of from
. Now you need to verify whether returned statements are having customername
or not. For this you can pipe the result. And can use awk or grep again.
Source: Stackoverflow.com