Regex grep for multi-line search needed

Question

Possible Duplicate    How can I search for a multiline pattern in a file   Use pcregrep       I m running a grep to find any   sql file that has the word select followed by the word customerName followed by the word from  This select statement can span many lines and can contain tabs and newlines    I ve tried a few variations on the following      grep -liIr --include    sql  --exclude-dir    svn   --regexp  select a-zA-Z0- 9  n r  customerName a-zA-Z0-9  n r  from    This  however  just runs forever  Can anyone help me with the correct syntax please

User · Accepted Answer

Without the need to install the grep variant pcregrep, you can do multiline search with grep.

$ grep -Pzo "(?s)^(\s*)\N*main.*?{.*?^\1}" *.c

Explanation:

-P activate perl-regexp for grep (a powerful extension of regular expressions)

-z suppress newline at the end of line, substituting it for null character. That is, grep knows where end of line is, but sees the input as one big line.

-o print only matching. Because we're using -z, the whole file is like a single big line, so if there is a match, the entire file would be printed; this way it won't do that.

In regexp:

(?s) activate PCRE_DOTALL, which means that . finds any character or newline

\N find anything except newline, even with PCRE_DOTALL activated

.*? find . in non-greedy mode, that is, stops as soon as possible.

^ find start of line

\1 backreference to the first group (\s*). This is a try to find the same indentation of method.

As you can imagine, this search prints the main method in a C (*.c) source file.

User · Answer

I am not very good in grep  But your problem can be solved using AWK command  Just see  awk   select   from     sql   The above code will result from first occurence of select till first sequence of from  Now you need to verify whether returned statements are having customername or not  For this you can pipe the result  And can use awk or grep again

User · Answer

Your fundamental problem is that grep works one line at a time - so it cannot find a SELECT statement spread across lines   Your second problem is that the regex you are using doesn t deal with the complexity of what can appear between SELECT and FROM - in particular  it omits commas  full stops  periods  and blanks  but also quotes and anything that can be inside a quoted string   I would likely go with a Perl-based solution  having Perl read  paragraphs  at a time and applying a regex to that   The downside is having to deal with the recursive search - there are modules to do that  of course  including the core module File  Find   In outline  for a single file          n n        Paragraphs  while   lt  gt          if        m SELECT  customerName  FROM mi                  printf file name          go to next file            That needs to be wrapped into a sub that is then invoked by the methods of File  Find

[regex] Regex (grep) for multi-line search needed

Examples related to regex

Examples related to linux

Examples related to cygwin

Examples related to grep