What are the undocumented features and limitations of the Windows FINDSTR command

Question

The Windows FINDSTR command is horribly documented  There is very basic command line help available through FINDSTR     or HELP FINDSTR  but it is woefully inadequate  There is a wee bit more documentation online at https   docs microsoft com en-us windows-server administration windows-commands findstr   There are many FINDSTR features and limitations that are not even hinted at in the documentation  Nor could they be anticipated without prior knowledge and or careful experimentation   So the question is - What are the undocumented FINDSTR features and limitations   The purpose of this question is to provide a one stop repository of the many undocumented features so that   A  Developers can take full advantage of the features that are there   B  Developers don t waste their time wondering why something doesn t work when it seems like it should   Please make sure you know the existing documentation before responding  If the information is covered by the HELP  then it does not belong here   Neither is this a place to show interesting uses of FINDSTR  If a logical person could anticipate the behavior of a particular usage of FINDSTR based on the documentation  then it does not belong here   Along the same lines  if a logical person could anticipate the behavior of a particular usage based on information contained in any existing answers  then again  it does not belong here

User · Answer

The findstr command sets the ErrorLevel  or exit code  to one of the following values  given that there are no invalid or incompatible switches and no search string exceeds the applicable length limit    0 when at least a single match is encountered in one line throughout all specified files  1 otherwise    A line is considered to contain a match when    no  V option is given and the search expression occurs at least once  the  V option is given and the search expression does not occur    This means that the  V option also changes the returned ErrorLevel  but it does not just revert it   For example  when you have got a file test txt with two lines  one of which contains the string text but the other one does not  both findstr  text   test txt  and findstr  V  text   test txt  return an ErrorLevel of 0   Basically you can say  if findstr returns at least a line  ErrorLevel is set to 0  else to 1   Note that the  M option does not affect the ErrorLevel value  it just alters the output    Just for the sake of completeness  the find command behaves exactly the same way with respect to the  V option and ErrorLevel  the  C option does not affect ErrorLevel

User · Answer

I d like to report a bug regarding the section Source of data to search in the first answer when using en dash       or em dash       within the filename   More specifically  if you are about to use the first option - filenames specified as arguments  the file won t be found  As soon as you use either option 2 - stdin via redirection or 3 - data stream from a pipe  findstr will find the file   For example  this simple batch script   echo off chcp 1250  gt  nul set INTEXTFILE1 filename with     dash txt set INTEXTFILE2 filename with     dash txt  rem 3 way of findstr use with en dashed filename echo  echo Filename with en dash  echo  echo 1  As argument findstr     INTEXTFILE1   echo  echo 2  As stdin via redirection findstr    lt    INTEXTFILE1   echo  echo 3  As datastream from a pipe type   INTEXTFILE1     findstr   echo  echo  rem The same set of operations with em dashed filename echo Filename with em dash  echo  echo 1  As argument findstr     INTEXTFILE2   echo  echo 2  As stdin via redirection findstr    lt    INTEXTFILE2   echo  echo 3  As datastream from a pipe type   INTEXTFILE2     findstr   echo   pause   will print   Filename with en dash     As argument FINDSTR  Cannot open filename with - dash txt As stdin via redirection I am the file with an en dash  As datastream from a pipe I am the file with an en dash    Filename with em dash     As argument FINDSTR  Cannot open filename with - dash txt As stdin via redirection I am the file with an em dash  As datastream from a pipe I am the file with an em dash    Hope it helps   M

User · Answer

When several commands are enclosed in parentheses and there are redirected files to the whole block    lt  input txt      command1    command2             gt  output txt       then the files remains open as long as the commands in the block be active  so the commands may move the file pointer of the redirected files  Both MORE and FIND commands move the Stdin file pointer to the beginning of the file before process it  so the same file may be processed several times inside the block  For example  this code   more  lt  input txt  gt   output txt more  lt  input txt  gt  gt  output txt       produce the same result than this one    lt  input txt      more    more    gt  output txt   This code   find     search string   lt  input txt  gt  matchedLines txt find  V  search string   lt  input txt  gt  unmatchedLines txt       produce the same result than this one    lt  input txt      find     search string   gt  matchedLines txt    find  V  search string   gt  unmatchedLines txt     FINDSTR is different  it does not move the Stdin file pointer from its current position  For example  this code insert a new line after a search line   call  ProcessFile  lt  input txt goto  EOF   ProcessFile    rem Read the next line from Stdin and copy it    set  P line     echo  line     rem Test if it is the search line    if   line   neq  search line  goto ProcessFile rem Insert the new line at this point echo New line rem And copy the rest of lines findstr     exit  B   We may make good use of this feature with the aid of an auxiliary program that allow us to move the file pointer of a redirected file  as shown in this example   This behavior was first reported by jeb at this post     EDIT 2018-08-18  New FINDSTR bug reported  The FINDSTR command have a strange bug that happen when this command is used to show characters in color AND the output of such a command is redirected to CON device  For details on how use FINDSTR command to show text in color  see this topic   When the output of this form of FINDSTR command is redirected to CON  something strange happens after the text is output in the desired color  all the text after it is output as  invisible  characters  although a more precise description is that the text is output as black text over black background  The original text will appear if you use COLOR command to reset the foreground and background colors of the entire screen  However  when the text is  invisible  we could execute a SET  P command  so all characters entered will not appear on the screen  This behavior may be used to enter passwords    echo off setlocal  set  P       lt  NUL  gt   Enter password  findstr  A 1E  V       Enter password  NUL  gt  CON del  Enter password  set  P  password   cls color 07 echo The password read is    password

User · Answer

Preface Much of the information in this answer has been gathered based on experiments run on a Vista machine  Unless explicitly stated otherwise  I have not confirmed whether the information applies to other Windows versions  FINDSTR output The documentation never bothers to explain the output of FINDSTR  It alludes to the fact that matching lines are printed  but nothing more  The format of matching line output is as follows  filename lineNumber lineOffset text where fileName    The name of the file containing the matching line  The file name is not printed if the request was explicitly for a single file  or if searching piped input or redirected input  When printed  the fileName will always include any path information provided  Additional path information will be added if the  S option is used  The printed path is always relative to the provided path  or relative to the current directory if none provided  Note - The filename prefix can be avoided when searching multiple files by using the non-standard  and poorly documented  wildcards  lt  and  gt   The exact rules for how these wildcards work can be found here  Finally  you can look at this example of how the non-standard wildcards work with FINDSTR  lineNumber    The line number of the matching line represented as a decimal value with 1 representing the 1st line of the input  Only printed if  N option is specified  lineOffset    The decimal byte offset of the start of the matching line  with 0 representing the 1st character of the 1st line  Only printed if  O option is specified  This is not the offset of the match within the line  It is the number of bytes from the beginning of the file to the beginning of the line  text   The binary representation of the matching line  including any  lt CR gt  and or  lt LF gt   Nothing is left out of the binary output  such that this example that matches all lines will produce an exact binary copy of the original file  FINDSTR  quot   quot  FILE  gt FILE COPY  The  A option sets the color of the fileName   lineNumber   and lineOffset  output only  The text of the matching line is always output with the current console color  The  A option only has effect when output is displayed directly to the console  The  A option has no effect if the output is redirected to a file or piped  See the 2018-08-18 edit in Aacini s answer for a description of the buggy behavior when output is redirected to CON  Most control characters and many extended ASCII characters display as dots on XP FINDSTR on XP displays most non-printable control characters from matching lines as dots  periods  on the screen  The following control characters are exceptions  they display as themselves  0x09 Tab  0x0A LineFeed  0x0B Vertical Tab  0x0C Form Feed  0x0D Carriage Return  XP FINDSTR also converts a number of extended ASCII characters to dots as well  The extended ASCII characters that display as dots on XP are the same as those that are transformed when supplied on the command line  See the  quot Character limits for command line parameters - Extended ASCII transformation quot  section  later in this post Control characters and extended ASCII are not converted to dots on XP if the output is piped  redirected to a file  or within a FOR IN   clause  Vista and Windows 7 always display all characters as themselves  never as dots  Return Codes  ERRORLEVEL   0  success   Match was found in at least one line of at least one file    1  failure   No match was found in any line of any file  Invalid color specified by  A xx option   2  error   Incompatible options  L and  R both specified Missing argument after  A    F    C    D   or  G  File specified by  F file or  G file not found   255  error   Too many regular expression character class terms see Regex character class term limit and BUG in part 2 of answer    Source of data to search  Updated based on tests with Windows 7  Findstr can search data from only one of the following sources   filenames specified as arguments and or using the  F file option   stdin via redirection findstr  quot searchString quot   lt file  data stream from a pipe type file   findstr  quot searchString quot    Arguments options take precedence over redirection  which takes precedence over piped data  File name arguments and  F file may be combined  Multiple file name arguments may be used  If multiple  F file options are specified  then only the last one is used  Wild cards are allowed in filename arguments  but not within the file pointed to by  F file  Source of search strings  Updated based on tests with Windows 7  The  G file and  C string options may be combined  Multiple  C string options may be specified  If multiple  G file options are specified  then only the last one is used  If either  G file or  C string is used  then all non-option arguments are assumed to be files to search  If neither  G file nor  C string is used  then the first non-option argument is treated as a space delimited list of search terms  File names must not be quoted within the file when using the  F FILE option  File names may contain spaces and other special characters  Most commands require that such file names are quoted  But the FINDSTR  F files txt option requires that filenames within files txt must NOT be quoted  The file will not be found if the name is quoted  BUG - Short 8 3 filenames can break the  D and  S options As with all Windows commands  FINDSTR will attempt to match both the long name and the short 8 3 name when looking for files to search  Assume the current folder contains the following non-empty files  b1 txt b txt2 c txt  The following command will successfully find all 3 files  findstr  m  quot   quot    txt  b txt2 matches because the corresponding short name B9F64 1 TXT matches  This is consistent with the behavior of all other Windows commands  But a bug with the  D and  S options causes the following commands to only find b1 txt findstr  m  d    quot   quot    txt findstr  m  s  quot   quot    txt  The bug prevents b txt2 from being found  as well as all file names that sort after b txt2 within the same directory  Additional files that sort before  like a txt  are found  Additional files that sort later  like d txt  are missed once the bug has been triggered  Each directory searched is treated independently  For example  the  S option would successfully begin searching in a child folder after failing to find files in the parent  but once the bug causes a short file name to be missed in the child  then all subsequent files in that child folder would also be missed  The commands work bug free if the same file names are created on a machine that has NTFS 8 3 name generation disabled  Of course b txt2 would not be found  but c txt would be found properly  Not all short names trigger the bug  All instances of bugged behavior I have seen involve an extension that is longer than 3 characters with a short 8 3 name that begins the same as a normal name that does not require an 8 3 name  The bug has been confirmed on XP  Vista  and Windows 7  Non-Printable characters and the  P option The  P option causes FINDSTR to skip any file that contains any of the following decimal byte codes  0-7  14-25  27-31  Put another way  the  P option will only skip files that contain non-printable control characters  Control characters are codes less than or equal to 31  0x1F   FINDSTR treats the following control characters as printable   8  0x08  backspace  9  0x09  horizontal tab 10  0x0A  line feed 11  0x0B  vertical tab 12  0x0C  form feed 13  0x0D  carriage return 26  0x1A  substitute  end of text   All other control characters are treated as non-printable  the presence of which causes the  P option to skip the file  Piped and Redirected input may have  lt CR gt  lt LF gt  appended If the input is piped in and the last character of the stream is not  lt LF gt   then FINDSTR will automatically append  lt CR gt  lt LF gt  to the input  This has been confirmed on XP  Vista and Windows 7   I used to think that the Windows pipe was responsible for modifying the input  but I have since discovered that FINDSTR is actually doing the modification   The same is true for redirected input on Vista  If the last character of a file used as redirected input is not  lt LF gt   then FINDSTR will automatically append  lt CR gt  lt LF gt  to the input  However  XP and Windows 7 do not alter redirected input  FINDSTR hangs on XP and Windows 7 if redirected input does not end with  lt LF gt  This is a nasty  quot feature quot  on XP and Windows 7  If the last character of a file used as redirected input does not end with  lt LF gt   then FINDSTR will hang indefinitely once it reaches the end of the redirected file  Last line of Piped data may be ignored if it consists of a single character If the input is piped in and the last line consists of a single character that is not followed by  lt LF gt   then FINDSTR completely ignores the last line  Example - The first command with a single character and no  lt LF gt  fails to match  but the second command with 2 characters works fine  as does the third command that has one character with terminating newline   gt  set  p  quot  x quot   lt nul   findstr  quot   quot    gt  set  p  quot  xx quot   lt nul   findstr  quot   quot  xx   gt  echo x  findstr  quot   quot  x  Reported by DosTips user Sponge Belly at new findstr bug  Confirmed on XP  Windows 7 and Windows 8  Haven t heard about Vista yet   I no longer have Vista to test   Option syntax Option letters are not case sensitive  so  i and  I are equivalent  Options can be prefixed with either   or - Options may be concatenated after a single   or -  However  the concatenated option list may contain at most one multicharacter option such as OFF or F   and the multi-character option must be the last option in the list  The following are all equivalent ways of expressing a case insensitive regex search for any line that contains both  quot hello quot  and  quot goodbye quot  in any order   i  r  c  quot hello  goodbye quot   c  quot goodbye  hello quot   -i -r -c  quot hello  goodbye quot   c  quot goodbye  hello quot    irc  quot hello  goodbye quot   c  quot goodbye  hello quot    Options may also be quoted  So  i  -i   quot  i quot  and  quot -i quot  are all equivalent  Likewise   c string   quot  c quot  string   quot  c  quot string and  quot  c string quot  are all equivalent  If a search string begins with a   or - literal  then the  C or  G option must be used  Thanks to Stephan for reporting this in a comment  since deleted   Search String length limits On Vista the maximum allowed length for a single search string is 511 bytes  If any search string exceeds 511 then the result is a FINDSTR  Search string too long  error with ERRORLEVEL 2  When doing a regular expression search  the maximum search string length is 254  A regular expression with length between 255 and 511 will result in a FINDSTR  Out of memory error with ERRORLEVEL 2  A regular expression length  gt 511 results in the FINDSTR  Search string too long  error  On Windows XP the search string length is apparently shorter  Findstr error   quot Search string too long quot   How to extract and match substring in  quot for quot  loop  The XP limit is 127 bytes for both literal and regex searches  Line Length limits Files specified as a command line argument or via the  F FILE option have no known line length limit  Searches were successfully run against a 128MB file that did not contain a single  lt LF gt   Piped data and Redirected input is limited to 8191 bytes per line  This limit is a  quot feature quot  of FINDSTR  It is not inherent to pipes or redirection  FINDSTR using redirected stdin or piped input will never match any line that is  gt  8k bytes  Lines  gt   8k generate an error message to stderr  but ERRORLEVEL is still 0 if the search string is found in at least one line of at least one file  Default type of search  Literal vs Regular Expression  C  quot string quot  - The default is  L literal  Explicitly combining the  L option with  C  quot string quot  certainly works but is redundant   quot string argument quot  - The default depends on the content of the very first search string   Remember that  lt space gt  is used to delimit search strings   If the first search string is a valid regular expression that contains at least one un-escaped meta-character  then all search strings are treated as regular expressions  Otherwise all search strings are treated as literals  For example   quot 51 4 200 quot  will be treated as two regular expressions because the first string contains an un-escaped dot  whereas  quot 200 51 4 quot  will be treated as two literals because the first string does not contain any meta-characters   G file - The default depends on the content of the first non-empty line in the file  If the first search string is a valid regular expression that contains at least one un-escaped meta-character  then all search strings are treated as regular expressions  Otherwise all search strings are treated as literals  Recommendation - Always explicitly specify  L literal option or  R regular expression option when using  quot string argument quot  or  G file  BUG - Specifying multiple literal search strings can give unreliable results The following simple FINDSTR example fails to find a match  even though it should  echo ffffaaa findstr  l  quot ffffaaa faffaffddd quot   This bug has been confirmed on Windows Server 2003  Windows XP  Vista  and Windows 7  Based on experiments  FINDSTR may fail if all of the following conditions are met   The search is using multiple literal search strings The search strings are of different lengths A short search string has some amount of overlap with a longer search string The search is case sensitive  no  I option   In every failure I have seen  it is always one of the shorter search strings that fails  For more info see Why doesn t this FINDSTR example with multiple literal search strings find a match   Quotes and backslahses within command line arguments Note - User MC ND s comments reflect the actual horrifically complicated rules for this section  There are 3 distinct parsing phases involved   First cmd exe may require some quotes to be escaped as   quot   really nothing to do with FINDSTR  Next FINDSTR uses the pre 2008 MS C C   argument parser  which has special rules for  quot  and   After the argument parser finishes  FINDSTR additionally treats   followed by an alpha-numeric character as literal  but   followed by non-alpha-numeric character as an escape character  The remainder of this highlighted section is not 100  correct  It can serve as a guide for many situations  but the above rules are required for total understanding  Escaping Quote within command line search strings Quotes within command line search strings must be escaped with backslash like   quot   This is true for both literal and regex search strings  This information has been confirmed on XP  Vista  and Windows 7  Note  The quote may also need to be escaped for the CMD EXE parser  but this has nothing to do with FINDSTR  For example  to search for a single quote you could use  FINDSTR    quot  file  amp  amp  echo found    echo not found Escaping Backslash within command line literal search strings Backslash in a literal search string can normally be represented as   or as     They are typically equivalent   There may be unusual cases in Vista where the backslash must always be escaped  but I no longer have a Vista machine to test   But there are some special cases  When searching for consecutive backslashes  all but the last must be escaped  The last backslash may optionally be escaped      can be coded as     or          can be coded as       or         Searching for one or more backslashes before a quote is bizarre  Logic would suggest that the quote must be escaped  and each of the leading backslashes would need to be escaped  but this does not work  Instead  each of the leading backslashes must be double escaped  and the quote is escaped normally     quot  must be coded as       quot     quot  must be coded as           quot   As previously noted  one or more escaped quotes may also require escaping with   for the CMD parser The info in this section has been confirmed on XP and Windows 7  Escaping Backslash within command line regex search strings  Vista only  Backslash in a regex must be either double escaped like       or else single escaped within a character class set like       XP and Windows 7  Backslash in a regex can always be represented as       It can normally be represented as     But this never works if the backslash precedes an escaped quote  One or more backslashes before an escaped quote must either be double escaped  or else coded as         quot  may be coded as       quot  or       quot     quot  may be coded as           quot  or           quot  or         quot      Escaping Quote and Backslash within  G FILE literal search strings Standalone quotes and backslashes within a literal search string file specified by  G file need not be escaped  but they can be   quot  and   quot  are equivalent    and    are equivalent  If the intent is to find     then at least the leading backslash must be escaped  Both     and      work  If the intent is to find  quot   then at least the leading backslash must be escaped  Both    quot  and     quot  work  Escaping Quote and Backslash within  G FILE regex search strings This is the one case where the escape sequences work as expected based on the documentation  Quote is not a regex metacharacter  so it need not be escaped  but can be   Backslash is a regex metacharacter  so it must be escaped  Character limits for command line parameters - Extended ASCII transformation The null character  0x00  cannot appear in any string on the command line  Any other single byte character can appear in the string  0x01 - 0xFF   However  FINDSTR converts many extended ASCII characters it finds within command line parameters into other characters  This has a major impact in two ways   Many extended ASCII characters will not match themselves if used as a search string on the command line  This limitation is the same for literal and regex searches  If a search string must contain extended ASCII  then the  G FILE option should be used instead   FINDSTR may fail to find a file if the name contains extended ASCII characters and the file name is specified on the command line  If a file to be searched contains extended ASCII in the name  then the  F FILE option should be used instead    Here is a complete list of extended ASCII character transformations that FINDSTR performs on command line strings  Each character is represented as the decimal byte code value  The first code represents the character as supplied on the command line  and the second code represents the character it is transformed into  Note - this list was compiled on a U S machine  I do not know what impact other languages may have on this list  158 treated as 080     199 treated as 221     226 treated as 071 169 treated as 170     200 treated as 043     227 treated as 112 176 treated as 221     201 treated as 043     228 treated as 083 177 treated as 221     202 treated as 045     229 treated as 115 178 treated as 221     203 treated as 045     231 treated as 116 179 treated as 221     204 treated as 221     232 treated as 070 180 treated as 221     205 treated as 045     233 treated as 084 181 treated as 221     206 treated as 043     234 treated as 079 182 treated as 221     207 treated as 045     235 treated as 100 183 treated as 043     208 treated as 045     236 treated as 056 184 treated as 043     209 treated as 045     237 treated as 102 185 treated as 221     210 treated as 045     238 treated as 101 186 treated as 221     211 treated as 043     239 treated as 110 187 treated as 043     212 treated as 043     240 treated as 061 188 treated as 043     213 treated as 043     242 treated as 061 189 treated as 043     214 treated as 043     243 treated as 061 190 treated as 043     215 treated as 043     244 treated as 040 191 treated as 043     216 treated as 043     245 treated as 041 192 treated as 043     217 treated as 043     247 treated as 126 193 treated as 045     218 treated as 043     249 treated as 250 194 treated as 045     219 treated as 221     251 treated as 118 195 treated as 043     220 treated as 095     252 treated as 110 196 treated as 045     222 treated as 221     254 treated as 221 197 treated as 043     223 treated as 095 198 treated as 221     224 treated as 097  Any character  gt 0 not in the list above is treated as itself  including  lt CR gt  and  lt LF gt   The easiest way to include odd characters like  lt CR gt  and  lt LF gt  is to get them into an environment variable and use delayed expansion within the command line argument  Character limits for strings found in files specified by  G FILE and  F FILE options The nul  0x00  character can appear in the file  but it functions like the C string terminator  Any characters after a nul character are treated as a different string as if they were on another line  The  lt CR gt  and  lt LF gt  characters are treated as line terminators that terminate a string  and are not included in the string  All other single byte characters are included perfectly within a string  Searching Unicode files FINDSTR cannot properly search most Unicode  UTF-16  UTF-16LE  UTF-16BE  UTF-32  because it cannot search for nul bytes and Unicode typically contains many nul bytes  However  the TYPE command converts UTF-16LE with BOM to a single byte character set  so a command like the following will work with UTF-16LE with BOM  type unicode txt findstr  quot search quot   Note that Unicode code points that are not supported by your active code page will be converted to   characters  It is possible to search UTF-8 as long as your search string contains only ASCII  However  the console output of any multi-byte UTF-8 characters will not be correct  But if you redirect the output to a file  then the result will be correctly encoded UTF-8  Note that if the UTF-8 file contains a BOM  then the BOM will be considered as part of the first line  which could throw off a search that matches the beginning of a line  It is possible to search multi-byte UTF-8 characters if you put your search string in a UTF-8 encoded search file  without BOM   and use the  G option  End Of Line FINDSTR breaks lines immediately after every  lt LF gt   The presence or absence of  lt CR gt  has no impact on line breaks  Searching across line breaks As expected  the   regex metacharacter will not  match  lt CR gt  or  lt LF gt   But it is possible to search across a line break using a command line search string  Both the  lt CR gt  and  lt LF gt  characters must be matched explicitly  If a multi-line match is found  only the 1st line of the match is printed  FINDSTR then doubles back to the 2nd line in the source and begins the search all over again - sort of a  quot look ahead quot  type feature  Assume TEXT TXT has these contents  could be Unix or Windows style  A A A B A A  Then this script  echo off setlocal   Define LF variable containing a linefeed  0x0A  set LF       Above 2 blank lines are critical - do not remove    Define CR variable containing a carriage return  0x0D  for  f   a in   copy  Z  quot   dpf0 quot  nul   do set  quot CR   a quot   setlocal enableDelayedExpansion   regex  quot  CR   LF  quot  will match both Unix and Windows style End-Of-Line findstr  n  r  c  quot A CR   LF A quot  TEST TXT  gives these results 1 A 2 A 5 A  Searching across line breaks using the  G FILE option is imprecise because the only way to match  lt CR gt  or  lt LF gt  is via a regex character class range expression that sandwiches the EOL characters     lt TAB gt - lt 0x0B gt   matches  lt LF gt   but it also matches  lt TAB gt  and  lt 0x0B gt     lt 0x0C gt -   matches  lt CR gt   but it also matches  lt 0x0C gt  and     Note - the above are symbolic representations of the regex byte stream since I can t graphically represent the characters  Answer continued in part 2 below

User · Answer

findstr sometimes hangs unexpectedly when searching large files   I haven t confirmed the exact conditions or boundary sizes  I suspect any file larger 2GB may be at risk   I have had mixed experiences with this  so it is more than just file size  This looks like it may be a variation on FINDSTR hangs on XP and Windows 7 if redirected input does not end with LF  but as demonstrated this particular problem manifests when input is not redirected   The following command line session  Windows 7  demonstrates how findstr can hang when searching a 3GB file   C  Data Temp 2014-04 gt echo 1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 gt  T100B txt  C  Data Temp 2014-04 gt for  L  i in  1 1 10  do  type T100B txt  gt  gt  T1KB txt  C  Data Temp 2014-04 gt for  L  i in  1 1 1000  do  type T1KB txt  gt  gt  T1MB txt  C  Data Temp 2014-04 gt for  L  i in  1 1 1000  do  type T1MB txt  gt  gt  T1GB txt  C  Data Temp 2014-04 gt echo find this line gt  gt  T1GB txt  C  Data Temp 2014-04 gt copy T1GB txt   T1GB txt   T1GB txt T3GB txt T1GB txt T1GB txt T1GB txt         1 file s  copied   C  Data Temp 2014-04 gt dir  Volume in drive C has no label   Volume Serial Number is D2B2-FFDF   Directory of C  Data Temp 2014-04  2014 04 08  04 28 PM     lt DIR gt             2014 04 08  04 28 PM     lt DIR gt              2014 04 08  04 22 PM               102 T100B txt 2014 04 08  04 28 PM     1  020  000  016 T1GB txt 2014 04 08  04 23 PM             1  020 T1KB txt 2014 04 08  04 23 PM         1  020  000 T1MB txt 2014 04 08  04 29 PM     3  060  000  049 T3GB txt                5 File s   4  081  021  187 bytes                2 Dir s   51  881  050  112 bytes free C  Data Temp 2014-04 gt rem Findstr on the 1GB file does not hang  C  Data Temp 2014-04 gt findstr  this  T1GB txt find this line  C  Data Temp 2014-04 gt rem On the 3GB file  findstr hangs and must be aborted    even though it clearly reaches end of file  C  Data Temp 2014-04 gt findstr  this  T3GB txt find this line find this line find this line  C C  Data Temp 2014-04 gt    Note  I ve verified in a hex editor that all lines are terminated with CRLF  The only anomaly is that the file is terminated with 0x1A due to the way copy works  Note however  that this anomaly doesn t cause a problem on  small  files   With additional testing I have confirmed the following    Using copy with the  b option for binary files prevents the addition of the 0x1A character  and findstr doesn t hang on the 3GB file  Terminating the 3GB file with a different character also causes a findstr to hang  The 0x1A character doesn t cause any problems on a  small  file   Similarly for other terminating characters   Adding CRLF after 0x1A resolves the problem   LF by itself would probably suffice   Using type to pipe the file into findstr works without hanging   This might be due to a side effect of either type or   that inserts an additional End Of Line   Use redirected input  lt  also causes findstr to hang  But this is expected  as explained in dbenham s post   redirected input must end in LF

User · Answer

Answer continued from part 1 above - I ve run into the 30 000 character answer limit  -   Limited Regular Expressions  regex  Support FINDSTR support for regular expressions is extremely limited  If it is not in the HELP documentation  it is not supported   Beyond that  the regex expressions that are supported are implemented in a completely non-standard manner  such that results can be different then would be expected coming from something like grep or perl   Regex Line Position anchors   and     matches beginning of input stream as well as any position immediately following a  lt LF   Since FINDSTR also breaks lines after  lt LF   a simple regex of     will always match all lines within a file  even a binary file     matches any position immediately preceding a  LT CR   This means that a regex search string containing   will never match any lines within a Unix style text file  nor will it match the last line of a Windows text file if it is missing the EOL marker of  lt CR  lt LF    Note - As previously discussed  piped and redirected input to FINDSTR may have  lt CR gt  lt LF gt  appended that is not in the source  Obviously this can impact a regex search that uses     Any search string with characters before   or after   will always fail to find a match   Positional Options  B  E  X The positional options work the same as   and    except they also work for literal search strings    B functions the same as   at the start of a regex search string    E functions the same as   at the end of a regex search string    X functions the same as having both   at the beginning and   at the end of a regex search string   Regex word boundary   lt  must be the very first term in the regex  The regex will not match anything if any other characters precede it    lt  corresponds to either the very beginning of the input  the beginning of a line  the position immediately following a  lt LF    or the position immediately following any  non-word  character  The next character need not be a  word  character     gt  must be the very last term in the regex  The regex will not match anything if any other characters follow it    gt  corresponds to either the end of input  the position immediately prior to a  lt CR   or the position immediately preceding any  non-word  character  The preceding character need not be a  word  character   Here is a complete list of  non-word  characters  represented as the decimal byte code  Note - this list was compiled on a U S machine  I do not know what impact other languages may have on this list   001   028   063   179   204   230 002   029   064   180   205   231 003   030   091   181   206   232 004   031   092   182   207   233 005   032   093   183   208   234 006   033   094   184   209   235 007   034   096   185   210   236 008   035   123   186   211   237 009   036   124   187   212   238 011   037   125   188   213   239 012   038   126   189   214   240 014   039   127   190   215   241 015   040   155   191   216   242 016   041   156   192   217   243 017   042   157   193   218   244 018   043   158   194   219   245 019   044   168   195   220   246 020   045   169   196   221   247 021   046   170   197   222   248 022   047   173   198   223   249 023   058   174   199   224   250 024   059   175   200   226   251 025   060   176   201   227   254 026   061   177   202   228   255 027   062   178   203   229   Regex character class ranges  x-y  Character class ranges do not work as expected  See this question  Why does findstr not handle case properly  in some circumstances    along with this answer  https   stackoverflow com a 8767815 1012053   The problem is FINDSTR does not collate the characters by their byte code value  commonly thought of as the ASCII code  but ASCII is only defined from 0x00 - 0x7F   Most regex implementations would treat  A-Z  as all upper case English capital letters  But FINDSTR uses a collation sequence that roughly corresponds to how SORT works  So  A-Z  includes the complete English alphabet  both upper and lower case  except for  a    as well as non-English alpha characters with diacriticals   Below is a complete list of all characters supported by FINDSTR  sorted in the collation sequence used by FINDSTR to establish regex character class ranges  The characters are represented as their decimal byte code value  I believe the collation sequence makes the most sense if the characters are viewed using code page 437  Note - this list was compiled on a U S machine  I do not know what impact other languages may have on this list   001 002 003 004 005 006 007 008 014 015 016 017 018            019 020 021 022 023 024 025 026 027 028 029 030 031 127 039 045 032 255 009 010 011 012 013 033 034 035 036 037 038 040 041 042 044 046 047 058 059 063 064 091 092 093 094 095 096 123 124 125 126 173 168 155 156 157 158 043 249 060 061 062 241 174 175 246 251 239 247 240 243 242 169 244 245 254 196 205 179 186 218 213 214 201 191 184 183 187 192 212 211 200 217 190 189 188 195 198 199 204 180 181 182 185 194 209 210 203 193 207 208 202 197 216 215 206 223 220 221 222 219 176 177 178 170 248 230 250 048 172 171 049 050 253 051 052 053 054 055 056 057 236 097 065 166 160 133 131 132 142 134 143 145 146 098 066 099 067 135 128 100 068 101 069 130 144 138 136 137 102 070 159 103 071 104 072 105 073 161 141 140 139 106 074 107 075 108 076 109 077 110 252 078 164 165 111 079 167 162 149 147 148 153 112 080 113 081 114 082 115 083 225 116 084 117 085 163 151 150 129 154 118 086 119 087 120 088 121 089 152 122 090 224 226 235 238 233 227 229 228 231 237 232 234   Regex character class term limit and BUG Not only is FINDSTR limited to a maximum of 15 character class terms within a regex  it fails to properly handle an attempt to exceed the limit  Using 16 or more character class terms results in an interactive Windows pop up stating  Find String  QGREP  Utility has encountered a problem and needs to close  We are sorry for the inconvenience   The message text varies slightly depending on the Windows version  Here is one example of a FINDSTR that will fail   echo 01234567890123456 findstr  0-9  0-9  0-9  0-9  0-9  0-9  0-9  0-9  0-9  0-9  0-9  0-9  0-9  0-9  0-9  0-9    This bug was reported by DosTips user Judago here  It has been confirmed on XP  Vista  and Windows 7   Regex searches fail  and may hang indefinitely  if they include byte code 0xFF  decimal 255  Any regex search that includes byte code 0xFF  decimal 255  will fail  It fails if byte code 0xFF is included directly  or if it is implicitly included within a character class range  Remember that FINDSTR character class ranges do not collate characters based on the byte code value  Character  lt 0xFF gt  appears relatively early in the collation sequence between the  lt space gt  and  lt tab gt  characters  So any character class range that includes both  lt space gt  and  lt tab gt  will fail   The exact behavior changes slightly depending on the Windows version  Windows 7 hangs indefinitely if 0xFF is included  XP doesn t hang  but it always fails to find a match  and occasionally prints the following error message -  The process tried to write to a nonexistent pipe    I no longer have access to a Vista machine  so I haven t been able to test on Vista   Regex bug    and   anySet  can match End-Of-File The regex   meta-character should only match any character other than  lt CR gt  or  lt LF gt   There is a bug that allows it to match the End-Of-File if the last line in the file is not terminated by  lt CR gt  or  lt LF gt   However  the   will not match an empty file   For example  a file named  test txt  containing a single line of x  without terminating  lt CR gt  or  lt LF gt   will match the following   findstr  r x          test txt   This bug has been confirmed on XP and Win7   The same seems to be true for negative character sets  Something like   abc  will match End-Of-File  Positive character sets like  abc  seem to work fine  I have only tested this on Win7

User · Answer

FINDSTR has a color bug that I described and solved at https   superuser com questions 1535810 is-there-a-better-way-to-mitigate-this-obscure-color-bug-when-piping-to-findstr 1538802 noredirect 1 comment2339443 1538802  To summarize that thread  the bug is that if input is piped to FINDSTR within a parenthesized block of code  inline ANSI escape colorcodes stop working in commands executed later  An example of inline colorcodes is  echo  magenta Alert  Something bad happened yellow   where magenta and yellow are vars defined earlier in the  bat file as the corresponding ANSI escape colorcodes    My initial solution was to call a do-nothing subroutine after the FINDSTR   Somehow the call or the return  resets  whatever needs to be reset     Later I discovered another solution that presumably is more efficient  place the FINDSTR phrase within parentheses  as in the following example  echo success     FINDSTR  R success   Placing the FINDSTR phrase within a nested block of code appears to isolate FINDSTR s colorcode bug so it won t affect what s outside the nested block   Perhaps this technique will solve some other undesired FINDSTR side effects too

User · Answer

D tip for multiple directories  put your directory list before the search string  These all work    findstr  D dir1 dir2  searchString      findstr  D  dir1 dir2   searchString      findstr  D   path dir1   path dir2    searchString        As expected  the path is relative to location if you don t start the directories with    Surrounding the path with   is optional if there are no spaces in the directory names  The ending   is optional  The output of location will include whatever path you give it  It will work with or without surrounding the directory list with

[batch-file] What are the undocumented features and limitations of the Windows FINDSTR command?

Examples related to batch-file

Examples related to cmd

Examples related to findstr