[batch-file] What are the undocumented features and limitations of the Windows FINDSTR command?

The Windows FINDSTR command is horribly documented. There is very basic command line help available through FINDSTR /?, or HELP FINDSTR, but it is woefully inadequate. There is a wee bit more documentation online at https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/findstr.

There are many FINDSTR features and limitations that are not even hinted at in the documentation. Nor could they be anticipated without prior knowledge and/or careful experimentation.

So the question is - What are the undocumented FINDSTR features and limitations?

The purpose of this question is to provide a one stop repository of the many undocumented features so that:

A) Developers can take full advantage of the features that are there.

B) Developers don't waste their time wondering why something doesn't work when it seems like it should.

Please make sure you know the existing documentation before responding. If the information is covered by the HELP, then it does not belong here.

Neither is this a place to show interesting uses of FINDSTR. If a logical person could anticipate the behavior of a particular usage of FINDSTR based on the documentation, then it does not belong here.

Along the same lines, if a logical person could anticipate the behavior of a particular usage based on information contained in any existing answers, then again, it does not belong here.

This question is related to batch-file cmd findstr

The answer is


I'd like to report a bug regarding the section Source of data to search in the first answer when using en dash (–) or em dash (—) within the filename.

More specifically, if you are about to use the first option - filenames specified as arguments, the file won't be found. As soon as you use either option 2 - stdin via redirection or 3 - data stream from a pipe, findstr will find the file.

For example, this simple batch script:

echo off
chcp 1250 > nul
set INTEXTFILE1=filename with – dash.txt
set INTEXTFILE2=filename with — dash.txt

rem 3 way of findstr use with en dashed filename
echo.
echo Filename with en dash:
echo.
echo 1. As argument
findstr . "%INTEXTFILE1%"
echo.
echo 2. As stdin via redirection
findstr . < "%INTEXTFILE1%"
echo.
echo 3. As datastream from a pipe
type "%INTEXTFILE1%" | findstr .
echo.
echo.
rem The same set of operations with em dashed filename
echo Filename with em dash:
echo.
echo 1. As argument
findstr . "%INTEXTFILE2%"
echo.
echo 2. As stdin via redirection
findstr . < "%INTEXTFILE2%"
echo.
echo 3. As datastream from a pipe
type "%INTEXTFILE2%" | findstr .
echo.

pause

will print:

Filename with en dash:

  1. As argument
    FINDSTR: Cannot open filename with - dash.txt

  2. As stdin via redirection
    I am the file with an en dash.

  3. As datastream from a pipe
    I am the file with an en dash.

Filename with em dash:

  1. As argument
    FINDSTR: Cannot open filename with - dash.txt

  2. As stdin via redirection
    I am the file with an em dash.

  3. As datastream from a pipe
    I am the file with an em dash.

Hope it helps.

M.


/D tip for multiple directories: put your directory list before the search string. These all work:

findstr /D:dir1;dir2 "searchString" *.*
findstr /D:"dir1;dir2" "searchString" *.*
findstr /D:"\path\dir1\;\path\dir2\" "searchString" *.*

As expected, the path is relative to location if you don't start the directories with \. Surrounding the path with " is optional if there are no spaces in the directory names. The ending \ is optional. The output of location will include whatever path you give it. It will work with or without surrounding the directory list with ".


The findstr command sets the ErrorLevel (or exit code) to one of the following values, given that there are no invalid or incompatible switches and no search string exceeds the applicable length limit:

  • 0 when at least a single match is encountered in one line throughout all specified files;
  • 1 otherwise;

A line is considered to contain a match when:

  • no /V option is given and the search expression occurs at least once;
  • the /V option is given and the search expression does not occur;

This means that the /V option also changes the returned ErrorLevel, but it does not just revert it!

For example, when you have got a file test.txt with two lines, one of which contains the string text but the other one does not, both findstr "text" "test.txt" and findstr /V "text" "test.txt" return an ErrorLevel of 0.

Basically you can say: if findstr returns at least a line, ErrorLevel is set to 0, else to 1.

Note that the /M option does not affect the ErrorLevel value, it just alters the output.

(Just for the sake of completeness: the find command behaves exactly the same way with respect to the /V option and ErrorLevel; the /C option does not affect ErrorLevel.)


Answer continued from part 1 above - I've run into the 30,000 character answer limit :-(

Limited Regular Expressions (regex) Support
FINDSTR support for regular expressions is extremely limited. If it is not in the HELP documentation, it is not supported.

Beyond that, the regex expressions that are supported are implemented in a completely non-standard manner, such that results can be different then would be expected coming from something like grep or perl.

Regex Line Position anchors ^ and $
^ matches beginning of input stream as well as any position immediately following a <LF>. Since FINDSTR also breaks lines after <LF>, a simple regex of "^" will always match all lines within a file, even a binary file.

$ matches any position immediately preceding a <CR>. This means that a regex search string containing $ will never match any lines within a Unix style text file, nor will it match the last line of a Windows text file if it is missing the EOL marker of <CR><LF>.

Note - As previously discussed, piped and redirected input to FINDSTR may have <CR><LF> appended that is not in the source. Obviously this can impact a regex search that uses $.

Any search string with characters before ^ or after $ will always fail to find a match.

Positional Options /B /E /X
The positional options work the same as ^ and $, except they also work for literal search strings.

/B functions the same as ^ at the start of a regex search string.

/E functions the same as $ at the end of a regex search string.

/X functions the same as having both ^ at the beginning and $ at the end of a regex search string.

Regex word boundary
\< must be the very first term in the regex. The regex will not match anything if any other characters precede it. \< corresponds to either the very beginning of the input, the beginning of a line (the position immediately following a <LF>), or the position immediately following any "non-word" character. The next character need not be a "word" character.

\> must be the very last term in the regex. The regex will not match anything if any other characters follow it. \> corresponds to either the end of input, the position immediately prior to a <CR>, or the position immediately preceding any "non-word" character. The preceding character need not be a "word" character.

Here is a complete list of "non-word" characters, represented as the decimal byte code. Note - this list was compiled on a U.S machine. I do not know what impact other languages may have on this list.

001   028   063   179   204   230
002   029   064   180   205   231
003   030   091   181   206   232
004   031   092   182   207   233
005   032   093   183   208   234
006   033   094   184   209   235
007   034   096   185   210   236
008   035   123   186   211   237
009   036   124   187   212   238
011   037   125   188   213   239
012   038   126   189   214   240
014   039   127   190   215   241
015   040   155   191   216   242
016   041   156   192   217   243
017   042   157   193   218   244
018   043   158   194   219   245
019   044   168   195   220   246
020   045   169   196   221   247
021   046   170   197   222   248
022   047   173   198   223   249
023   058   174   199   224   250
024   059   175   200   226   251
025   060   176   201   227   254
026   061   177   202   228   255
027   062   178   203   229

Regex character class ranges [x-y]
Character class ranges do not work as expected. See this question: Why does findstr not handle case properly (in some circumstances)?, along with this answer: https://stackoverflow.com/a/8767815/1012053.

The problem is FINDSTR does not collate the characters by their byte code value (commonly thought of as the ASCII code, but ASCII is only defined from 0x00 - 0x7F). Most regex implementations would treat [A-Z] as all upper case English capital letters. But FINDSTR uses a collation sequence that roughly corresponds to how SORT works. So [A-Z] includes the complete English alphabet, both upper and lower case (except for "a"), as well as non-English alpha characters with diacriticals.

Below is a complete list of all characters supported by FINDSTR, sorted in the collation sequence used by FINDSTR to establish regex character class ranges. The characters are represented as their decimal byte code value. I believe the collation sequence makes the most sense if the characters are viewed using code page 437. Note - this list was compiled on a U.S machine. I do not know what impact other languages may have on this list.

001
002
003
004
005
006
007
008
014
015
016
017
018           
019
020
021
022
023
024
025
026
027
028
029
030
031
127
039
045
032
255
009
010
011
012
013
033
034
035
036
037
038
040
041
042
044
046
047
058
059
063
064
091
092
093
094
095
096
123
124
125
126
173
168
155
156
157
158
043
249
060
061
062
241
174
175
246
251
239
247
240
243
242
169
244
245
254
196
205
179
186
218
213
214
201
191
184
183
187
192
212
211
200
217
190
189
188
195
198
199
204
180
181
182
185
194
209
210
203
193
207
208
202
197
216
215
206
223
220
221
222
219
176
177
178
170
248
230
250
048
172
171
049
050
253
051
052
053
054
055
056
057
236
097
065
166
160
133
131
132
142
134
143
145
146
098
066
099
067
135
128
100
068
101
069
130
144
138
136
137
102
070
159
103
071
104
072
105
073
161
141
140
139
106
074
107
075
108
076
109
077
110
252
078
164
165
111
079
167
162
149
147
148
153
112
080
113
081
114
082
115
083
225
116
084
117
085
163
151
150
129
154
118
086
119
087
120
088
121
089
152
122
090
224
226
235
238
233
227
229
228
231
237
232
234

Regex character class term limit and BUG
Not only is FINDSTR limited to a maximum of 15 character class terms within a regex, it fails to properly handle an attempt to exceed the limit. Using 16 or more character class terms results in an interactive Windows pop up stating "Find String (QGREP) Utility has encountered a problem and needs to close. We are sorry for the inconvenience." The message text varies slightly depending on the Windows version. Here is one example of a FINDSTR that will fail:

echo 01234567890123456|findstr [0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]

This bug was reported by DosTips user Judago here. It has been confirmed on XP, Vista, and Windows 7.

Regex searches fail (and may hang indefinitely) if they include byte code 0xFF (decimal 255)
Any regex search that includes byte code 0xFF (decimal 255) will fail. It fails if byte code 0xFF is included directly, or if it is implicitly included within a character class range. Remember that FINDSTR character class ranges do not collate characters based on the byte code value. Character <0xFF> appears relatively early in the collation sequence between the <space> and <tab> characters. So any character class range that includes both <space> and <tab> will fail.

The exact behavior changes slightly depending on the Windows version. Windows 7 hangs indefinitely if 0xFF is included. XP doesn't hang, but it always fails to find a match, and occasionally prints the following error message - "The process tried to write to a nonexistent pipe."

I no longer have access to a Vista machine, so I haven't been able to test on Vista.

Regex bug: . and [^anySet] can match End-Of-File
The regex . meta-character should only match any character other than <CR> or <LF>. There is a bug that allows it to match the End-Of-File if the last line in the file is not terminated by <CR> or <LF>. However, the . will not match an empty file.

For example, a file named "test.txt" containing a single line of x, without terminating <CR> or <LF>, will match the following:

findstr /r x......... test.txt

This bug has been confirmed on XP and Win7.

The same seems to be true for negative character sets. Something like [^abc] will match End-Of-File. Positive character sets like [abc] seem to work fine. I have only tested this on Win7.


When several commands are enclosed in parentheses and there are redirected files to the whole block:

< input.txt (
   command1
   command2
   . . .
) > output.txt

... then the files remains open as long as the commands in the block be active, so the commands may move the file pointer of the redirected files. Both MORE and FIND commands move the Stdin file pointer to the beginning of the file before process it, so the same file may be processed several times inside the block. For example, this code:

more < input.txt >  output.txt
more < input.txt >> output.txt

... produce the same result than this one:

< input.txt (
   more
   more
) > output.txt

This code:

find    "search string" < input.txt > matchedLines.txt
find /V "search string" < input.txt > unmatchedLines.txt

... produce the same result than this one:

< input.txt (
   find    "search string" > matchedLines.txt
   find /V "search string" > unmatchedLines.txt
)

FINDSTR is different; it does not move the Stdin file pointer from its current position. For example, this code insert a new line after a search line:

call :ProcessFile < input.txt
goto :EOF

:ProcessFile
   rem Read the next line from Stdin and copy it
   set /P line=
   echo %line%
   rem Test if it is the search line
   if "%line%" neq "search line" goto ProcessFile
rem Insert the new line at this point
echo New line
rem And copy the rest of lines
findstr "^"
exit /B

We may make good use of this feature with the aid of an auxiliary program that allow us to move the file pointer of a redirected file, as shown in this example.

This behavior was first reported by jeb at this post.


EDIT 2018-08-18: New FINDSTR bug reported

The FINDSTR command have a strange bug that happen when this command is used to show characters in color AND the output of such a command is redirected to CON device. For details on how use FINDSTR command to show text in color, see this topic.

When the output of this form of FINDSTR command is redirected to CON, something strange happens after the text is output in the desired color: all the text after it is output as "invisible" characters, although a more precise description is that the text is output as black text over black background. The original text will appear if you use COLOR command to reset the foreground and background colors of the entire screen. However, when the text is "invisible" we could execute a SET /P command, so all characters entered will not appear on the screen. This behavior may be used to enter passwords.

@echo off
setlocal

set /P "=_" < NUL > "Enter password"
findstr /A:1E /V "^$" "Enter password" NUL > CON
del "Enter password"
set /P "password="
cls
color 07
echo The password read is: "%password%"

FINDSTR has a color bug that I described and solved at https://superuser.com/questions/1535810/is-there-a-better-way-to-mitigate-this-obscure-color-bug-when-piping-to-findstr/1538802?noredirect=1#comment2339443_1538802

To summarize that thread, the bug is that if input is piped to FINDSTR within a parenthesized block of code, inline ANSI escape colorcodes stop working in commands executed later. An example of inline colorcodes is: echo %magenta%Alert: Something bad happened%yellow% (where magenta and yellow are vars defined earlier in the .bat file as the corresponding ANSI escape colorcodes).

My initial solution was to call a do-nothing subroutine after the FINDSTR. Somehow the call or the return "resets" whatever needs to be reset.

Later I discovered another solution that presumably is more efficient: place the FINDSTR phrase within parentheses, as in the following example: echo success | ( FINDSTR /R success ) Placing the FINDSTR phrase within a nested block of code appears to isolate FINDSTR's colorcode bug so it won't affect what's outside the nested block. Perhaps this technique will solve some other undesired FINDSTR side effects too.


findstr sometimes hangs unexpectedly when searching large files.

I haven't confirmed the exact conditions or boundary sizes. I suspect any file larger 2GB may be at risk.

I have had mixed experiences with this, so it is more than just file size. This looks like it may be a variation on FINDSTR hangs on XP and Windows 7 if redirected input does not end with LF, but as demonstrated this particular problem manifests when input is not redirected.

The following command line session (Windows 7) demonstrates how findstr can hang when searching a 3GB file.

C:\Data\Temp\2014-04>echo 1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890> T100B.txt

C:\Data\Temp\2014-04>for /L %i in (1,1,10) do @type T100B.txt >> T1KB.txt

C:\Data\Temp\2014-04>for /L %i in (1,1,1000) do @type T1KB.txt >> T1MB.txt

C:\Data\Temp\2014-04>for /L %i in (1,1,1000) do @type T1MB.txt >> T1GB.txt

C:\Data\Temp\2014-04>echo find this line>> T1GB.txt

C:\Data\Temp\2014-04>copy T1GB.txt + T1GB.txt + T1GB.txt T3GB.txt
T1GB.txt
T1GB.txt
T1GB.txt
        1 file(s) copied.

C:\Data\Temp\2014-04>dir
 Volume in drive C has no label.
 Volume Serial Number is D2B2-FFDF

 Directory of C:\Data\Temp\2014-04

2014/04/08  04:28 PM    <DIR>          .
2014/04/08  04:28 PM    <DIR>          ..
2014/04/08  04:22 PM               102 T100B.txt
2014/04/08  04:28 PM     1 020 000 016 T1GB.txt
2014/04/08  04:23 PM             1 020 T1KB.txt
2014/04/08  04:23 PM         1 020 000 T1MB.txt
2014/04/08  04:29 PM     3 060 000 049 T3GB.txt
               5 File(s)  4 081 021 187 bytes
               2 Dir(s)  51 881 050 112 bytes free
C:\Data\Temp\2014-04>rem Findstr on the 1GB file does not hang

C:\Data\Temp\2014-04>findstr "this" T1GB.txt
find this line

C:\Data\Temp\2014-04>rem On the 3GB file, findstr hangs and must be aborted... even though it clearly reaches end of file

C:\Data\Temp\2014-04>findstr "this" T3GB.txt
find this line
find this line
find this line
^C
C:\Data\Temp\2014-04>

Note, I've verified in a hex editor that all lines are terminated with CRLF. The only anomaly is that the file is terminated with 0x1A due to the way copy works. Note however, that this anomaly doesn't cause a problem on "small" files.

With additional testing I have confirmed the following:

  • Using copy with the /b option for binary files prevents the addition of the 0x1A character, and findstr doesn't hang on the 3GB file.
  • Terminating the 3GB file with a different character also causes a findstr to hang.
  • The 0x1A character doesn't cause any problems on a "small" file. (Similarly for other terminating characters.)
  • Adding CRLF after 0x1A resolves the problem. (LF by itself would probably suffice.)
  • Using type to pipe the file into findstr works without hanging. (This might be due to a side effect of either type or | that inserts an additional End Of Line.)
  • Use redirected input < also causes findstr to hang. But this is expected; as explained in dbenham's post: "redirected input must end in LF".

Examples related to batch-file

'ls' is not recognized as an internal or external command, operable program or batch file '' is not recognized as an internal or external command, operable program or batch file XCOPY: Overwrite all without prompt in BATCH Can´t run .bat file under windows 10 Execute a batch file on a remote PC using a batch file on local PC Windows batch - concatenate multiple text files into one How do I create a shortcut via command-line in Windows? Getting Error:JRE_HOME variable is not defined correctly when trying to run startup.bat of Apache-Tomcat Curl not recognized as an internal or external command, operable program or batch file Best way to script remote SSH commands in Batch (Windows)

Examples related to cmd

'ls' is not recognized as an internal or external command, operable program or batch file '' is not recognized as an internal or external command, operable program or batch file XCOPY: Overwrite all without prompt in BATCH VSCode Change Default Terminal How to install pandas from pip on windows cmd? 'ls' in CMD on Windows is not recognized Command to run a .bat file VMware Workstation and Device/Credential Guard are not compatible How do I kill the process currently using a port on localhost in Windows? how to run python files in windows command prompt?

Examples related to findstr

What are the undocumented features and limitations of the Windows FINDSTR command?