The Windows FINDSTR command is horribly documented. There is very basic command line help available through FINDSTR /?
, or HELP FINDSTR
, but it is woefully inadequate. There is a wee bit more documentation online at https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/findstr.
There are many FINDSTR features and limitations that are not even hinted at in the documentation. Nor could they be anticipated without prior knowledge and/or careful experimentation.
So the question is - What are the undocumented FINDSTR features and limitations?
The purpose of this question is to provide a one stop repository of the many undocumented features so that:
A) Developers can take full advantage of the features that are there.
B) Developers don't waste their time wondering why something doesn't work when it seems like it should.
Please make sure you know the existing documentation before responding. If the information is covered by the HELP, then it does not belong here.
Neither is this a place to show interesting uses of FINDSTR. If a logical person could anticipate the behavior of a particular usage of FINDSTR based on the documentation, then it does not belong here.
Along the same lines, if a logical person could anticipate the behavior of a particular usage based on information contained in any existing answers, then again, it does not belong here.
This question is related to
batch-file
cmd
findstr
I'd like to report a bug regarding the section Source of data to search in the first answer when using en dash (–) or em dash (—) within the filename.
More specifically, if you are about to use the first option - filenames specified as arguments, the file won't be found. As soon as you use either option 2 - stdin via redirection or 3 - data stream from a pipe, findstr will find the file.
For example, this simple batch script:
echo off
chcp 1250 > nul
set INTEXTFILE1=filename with – dash.txt
set INTEXTFILE2=filename with — dash.txt
rem 3 way of findstr use with en dashed filename
echo.
echo Filename with en dash:
echo.
echo 1. As argument
findstr . "%INTEXTFILE1%"
echo.
echo 2. As stdin via redirection
findstr . < "%INTEXTFILE1%"
echo.
echo 3. As datastream from a pipe
type "%INTEXTFILE1%" | findstr .
echo.
echo.
rem The same set of operations with em dashed filename
echo Filename with em dash:
echo.
echo 1. As argument
findstr . "%INTEXTFILE2%"
echo.
echo 2. As stdin via redirection
findstr . < "%INTEXTFILE2%"
echo.
echo 3. As datastream from a pipe
type "%INTEXTFILE2%" | findstr .
echo.
pause
will print:
Filename with en dash:
As argument
FINDSTR: Cannot open filename with - dash.txt
As stdin via redirection
I am the file with an en dash.
As datastream from a pipe
I am the file with an en dash.
Filename with em dash:
As argument
FINDSTR: Cannot open filename with - dash.txt
As stdin via redirection
I am the file with an em dash.
As datastream from a pipe
I am the file with an em dash.
Hope it helps.
M.
/D tip for multiple directories: put your directory list before the search string. These all work:
findstr /D:dir1;dir2 "searchString" *.*
findstr /D:"dir1;dir2" "searchString" *.*
findstr /D:"\path\dir1\;\path\dir2\" "searchString" *.*
As expected, the path is relative to location if you don't start the directories with \
. Surrounding the path with "
is optional if there are no spaces in the directory names. The ending \
is optional. The output of location will include whatever path you give it. It will work with or without surrounding the directory list with "
.
The findstr
command sets the ErrorLevel
(or exit code) to one of the following values, given that there are no invalid or incompatible switches and no search string exceeds the applicable length limit:
0
when at least a single match is encountered in one line throughout all specified files;1
otherwise;A line is considered to contain a match when:
/V
option is given and the search expression occurs at least once;/V
option is given and the search expression does not occur;This means that the /V
option also changes the returned ErrorLevel
, but it does not just revert it!
For example, when you have got a file test.txt
with two lines, one of which contains the string text
but the other one does not, both findstr "text" "test.txt"
and findstr /V "text" "test.txt"
return an ErrorLevel
of 0
.
Basically you can say: if findstr
returns at least a line, ErrorLevel
is set to 0
, else to 1
.
Note that the /M
option does not affect the ErrorLevel
value, it just alters the output.
(Just for the sake of completeness: the find
command behaves exactly the same way with respect to the /V
option and ErrorLevel
; the /C
option does not affect ErrorLevel
.)
Answer continued from part 1 above - I've run into the 30,000 character answer limit :-(
Limited Regular Expressions (regex) Support
FINDSTR support for regular expressions is extremely limited. If it is not in the HELP documentation, it is not supported.
Beyond that, the regex expressions that are supported are implemented in a completely non-standard manner, such that results can be different then would be expected coming from something like grep or perl.
Regex Line Position anchors ^ and $
^
matches beginning of input stream as well as any position immediately following a <LF>. Since FINDSTR also breaks lines after <LF>, a simple regex of "^" will always match all lines within a file, even a binary file.
$
matches any position immediately preceding a <CR>. This means that a regex search string containing $
will never match any lines within a Unix style text file, nor will it match the last line of a Windows text file if it is missing the EOL marker of <CR><LF>.
Note - As previously discussed, piped and redirected input to FINDSTR may have <CR><LF>
appended that is not in the source. Obviously this can impact a regex search that uses $
.
Any search string with characters before ^
or after $
will always fail to find a match.
Positional Options /B /E /X
The positional options work the same as ^
and $
, except they also work for literal search strings.
/B functions the same as ^
at the start of a regex search string.
/E functions the same as $
at the end of a regex search string.
/X functions the same as having both ^
at the beginning and $
at the end of a regex search string.
Regex word boundary
\<
must be the very first term in the regex. The regex will not match anything if any other characters precede it. \<
corresponds to either the very beginning of the input, the beginning of a line (the position immediately following a <LF>), or the position immediately following any "non-word" character. The next character need not be a "word" character.
\>
must be the very last term in the regex. The regex will not match anything if any other characters follow it. \>
corresponds to either the end of input, the position immediately prior to a <CR>, or the position immediately preceding any "non-word" character. The preceding character need not be a "word" character.
Here is a complete list of "non-word" characters, represented as the decimal byte code. Note - this list was compiled on a U.S machine. I do not know what impact other languages may have on this list.
001 028 063 179 204 230
002 029 064 180 205 231
003 030 091 181 206 232
004 031 092 182 207 233
005 032 093 183 208 234
006 033 094 184 209 235
007 034 096 185 210 236
008 035 123 186 211 237
009 036 124 187 212 238
011 037 125 188 213 239
012 038 126 189 214 240
014 039 127 190 215 241
015 040 155 191 216 242
016 041 156 192 217 243
017 042 157 193 218 244
018 043 158 194 219 245
019 044 168 195 220 246
020 045 169 196 221 247
021 046 170 197 222 248
022 047 173 198 223 249
023 058 174 199 224 250
024 059 175 200 226 251
025 060 176 201 227 254
026 061 177 202 228 255
027 062 178 203 229
Regex character class ranges [x-y]
Character class ranges do not work as expected. See this question: Why does findstr not handle case properly (in some circumstances)?, along with this answer: https://stackoverflow.com/a/8767815/1012053.
The problem is FINDSTR does not collate the characters by their byte code value (commonly thought of as the ASCII code, but ASCII is only defined from 0x00 - 0x7F). Most regex implementations would treat [A-Z] as all upper case English capital letters. But FINDSTR uses a collation sequence that roughly corresponds to how SORT works. So [A-Z] includes the complete English alphabet, both upper and lower case (except for "a"), as well as non-English alpha characters with diacriticals.
Below is a complete list of all characters supported by FINDSTR, sorted in the collation sequence used by FINDSTR to establish regex character class ranges. The characters are represented as their decimal byte code value. I believe the collation sequence makes the most sense if the characters are viewed using code page 437. Note - this list was compiled on a U.S machine. I do not know what impact other languages may have on this list.
001
002
003
004
005
006
007
008
014
015
016
017
018
019
020
021
022
023
024
025
026
027
028
029
030
031
127
039
045
032
255
009
010
011
012
013
033
034
035
036
037
038
040
041
042
044
046
047
058
059
063
064
091
092
093
094
095
096
123
124
125
126
173
168
155
156
157
158
043
249
060
061
062
241
174
175
246
251
239
247
240
243
242
169
244
245
254
196
205
179
186
218
213
214
201
191
184
183
187
192
212
211
200
217
190
189
188
195
198
199
204
180
181
182
185
194
209
210
203
193
207
208
202
197
216
215
206
223
220
221
222
219
176
177
178
170
248
230
250
048
172
171
049
050
253
051
052
053
054
055
056
057
236
097
065
166
160
133
131
132
142
134
143
145
146
098
066
099
067
135
128
100
068
101
069
130
144
138
136
137
102
070
159
103
071
104
072
105
073
161
141
140
139
106
074
107
075
108
076
109
077
110
252
078
164
165
111
079
167
162
149
147
148
153
112
080
113
081
114
082
115
083
225
116
084
117
085
163
151
150
129
154
118
086
119
087
120
088
121
089
152
122
090
224
226
235
238
233
227
229
228
231
237
232
234
Regex character class term limit and BUG
Not only is FINDSTR limited to a maximum of 15 character class terms within a regex, it fails to properly handle an attempt to exceed the limit. Using 16 or more character class terms results in an interactive Windows pop up stating "Find String (QGREP) Utility has encountered a problem and needs to close. We are sorry for the inconvenience." The message text varies slightly depending on the Windows version. Here is one example of a FINDSTR that will fail:
echo 01234567890123456|findstr [0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]
This bug was reported by DosTips user Judago here. It has been confirmed on XP, Vista, and Windows 7.
Regex searches fail (and may hang indefinitely) if they include byte code 0xFF (decimal 255)
Any regex search that includes byte code 0xFF (decimal 255) will fail. It fails if byte code 0xFF is included directly, or if it is implicitly included within a character class range. Remember that FINDSTR character class ranges do not collate characters based on the byte code value. Character <0xFF>
appears relatively early in the collation sequence between the <space>
and <tab>
characters. So any character class range that includes both <space>
and <tab>
will fail.
The exact behavior changes slightly depending on the Windows version. Windows 7 hangs indefinitely if 0xFF is included. XP doesn't hang, but it always fails to find a match, and occasionally prints the following error message - "The process tried to write to a nonexistent pipe."
I no longer have access to a Vista machine, so I haven't been able to test on Vista.
Regex bug: .
and [^anySet]
can match End-Of-File
The regex .
meta-character should only match any character other than <CR>
or <LF>
. There is a bug that allows it to match the End-Of-File if the last line in the file is not terminated by <CR>
or <LF>
. However, the .
will not match an empty file.
For example, a file named "test.txt" containing a single line of x
, without terminating <CR>
or <LF>
, will match the following:
findstr /r x......... test.txt
This bug has been confirmed on XP and Win7.
The same seems to be true for negative character sets. Something like [^abc]
will match End-Of-File. Positive character sets like [abc]
seem to work fine. I have only tested this on Win7.
When several commands are enclosed in parentheses and there are redirected files to the whole block:
< input.txt (
command1
command2
. . .
) > output.txt
... then the files remains open as long as the commands in the block be active, so the commands may move the file pointer of the redirected files. Both MORE and FIND commands move the Stdin file pointer to the beginning of the file before process it, so the same file may be processed several times inside the block. For example, this code:
more < input.txt > output.txt
more < input.txt >> output.txt
... produce the same result than this one:
< input.txt (
more
more
) > output.txt
This code:
find "search string" < input.txt > matchedLines.txt
find /V "search string" < input.txt > unmatchedLines.txt
... produce the same result than this one:
< input.txt (
find "search string" > matchedLines.txt
find /V "search string" > unmatchedLines.txt
)
FINDSTR is different; it does not move the Stdin file pointer from its current position. For example, this code insert a new line after a search line:
call :ProcessFile < input.txt
goto :EOF
:ProcessFile
rem Read the next line from Stdin and copy it
set /P line=
echo %line%
rem Test if it is the search line
if "%line%" neq "search line" goto ProcessFile
rem Insert the new line at this point
echo New line
rem And copy the rest of lines
findstr "^"
exit /B
We may make good use of this feature with the aid of an auxiliary program that allow us to move the file pointer of a redirected file, as shown in this example.
This behavior was first reported by jeb at this post.
EDIT 2018-08-18: New FINDSTR bug reported
The FINDSTR command have a strange bug that happen when this command is used to show characters in color AND the output of such a command is redirected to CON device. For details on how use FINDSTR command to show text in color, see this topic.
When the output of this form of FINDSTR command is redirected to CON, something strange happens after the text is output in the desired color: all the text after it is output as "invisible" characters, although a more precise description is that the text is output as black text over black background. The original text will appear if you use COLOR command to reset the foreground and background colors of the entire screen. However, when the text is "invisible" we could execute a SET /P command, so all characters entered will not appear on the screen. This behavior may be used to enter passwords.
@echo off
setlocal
set /P "=_" < NUL > "Enter password"
findstr /A:1E /V "^$" "Enter password" NUL > CON
del "Enter password"
set /P "password="
cls
color 07
echo The password read is: "%password%"
FINDSTR has a color bug that I described and solved at https://superuser.com/questions/1535810/is-there-a-better-way-to-mitigate-this-obscure-color-bug-when-piping-to-findstr/1538802?noredirect=1#comment2339443_1538802
To summarize that thread, the bug is that if input is piped to FINDSTR within a parenthesized block of code, inline ANSI escape colorcodes stop working in commands executed later. An example of inline colorcodes is: echo %magenta%Alert: Something bad happened%yellow%
(where magenta and yellow are vars defined earlier in the .bat file as the corresponding ANSI escape colorcodes).
My initial solution was to call a do-nothing subroutine after the FINDSTR. Somehow the call or the return "resets" whatever needs to be reset.
Later I discovered another solution that presumably is more efficient: place the FINDSTR phrase within parentheses, as in the following example:
echo success | ( FINDSTR /R success )
Placing the FINDSTR phrase within a nested block of code appears to isolate FINDSTR's colorcode bug so it won't affect what's outside the nested block. Perhaps this technique will solve some other undesired FINDSTR side effects too.
findstr
sometimes hangs unexpectedly when searching large files.
I haven't confirmed the exact conditions or boundary sizes. I suspect any file larger 2GB may be at risk.
I have had mixed experiences with this, so it is more than just file size. This looks like it may be a variation on FINDSTR hangs on XP and Windows 7 if redirected input does not end with LF, but as demonstrated this particular problem manifests when input is not redirected.
The following command line session (Windows 7) demonstrates how findstr
can hang when searching a 3GB file.
C:\Data\Temp\2014-04>echo 1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890> T100B.txt
C:\Data\Temp\2014-04>for /L %i in (1,1,10) do @type T100B.txt >> T1KB.txt
C:\Data\Temp\2014-04>for /L %i in (1,1,1000) do @type T1KB.txt >> T1MB.txt
C:\Data\Temp\2014-04>for /L %i in (1,1,1000) do @type T1MB.txt >> T1GB.txt
C:\Data\Temp\2014-04>echo find this line>> T1GB.txt
C:\Data\Temp\2014-04>copy T1GB.txt + T1GB.txt + T1GB.txt T3GB.txt
T1GB.txt
T1GB.txt
T1GB.txt
1 file(s) copied.
C:\Data\Temp\2014-04>dir
Volume in drive C has no label.
Volume Serial Number is D2B2-FFDF
Directory of C:\Data\Temp\2014-04
2014/04/08 04:28 PM <DIR> .
2014/04/08 04:28 PM <DIR> ..
2014/04/08 04:22 PM 102 T100B.txt
2014/04/08 04:28 PM 1 020 000 016 T1GB.txt
2014/04/08 04:23 PM 1 020 T1KB.txt
2014/04/08 04:23 PM 1 020 000 T1MB.txt
2014/04/08 04:29 PM 3 060 000 049 T3GB.txt
5 File(s) 4 081 021 187 bytes
2 Dir(s) 51 881 050 112 bytes free
C:\Data\Temp\2014-04>rem Findstr on the 1GB file does not hang
C:\Data\Temp\2014-04>findstr "this" T1GB.txt
find this line
C:\Data\Temp\2014-04>rem On the 3GB file, findstr hangs and must be aborted... even though it clearly reaches end of file
C:\Data\Temp\2014-04>findstr "this" T3GB.txt
find this line
find this line
find this line
^C
C:\Data\Temp\2014-04>
Note, I've verified in a hex editor that all lines are terminated with CRLF
. The only anomaly is that the file is terminated with 0x1A
due to the way copy
works. Note however, that this anomaly doesn't cause a problem on "small" files.
With additional testing I have confirmed the following:
copy
with the /b
option for binary files prevents the addition of the 0x1A
character, and findstr
doesn't hang on the 3GB file.findstr
to hang.0x1A
character doesn't cause any problems on a "small" file. (Similarly for other terminating characters.)CRLF
after 0x1A
resolves the problem. (LF
by itself would probably suffice.)type
to pipe the file into findstr
works without hanging. (This might be due to a side effect of either type
or |
that inserts an additional End Of Line.)<
also causes findstr
to hang. But this is expected; as explained in dbenham's post: "redirected input must end in LF
".Source: Stackoverflow.com