How do you extract IP addresses from files using a regex in a linux shell

Question

How to extract a text part by regexp in linux shell  Lets say  I have a file where in every line is an IP address  but on a different position  What is the simplest way to extract those IP addresses using common unix command-line tools

User · Answer

If you are not given a specific file and you need to extract IP address then we need to do it recursively. grep command -> Searches a text or file for matching a given string and displays the matched string .

grep -roE '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' | grep -oE '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'

-r We can search the entire directory tree i.e. the current directory and all levels of sub-directories. It denotes recursive searching.

-o Print only the matching string

-E Use extended regular expression

If we would not have used the second grep command after the pipe we would have got the IP address along with the path where it is present

User · Answer

For those who want a ready solution for getting IP addresses from apache log and listing occurences of how many times IP address has visited website  use this line   grep -Eo   0-9  1 3    0-9  1 3    0-9  1 3    0-9  1 3   error log   sort   uniq -c   sort -nr  gt  occurences txt   Nice method to ban hackers  Next you can    Delete lines with less than 20 visits Using regexp cut till single space so you will have only IP addresses Using regexp cut 1-3 last numbers of IP addresses so you will have only network addresses Add deny from and a space at the beginning of each line Put the result file as  htaccess

User · Answer

I d suggest perl    d   d   d   d   should probably do the trick    EDIT  Just to make it more like a complete program  you could do something like the following  not tested        usr bin perl -w use strict   while   lt  gt         if     d    d    d    d               print   1 n             This handles one IP per line  If you have more than one IPs per line  you need to use the  g option  man perlretut gives you a more detailed tutorial on regular expressions

User · Answer

I have tried all answers but all of them had one or many problems that I list a few of them    Some detected 123 456 789 111 as valid IP   Some don t detect 127 0 00 1 as valid IP   Some don t detect IP that start with zero like 08 8 8 8   So here I post a regex that works on all above conditions        Note   I have extracted more than 2 millions IP without any problem with following regex          1 d d 2 0-5  0-5  2 0-4  d 0  1-9  d 0 0  d     3    1 d d 2 0-5  0-5  2 0-4  d 0  1-9  d 0 0  d

User · Answer

cat ip address txt   grep    0-9   1 3      0-9   1 3      0-9   1 3      0-9   1 3                 0-9   1 3      0-9   1 3      0-9   1 3      0-9   1 3                 0-9   1 3      0-9   1 3      0-9   1 3      0-9   1 3       Lets assume the file is comma delimited and the position of ip address in the beginning  end and somewhere in the middle  First regexp looks for the exact match of ip address in the beginning of the line  The second regexp after the or looks for ip address in the middle we are matching it in such a way that the number that follows  should be exactly 1 to 3 digits  falsy ips like 12345 12 34 1 can be excluded in this   The third regexp looks for the ip address at the end of the line

User · Answer

I usually start with grep  to get the regexp right      multiple failed attempts here  grep      0-9     0-9     0-9     0-9                    file    good  grep -E   0-9  1 3    0-9  1 3    0-9  1 3    0-9  1 3   file    good enough   Then I d try and convert it to sed to filter out the rest of the line    After reading this thread  you and I aren t going to do that anymore  we re going to use grep -o instead   sed -ne  s      0-9  1 3    0-9  1 3    0-9  1 3    0-9  1 3       1 p    FAIL   That s when I usually get annoyed with sed for not using the same regexes as anyone else   So I move to perl     perl -nle    0-9  1 3    0-9  1 3    0-9  1 3    0-9  1 3   and print   amp     Perl s good to know in any case   If you ve got a teeny bit of CPAN installed  you can even make it more reliable at little cost     perl -MRegexp  Common net -nE    RE net  IPV4   and say   amp   file s

User · Answer

You could use awk  as well  Something like      awk   i 1  if  NF  gt  0  do  if   i    regexp   print  i  i     while  i  lt   NF     file   May require cleaning  just a quick and dirty response to shows basically how to do it with awk

User · Answer

You could use grep to pull them out   grep -o   0-9   1 3     0-9   1 3     0-9   1 3     0-9   1 3    file txt

User · Answer

You can use sed  But if you know perl  that might be easier  and more useful to know in the long run   perl -n     d    d    d    d     amp  amp  print   1 n    lt  file

User · Answer

I have tried all answers but all of them had one or many problems that I list a few of them    Some detected 123 456 789 111 as valid IP   Some don t detect 127 0 00 1 as valid IP   Some don t detect IP that start with zero like 08 8 8 8   So here I post a regex that works on all above conditions        Note   I have extracted more than 2 millions IP without any problem with following regex          1 d d 2 0-5  0-5  2 0-4  d 0  1-9  d 0 0  d     3    1 d d 2 0-5  0-5  2 0-4  d 0  1-9  d 0 0  d

User · Answer

This works fine for me in access logs   cat access log   egrep -o    0-9  1 3     3  0-9  1 3     Let s break it part by part     0-9  1 3  means one to three occurrences of the range mentioned in     In this case it is 0-9  so it matches patterns like 10 or 183  Followed by a      We will need to escape this as     is a meta character and has special meaning for the shell    So now we are at patterns like  123    12   etc    This pattern repeats itself three times with the       So we enclose it in brackets    0-9  1 3     3  And lastly the pattern repeats itself but this time without the      That is why we kept it separately in the 3rd step    0-9  1 3    If the ips are at the beginning of each line as in my case use   egrep -o     0-9  1 3     3  0-9  1 3     where     is an anchor that tells to search at the start of a line

User · Answer

I usually start with grep  to get the regexp right      multiple failed attempts here  grep      0-9     0-9     0-9     0-9                    file    good  grep -E   0-9  1 3    0-9  1 3    0-9  1 3    0-9  1 3   file    good enough   Then I d try and convert it to sed to filter out the rest of the line    After reading this thread  you and I aren t going to do that anymore  we re going to use grep -o instead   sed -ne  s      0-9  1 3    0-9  1 3    0-9  1 3    0-9  1 3       1 p    FAIL   That s when I usually get annoyed with sed for not using the same regexes as anyone else   So I move to perl     perl -nle    0-9  1 3    0-9  1 3    0-9  1 3    0-9  1 3   and print   amp     Perl s good to know in any case   If you ve got a teeny bit of CPAN installed  you can even make it more reliable at little cost     perl -MRegexp  Common net -nE    RE net  IPV4   and say   amp   file s

User · Answer

I wanted to get only IP addresses that began with  10   from any file in a directory   grep -o -nr   10   2     0-9   1 3     0-9   1 3     0-9   1 3     var www

User · Answer

You could use awk  as well  Something like      awk   i 1  if  NF  gt  0  do  if   i    regexp   print  i  i     while  i  lt   NF     file   May require cleaning  just a quick and dirty response to shows basically how to do it with awk

User · Answer

I wrote a little script to see my log files better  it s nothing special  but might help a lot of the people who are learning perl    It does DNS lookups on the IP addresses after it extracts them

User · Answer

All of the previous answers have one or more problems  The accepted answer allows ip numbers like 999 999 999 999  The currently second most upvoted answer requires prefixing with 0 such as 127 000 000 001 or 008 008 008 008 instead of 127 0 0 1 or 8 8 8 8  Apama has it almost right  but that expression requires that the ipnumber is the only thing on the line  no leading or trailing space allowed  nor can it select ip s from the middle of a line   I think the correct regex can be found on http   www regextester com 22  So if you want to extract all ip-adresses from a file use   grep -Eo     0-9   1-9  0-9  1 0-9  2  2 0-4  0-9  25 0-5      3   0-9   1-9  0-9  1 0-9  2  2 0-4  0-9  25 0-5    file txt   If you don t want duplicates use   grep -Eo     0-9   1-9  0-9  1 0-9  2  2 0-4  0-9  25 0-5      3   0-9   1-9  0-9  1 0-9  2  2 0-4  0-9  25 0-5    file txt   sort   uniq   Please comment if there still are problems in this regex  It easy to find many wrong regex for this problem  I hope this one has no real issues

User · Answer

You can use sed  But if you know perl  that might be easier  and more useful to know in the long run   perl -n     d    d    d    d     amp  amp  print   1 n    lt  file

User · Answer

For those who want a ready solution for getting IP addresses from apache log and listing occurences of how many times IP address has visited website  use this line   grep -Eo   0-9  1 3    0-9  1 3    0-9  1 3    0-9  1 3   error log   sort   uniq -c   sort -nr  gt  occurences txt   Nice method to ban hackers  Next you can    Delete lines with less than 20 visits Using regexp cut till single space so you will have only IP addresses Using regexp cut 1-3 last numbers of IP addresses so you will have only network addresses Add deny from and a space at the beginning of each line Put the result file as  htaccess

User · Answer

Everyone here is using really long-handed regular expressions but actually understanding the regex of POSIX will allow you to use a small grep command like this for printing IP addresses   grep -Eo     0-9  1 3      3   0-9  1 3       Side note  This doesn t ignore invalid IPs but it is very simple

User · Answer

grep -E -o    0-9  1 3       3  0-9  1 3

User · Answer

You can use sed  But if you know perl  that might be easier  and more useful to know in the long run   perl -n     d    d    d    d     amp  amp  print   1 n    lt  file

User · Answer

If you are not given a specific file and you need to extract IP address then we need to do it recursively. grep command -> Searches a text or file for matching a given string and displays the matched string .

grep -roE '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' | grep -oE '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}'

-r We can search the entire directory tree i.e. the current directory and all levels of sub-directories. It denotes recursive searching.

-o Print only the matching string

-E Use extended regular expression

If we would not have used the second grep command after the pipe we would have got the IP address along with the path where it is present

User · Answer

You could use awk  as well  Something like      awk   i 1  if  NF  gt  0  do  if   i    regexp   print  i  i     while  i  lt   NF     file   May require cleaning  just a quick and dirty response to shows basically how to do it with awk

User · Answer

Everyone here is using really long-handed regular expressions but actually understanding the regex of POSIX will allow you to use a small grep command like this for printing IP addresses   grep -Eo     0-9  1 3      3   0-9  1 3       Side note  This doesn t ignore invalid IPs but it is very simple

User · Answer

I d suggest perl    d   d   d   d   should probably do the trick    EDIT  Just to make it more like a complete program  you could do something like the following  not tested        usr bin perl -w use strict   while   lt  gt         if     d    d    d    d               print   1 n             This handles one IP per line  If you have more than one IPs per line  you need to use the  g option  man perlretut gives you a more detailed tutorial on regular expressions

User · Answer

You can use some shell helper I made  https   github com philpraxis ipextract  included them here for convenience      bin sh ipextract        egrep --only-matching -E    25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9         ipextractnet       egrep --only-matching -E    25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9       digit          ipextracttcp       egrep --only-matching -E      digit     tcp      ipextractudp       egrep --only-matching -E      digit     udp      ipextractsctp       egrep --only-matching -E      digit     sctp      ipextractfqdn       egrep --only-matching -E    a-zA-Z0-9   a-zA-Z0-9 -       a-zA-Z  2         Load it   source it  when stored in ipextract file  from shell          ipextract   Use them     ipextract  lt   etc hosts 127 0 0 1 255 255 255 255     For some example of real use   ipextractfqdn  lt   var log snort alert   sort -u dmesg   ipextractudp

User · Answer

This works fine for me in access logs   cat access log   egrep -o    0-9  1 3     3  0-9  1 3     Let s break it part by part     0-9  1 3  means one to three occurrences of the range mentioned in     In this case it is 0-9  so it matches patterns like 10 or 183  Followed by a      We will need to escape this as     is a meta character and has special meaning for the shell    So now we are at patterns like  123    12   etc    This pattern repeats itself three times with the       So we enclose it in brackets    0-9  1 3     3  And lastly the pattern repeats itself but this time without the      That is why we kept it separately in the 3rd step    0-9  1 3    If the ips are at the beginning of each line as in my case use   egrep -o     0-9  1 3     3  0-9  1 3     where     is an anchor that tells to search at the start of a line

User · Answer

You can use sed  But if you know perl  that might be easier  and more useful to know in the long run   perl -n     d    d    d    d     amp  amp  print   1 n    lt  file

User · Answer

Most of the examples here will match on 999 999 999 999 which is not technically a valid IP address   The following will match on only valid IP addresses  including network and broadcast addresses    grep -E -o   25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9     file txt   Omit the -o if you want to see the entire line that matched

User · Answer

Most of the examples here will match on 999 999 999 999 which is not technically a valid IP address   The following will match on only valid IP addresses  including network and broadcast addresses    grep -E -o   25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9     file txt   Omit the -o if you want to see the entire line that matched

User · Answer

I d suggest perl    d   d   d   d   should probably do the trick    EDIT  Just to make it more like a complete program  you could do something like the following  not tested        usr bin perl -w use strict   while   lt  gt         if     d    d    d    d               print   1 n             This handles one IP per line  If you have more than one IPs per line  you need to use the  g option  man perlretut gives you a more detailed tutorial on regular expressions

User · Answer

for centos6 3  ifconfig eth0   grep  inet addr    awk   print  2     awk  BEGIN  FS       print  2

User · Answer

grep -E -o    0-9  1 3       3  0-9  1 3

User · Answer

cat ip address txt   grep    0-9   1 3      0-9   1 3      0-9   1 3      0-9   1 3                 0-9   1 3      0-9   1 3      0-9   1 3      0-9   1 3                 0-9   1 3      0-9   1 3      0-9   1 3      0-9   1 3       Lets assume the file is comma delimited and the position of ip address in the beginning  end and somewhere in the middle  First regexp looks for the exact match of ip address in the beginning of the line  The second regexp after the or looks for ip address in the middle we are matching it in such a way that the number that follows  should be exactly 1 to 3 digits  falsy ips like 12345 12 34 1 can be excluded in this   The third regexp looks for the ip address at the end of the line

User · Answer

Most of the examples here will match on 999 999 999 999 which is not technically a valid IP address   The following will match on only valid IP addresses  including network and broadcast addresses    grep -E -o   25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9     file txt   Omit the -o if you want to see the entire line that matched

User · Answer

You could use awk  as well  Something like      awk   i 1  if  NF  gt  0  do  if   i    regexp   print  i  i     while  i  lt   NF     file   May require cleaning  just a quick and dirty response to shows basically how to do it with awk

User · Answer

Most of the examples here will match on 999 999 999 999 which is not technically a valid IP address   The following will match on only valid IP addresses  including network and broadcast addresses    grep -E -o   25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9     file txt   Omit the -o if you want to see the entire line that matched

User · Answer

You could use grep to pull them out   grep -o   0-9   1 3     0-9   1 3     0-9   1 3     0-9   1 3    file txt

User · Answer

I wrote a little script to see my log files better  it s nothing special  but might help a lot of the people who are learning perl    It does DNS lookups on the IP addresses after it extracts them

User · Answer

You could use grep to pull them out   grep -o   0-9   1 3     0-9   1 3     0-9   1 3     0-9   1 3    file txt

User · Answer

I d suggest perl    d   d   d   d   should probably do the trick    EDIT  Just to make it more like a complete program  you could do something like the following  not tested        usr bin perl -w use strict   while   lt  gt         if     d    d    d    d               print   1 n             This handles one IP per line  If you have more than one IPs per line  you need to use the  g option  man perlretut gives you a more detailed tutorial on regular expressions

User · Answer

I usually start with grep  to get the regexp right      multiple failed attempts here  grep      0-9     0-9     0-9     0-9                    file    good  grep -E   0-9  1 3    0-9  1 3    0-9  1 3    0-9  1 3   file    good enough   Then I d try and convert it to sed to filter out the rest of the line    After reading this thread  you and I aren t going to do that anymore  we re going to use grep -o instead   sed -ne  s      0-9  1 3    0-9  1 3    0-9  1 3    0-9  1 3       1 p    FAIL   That s when I usually get annoyed with sed for not using the same regexes as anyone else   So I move to perl     perl -nle    0-9  1 3    0-9  1 3    0-9  1 3    0-9  1 3   and print   amp     Perl s good to know in any case   If you ve got a teeny bit of CPAN installed  you can even make it more reliable at little cost     perl -MRegexp  Common net -nE    RE net  IPV4   and say   amp   file s

User · Answer

I wrote an informative blog article about this topic  How to Extract IPv4 and IPv6 IP Addresses from Plain Text Using Regex   In the article there s a detailed guide of the most common different patterns for IPs  often required to be extracted and isolated from plain text using regular expressions  This guide is based on CodVerter s IP Extractor source code tool for handling IP addresses extraction and detection when necessary    If you wish to validate and capture IPv4 Address this pattern can do the job    b      25 0-5  2 0-4  0-9   01   0-9  0-9        3    25 0-5  2 0-4  0-9   01   0-9  0-9    b   or to validate and capture IPv4 Address with Prefix   slash notation      b      25 0-5  2 0-4  0-9   01   0-9  0-9        3    25 0-5  2 0-4  0-9   01   0-9  0-9    0-9  1 2   b   or to capture subnet mask or wildcard mask    255 254 252 248 240 224 192 128 0     255 254 252 248 240 224 192 128 0     255 254 252 248 240 224 192 128 0     255 254 252 248 240 224 192 128 0    or to filter out subnet mask addresses you do it with regex negative lookahead    b     255 254 252 248 240 224 192 128 0     255 254 252 248 240 224 192 128 0     255 254 252 248 240 224 192 128 0     255 254 252 248 240 224 192 128 0         25 0-5  2 0-4  0-9   01   0-9  0-9        3    25 0-5  2 0-4  0-9   01   0-9  0-9    b   For IPv6 validation you can go to the article link I have added at the top of this answer  Here is an example for capturing all the common patterns  taken from CodVerter s IP Extractor Help Sample      If you wish you can test the IPv4 regex here

User · Answer

for centos6 3  ifconfig eth0   grep  inet addr    awk   print  2     awk  BEGIN  FS       print  2

User · Answer

I usually start with grep  to get the regexp right      multiple failed attempts here  grep      0-9     0-9     0-9     0-9                    file    good  grep -E   0-9  1 3    0-9  1 3    0-9  1 3    0-9  1 3   file    good enough   Then I d try and convert it to sed to filter out the rest of the line    After reading this thread  you and I aren t going to do that anymore  we re going to use grep -o instead   sed -ne  s      0-9  1 3    0-9  1 3    0-9  1 3    0-9  1 3       1 p    FAIL   That s when I usually get annoyed with sed for not using the same regexes as anyone else   So I move to perl     perl -nle    0-9  1 3    0-9  1 3    0-9  1 3    0-9  1 3   and print   amp     Perl s good to know in any case   If you ve got a teeny bit of CPAN installed  you can even make it more reliable at little cost     perl -MRegexp  Common net -nE    RE net  IPV4   and say   amp   file s

User · Answer

All of the previous answers have one or more problems  The accepted answer allows ip numbers like 999 999 999 999  The currently second most upvoted answer requires prefixing with 0 such as 127 000 000 001 or 008 008 008 008 instead of 127 0 0 1 or 8 8 8 8  Apama has it almost right  but that expression requires that the ipnumber is the only thing on the line  no leading or trailing space allowed  nor can it select ip s from the middle of a line   I think the correct regex can be found on http   www regextester com 22  So if you want to extract all ip-adresses from a file use   grep -Eo     0-9   1-9  0-9  1 0-9  2  2 0-4  0-9  25 0-5      3   0-9   1-9  0-9  1 0-9  2  2 0-4  0-9  25 0-5    file txt   If you don t want duplicates use   grep -Eo     0-9   1-9  0-9  1 0-9  2  2 0-4  0-9  25 0-5      3   0-9   1-9  0-9  1 0-9  2  2 0-4  0-9  25 0-5    file txt   sort   uniq   Please comment if there still are problems in this regex  It easy to find many wrong regex for this problem  I hope this one has no real issues

User · Answer

I wrote an informative blog article about this topic  How to Extract IPv4 and IPv6 IP Addresses from Plain Text Using Regex   In the article there s a detailed guide of the most common different patterns for IPs  often required to be extracted and isolated from plain text using regular expressions  This guide is based on CodVerter s IP Extractor source code tool for handling IP addresses extraction and detection when necessary    If you wish to validate and capture IPv4 Address this pattern can do the job    b      25 0-5  2 0-4  0-9   01   0-9  0-9        3    25 0-5  2 0-4  0-9   01   0-9  0-9    b   or to validate and capture IPv4 Address with Prefix   slash notation      b      25 0-5  2 0-4  0-9   01   0-9  0-9        3    25 0-5  2 0-4  0-9   01   0-9  0-9    0-9  1 2   b   or to capture subnet mask or wildcard mask    255 254 252 248 240 224 192 128 0     255 254 252 248 240 224 192 128 0     255 254 252 248 240 224 192 128 0     255 254 252 248 240 224 192 128 0    or to filter out subnet mask addresses you do it with regex negative lookahead    b     255 254 252 248 240 224 192 128 0     255 254 252 248 240 224 192 128 0     255 254 252 248 240 224 192 128 0     255 254 252 248 240 224 192 128 0         25 0-5  2 0-4  0-9   01   0-9  0-9        3    25 0-5  2 0-4  0-9   01   0-9  0-9    b   For IPv6 validation you can go to the article link I have added at the top of this answer  Here is an example for capturing all the common patterns  taken from CodVerter s IP Extractor Help Sample      If you wish you can test the IPv4 regex here

User · Answer

You can use some shell helper I made  https   github com philpraxis ipextract  included them here for convenience      bin sh ipextract        egrep --only-matching -E    25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9         ipextractnet       egrep --only-matching -E    25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9      25 0-5  2 0-4  0-9   01   0-9  0-9       digit          ipextracttcp       egrep --only-matching -E      digit     tcp      ipextractudp       egrep --only-matching -E      digit     udp      ipextractsctp       egrep --only-matching -E      digit     sctp      ipextractfqdn       egrep --only-matching -E    a-zA-Z0-9   a-zA-Z0-9 -       a-zA-Z  2         Load it   source it  when stored in ipextract file  from shell          ipextract   Use them     ipextract  lt   etc hosts 127 0 0 1 255 255 255 255     For some example of real use   ipextractfqdn  lt   var log snort alert   sort -u dmesg   ipextractudp

User · Answer

I wanted to get only IP addresses that began with  10   from any file in a directory   grep -o -nr   10   2     0-9   1 3     0-9   1 3     0-9   1 3     var www

User · Answer

You could use grep to pull them out   grep -o   0-9   1 3     0-9   1 3     0-9   1 3     0-9   1 3    file txt

[regex] How do you extract IP addresses from files using a regex in a linux shell?

Examples related to regex

Examples related to linux

Examples related to bash

Examples related to unix

Examples related to command-line