Take nth column in a text file

Question

I have a text file   1 Q0 1657 1 19 6117 Exp 1 Q0 1410 2 18 8302 Exp 2 Q0 3078 1 18 6695 Exp 2 Q0 2434 2 14 0508 Exp 2 Q0 3129 3 13 5495 Exp   I want to take the 2nd and 4th word of every line like this   1657 19 6117 1410 18 8302 3078 18 6695 2434 14 0508 3129 13 5495   I m using this code    nol   cat   path of my text    wc -l   x 1  while     x -le   nol     do      line    sed -n   x p  path of my text       echo     line 1       line 3       gt  gt  out txt      x      x   1     done   It works  but it is very complicated and takes a long time to process long text files   Is there a simpler way to do this

User · Accepted Answer

iirc    cat filename txt   awk    print  2  4      or  as mentioned in the comments    awk    print  2  4    filename txt

User · Answer

For the sake of completeness   while read     one   two    do     echo   one  two  done  lt  file txt   Instead of   an arbitrary variable  such as junk  can be used as well  The point is just to extract the columns   Demo     while read     one   two    do echo   one  two   done  lt   tmp file txt 1657 19 6117 1410 18 8302 3078 18 6695 2434 14 0508 3129 13 5495

User · Answer

You can use the cut command   cut -d    -f3 5  lt  datafile txt   prints  1657 19 6117 1410 18 8302 3078 18 6695 2434 14 0508 3129 13 5495   the   -d    - mean  use space as a delimiter -f3 5 - take and print 3rd and 5th column   The cut is much faster for large files as a pure shell solution  If your file is delimited with multiple whitespaces  you can remove them first  like   sed  s   t    t      g   lt  datafile txt   cut -d    -f3 5   where the  gnu  sed will replace any tab or space characters with a single space   For a variant - here is a perl solution too   perl -lanE  say   F 2   F 4     lt  datafile txt

User · Answer

If your file contains n lines  then your script has to read the file n times  so if you double the length of the file  you quadruple the amount of work your script does  mdash  and almost all of that work is simply thrown away  since all you want to do is loop over the lines in order   Instead  the best way to loop over the lines of a file is to use a while loop  with the condition-command being the read builtin   while IFS  read -r line   do        line is a single line of the file  as a single string           commands that use  line     done  lt  input file txt   In your case  since you want to split the line into an array  and the read builtin actually has special support for populating an array variable  which is what you want  you can write   while read -r -a line   do     echo     line 1       line 3      gt  gt  out txt done  lt   path of my text   or better yet   while read -r -a line   do     echo    line 1     line 3    done  lt   path of my text  gt  out txt   However  for what you re doing you can just use the cut utility   cut -d    -f2 4  lt   path of my text  gt  out txt    or awk  as Tom van der Woerdt suggests  or perl  or even sed

User · Answer

If you are using structured data  this has the added benefit of not invoking an extra shell process to run tr and or cut or something        Of course  you will want to guard against bad inputs with conditionals and sane alternatives         while read line    do      lineCols    line         echo    lineCols 0        echo    lineCols 1    done  lt   myFQFileToRead

User · Answer

One more simple variant -    while read line   do       set  line            assigns words in line to positional parameters       echo   3  5    done  lt  file

[linux] Take nth column in a text file

Examples related to linux

Examples related to bash