[perl] how to remove the first two columns in a file using shell (awk, sed, whatever)

I have a file with many lines in each line there are many columns(fields) separated by blank " " the numbers of columns in each line are different I want to remove the first two columns how to?

This question is related to perl shell awk sed cut

The answer is


Using awk, and based in some of the options below, using a for loop makes a bit more flexible; sometimes I may want to delete the first 9 columns ( if I do an "ls -lrt" for example), so I change the 2 for a 9 and that's it:

awk '{ for(i=0;i++<2;){$i=""}; print $0 }' your_file.txt


You can use sed:

sed 's/^[^ ][^ ]* [^ ][^ ]* //'

This looks for lines starting with one-or-more non-blanks, a blank, another set of one-or-more non-blanks and another blank, and deletes the matched material, aka the first two fields. The [^ ][^ ]* is marginally shorter than the equivalent but more explicit [^ ]\{1,\} notation, and the second might run into issues with GNU sed (though if you use --posix as an option, even GNU sed can't screw it up). OTOH, if the character class to be repeated was more complex, the numbered notation wins for brevity. It is easy to extend this to handle 'blank or tab' as separator, or 'multiple blanks' or 'multiple blanks or tabs'. It could also be modified to handle optional leading blanks (or tabs) before the first field, etc.

For awk and cut, see Sampson-Chen's answer. There are other ways to write the awk script, but they're not materially better than the answer given. Note that you might need to set the field separator explicitly (-F" ") in awk if you do not want tabs treated as separators, or you might have multiple blanks between fields. The POSIX standard cut does not support multiple separators between fields; GNU cut has the useful but non-standard -i option to allow for multiple separators between fields.

You can also do it in pure shell:

while read junk1 junk2 residue
do echo "$residue"
done < in-file > out-file

Use kscript

kscript 'lines.split().select(-1,-2).print()' file

You can do it with cut:

cut -d " " -f 3- input_filename > output_filename

Explanation:

  • cut: invoke the cut command
  • -d " ": use a single space as the delimiter (cut uses TAB by default)
  • -f: specify fields to keep
  • 3-: all the fields starting with field 3
  • input_filename: use this file as the input
  • > output_filename: write the output to this file.

Alternatively, you can do it with awk:

awk '{$1=""; $2=""; sub("  ", " "); print}' input_filename > output_filename

Explanation:

  • awk: invoke the awk command
  • $1=""; $2="";: set field 1 and 2 to the empty string
  • sub(...);: clean up the output fields because fields 1 & 2 will still be delimited by " "
  • print: print the modified line
  • input_filename > output_filename: same as above.

Its pretty straight forward to do it with only shell

while read A B C; do
echo "$C"
done < oldfile >newfile

Thanks for posting the question. I'd also like to add the script that helped me.

awk '{ $1=""; print $0 }' file

perl:

perl -lane 'print join(' ',@F[2..$#F])' File

awk:

awk '{$1=$2=""}1' File

Here's one way to do it with Awk that's relatively easy to understand:

awk '{print substr($0, index($0, $3))}'

This is a simple awk command with no pattern, so action inside {} is run for every input line.

The action is to simply prints the substring starting with the position of the 3rd field.

  • $0: the whole input line
  • $3: 3rd field
  • index(in, find): returns the position of find in string in
  • substr(string, start): return a substring starting at index start

If you want to use a different delimiter, such as comma, you can specify it with the -F option:

awk -F"," '{print substr($0, index($0, $3))}'

You can also operate this on a subset of the input lines by specifying a pattern before the action in {}. Only lines matching the pattern will have the action run.

awk 'pattern{print substr($0, index($0, $3))}'

Where pattern can be something such as:

  • /abcdef/: use regular expression, operates on $0 by default.
  • $1 ~ /abcdef/: operate on a specific field.
  • $1 == blabla: use string comparison
  • NR > 1: use record/line number
  • NF > 0: use field/column number

This might work for you (GNU sed):

sed -r 's/^([^ ]+ ){2}//' file

or for columns separated by one or more white spaces:

sed -r 's/^(\S+\s+){2}//' file

awk '{$1=$2="";$0=$0;$1=$1}1'

Input

a b c d

Output

c d

Examples related to perl

The program can't start because api-ms-win-crt-runtime-l1-1-0.dll is missing while starting Apache server on my computer "End of script output before headers" error in Apache Perl - Multiple condition if statement without duplicating code? How to decrypt hash stored by bcrypt Split a string into array in Perl Turning multiple lines into one comma separated line String compare in Perl with "eq" vs "==" how to remove the first two columns in a file using shell (awk, sed, whatever) Find everything between two XML tags with RegEx Difference between \w and \b regular expression meta characters

Examples related to shell

Comparing a variable with a string python not working when redirecting from bash script Get first line of a shell command's output How to run shell script file using nodejs? Run bash command on jenkins pipeline Way to create multiline comments in Bash? How to do multiline shell script in Ansible How to check if a file exists in a shell script How to check if an environment variable exists and get its value? Curl to return http status code along with the response docker entrypoint running bash script gets "permission denied"

Examples related to awk

What are NR and FNR and what does "NR==FNR" imply? awk - concatenate two string variable and assign to a third Printing column separated by comma using Awk command line Insert multiple lines into a file after specified pattern using shell script cut or awk command to print first field of first row How to run an awk commands in Windows? Linux bash script to extract IP address Print line numbers starting at zero using awk Trim leading and trailing spaces from a string in awk Use awk to find average of a column

Examples related to sed

Retrieve last 100 lines logs How to replace multiple patterns at once with sed? Insert multiple lines into a file after specified pattern using shell script Linux bash script to extract IP address Ansible playbook shell output remove white space from the end of line in linux bash, extract string before a colon invalid command code ., despite escaping periods, using sed RE error: illegal byte sequence on Mac OS X How to use variables in a command in sed?

Examples related to cut

Loop through a comma-separated shell variable cat, grep and cut - translated to python How to find the last field using 'cut' Using the grep and cut delimiter command (in bash shell scripting UNIX) - and kind of "reversing" it? Using cut command to remove multiple columns how to remove the first two columns in a file using shell (awk, sed, whatever) How can I remove the extension of a filename in a shell script? bash script use cut command at variable and store result at another variable How to specify more spaces for the delimiter using cut? Copy and paste content from one file to another file in vi