[linux] Split output of command by columns using Bash?

I want to do this:

  1. run a command
  2. capture the output
  3. select a line
  4. select a column of that line

Just as an example, let's say I want to get the command name from a $PID (please note this is just an example, I'm not suggesting this is the easiest way to get a command name from a process id - my real problem is with another command whose output format I can't control).

If I run ps I get:


  PID TTY          TIME CMD
11383 pts/1    00:00:00 bash
11771 pts/1    00:00:00 ps

Now I do ps | egrep 11383 and get

11383 pts/1    00:00:00 bash

Next step: ps | egrep 11383 | cut -d" " -f 4. Output is:

<absolutely nothing/>

The problem is that cut cuts the output by single spaces, and as ps adds some spaces between the 2nd and 3rd columns to keep some resemblance of a table, cut picks an empty string. Of course, I could use cut to select the 7th and not the 4th field, but how can I know, specially when the output is variable and unknown on beforehand.

This question is related to linux bash pipe

The answer is


Getting the correct line (example for line no. 6) is done with head and tail and the correct word (word no. 4) can be captured with awk:

command|head -n 6|tail -n 1|awk '{print $4}'

Your command

ps | egrep 11383 | cut -d" " -f 4

misses a tr -s to squeeze spaces, as unwind explains in his answer.

However, you maybe want to use awk, since it handles all of these actions in a single command:

ps | awk '/11383/ {print $4}'

This prints the 4th column in those lines containing 11383. If you want this to match 11383 if it appears in the beginning of the line, then you can say ps | awk '/^11383/ {print $4}'.


I think the simplest way is to use awk. Example:

$ echo "11383 pts/1    00:00:00 bash" | awk '{ print $4; }'
bash

Instead of doing all these greps and stuff, I'd advise you to use ps capabilities of changing output format.

ps -o cmd= -p 12345

You get the cmmand line of a process with the pid specified and nothing else.

This is POSIX-conformant and may be thus considered portable.


try

ps |&
while read -p first second third fourth etc ; do
   if [[ $first == '11383' ]]
   then
       echo got: $fourth
   fi       
done

Using array variables

set $(ps | egrep "^11383 "); echo $4

or

A=( $(ps | egrep "^11383 ") ) ; echo ${A[3]}

Please note that the tr -s ' ' option will not remove any single leading spaces. If your column is right-aligned (as with ps pid)...

$ ps h -o pid,user -C ssh,sshd | tr -s " "
 1543 root
19645 root
19731 root

Then cutting will result in a blank line for some of those fields if it is the first column:

$ <previous command> | cut -d ' ' -f1

19645
19731

Unless you precede it with a space, obviously

$ <command> | sed -e "s/.*/ &/" | tr -s " "

Now, for this particular case of pid numbers (not names), there is a function called pgrep:

$ pgrep ssh


Shell functions

However, in general it is actually still possible to use shell functions in a concise manner, because there is a neat thing about the read command:

$ <command> | while read a b; do echo $a; done

The first parameter to read, a, selects the first column, and if there is more, everything else will be put in b. As a result, you never need more variables than the number of your column +1.

So,

while read a b c d; do echo $c; done

will then output the 3rd column. As indicated in my comment...

A piped read will be executed in an environment that does not pass variables to the calling script.

out=$(ps whatever | { read a b c d; echo $c; })

arr=($(ps whatever | { read a b c d; echo $c $b; }))
echo ${arr[1]}     # will output 'b'`


The Array Solution

So we then end up with the answer by @frayser which is to use the shell variable IFS which defaults to a space, to split the string into an array. It only works in Bash though. Dash and Ash do not support it. I have had a really hard time splitting a string into components in a Busybox thing. It is easy enough to get a single component (e.g. using awk) and then to repeat that for every parameter you need. But then you end up repeatedly calling awk on the same line, or repeatedly using a read block with echo on the same line. Which is not efficient or pretty. So you end up splitting using ${name%% *} and so on. Makes you yearn for some Python skills because in fact shell scripting is not a lot of fun anymore if half or more of the features you are accustomed to, are gone. But you can assume that even python would not be installed on such a system, and it wasn't ;-).


Similar to brianegge's awk solution, here is the Perl equivalent:

ps | egrep 11383 | perl -lane 'print $F[3]'

-a enables autosplit mode, which populates the @F array with the column data.
Use -F, if your data is comma-delimited, rather than space-delimited.

Field 3 is printed since Perl starts counting from 0 rather than 1


Bash's set will parse all output into position parameters.

For instance, with set $(free -h) command, echo $7 will show "Mem:"


Examples related to linux

grep's at sign caught as whitespace How to prevent Google Colab from disconnecting? "E: Unable to locate package python-pip" on Ubuntu 18.04 How to upgrade Python version to 3.7? Install Qt on Ubuntu Get first line of a shell command's output Cannot connect to the Docker daemon at unix:/var/run/docker.sock. Is the docker daemon running? Run bash command on jenkins pipeline How to uninstall an older PHP version from centOS7 How to update-alternatives to Python 3 without breaking apt?

Examples related to bash

Comparing a variable with a string python not working when redirecting from bash script Zipping a file in bash fails How do I prevent Conda from activating the base environment by default? Get first line of a shell command's output Fixing a systemd service 203/EXEC failure (no such file or directory) /bin/sh: apt-get: not found VSCode Change Default Terminal Run bash command on jenkins pipeline How to check if the docker engine and a docker container are running? How to switch Python versions in Terminal?

Examples related to pipe

Angular 4: InvalidPipeArgument: '[object Object]' for pipe 'AsyncPipe' Using Pipes within ngModel on INPUT Elements in Angular What are the parameters for the number Pipe - Angular 2 Limit to 2 decimal places with a simple pipe Pass a password to ssh in pure bash How to open every file in a folder Why does cURL return error "(23) Failed writing body"? How do I use a pipe to redirect the output of one command to the input of another? How to use `subprocess` command with pipes Pipe subprocess standard output to a variable