Select an arbitrary column of text in UNIX

Posted October 6, 2003 by Quinn McHenry in UNIX

UNIX has wonderfully powerful text processing capabilities. There are numerous ways to solve the same problem. Frequently, for example, it is necessary to extract a single column of data from a text file or output stream. This tech-recipe will present several solutions to this problem.

Many data files have data fields delimited by a single character like a tab or colon. To extract the full name field out of /etc/passwd, the fifth colon-delimited field, use:

cut -d : -f 5 /etc/passwd

The cut command allows a great deal of flexibility in cutting data. In this case, the -d : directs cut to use a colon character as the delimiter. The -f 5 parameter directs cut to extract only the fifth field. The field parameter makes cut extrememly flexible. Other examples are -f 2-5 to extract fields 2 through 5, -f 1,3,7 to extract the first, third, and seventh fields.

To extract a fixed set of columns, for example the column numbers 44 through 49 from a long directory listing (ls -l), use the following command:

ls -l | cut -c 44-49

On many UNIX systems, these columns represent the modification date. Like the -f parameter, the -c parameter can accept alternative values such as -c 5,7,6,8 will present those character positions in that order.

One of the trickier column extraction involves the presence of a variable amount of whitespace between fields. To extract the process id (second) field from a process listing (ps -ef), cut will not work. Another powerful text manipulator in UNIX is awk which understands that several spaces should be counted as a single whitespace. To extract the pid from a ps -ef, use:

ps -ef | head | awk '{print $2}'

Awk is an incredibly powerful tool, and this is a trivial but useful application of it.


About Quinn McHenry

Quinn was one of the original co-founders of Tech-Recipes. He is currently crafting iOS applications as a senior developer at Small Planet Digital in Brooklyn, New York.
View more articles by Quinn McHenry

The Conversation

Follow the reactions below and share your own thoughts.