Appending a List of Files to One File Using xargs Command
Posted by Jimmy Selix in Korn shell
This tech-recipe explains how to use the xargs command in Korn shell to work with multiple files. The instructions will show how to combine ten files into one file, without manually combining them using a: cat file1 > bigfile, cat file2 >> bigfile, etc. The uses of this command are wide and most helpful. Using the xargs command is an effective way to work with large amounts of files. For example, it can even help you avoid “the parameter list is too long” message when trying to grep more than 1024 files.
The following tutorial contains instructions for the use of a basic ksh command: xargs.
If you work with AIX/Unix/ksh, most likely there are times when you need to do multiple procedures with files. This tutorial will show how to append a list of files to one large file using two commands, versus manually combining them using cat file1 > bigfile, cat file2 >> bigfile, etc.
Files/Directory
In my example, I am going to take the contents of five files and combine them into one big file.
My files are: file1, file2, file3, file4, file5.
The combined file will be called bigfile.dat.
Also, I have all the files in the same directory. (/usr/acct/test/files/ )
First, we will create a list of the files. This is extremely helpful when working with large amounts files (1500, for instance).
For my example, I would type this command:
ls | grep file > filelist
The ls command will list the contents of the directory. Then I search for any file that has file in its name and append the results to a new file called filelist.
Now, we have a list of the files we want to combine into one bigfile.dat.
To combine these files, I will type the following:
cat filelist | xargs cat >> bigfile.dat
This command tells me to take the list of files (filelist); and then for each file listed, append the contents to the file bigfile.dat.
The xargs command is useful when combined with pipe ( | ). xargs will allow you to work with large numbers of files in a list or grep’d.
____________________
NOTES:
Here is a description of the xargs command, courtesy of our AIX/KornShell Reference Manual.
Description
The generated command line length is the sum of the size, in bytes, of the Command and each Argument treated as strings, including a null byte terminator for each of these strings. The xargs command limits the command line length. When the constructed command line runs, the combined Argument and environment lists can not exceed ARG_MAX bytes. Within this constraint, if you do not specify the -n or the -s flags, the default command line length is at least the value specified by LINE_MAX.
Here are a few examples of using the command from our manual.
To insert file names into the middle of command lines, enter the following:
ls | xargs -t -I {} mv {} {}.old
This command sequence renames all files in the current directory by adding .old to the end of each name. The -I flag tells the xargs command to insert each line of the ls directory listing where {} (braces) appear. If the current directory contains the files chap1, chap2, and chap3, this constructs the following commands:
mv chap1 chap1.old
mv chap2 chap2.old
mv chap3 chap3.old
Here is another example: If the cfiles file contains the following text: lint -a main.c readit.c gettoken.c putobj.c If the cfiles file contains more file names than fit on a single shell command line (up to LINE_MAX), the xargs command runs the lint command with the file names that fit. It then constructs and runs another lint command using the remaining file names. Depending on the names listed in the cfiles file, the commands might look like the following: xargs -x lint -a Definition/Examples taken from the following sources:
To use a command on files whose names are listed in a file, enter the following:
xargs lint -a
main.c readit.c
gettoken.c
putobj.c
then the xargs command constructs and runs the following command:
lint -a main.c readit.c gettoken.c . . .
lint -a getisx.c getprp.c getpid.c . . .
lint -a fltadd.c fltmult.c fltdiv.c . . .
This command sequence is not quite the same as running the lint command once with all the file names. The lint command checks cross-references between files. However, in this example, it cannot check between the main.c and the fltadd.c files or between any two files listed on separate command lines.
For this reason, you may want to run the command only if all the file names fit on one line. To specify this to the xargs command, use the -x flag by entering the following:
AIX Version 4.3 Commands Reference, Volume 6 applies to the AIX Version 4.3, 3270 Host Connection Program 2.1 and 1.3.3 for AIX, and Distributed SMIT 2.2 for AIX licensed programs,
About Jimmy Selix
View more articles by Jimmy Selix
The Conversation
Follow the reactions below and share your own thoughts.
February 03, 2009 at 10:19 am, AMK said:
Well in MSDOS, we do this by this cmdline:
cmdprompt
c:> type “File1.dat” “File2.dat” “FileNNNN.dat >> “My_bigfile.dat”
c:> “My_bigfile.dat”
::THIS WILL SHOW ALL FILES APPENDED TO ONE REDIRECTED FILE WITH ::CONTENTS
April 21, 2012 at 2:19 am, cowperandrewes said:
In unix, if your files are literally named:
file1 file2 file3 file4 file5
Then simply use:
cat file* >> file6
this will lexicographically sort the files via globbing of * before redirecting into file6, if you wish to reverse the order you can use:
cat file[54321] >> file6
cat $(echo file*|sort -r) >> file6
To recurse into directories and slurp only files (optionally specify a name via -name option):
find . -type f -exec cat ‘{}’ >> /path/to/file \; # The backslash is important
It’s amazing how much work your shell will do in the background without needing to resort to scripts!