[linux] Argument list too long error for rm, cp, mv commands

I have several hundred PDFs under a directory in UNIX. The names of the PDFs are really long (approx. 60 chars).

When I try to delete all PDFs together using the following command:

rm -f *.pdf

I get the following error:

/bin/rm: cannot execute [Argument list too long]

What is the solution to this error? Does this error occur for mv and cp commands as well? If yes, how to solve for these commands?

This question is related to linux unix command-line-arguments

The answer is


I ran into this problem a few times. Many of the solutions will run the rm command for each individual file that needs to be deleted. This is very inefficient:

find . -name "*.pdf" -print0 | xargs -0 rm -rf

I ended up writing a python script to delete the files based on the first 4 characters in the file-name:

import os
filedir = '/tmp/' #The directory you wish to run rm on 
filelist = (os.listdir(filedir)) #gets listing of all files in the specified dir
newlist = [] #Makes a blank list named newlist
for i in filelist: 
    if str((i)[:4]) not in newlist: #This makes sure that the elements are unique for newlist
        newlist.append((i)[:4]) #This takes only the first 4 charcters of the folder/filename and appends it to newlist
for i in newlist:
    if 'tmp' in i:  #If statment to look for tmp in the filename/dirname
        print ('Running command rm -rf '+str(filedir)+str(i)+'* : File Count: '+str(len(os.listdir(filedir)))) #Prints the command to be run and a total file count
        os.system('rm -rf '+str(filedir)+str(i)+'*') #Actual shell command
print ('DONE')

This worked very well for me. I was able to clear out over 2 million temp files in a folder in about 15 minutes. I commented the tar out of the little bit of code so anyone with minimal to no python knowledge can manipulate this code.


I only know a way around this. The idea is to export that list of pdf files you have into a file. Then split that file into several parts. Then remove pdf files listed in each part.

ls | grep .pdf > list.txt
wc -l list.txt

wc -l is to count how many line the list.txt contains. When you have the idea of how long it is, you can decide to split it in half, forth or something. Using split -l command For example, split it in 600 lines each.

split -l 600 list.txt

this will create a few file named xaa,xab,xac and so on depends on how you split it. Now to "import" each list in those file into command rm, use this:

rm $(<xaa)
rm $(<xab)
rm $(<xac)

Sorry for my bad english.


To delete all *.pdf in a directory /path/to/dir_with_pdf_files/

mkdir empty_dir        # Create temp empty dir

rsync -avh --delete --include '*.pdf' empty_dir/ /path/to/dir_with_pdf_files/

To delete specific files via rsync using wildcard is probably the fastest solution in case you've millions of files. And it will take care of error you're getting.


(Optional Step): DRY RUN. To check what will be deleted without deleting. `

rsync -avhn --delete --include '*.pdf' empty_dir/ /path/to/dir_with_pdf_files/

. . .

Click rsync tips and tricks for more rsync hacks


You can create a temp folder, move all the files and sub-folders you want to keep into the temp folder then delete the old folder and rename the temp folder to the old folder try this example until you are confident to do it live:

mkdir testit
cd testit
mkdir big_folder tmp_folder
touch big_folder/file1.pdf
touch big_folder/file2.pdf
mv big_folder/file1,pdf tmp_folder/
rm -r big_folder
mv tmp_folder big_folder

the rm -r big_folder will remove all files in the big_folder no matter how many. You just have to be super careful you first have all the files/folders you want to keep, in this case it was file1.pdf


If you’re trying to delete a very large number of files at one time (I deleted a directory with 485,000+ today), you will probably run into this error:

/bin/rm: Argument list too long.

The problem is that when you type something like rm -rf *, the * is replaced with a list of every matching file, like “rm -rf file1 file2 file3 file4” and so on. There is a relatively small buffer of memory allocated to storing this list of arguments and if it is filled up, the shell will not execute the program.

To get around this problem, a lot of people will use the find command to find every file and pass them one-by-one to the “rm” command like this:

find . -type f -exec rm -v {} \;

My problem is that I needed to delete 500,000 files and it was taking way too long.

I stumbled upon a much faster way of deleting files – the “find” command has a “-delete” flag built right in! Here’s what I ended up using:

find . -type f -delete

Using this method, I was deleting files at a rate of about 2000 files/second – much faster!

You can also show the filenames as you’re deleting them:

find . -type f -print -delete

…or even show how many files will be deleted, then time how long it takes to delete them:

root@devel# ls -1 | wc -l && time find . -type f -delete
100000
real    0m3.660s
user    0m0.036s
sys     0m0.552s

you can try this:

for f in *.pdf
do
  rm "$f"
done

EDIT: ThiefMaster comment suggest me not to disclose such dangerous practice to young shell's jedis, so I'll add a more "safer" version (for the sake of preserving things when someone has a "-rf . ..pdf" file)

echo "# Whooooo" > /tmp/dummy.sh
for f in '*.pdf'
do
   echo "rm -i \"$f\""
done >> /tmp/dummy.sh

After running the above, just open the /tmp/dummy.sh file in your favorite editor and check every single line for dangerous filenames, commenting them out if found.

Then copy the dummy.sh script in your working dir and run it.

All this for security reasons.


Using GNU parallel (sudo apt install parallel) is super easy

It runs the commands multithreaded where '{}' is the argument passed

E.g.

ls /tmp/myfiles* | parallel 'rm {}'


Or you can try:

find . -name '*.pdf' -exec rm -f {} \;

tl;dr

It's a kernel limitation on the size of the command line argument. Use a for loop instead.

Origin of problem

This is a system issue, related to execve and ARG_MAX constant. There is plenty of documentation about that (see man execve, debian's wiki).

Basically, the expansion produce a command (with its parameters) that exceeds the ARG_MAX limit. On kernel 2.6.23, the limit was set at 128 kB. This constant has been increased and you can get its value by executing:

getconf ARG_MAX
# 2097152 # on 3.5.0-40-generic

Solution: Using for Loop

Use a for loop as it's recommended on BashFAQ/095 and there is no limit except for RAM/memory space:

Dry run to ascertain it will delete what you expect:

for f in *.pdf; do echo rm "$f"; done

And execute it:

for f in *.pdf; do rm "$f"; done

Also this is a portable approach as glob have strong and consistant behavior among shells (part of POSIX spec).

Note: As noted by several comments, this is indeed slower but more maintainable as it can adapt more complex scenarios, e.g. where one want to do more than just one action.

Solution: Using find

If you insist, you can use find but really don't use xargs as it "is dangerous (broken, exploitable, etc.) when reading non-NUL-delimited input":

find . -maxdepth 1 -name '*.pdf' -delete 

Using -maxdepth 1 ... -delete instead of -exec rm {} + allows find to simply execute the required system calls itself without using an external process, hence faster (thanks to @chepner comment).

References


You could use a bash array:

files=(*.pdf)
for((I=0;I<${#files[@]};I+=1000)); do
    rm -f "${files[@]:I:1000}"
done

This way it will erase in batches of 1000 files per step.


you can use this commend

find -name "*.pdf"  -delete

Argument list too long

As this question title for cp, mv and rm, but answer stand mostly for rm.

Un*x commands

Read carefully command's man page!

For cp and mv, there is a -t switch, for target:

find . -type f -name '*.pdf' -exec cp -ait "/path to target" {} +

and

find . -type f -name '*.pdf' -exec mv -t "/path to target" {} +

Script way

There is an overall workaroung used in script:

#!/bin/bash

folder=( "/path to folder" "/path to anther folder" )

[ "$1" = "--run" ] && exec find "${target[@]}" -type f -name '*.pdf' -exec $0 {} +

for file ;do
    printf "Doing something with '%s'.\n" "$file"
done

I'm surprised there are no ulimit answers here. Every time I have this problem I end up here or here. I understand this solution has limitations but ulimit -s 65536 seems to often do the trick for me.


If they are filenames with spaces or special characters, use:

find -maxdepth 1 -name '*.pdf' -exec rm "{}" \;

This sentence search all files in the current directory (-maxdepth 1) with extension pdf (-name '*.pdf'), and then, delete each one (-exec rm "{}").

The expression {} replace the name of the file, and, "{}" set the filename as string, including spaces or special characters.


A bit safer version than using xargs, also not recursive: ls -p | grep -v '/$' | grep '\.pdf$' | while read file; do rm "$file"; done

Filtering our directories here is a bit unnecessary as 'rm' won't delete it anyway, and it can be removed for simplicity, but why run something that will definitely return error?


And another one:

cd  /path/to/pdf
printf "%s\0" *.[Pp][Dd][Ff] | xargs -0 rm

printf is a shell builtin, and as far as I know it's always been as such. Now given that printf is not a shell command (but a builtin), it's not subject to "argument list too long ..." fatal error.

So we can safely use it with shell globbing patterns such as *.[Pp][Dd][Ff], then we pipe its output to remove (rm) command, through xargs, which makes sure it fits enough file names in the command line so as not to fail the rm command, which is a shell command.

The \0 in printf serves as a null separator for the file names wich are then processed by xargs command, using it (-0) as a separator, so rm does not fail when there are white spaces or other special characters in the file names.


The rm command has a limitation of files which you can remove simultaneous.

One possibility you can remove them using multiple times the rm command bases on your file patterns, like:

rm -f A*.pdf
rm -f B*.pdf
rm -f C*.pdf
...
rm -f *.pdf

You can also remove them through the find command:

find . -name "*.pdf" -exec rm {} \;

find has a -delete action:

find . -maxdepth 1 -name '*.pdf' -delete

I had the same problem with a folder full of temporary images that was growing day by day and this command helped me to clear the folder

find . -name "*.png" -mtime +50 -exec rm {} \;

The difference with the other commands is the mtime parameter that will take only the files older than X days (in the example 50 days)

Using that multiple times, decreasing on every execution the day range, I was able to remove all the unnecessary files


Another answer is to force xargs to process the commands in batches. For instance to delete the files 100 at a time, cd into the directory and run this:

echo *.pdf | xargs -n 100 rm


I have faced a similar problem when there were millions of useless log files created by an application which filled up all inodes. I resorted to "locate", got all the files "located"d into a text file and then removed them one by one. Took a while but did the job!


What about a shorter and more reliable one?

for i in **/*.pdf; do rm "$i"; done

I found that for extremely large lists of files (>1e6), these answers were too slow. Here is a solution using parallel processing in python. I know, I know, this isn't linux... but nothing else here worked.

(This saved me hours)

# delete files
import os as os
import glob
import multiprocessing as mp

directory = r'your/directory'
os.chdir(directory)


files_names = [i for i in glob.glob('*.{}'.format('pdf'))]

# report errors from pool

def callback_error(result):
    print('error', result)

# delete file using system command
def delete_files(file_name):
     os.system('rm -rf ' + file_name)

pool = mp.Pool(12)  
# or use pool = mp.Pool(mp.cpu_count())


if __name__ == '__main__':
    for file_name in files_names:
        print(file_name)
        pool.apply_async(delete_files,[file_name], error_callback=callback_error)

The below option seems simple to this problem. I got this info from some other thread but it helped me.

for file in /usr/op/data/Software/temp/application/openpages-storage/*; do
    cp "$file" /opt/sw/op-storage/
done

Just run the above one command and it will do the task.


Try this also If you wanna delete above 30/90 days (+) or else below 30/90(-) days files/folders then you can use the below ex commands

Ex: For 90days excludes above after 90days files/folders deletes, it means 91,92....100 days

find <path> -type f -mtime +90 -exec rm -rf {} \;

Ex: For only latest 30days files that you wanna delete then use the below command (-)

find <path> -type f -mtime -30 -exec rm -rf {} \;

If you wanna giz the files for more than 2 days files

find <path> -type f -mtime +2 -exec gzip {} \;

If you wanna see the files/folders only from past one month . Ex:

find <path> -type f -mtime -30 -exec ls -lrt {} \;

Above 30days more only then list the files/folders Ex:

find <path> -type f -mtime +30 -exec ls -lrt {} \;

find /opt/app/logs -type f -mtime +30 -exec ls -lrt {} \;

i was facing same problem while copying form source directory to destination

source directory had files ~3 lakcs

i used cp with option -r and it's worked for me

cp -r abc/ def/

it will copy all files from abc to def without giving warning of Argument list too long


For remove first 100 files:

rm -rf 'ls | head -100'


Examples related to linux

grep's at sign caught as whitespace How to prevent Google Colab from disconnecting? "E: Unable to locate package python-pip" on Ubuntu 18.04 How to upgrade Python version to 3.7? Install Qt on Ubuntu Get first line of a shell command's output Cannot connect to the Docker daemon at unix:/var/run/docker.sock. Is the docker daemon running? Run bash command on jenkins pipeline How to uninstall an older PHP version from centOS7 How to update-alternatives to Python 3 without breaking apt?

Examples related to unix

Docker CE on RHEL - Requires: container-selinux >= 2.9 What does `set -x` do? How to find files modified in last x minutes (find -mmin does not work as expected) sudo: npm: command not found How to sort a file in-place How to read a .properties file which contains keys that have a period character using Shell script gpg decryption fails with no secret key error Loop through a comma-separated shell variable Best way to find os name and version in Unix/Linux platform Resource u'tokenizers/punkt/english.pickle' not found

Examples related to command-line-arguments

How to pass arguments to Shell Script through docker run How can I pass variable to ansible playbook in the command line? Use Robocopy to copy only changed files? mkdir's "-p" option What is Robocopy's "restartable" option? How do you run a .exe with parameters using vba's shell()? Bash command line and input limit Check number of arguments passed to a Bash script Parsing boolean values with argparse Import SQL file by command line in Windows 7