James B. Byrne
2010-Jan-25 14:48 UTC
[CentOS] [Fwd: Re: The directory that I am trying to clean up is huge]
On Sat, January 23, 2010 20:21, Robert Nichols wrote:> Robert Heller wrote:> > Gosh, then I guess the manpage for 'find' must be totally wrong > where it > says: > > -exec command ; > ... > The specified command is run once for each matched > file. >Not wrong, just not very explicit regarding process. The man page does not say that find acts upon each file as it is found, only that it acts upon each file that is found. Neither does the man page speak to the limits of the kernel configuration (MAX_ARG_PAGES) implicitly used by cp, find, ls, etc. The problem you have is that the selection of all qualified files is completed before any are acted on by find. So, in your case of an overlarge collection, the page limit is exceeded before any are deleted. Taking each file as it is found and piping it to an external rm command explicitly defines the process as find, delete and find again, and thereby avoids hitting the page limit. CentOS-5.3 was supposed to address this issue:> Previously, the MAX_ARG_PAGES limit that is set in the kernel > was too low, and may have resulted in the following error: > > execve: Argument list too long > > In this update, this limit has been increased to 25 percent of > the stack size, which resolves this issue.So, perhaps if you update to 5.3+ the problem might go away? Although, in my opinion, piping find results through xargs is far more reliable and portable. Regards, -- *** E-Mail is NOT a SECURE channel *** James B. Byrne mailto:ByrneJB at Harte-Lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3
Robert Nichols
2010-Jan-25 15:33 UTC
[CentOS] [Fwd: Re: The directory that I am trying to clean up is huge]
James B. Byrne wrote: > On Sat, January 23, 2010 20:21, Robert Nichols wrote: >> Robert Heller wrote: > >> Gosh, then I guess the manpage for 'find' must be totally wrong >> where it >> says: >> >> -exec command ; >> ... >> The specified command is run once for each matched >> file. >> > > Not wrong. The man page on find simply does not speak to the limits > of the kernal configuration (MAX_ARG_PAGES) implicitly used by cp, > find, ls, etc. It just lives within its means and fails when these > do not suffice. > > The problem you have is that the selection of all qualified files is > completed before any are acted on by find. So, in the case of > overlarge collections, the page limit is exceeded before any are > deleted. Taking each file as it is found and piping it to an > external rm command avoids hitting the page limit. When using the -exec action with the ";" terminator, the constructed command line always contains the path for exactly one matched file. Try it. Run "find /usr -exec echo {} ;" and see that you get one path per line and output begins almost instantly. Do you really believe that 'find' searched the entire /usr tree in that time? Now if the "{}" string appears more than once then the command line contains that path more than once, but it is essentially impossible to exceed the kernel's MAX_ARG_PAGES this way. The only issue with using "-exec command {} ;" for a huge number of files is one of performance. If there are 100,000 matched files, the command will be invoked 100,000 times. -- Bob Nichols "NOSPAM" is really part of my email address. Do NOT delete it.