I hope someone familiar with the way Linux processes files can enlighten me on the following: I recently replaced an old Windows 2000 server with a new machine running CentOS 5.2. It uses Samba 3.2.7 to serve a network of Windows XP clients. We are a newspaper. We use Acrobat Distiller to batch-convert a folder of single-page PostScript files (for print) to a multipage PDF file (for electronic distribution). Running on a workstation, Distiller watches the folder on a Samba share and does the conversion, automatically creating bookmarks, indexes and other information. On the Windows server, Distiller processes the files by filename order: M09010901A001C.ps M09010901A002C.ps M09010901A003C.ps ... and so on. On the Linux server, Distiller processes the files in an order that seems arbitrary, for example: M09010901A021C.ps M09010901A005C.ps M09010901A015C.ps ... and so on. The order Distiller uses is NOT related to the time stamp of the files. I tried to copy the files to the watched folder one by one in the correct order; the result is the same. This creates the need to open the final PDF and reshuffle the pages by hand, which is very time consuming and prone to error. There is a workaround to this: use the runfilex script that comes with Acrobat: it can contain a list of files to convert, in the order you want. Unfortunately, this is not acceptable for us since the process then takes about 40 minutes (irrespective of platform or filesystem), instead of 3 or 4 minutes. My question is: how is the order of files determined by Linux when a particular order is not explicitly required by a program? I noted the following: I have 4 files in a folder: file1.ps, file2.ps, file3.ps, file4.ps. When I order them by date, they appear in Windows Explorer in, say, the following order: 3, 4, 1, 2 If I copy them to a new folder one by one in the order 1, 2, 3, 4, they will still appear in the order 3, 4, 1, 2 when ordered by date. So, what information is transported with the files that makes the Linux server present them to the world in this order? Does someone know a workaround to this situation or can someone point me to information about file ordering with Linux? By the way, I am using the EXT3 file system. I tried the same on a VFAT file system and the result is the same. It seems to be a Linux thing, not a file system thing. Thank you for your patience.
On Thu, 22 Jan 2009 20:28:41 +0000 Miguel Medalha wrote:> My question is: how is the order of files determined by Linux when a > particular order is not explicitly required by a program?http://www.linuxforums.org/forum/linux-newbie/111044-change-order-files-directory.html I have no idea if the script posted there works or not but I found that with a quick google search. -- MELVILLE THEATRE ~ Melville Sask ~ http://www.melvilletheatre.com DRY CLEANER BUSINESS FOR SALE ~ http://www.canadadrycleanerforsale.com
Miguel Medalha wrote:> I hope someone familiar with the way Linux processes files can enlighten > me on the following: > > I recently replaced an old Windows 2000 server with a new machine > running CentOS 5.2. It uses Samba 3.2.7 to serve a network of Windows XP > clients. > > We are a newspaper. We use Acrobat Distiller to batch-convert a folder > of single-page PostScript files (for print) to a multipage PDF file (for > electronic distribution). > Running on a workstation, Distiller watches the folder on a Samba share > and does the conversion, automatically creating bookmarks, indexes and > other information. > > On the Windows server, Distiller processes the files by filename order: > > M09010901A001C.ps > M09010901A002C.ps > M09010901A003C.ps > > ... and so on. > > On the Linux server, Distiller processes the files in an order that > seems arbitrary, for example: > > M09010901A021C.ps > M09010901A005C.ps > M09010901A015C.ps > > ... and so on. > > The order Distiller uses is NOT related to the time stamp of the files. > I tried to copy the files to the watched folder one by one in the > correct order; the result is the same.Programs that read directories on their own normally find files in the order that they happen to appear in the directory. In a newly created directory, that would likely be in the order that the files were added, but in existing directories, slots previously used and now free may be reused in any order and this may not be consistent across filesystem types. If you are processing on the linux side and not via samba, and your program will take a list of files on the command line instead of groveling through the directory itself, you might simply start it with a wild-card filename on the command line. The shell will sort the list as it expands it so programs see the sorted list.> There is a workaround to this: use the runfilex script that comes with > Acrobat: it can contain a list of files to convert, in the order you > want. Unfortunately, this is not acceptable for us since the process > then takes about 40 minutes (irrespective of platform or filesystem), > instead of 3 or 4 minutes.That's very strange. Maybe you should look for a different tool. Won't ghostscript/psutils or OOo do this? -- Les Mikesell lesmikesell at gmail.com
Miguel Medalha wrote:> I hope someone familiar with the way Linux processes files can enlighten > me on the following: > ... > On the Windows server, Distiller processes the files by filename order: > > M09010901A001C.ps > M09010901A002C.ps > M09010901A003C.ps >Windows NTFS uses B-Tree for its directories so they are inherently alphabetically sorted.
On Thu, 2009-01-22 at 14:06 -0800, John R Pierce wrote:> Miguel Medalha wrote: > > I hope someone familiar with the way Linux processes files can enlighten > > me on the following: > > ... > > On the Windows server, Distiller processes the files by filename order: > > > > M09010901A001C.ps > > M09010901A002C.ps > > M09010901A003C.ps > > > > Windows NTFS uses B-Tree for its directories so they are inherently > alphabetically sorted.If the linux FS is efs2, maybe the "dir_index" option of mke2fs will doo what you want? See "man mke2fs". It says it uses hashed b-trees, but for speed.> <snip>HTH -- Bill
On Thu, 2009-01-22 at 20:28 +0000, Miguel Medalha wrote:> I hope someone familiar with the way Linux processes files can enlighten > me on the following: > > I recently replaced an old Windows 2000 server with a new machine > running CentOS 5.2. It uses Samba 3.2.7 to serve a network of Windows XP > clients. > > We are a newspaper. We use Acrobat Distiller to batch-convert a folder > of single-page PostScript files (for print) to a multipage PDF file (for > electronic distribution). > Running on a workstation, Distiller watches the folder on a Samba share > and does the conversion, automatically creating bookmarks, indexes and > other information. > > On the Windows server, Distiller processes the files by filename order: > > M09010901A001C.ps > M09010901A002C.ps > M09010901A003C.ps > > ... and so on. > > On the Linux server, Distiller processes the files in an order that > seems arbitrary, for example: > > M09010901A021C.ps > M09010901A005C.ps > M09010901A015C.ps > > ... and so on. > > The order Distiller uses is NOT related to the time stamp of the files. > I tried to copy the files to the watched folder one by one in the > correct order; the result is the same. > > This creates the need to open the final PDF and reshuffle the pages by > hand, which is very time consuming and prone to error. > > There is a workaround to this: use the runfilex script that comes with > Acrobat: it can contain a list of files to convert, in the order you > want. Unfortunately, this is not acceptable for us since the process > then takes about 40 minutes (irrespective of platform or filesystem), > instead of 3 or 4 minutes. > > My question is: how is the order of files determined by Linux when a > particular order is not explicitly required by a program? > > I noted the following: > > I have 4 files in a folder: file1.ps, file2.ps, file3.ps, file4.ps. When > I order them by date, they appear in Windows Explorer in, say, the > following order: 3, 4, 1, 2 > If I copy them to a new folder one by one in the order 1, 2, 3, 4, they > will still appear in the order 3, 4, 1, 2 when ordered by date. So, what > information is transported with the files that makes the Linux server > present them to the world in this order? > > Does someone know a workaround to this situation or can someone point me > to information about file ordering with Linux? By the way, I am using > the EXT3 file system. I tried the same on a VFAT file system and the > result is the same. It seems to be a Linux thing, not a file system thing.---- You might want to look closely at the file names in Linux. Windows is not case sensitive but Linux is. In Windows, you cannot create the 2 files, TEST.DOC and test.doc in the same directory but in Linux you can. It may be that some of these files are stored differently as in file1.ps and FILE2.PS etc. Also, you might want to check out some alternate settings... dos filemode = yes (Share setting only) case sensitive = no (share setting only) default case = lower (share setting only) Craig
> If the linux FS is efs2, maybe the "dir_index" option of mke2fs will doo > what you want? See "man mke2fs". It says it uses hashed b-trees, but for > speed. >That is the kind of information I am looking for. Thank you!
> You might want to look closely at the file names in Linux. > > Windows is not case sensitive but Linux is. > > In Windows, you cannot create the 2 files, TEST.DOC and test.doc in the > same directory but in Linux you can. It may be that some of these files > are stored differently as in file1.ps and FILE2.PS etc. > > Also, you might want to check out some alternate settings... > > > dos filemode = yes (Share setting only) > case sensitive = no (share setting only) > default case = lower (share setting only) > >I am aware of the differences in case treatment between Linux and Windows. This is not related to case. The filenames in question are automated and ALWAYS take the following form: M09010901A001C.ps M09010901A002C.ps M09010901A003C.ps etc, etc. (By the way, the Samba share settings are not "share only". According to the man pages, share settings can be used globally. The inverse is not true: global settings can only be used globally.) Thank you for your answer, though.
On Thu, Jan 22, 2009 at 3:20 PM, Les Mikesell <lesmikesell at gmail.com> wrote:> The quick/dirty fix might be to cifs-mount a windows directory where the > linux side wants to see it and let the windows side work natively if > that gives the behavior you want. Using the automounter might help if > the windows side is not always available.If you go that route, this wiki has useful tips: http://wiki.centos.org/TipsAndTricks/WindowsShares See section "3. Even-better method" Akemi
I just verified the filesystem features with tune2fs -l and the dir_index feature is already present. So, no luck here.
On Thu, Jan 22, 2009 at 08:28:41PM +0000, Miguel Medalha wrote:> > My question is: how is the order of files determined by Linux when a > particular order is not explicitly required by a program?There is not ordering in POSIX filesystems. If you want an ordered list you must sort them yourself. This isn't guarenteed in Windows either btw. Someone has posted a Samba VFS that will sort directory output in alphabetical order (but only for the current locale). You could examine that. Jeremy.
> Someone has posted a Samba VFS that will sort directory > output in alphabetical order (but only for the current > locale). You could examine that. >http://www.mail-archive.com/samba@lists.samba.org/msg98048.html
On Thu, Jan 22, 2009 at 09:20:42PM -0500, John Drescher wrote:> > Someone has posted a Samba VFS that will sort directory > > output in alphabetical order (but only for the current > > locale). You could examine that. > > > > http://www.mail-archive.com/samba@lists.samba.org/msg98048.htmlFYI: This is still in my inbox to get into upstream :-) Volker -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://lists.samba.org/archive/samba/attachments/20090123/f3ea122e/attachment.bin
> -----Original Message----- > From: centos-bounces at centos.org > [mailto:centos-bounces at centos.org] On Behalf Of Miguel Medalha > Sent: Thursday, January 22, 2009 3:29 PM > To: CentOS mailing list; samba at lists.samba.org > Subject: [CentOS] OT? File order on CentOS/Samba serverhttp://code.google.com/p/samba-dirsort-vfs/ Did you try that? I think someone recommended it to you. If it does indeed work which I do not think it will for your situation, send me a personal mail. Although I think your real problem lies in your processing software in the file ordering. I would have a really good look at the software doing it. Why because The Gimp can do this with no problem and it is OSS (file ordering). JohnStanley
Miguel Medalha
2009-Jan-24 21:54 UTC
[CentOS] [Samba] OT? File order on CentOS/Samba server -- FINALLY SOLVED
I just turned dir_index OFF with tune2fs. Now the directory order is the same as the inode order. This makes the order of files predictable and in fact turns out to solve my problem. With dir_index turned OFF on that filesystem, when a copy is made to another directory (even from Windows on a Samba share) the alphanumeric order is preserved. I will just ask the workstation operators to copy the PS files to a new folder when they are all ready. Distiller is watching that folder and will process the files in the normal way, using the rundirex file. This solution is even better than the initial situation: since we can now predict the order in which the pages will be processed, we can manipulate the order at will by doing multi-phased copies to the folder, in any order we want, instead of being limited to the alphanumeric one provided by NTFS :-) So "dir_index ON" (and my ignorance of the inner workings of EXT3) was to blame for this confusion, from the beginning! What a trip this was (sometimes in circles)! Thank you very much to all who contributed! Great community!
Miguel Medalha
2009-Jan-25 18:43 UTC
[CentOS] OT? File order on CentOS/Samba server -- SOLVED (kind of...)
>>> Well, I did try to compile it but make fails on all the Linux computers >>> I have access to. They all run CentOS 5.2. It would be nice to have a >>> .rpm... I am a sysadmin, not a programmer, I am not able to solve most >>> compile errors. >>> > > I will have a hack at compiling it later on because I am very > interested in it. If I manage to get it rolling I will send out a mail > to you and update the thread here on the list. I have had great > success with the clamav vfs module. > >That would be GREAT! Thank you!