Sorry to insist but I am not aware of a small file problem with ZFS (which doesn''t mean there isn''t one, nor that we agree on definition of ''problem''). So if anyone has data on this topic, I''m interested. Also note, ZFS does a lot more than VxFS. -r Claude Teissedre writes: > Hello Roch, > > Thanks for your reply. According to Iozone and Filebench > (http://blogs.sun.com/dom/), ZFS is less performant than VxFS for smalll > files and more performant for large files. In you blog, I don''t see > specific infos related to small files -but it''s a very interesting blog. > > Any help from CC: people related to Perforce benchmark (not in > techtracker) is welcome. > > Thanks, > Clausde > > Roch - PAE a ?crit : > > Salut Claude. > > For this kind of query, try zfs-discuss at opensolaris.org; > > Looks like a common workload to me. > > I know of no small file problem with ZFS. > > You might want to state your metric of success ? > > > > -r > > > > Claude Teissedre writes: > > > Hello, > > > > > > I am looking for any benchmark of Perforce on ZFS. > > > My need here is specifically for Perforce, a source manager. At my ISV, it handles 250 users simustaneously (15 instances on average) > > > and 16 Millions (small) files. That''s an area not covered in the benchmaks I have seen. > > > > > > Thanks, Claude > > > > > > > > > > >
Roch what''s the minimum allocation size for a file in zfs? I get 1024B by my calculation (1 x 512B block allocation (minimum) + 1 x 512B inode/ znode allocation) since we never pack file data in the inode/znode. Is this a problem? Only if you''re trying to pack a lot files small byte files in a limited amount of space, or if you''re concerned about trying to access many small files quickly. VxFS has a 96B "immediate area" for file, symlink, or directory data; NTFS can store small files in the MFT record; NetApp WAFL can also store small files in the 4KB inode (16 Block pointers = 128B?) .. if you look at some of the more recent OSD papers and some of the Lustre/ BlueArc work you''ll see that this topic comes into play for performance in pre-fetching file data and locality issues for optimizing heavy access of many small files. --- .je On Feb 20, 2007, at 05:12, Roch - PAE wrote:> > Sorry to insist but I am not aware of a small file problem > with ZFS (which doesn''t mean there isn''t one, nor that we > agree on definition of ''problem''). So if anyone has data on > this topic, I''m interested. > > Also note, ZFS does a lot more than VxFS. > > -r > > Claude Teissedre writes: >> Hello Roch, >> >> Thanks for your reply. According to Iozone and Filebench >> (http://blogs.sun.com/dom/), ZFS is less performant than VxFS for >> smalll >> files and more performant for large files. In you blog, I don''t see >> specific infos related to small files -but it''s a very interesting >> blog. >> >> Any help from CC: people related to Perforce benchmark (not in >> techtracker) is welcome. >> >> Thanks, >> Clausde >> >> Roch - PAE a ?crit : >>> Salut Claude. >>> For this kind of query, try zfs-discuss at opensolaris.org; >>> Looks like a common workload to me. >>> I know of no small file problem with ZFS. >>> You might want to state your metric of success ? >>> >>> -r >>> >>> Claude Teissedre writes: >>>> Hello, >>>> >>>> I am looking for any benchmark of Perforce on ZFS. >>>> My need here is specifically for Perforce, a source manager. At >>>> my ISV, it handles 250 users simustaneously (15 instances on >>>> average) >>>> and 16 Millions (small) files. That''s an area not covered in the >>>> benchmaks I have seen. >>>> >>>> Thanks, Claude >>>> >>>> >>> >>> >> > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Feb 20, 2007, at 15:05, Krister Johansen wrote:>> what''s the minimum allocation size for a file in zfs? I get 1024B by >> my calculation (1 x 512B block allocation (minimum) + 1 x 512B inode/ >> znode allocation) since we never pack file data in the inode/znode. >> Is this a problem? Only if you''re trying to pack a lot files small >> byte files in a limited amount of space, or if you''re concerned about >> trying to access many small files quickly. > > This is configurable on a per-dataset basis. The look in zfs(1m) for > recordsize.the minimum is still 512B .. (try creating a bunch of 10B files - they show up as ZFS plain files each with a 512B data block in zdb)>> VxFS has a 96B "immediate area" for file, symlink, or directory data; >> NTFS can store small files in the MFT record; NetApp WAFL can also >> store small files in the 4KB inode (16 Block pointers = 128B?) .. if >> you look at some of the more recent OSD papers and some of the >> Lustre/ >> BlueArc work you''ll see that this topic comes into play for >> performance in pre-fetching file data and locality issues for >> optimizing heavy access of many small files. > > ZFS has something similar. It''s called a bonus buffer.i see .. but currently we''re only storing symbolic links there since given the bufsize of 320B - the znode_phys struct of 264B, we''ve only got 56B left for data in the 512B dnode_phys struct .. i''m thinking we might want to trade off some of the uint64_t meta attributes with something smaller and maybe eat into the pad to get a bigger data buffer .. of course that will also affect the reporting end of things, but should be easily fixable. just my 2p --- .je
So Jonathan, you have a concern about the on-disk space efficiency for small file (more or less subsector). It is a problem that we can throw rust at. I am not sure if this is the basis of Claude''s concern though. Creating small files, last week I did a small test. With ZFS I can create 4600 files _and_ sync up the pool to disk and saw no more than 500 I/Os. I''m no FS expert but this looks absolutely amazing to me (ok, I''m rather enthousiastic in general). Logging UFS needs 1 I/O per file (so ~10X more for my test). I don''t know where other filesystems are on that metric. I also pointed out that ZFS is not too CPU efficient at tiny write(2) syscalls. But this inefficiency rescinds around 8K writes. This here is a CPU benchmark (I/O is non-factor) : CHUNK ZFS vz UFS 1B 4X slower 1K 2X slower 8K 25% slower 32K equal 64K 30% faster Waiting for a more specific problem statement, I can only stick to what I said, I know of no small file problems with ZFS; If there is one, I''d just like to see the data. -r
Perforce is based upon berkely db (some early version), so standard "database XXX on ZFS" techniques are relevant. For example, putting the journal file on a different disk than the table files. There are several threads about optimizing databases under ZFS. If you need a screaming perforce server, talk to IC Manage, Inc. who is a VAR of Perforce. They also have added the ability to do remote replication, etc. so you can can have servers local to the end users in an enterprise environment. It seems to me that the network is usually the limiting factor in Perforce transactions, though operations like "fstat" and "have" shouldn''t be overused because they are very table taxing. Later Perforce versions have reduced the amount of table and record locking that goes on so you might find improvement just by upgrading both servers and clients (the server operations downgrade to match the version of the client). All this said, I''d love to see experiments done with perforce on ZFS. It would help us all tune ZFS for these kinds of applications. Gary This message posted from opensolaris.org