Robert Buck
2020-Oct-06 19:11 UTC
[Samba] Performance Question: Lots of Small Files vs One Large File
Is this a protocol issue? A decade ago I saw writes to small files less than 16k were awful, because the cost of opening and other file ops dwarfed writing actual content. So very small files should be slow, but these files are 10mb each. Or is this a Windows issue? If so, what's causing the problem? Just trying to understand. On Tue, Oct 6, 2020 at 2:39 PM Ralph Boehme <slow at samba.org> wrote:> Am 10/6/20 um 7:49 PM schrieb Robert Buck via samba: > > In an architecture where we have Samba running on RHEL exposing shares to > > Windows, when we have 1 large 1GB file, the write performance to storage > is > > very fast, even over distances of 5000 miles. However, even writes to > local > > Samba servers, with 100 10MB files being copied onto a shared drive, > > Windows Explorer is MUCH slower. I don't know if it's really Samba, but > > more than likely Windows. > > > > Does anyone on channel have experience with copying multiple small files > > onto shares? Are there any ways to make copying files onto a share > faster? > > to save you a lot rocking back and forth, yes, small files workloads > will be slower. The smaller, the slower... :) > > -slow > > -- > Ralph Boehme, Samba Team https://samba.org/ > Samba Developer, SerNet GmbH https://sernet.de/en/samba/ > GPG-Fingerprint FAE2C6088A24252051C559E4AA1E9B7126399E46 > > --BOB BUCK SENIOR PLATFORM SOFTWARE ENGINEER SKIDMORE, OWINGS & MERRILL 7 WORLD TRADE CENTER 250 GREENWICH STREET NEW YORK, NY 10007 T (212) 298-9624 ROBERT.BUCK at SOM.COM
Ralph Boehme
2020-Oct-06 19:37 UTC
[Samba] Performance Question: Lots of Small Files vs One Large File
Am 10/6/20 um 9:11 PM schrieb Robert Buck:> Is this a protocol issue? A decade ago I saw writes to small files less > than 16k were awful, because the cost of opening and other file ops > dwarfed writing actual content. So very small files should be slow, but > these files are 10mb each. > > Or is this a Windows issue? If so, what's causing the problem?opening a file is expensive due to the WIndows semantics emulation layer in Samba. It's also done sequentially in the server which means there's not going to be much in-flight IO to keep the pipes full. -slow -- Ralph Boehme, Samba Team https://samba.org/ Samba Developer, SerNet GmbH https://sernet.de/en/samba/ GPG-Fingerprint FAE2C6088A24252051C559E4AA1E9B7126399E46 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: <http://lists.samba.org/pipermail/samba/attachments/20201006/3b45d1d5/signature.sig>
Reindl Harald
2020-Oct-06 20:15 UTC
[Samba] Performance Question: Lots of Small Files vs One Large File
Am 06.10.20 um 21:11 schrieb Robert Buck via samba:> Is this a protocol issue? > Or is this a Windows issue?no, it's a general issue even without networking, no matter windows, linux or mac - a ton of small files are always a lot slower than a large chunk and it was always that way it's unavoidable overhead> On Tue, Oct 6, 2020 at 2:39 PM Ralph Boehme <slow at samba.org> wrote: > >> Am 10/6/20 um 7:49 PM schrieb Robert Buck via samba: >>> In an architecture where we have Samba running on RHEL exposing shares to >>> Windows, when we have 1 large 1GB file, the write performance to storage >> is >>> very fast, even over distances of 5000 miles. However, even writes to >> local >>> Samba servers, with 100 10MB files being copied onto a shared drive, >>> Windows Explorer is MUCH slower. I don't know if it's really Samba, but >>> more than likely Windows. >>> >>> Does anyone on channel have experience with copying multiple small files >>> onto shares? Are there any ways to make copying files onto a share >> faster? >> >> to save you a lot rocking back and forth, yes, small files workloads >> will be slower. The smaller, the slower... :)
Robert Buck
2020-Oct-06 23:08 UTC
[Samba] Performance Question: Lots of Small Files vs One Large File
We're finding the issue is ctdb. We're looking at alternatives and charges in architecture. If your team has ideas in addition to what we're thinking of please let us know. Thanks On Tue, Oct 6, 2020 at 4:31 PM Reindl Harald <h.reindl at thelounge.net> wrote:> > > Am 06.10.20 um 21:11 schrieb Robert Buck via samba: > > Is this a protocol issue? > > Or is this a Windows issue? > > no, it's a general issue even without networking, no matter windows, > linux or mac - a ton of small files are always a lot slower than a large > chunk and it was always that way > > it's unavoidable overhead > > > On Tue, Oct 6, 2020 at 2:39 PM Ralph Boehme <slow at samba.org> wrote: > > > >> Am 10/6/20 um 7:49 PM schrieb Robert Buck via samba: > >>> In an architecture where we have Samba running on RHEL exposing shares > to > >>> Windows, when we have 1 large 1GB file, the write performance to > storage > >> is > >>> very fast, even over distances of 5000 miles. However, even writes to > >> local > >>> Samba servers, with 100 10MB files being copied onto a shared drive, > >>> Windows Explorer is MUCH slower. I don't know if it's really Samba, but > >>> more than likely Windows. > >>> > >>> Does anyone on channel have experience with copying multiple small > files > >>> onto shares? Are there any ways to make copying files onto a share > >> faster? > >> > >> to save you a lot rocking back and forth, yes, small files workloads > >> will be slower. The smaller, the slower... :) > > --BOB BUCK SENIOR PLATFORM SOFTWARE ENGINEER SKIDMORE, OWINGS & MERRILL 7 WORLD TRADE CENTER 250 GREENWICH STREET NEW YORK, NY 10007 T (212) 298-9624 ROBERT.BUCK at SOM.COM
Apparently Analagous Threads
- Performance Question: Lots of Small Files vs One Large File
- Performance Question: Lots of Small Files vs One Large File
- Performance Question: Lots of Small Files vs One Large File
- CTDB Question w/ Winbind
- How do you graph data when you have lots of small values but few extremely large values?