On Mon, Feb 16, 2015 at 6:47 AM, Eliezer Croitoru <eliezer at ngtech.co.il> wrote:> I am unsure I understand what you wrote. > "XFS will create multiple AG's across all of those > devices," > Are you comparing md linear/concat to md raid0? and that the upper level XFS > will run on top them?Yes to the first question, I'm not understanding the second question. Allocation groups are created at mkfs time. When the workload IO involves a lot of concurrency, XFS over linear will beat XFS or ext4 over raid0. Whereas for streaming performance workloads, striped raid will work better. If redundancy is needed, mdadm permits creation of 1+linear, as compared to 10. http://xfs.org/docs/xfsdocs-xml-dev/XFS_Filesystem_Structure/tmp/en-US/html/Allocation_Groups.html You can think of XFS on linear as being something like raid0 at a file level, rather than at a block level. On a completely empty file system if you start copying a pile of dozens or more (typically hundreds or thousands) of files in mail directories, XFS distributes those across AG's and hence across all drives, in parallel. ext4 would for the most part focus all writes to the first device until mostly full, then the 2nd device, then the 3rd. And on raid0 you'll get a bunch of disk contention that isn't really necessary because everyones files are striped across all drives. So contrary to popular opinion on XFS being mainly useful for large files, it's actually quite useful for concurrent read write workflows of small files on a many disk linear/concat arrangement. This extends to using raid1 + linear instead of raid10 if some redundancy is desired. -- Chris Murphy
Thanks Chris for the detailed response! I couldn't understand the complex sentence about XFS and was almost convinced that XFS might offer a new way to spread across multiple disks. And in this case it's mainly me and not you. Now I understand how a md linear/concat array can be exploited with XFS! Not related directly but given that XFS has commercial support, it can be an advantage over other file systems which are built to handle lots of small files but might not have commercial support. Eliezer On 16/02/2015 19:21, Chris Murphy wrote:> So contrary to popular opinion on XFS being mainly useful for large > files, it's actually quite useful for concurrent read write workflows > of small files on a many disk linear/concat arrangement. This extends > to using raid1 + linear instead of raid10 if some redundancy is > desired.
On Mon, Feb 16, 2015 at 10:21 AM, Chris Murphy <lists at colorremedies.com> wrote:> On Mon, Feb 16, 2015 at 6:47 AM, Eliezer Croitoru <eliezer at ngtech.co.il> wrote: > >> I am unsure I understand what you wrote. >> "XFS will create multiple AG's across all of those >> devices," >> Are you comparing md linear/concat to md raid0? and that the upper level XFS >> will run on top them? > > Yes to the first question, I'm not understanding the second question. > Allocation groups are created at mkfs time. When the workload IO > involves a lot of concurrency, XFS over linear will beat XFS or ext4 > over raid0. Whereas for streaming performance workloads, striped raid > will work better. If redundancy is needed, mdadm permits creation of > 1+linear, as compared to 10. > http://xfs.org/docs/xfsdocs-xml-dev/XFS_Filesystem_Structure/tmp/en-US/html/Allocation_Groups.html > > You can think of XFS on linear as being something like raid0 at a file > level, rather than at a block level. On a completely empty file system > if you start copying a pile of dozens or more (typically hundreds or > thousands) of files in mail directories, XFS distributes those across > AG's and hence across all drives, in parallel. ext4 would for the most > part focus all writes to the first device until mostly full, then the > 2nd device, then the 3rd. And on raid0 you'll get a bunch of disk > contention that isn't really necessary because everyones files are > striped across all drives. > > So contrary to popular opinion on XFS being mainly useful for large > files, it's actually quite useful for concurrent read write workflows > of small files on a many disk linear/concat arrangement. This extends > to using raid1 + linear instead of raid10 if some redundancy is > desired.The other plus is that growing linear arrays is cake. They just get added to the end of the concat, and xfs_growfs is used. Takes less than a minute. Whereas md raid0 grow means converting to raid4, then adding the device, then converting back to raid0. And further, linear grow can be any size drive, whereas clearly with raid0 the drive sizes must all be the same. -- Chris Murphy
On Mon, Feb 16, 2015 at 10:48 AM, Eliezer Croitoru <eliezer at ngtech.co.il> wrote:> Thanks Chris for the detailed response! > > I couldn't understand the complex sentence about XFS and was almost > convinced that XFS might offer a new way to spread across multiple disks.It's not new in XFS, it's behaved like this forever. But it's new in that other filesystems don't allocate this way. I'm not aware of any other filesystem that does. Btrfs could possibly be revised to do this somewhat more easily than other fs's since it also has a concept of allocation chunks. Right now its single data profile allocates in 1GB chunks until full, and the next 1GB chunk goes on the next device in a sequence mainly determined by free space. This is how it's able to use different sized devices (including raid0,1,5,6). So it can read files from multiple drives at the same time, but it tends to only write to one drive at a time (unless using one of the striping raid-like profiles). -- Chris Murphy
On 16/02/2015 22:29, Chris Murphy wrote:> The other plus is that growing linear arrays is cake. They just get > added to the end of the concat, and xfs_growfs is used. Takes less > than a minute. Whereas md raid0 grow means converting to raid4, then > adding the device, then converting back to raid0. And further, linear > grow can be any size drive, whereas clearly with raid0 the drive sizes > must all be the same.Nice! I have been learning about md arrays and have seen the details about growing operation but it's another aspect which I wasn't thinking about at first. For now I am not planning any storage but it might come handy later on. Thanks, Eliezer