thr3ads.net - Btrfs devel - btrfs RAID with enterprise SATA or SAS drives [May 2012]

If this information is useful, please help other people find it:
Share via:

Daniel Pocock

2012-May-09 22:01 UTC

btrfs RAID with enterprise SATA or SAS drives

There is various information about
- enterprise-class drives (either SAS or just enterprise SATA)
- the SCSI/SAS protocols themselves vs SATA
having more advanced features (e.g. for dealing with error conditions)
than the average block device

For example, Adaptec recommends that such drives will work better with
their hardware RAID cards:

http://ask.adaptec.com/cgi-bin/adaptec_tic.cfg/php/enduser/std_adp.php?p_faqid=14596
"Desktop class disk drives have an error recovery feature that will
result in a continuous retry of the drive (read or write) when an error
is encountered, such as a bad sector. In a RAID array this can cause the
RAID controller to time-out while waiting for the drive to respond."

and this blog:
http://www.adaptec.com/blog/?p=901
"major advantages to enterprise drives (TLER for one) ... opt for the
enterprise drives in a RAID environment no matter what the cost of the
drive over the desktop drive"

My question..

- does btrfs RAID1 actively use the more advanced features of these
drives, e.g. to work around errors without getting stuck on a bad block?

- if a non-RAID SAS card is used, does it matter which card is chosen?
Does btrfs work equally well with all of them?

- ignoring the better MTBF and seek times of these drives, do any of the
other features passively contribute to a better RAID experience when
using btrfs?

- for someone using SAS or enterprise SATA drives with Linux, I
understand btrfs gives the extra benefit of checksums, are there any
other specific benefits over using mdadm or dmraid?

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hubert Kario

2012-May-10 19:58 UTC

head link

Re: btrfs RAID with enterprise SATA or SAS drives

On Wednesday 09 of May 2012 22:01:49 Daniel Pocock
wrote:> There is various information about
> - enterprise-class drives (either SAS or just enterprise SATA)
> - the SCSI/SAS protocols themselves vs SATA
> having more advanced features (e.g. for dealing with error conditions)
> than the average block device
> 
> For example, Adaptec recommends that such drives will work better with
> their hardware RAID cards:
> 
> http://ask.adaptec.com/cgi-bin/adaptec_tic.cfg/php/enduser/std_adp.php?p_f
> aqid=14596 "Desktop class disk drives have an error recovery feature
that
> will result in a continuous retry of the drive (read or write) when an
> error is encountered, such as a bad sector. In a RAID array this can
> cause the RAID controller to time-out while waiting for the drive to
> respond."
> 
> and this blog:
> http://www.adaptec.com/blog/?p=901
> "major advantages to enterprise drives (TLER for one) ... opt for the
> enterprise drives in a RAID environment no matter what the cost of the
> drive over the desktop drive"
> 
> My question..
> 
> - does btrfs RAID1 actively use the more advanced features of these
> drives, e.g. to work around errors without getting stuck on a bad block?
There are no (short) timeouts that I know of
 > - if a non-RAID SAS card is used, does it matter which card is chosen?
> Does btrfs work equally well with all of them?
If you''re using btrfs RAID, you need a HBA, not a RAID card. If the
RAID
card can work as a HBA (usually labelled as JBOD mode) then you''re good
to
go.

For example, HP CCISS controllers can''t work in JBOD mode.

If you''re using the RAID feature of the card, then you need to look at 
general Linux support, btrfs doesn''t do anything other FS
don''t do with the
block devices.
 > - ignoring the better MTBF and seek times of these drives, do any of the
> other features passively contribute to a better RAID experience when
> using btrfs?
whatever they really have high MTBF values is debatable...

seek times do matter very much to btrfs, fast CPU is also a good thing to 
have with btrfs, especially if you want to use data compression, high node 
or leaf sizes
> - for someone using SAS or enterprise SATA drives with Linux, I
> understand btrfs gives the extra benefit of checksums, are there any
> other specific benefits over using mdadm or dmraid?
Because btrfs knows when the drive is misbeheaving (because of checksums) 
and is returning bad data, it can detect problems much faster then RAID 
(which doesn''t use the reduncancy for checking if the data
it''s returning is
actually correct). Both hardware and software RAID implementations depend on 
the drives to return IO errors. In effect, the data is safer on btrfs than 
regular RAID.

Besides that online resize (both shrinking and extending) and (currently not 
implemented) ability to set redundancy level on a per file basis.
In other words, with btrfs you can have a file with RAID6 redundancy and a 
second one with RAID10 level of redundancy in single directory.

Regards,
-- 
Hubert Kario
QBS - Quality Business Software
02-656 Warszawa, ul. Ksawerów 30/85
tel. +48 (22) 646-61-51, 646-74-24
www.qbs.com.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Duncan

2012-May-11 02:18 UTC

head link

Re: btrfs RAID with enterprise SATA or SAS drives

Daniel Pocock posted on Wed, 09 May 2012 22:01:49 +0000 as excerpted:
> There is various information about
> - enterprise-class drives (either SAS or just enterprise SATA)
> - the SCSI/SAS protocols themselves vs SATA having more advanced
> features (e.g. for dealing with error conditions)
> than the average block device
This isn''t a direct answer to that, but expressing a bit of concern
over
the implications of your question, that you''re planning on using btrfs
in
an enterprise class installation.

While various Enterprise Linux distributions do now officially
"support"
btrfs, it''s worth checking out exactly what that means in practice.

Meanwhile, in mainline Linux kernel terms, btrfs remains very much an 
experimental filesystem, as expressed by the kernel config option that 
turns btrfs on.  It''s still under very intensive development, with an 
error-fixing btrfsck only recently available and still coming with its 
own "may make the problems worse instead of fixing them" warning.  
Testers willing to risk the chance of data loss implied by that 
"experimental filesystem" label should be running the latest stable 
kernel at the oldest, and preferably the rcs by rc5 or so, as new kernels 
continue to fix problems in older btrfs code as well as introduce new 
features and if you''re running an older kernel, that means
you''re running
a kernel with known problems that are fixed in the latest kernel.

Experimental also has implications in terms of backups.  A good sysadmin 
always has backups, but normally, the working copy can be considered the 
primary copy, and there''s backups of that.  On an experimental
filesystem
under as intense continued development as btrfs, by contrast, it''s best
to consider your btrfs copy an extra "throwaway" copy only intended
for
testing.  You still have your primary copy, along with all the usual 
backups, on something less experimental, since you never know when/where/
how your btrfs testing will screw up its copy.

That''s not normally the kind of filesystem "enterprise class"
users are
looking for, unless of course they''re doing longer term testing, with
an
intent to actually deploy perhaps a year out, if the testing proves it 
robust enough by then.

And while it''s still experimental ATM, btrfs /is/ fast improving.  It 
/does/ now have a working fsck, even if it still comes with warnings, 
and  reasonable feature-set build-out should be within a few more kernels 
(raid5/6 mode is roadmapped for 3.5, and n-way-mirroring raid1/10 are 
roadmapped after that, current "raid1" mode is only 2-way mirroring, 
regardless of the number of drives).  After that, the focus should turn 
toward full stabilization.  So while btrfs is currently intended for 
testers only, by around the end of the year or early next, it will likely 
be reasonably stable and ready for at least the more adventurous 
conventional users.  Still, enterprise class users tend to be a 
conservative bunch, and I''d be surprised if they really consider btrfs 
ready before mid-year next year, at the earliest.

So if you''re looking to test btrfs on enterprise-class hardware, great!
But do be aware of what you''re getting into.  If you have an enterprise
distro which supports it too, even greater, but know what that actually 
means.  Does it mean they support the same level of 9s uptime on it as 
they normally do, or just that they''re ready to accept payment to try
and
recover things if something goes wrong?

If that hasn''t scared you off, and you''ve not read the wiki
yet, that''s
probably the next thing you should look at, as it answers a lot of 
questions you may have, as well as some you wouldn''t think to ask. 
Being
a wiki, of course, your own contributions are welcome.  In particular, 
you may well be able to cover some of the enterprise class viewpoint 
questions your asking based on your own testing, once you get to that 
point.

https://btrfs.wiki.kernel.org/

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Martin Steigerwald

2012-May-11 16:58 UTC

head link

Re: btrfs RAID with enterprise SATA or SAS drives

Am Freitag, 11. Mai 2012 schrieb Duncan:> Daniel Pocock posted on Wed, 09 May 2012 22:01:49 +0000 as excerpted:
> > There is various information about
> > - enterprise-class drives (either SAS or just enterprise SATA)
> > - the SCSI/SAS protocols themselves vs SATA having more advanced
> > features (e.g. for dealing with error conditions)
> > than the average block device
> 
> This isn''t a direct answer to that, but expressing a bit of
concern
> over  the implications of your question, that you''re planning on
using
> btrfs in an enterprise class installation.
> 
> While various Enterprise Linux distributions do now officially
> "support"  btrfs, it''s worth checking out exactly what
that means in
> practice.
> 
> Meanwhile, in mainline Linux kernel terms, btrfs remains very much an 
> experimental filesystem, as expressed by the kernel config option that 
> turns btrfs on.  It''s still under very intensive development, with
an
> error-fixing btrfsck only recently available and still coming with its 
> own "may make the problems worse instead of fixing them" warning.
> Testers willing to risk the chance of data loss implied by that 
> "experimental filesystem" label should be running the latest
stable
> kernel at the oldest, and preferably the rcs by rc5 or so, as new
> kernels  continue to fix problems in older btrfs code as well as
> introduce new features and if you''re running an older kernel, that
> means you''re running a kernel with known problems that are fixed
in
> the latest kernel.
> 
> Experimental also has implications in terms of backups.  A good
> sysadmin  always has backups, but normally, the working copy can be
> considered the primary copy, and there''s backups of that.  On an
> experimental filesystem under as intense continued development as
> btrfs, by contrast, it''s best to consider your btrfs copy an extra
> "throwaway" copy only intended for testing.  You still have your
> primary copy, along with all the usual backups, on something less
> experimental, since you never know when/where/ how your btrfs testing
> will screw up its copy.
Duncan, did you actually test BTRFS? Theory can´t replace real life 
experience.

From all of my personal BTRFS installations not one has gone corrupt - and 
I have at least four, while more of them are in use at my employer. Except 
maybe a scratch data BRTFS RAID 0 over lots of SATA disks. But maybe it 
would have been fixable by btrfs-zero-log which I didn´t know of back then. 
Another one needed a btrfs-zero-log, but that was quite some time ago.

Some of the installations are in use for more than a year AFAIR.

While I would still be reluctant with deploying BTRFS for a customer for 
critical data and I think Oracle´s and SUSE´s move to support it officially 
is a bit daring, I don´t think BTRFS is in a "throwaway copy" state 
anymore.

As usual regular backups are important…

-- 
Martin ''Helios'' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Duncan

2012-May-14 08:38 UTC

head link

Re: btrfs RAID with enterprise SATA or SAS drives

Martin Steigerwald posted on Fri, 11 May 2012 18:58:05 +0200 as excerpted:

Martin Steigerwald posted on Fri, 11 May 2012 18:58:05 +0200 as excerpted:
> Am Freitag, 11. Mai 2012 schrieb Duncan:
>> Daniel Pocock posted on Wed, 09 May 2012 22:01:49 +0000 as excerpted:
>> > There is various information about - enterprise-class drives
>> This isn''t a direct answer to that, but expressing a bit of
concern
>> over  the implications of your question, that you''re planning
on using
>> btrfs in an enterprise class installation.
>> [In] mainline Linux kernel terms, btrfs remains very much an
>> experimental filesystem
>> On an experimental filesystem under as intense continued development as
>> btrfs, by contrast, it''s best to consider your btrfs copy an
extra
>> "throwaway" copy only intended for testing.  You still have
your
>> primary copy, along with all the usual backups, on something less
>> experimental, since you never know when/where/ how your btrfs testing
>> will screw up its copy.
> 
> Duncan, did you actually test BTRFS? Theory can´t replace real life
> experience.
I /had/ been waiting until the n-way-mirrored-raid1 roadmapped for after
raid5/6 mode (which should hit 3.5, I believe), but hardware issues
intervened and I''m no longer using those older 4-way md/raid drives as
primary.

And now that I have it, present personal experience does not contradict
what I posted.  btrfs does indeed work reasonably well under reasonably
good, non-stressful, conditions.  But my experience so far aligns quite
well with the "consider the btrfs copy a throw-away copy, just in
case"
recommendation.  Just because it''s a throw-away copy doesn''t
mean you''ll
have to have to resort to the "good" copy elsewhere, but it DOES
hopefully
mean that you''ll have both a "good" copy elsewhere, and a
backup for that
supposedly good copy, just in case btrfs does go bad,
and that supposedly good primary copy, ends up not being good after all.
> From all of my personal BTRFS installations not one has gone corrupt -
> and I have at least four, while more of them are in use at my employer.
> Except maybe a scratch data BRTFS RAID 0 over lots of SATA disks. But
> maybe it would have been fixable by btrfs-zero-log which I didn´t know
> of back then. Another one needed a btrfs-zero-log, but that was quite
> some time ago.
> 
> Some of the installations are in use for more than a year AFAIR.
> 
> While I would still be reluctant with deploying BTRFS for a customer for
> critical data
This was actually my point in this thread.  If someone''s asking
questions
about enterprise quality hardware, they''re not likely to run into some
of
the bugs I''ve been having recently that have been exposed by hardware
issues.  However, they''re also far more likely to be considering btrfs
for
a row-of-nines uptime application, which is, after all, where some of
btrfs'' features are normally found.  Regardless of whether btrfs is
past
the "throw away data experimental class" stage or not, I think we both
agree it isn''t ready for row-of-nines-uptime applications just yet.  If
he''s just testing btrfs on such equipment for possible future
row-of-nines-uptime deployment a year or possibly two out, great.  If
he''s
looking at such a deployment two-months-out, no way, and it looks like you
agree.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Daniel Pocock

2012-May-18 16:19 UTC

head link

Re: btrfs RAID with RAID cards (thread renamed)

>> - if a non-RAID SAS card is used, does it matter which card is chosen?
>> Does btrfs work equally well with all of them?
> 
> If you''re using btrfs RAID, you need a HBA, not a RAID card. If
the RAID
> card can work as a HBA (usually labelled as JBOD mode) then you''re
good to
> go.
> 
> For example, HP CCISS controllers can''t work in JBOD mode.
Would you know if they implement their own checksumming, similar to what
btrfs does?  Or if someone uses SmartArray (CCISS) RAID1, then they
simply don''t get the full benefit of checksumming under any possible
configuration?

I''ve had a quick look at what is on the market, here are some
observations:

- in many cases, IOPS (critical for SSDs) vary wildly: e.g.
  - SATA-3 SSDs advertise up to 85k IOPS, so RAID1 needs 170k IOPS
  - HP''s standard HBAs don''t support high IOPS
  - HP Gen8 SmartArray (e.g. P420) claims up to 200k IOPS
  - previous HP arrays (e.g. P212) support only 60k IOPS
  - many vendors don''t advertise the IOPS prominently - I had to Google
the HP site to find those figures quoted in some PDFs, they don''t quote
them in the quickspecs or product summary tables

- Adaptec now offers an SSD caching function in hardware, supposedly
drop it in the machine and all disks respond faster
  - how would this interact with btrfs checksumming?  E.g. I''m guessing
it would be necessary to ensure that data from both spindles is not
cached on the same SSD?
  - I started thinking about the possibility that data is degraded on
the mechanical disk but btrfs gets a good checksum read from the SSD and
remains blissfully unaware that the real disk is failing, then the other
disk goes completely offline one day, for whatever reason the data is
not in the SSD cache and the sector can''t be read reliably from the
remaining physical disk - should such caching just be avoided or can it
be managed from btrfs itself in a manner that is foolproof?

How about the combination of btrfs/root/boot filesystems and grub?  Can
they all play nicely together?  This seems to be one compelling factor
with hardware RAID, the cards have a BIOS that can boot from any drive
even if the other is offline.




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Btrfs devel - May 2012 - btrfs RAID with enterprise SATA or SAS drives

btrfs RAID with enterprise SATA or SAS drives

Re: btrfs RAID with enterprise SATA or SAS drives

Re: btrfs RAID with enterprise SATA or SAS drives

Re: btrfs RAID with enterprise SATA or SAS drives

Re: btrfs RAID with enterprise SATA or SAS drives

Re: btrfs RAID with RAID cards (thread renamed)