thr3ads.net - zfs discuss - [zfs-discuss] ZFS Scalability/performance [Jun 2007]

If this information is useful, please help other people find it:
Share via:

Oliver Schinagl

2007-Jun-19 23:10 UTC

[zfs-discuss] ZFS Scalability/performance

Hello,

I''m quite interested in ZFS, like everybody else I suppose, and am
about
to install FBSD with ZFS.

On that note, i have a different first question to start with. I
personally am a Linux fanboy, and would love to see/use ZFS on linux. I
assume that I can use those ZFS disks later with any os that can
work/recognizes ZFS correct? e.g.  I can install/setup ZFS in FBSD, and
later use it in OpenSolaris/Linux Fuse(native) later?

Anyway, back to business :)
I have a whole bunch of different sized disks/speeds. E.g. 3 300GB disks
@ 40mb, a 320GB disk @ 60mb/s, 3 120gb disks @ 50mb/s and so on.

Raid-Z and ZFS claims to be uber scalable and all that, but would it
''just work'' with a setup like that too?

I used to match up partition sizes in linux, so make the 320gb disk into
2 partitions of 300 and 20gb, then use the 4 300gb partitions as a
raid5, same with the 120 gigs and use the scrap on those aswell, finally
stiching everything together with LVM2. I can''t easly find how this
would work with raid-Z/ZFS, e.g. can I really just put all these disks
in 1 big pool and remove/add to it at will? And I really don''t need to
use softwareraid yet still have the same reliablity with raid-z as I had
with raid-5? What about hardware raid controllers, just use it as a JBOD
device, or would I use it to match up disk sizes in raid0 stripes (e.g.
the 3x 120gb to make a 360 raid0).

Or you''d recommend to just stick with raid/lvm/reiserfs and use that.

thanks,

Oliver

Richard Elling

2007-Jun-20 02:52 UTC

head link

[zfs-discuss] ZFS Scalability/performance

Oliver Schinagl wrote:> Hello,
> 
> I''m quite interested in ZFS, like everybody else I suppose, and am
about
> to install FBSD with ZFS.
cool.
> On that note, i have a different first question to start with. I
> personally am a Linux fanboy, and would love to see/use ZFS on linux. I
> assume that I can use those ZFS disks later with any os that can
> work/recognizes ZFS correct? e.g.  I can install/setup ZFS in FBSD, and
> later use it in OpenSolaris/Linux Fuse(native) later?
The on-disk format is an available specification and is designed to be
platform neutral.  We certainly hope you will be able to access the
zpools from different OSes (one at a time).
> Anyway, back to business :)
> I have a whole bunch of different sized disks/speeds. E.g. 3 300GB disks
> @ 40mb, a 320GB disk @ 60mb/s, 3 120gb disks @ 50mb/s and so on.
> 
> Raid-Z and ZFS claims to be uber scalable and all that, but would it
> ''just work'' with a setup like that too?
Yes, for most definitions of ''just work.''
> I used to match up partition sizes in linux, so make the 320gb disk into
> 2 partitions of 300 and 20gb, then use the 4 300gb partitions as a
> raid5, same with the 120 gigs and use the scrap on those aswell, finally
> stiching everything together with LVM2. I can''t easly find how
this
> would work with raid-Z/ZFS, e.g. can I really just put all these disks
> in 1 big pool and remove/add to it at will?
Yes is the simple answer.  But we generally recommend planning.  To begin
your plan, decide your priority: space, performance, data protection.

ZFS is very dynamic, which has the property that for redundancy schemes
(mirror, raidz[12]) it will use as much a space as possible.  For example,
if you mirror a 1 GByte drive with a 2 GByte drive, then you will have
available space of 1 GByte.  If you later replace the 1 GByte drive with a
4 GByte drive, then you will instantly have the available space of 2 GBytes.
If you replace the 2 GByte drive with an 8 GByte drive, you will instantly
have access to 4 GBytes of mirrored data.
>                                              And I really don''t
need to
> use softwareraid yet still have the same reliablity with raid-z as I had
> with raid-5? 
raidz is more reliable than software raid-5.
>               What about hardware raid controllers, just use it as a JBOD
> device, or would I use it to match up disk sizes in raid0 stripes (e.g.
> the 3x 120gb to make a 360 raid0).
ZFS is dynamic.
> Or you''d recommend to just stick with raid/lvm/reiserfs and use
that.
ZFS rocks!
  -- richard

Constantin Gonzalez

2007-Jun-20 08:03 UTC

head link

[zfs-discuss] ZFS Scalability/performance

Hi,
> I''m quite interested in ZFS, like everybody else I suppose, and am
about
> to install FBSD with ZFS.
welcome to ZFS!
> Anyway, back to business :)
> I have a whole bunch of different sized disks/speeds. E.g. 3 300GB disks
> @ 40mb, a 320GB disk @ 60mb/s, 3 120gb disks @ 50mb/s and so on.
> 
> Raid-Z and ZFS claims to be uber scalable and all that, but would it
> ''just work'' with a setup like that too?
Yes. If you dump a set of variable-size disks into a mirror or RAID-Z
configuration, you''ll get the same result as if you had the smallest of
their sizes. Then, the pool will grow when exchanging smaller disks with
larger.

I used to run a ZFS pool on 1x250GB, 1x200GB, 1x85 GB and 1x80 GB the following
way:

- Set up an 80 GB slice on all 4 disks and make a 4 disk RAID-Z vdev
- Set up a 5 GB slice on the 250, 200 and 85 GB disks and make a 3 disk RAID-Z
- Set up a 115GB slice on the 200 and the 250 GB disk and make a 2 disk mirror.
- Concatenate all 3 vdevs into one pool. (You need zpool add -f for that).

Not something to be done on a professional production system, but it worked
for my home setup just fine. The remaining 50GB from the 250GB drive then
went into a scratch pool.

Kinda like playing Tetris with RAID-Z...

Later, I decided using just paired disks as mirrors are really more
flexible and easier to expand, since disk space is cheap.

Hope this helps,
   Constantin

-- 
Constantin Gonzalez                            Sun Microsystems GmbH, Germany
Platform Technology Group, Global Systems Engineering      http://www.sun.de/
Tel.: +49 89/4 60 08-25 91                   http://blogs.sun.com/constantin/

Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering

Richard L. Hamilton

2007-Jun-20 09:23 UTC

head link

[zfs-discuss] Re: ZFS Scalability/performance

> Hello,
> 
> I''m quite interested in ZFS, like everybody else I
> suppose, and am about
> to install FBSD with ZFS.
> 
> On that note, i have a different first question to
> start with. I
> personally am a Linux fanboy, and would love to
> see/use ZFS on linux. I
> assume that I can use those ZFS disks later with any
> os that can
> work/recognizes ZFS correct? e.g.  I can
> install/setup ZFS in FBSD, and
> later use it in OpenSolaris/Linux Fuse(native) later?
I''ve seen some discussions that implied adding attributes
to support non-Solaris (*BSD) uses of zfs, so that the format would
remain interoperable (i.e. free of incompatible extensions),
although not all OSs might fully support those.  But I don''t know
if there''s some firm direction to keeping the on-disk format
compatible across platforms that zfs is ported to.  Indeed, if the
code is open-source, I''m not sure that''s possible to
_enforce_.  But
I suspect (and certainly hope) it''s being encouraged.  If someone who
works on zfs could comment on that, it might help.
> Anyway, back to business :)
> I have a whole bunch of different sized disks/speeds.
> E.g. 3 300GB disks
> @ 40mb, a 320GB disk @ 60mb/s, 3 120gb disks @ 50mb/s
> and so on.
> 
> Raid-Z and ZFS claims to be uber scalable and all
> that, but would it
> ''just work'' with a setup like that too?
> 
> I used to match up partition sizes in linux, so make
> the 320gb disk into
> 2 partitions of 300 and 20gb, then use the 4 300gb
> partitions as a
> raid5, same with the 120 gigs and use the scrap on
> those aswell, finally
> stiching everything together with LVM2. I can''t easly
> find how this
> would work with raid-Z/ZFS, e.g. can I really just
> put all these disks
> in 1 big pool and remove/add to it at will? And I
> really don''t need to
> use softwareraid yet still have the same reliablity
> with raid-z as I had
> with raid-5? What about hardware raid controllers,
> just use it as a JBOD
> device, or would I use it to match up disk sizes in
> raid0 stripes (e.g.
> the 3x 120gb to make a 360 raid0).
> 
> Or you''d recommend to just stick with
> raid/lvm/reiserfs and use that.
One of the advantages of zfs is said to be that if it''s used
end-to-end, it can catch more potential data integrity issues
(including controller, disk, cabling glitches, misdirected writes, etc).

As far as I understand, raid-z is like raid-5 except that the stripes
are varying size, so all writes are full-stripe, closing the "write
hole",
so no NVRAM is needed to ensure that recovery would always be possible.

Components of raid-z or raid-z2 or mirrors can AFAIK only be used up to the
size of the smallest component.  However, a zpool can consist of
the aggregation (dynamic striping, I think) of various mirrors or raid-z[2]
virtual devices.  So you could group similar sized chunks (be it partitions or
whole disks) into redundant virtual devices, and aggregate them all into a
zpool (and add more later to grow it, too).  Ideally, all such virtual devices
would have the same level of redundancy; I don''t think that''s
_required_, but
there isn''t much good excuse for doing otherwise, since the performance
of
raid-z[2] is different from that of a mirror.

There may be some advantages to giving zfs entire disks where possible;
it will handle labelling (using EFI labels) and IIRC, may be able to better
manage the disk''s write cache.

For the most part, I can''t see many cases where using zfs together with
something else (like vxvm or lvm) would make much sense.  One possible
exception might be AVS (http://opensolaris.org/os/project/avs/) for
geographic redundancy; see http://blogs.sun.com/AVS/entry/avs_and_zfs_seamless
for more details.

It can be quite easy to use, with only two commands (zpool and zfs);
however, you still want to know what you''re doing, and there are plenty
of
issues and tradeoffs to consider to get the best out of it.

Look around a little for more info; for example,
http://www.opensolaris.org/os/community/zfs/faq/
http://en.wikipedia.org/wiki/ZFS
http://docs.sun.com/app/docs/doc/817-2271   (ZFS Administration Guide)
http://www.google.com/search?hl=en&q=zpool+OR+zfs+site%3Ablogs.sun.com&btnG=Search
 
 
This message posted from opensolaris.org

Pawel Jakub Dawidek

2007-Jun-20 10:45 UTC

head link

[zfs-discuss] ZFS Scalability/performance

On Tue, Jun 19, 2007 at 07:52:28PM -0700, Richard Elling
wrote:> >On that note, i have a different first question to start with. I
> >personally am a Linux fanboy, and would love to see/use ZFS on linux. I
> >assume that I can use those ZFS disks later with any os that can
> >work/recognizes ZFS correct? e.g.  I can install/setup ZFS in FBSD, and
> >later use it in OpenSolaris/Linux Fuse(native) later?
> 
> The on-disk format is an available specification and is designed to be
> platform neutral.  We certainly hope you will be able to access the
> zpools from different OSes (one at a time).
Will be nice to not EFI label disks, though:) Currently there is a
problem with this - zpool created on Solaris is not recognized by
FreeBSD, because FreeBSD claims GPT label is corrupted. On the other
hand, creating ZFS on FreeBSD (on a raw disk) can be used under Solaris.

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd at FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070620/ee8b0bdc/attachment.bin>

Oliver Schinagl

2007-Jun-20 11:45 UTC

head link

[zfs-discuss] ZFS Scalability/performance

Pawel Jakub Dawidek wrote:> On Tue, Jun 19, 2007 at 07:52:28PM -0700, Richard Elling wrote:
>   
>>> On that note, i have a different first question to start with. I
>>> personally am a Linux fanboy, and would love to see/use ZFS on
linux. I
>>> assume that I can use those ZFS disks later with any os that can
>>> work/recognizes ZFS correct? e.g.  I can install/setup ZFS in FBSD,
and
>>> later use it in OpenSolaris/Linux Fuse(native) later?
>>>       
>> The on-disk format is an available specification and is designed to be
>> platform neutral.  We certainly hope you will be able to access the
>> zpools from different OSes (one at a time).
>>     
>
> Will be nice to not EFI label disks, though:) Currently there is a
> problem with this - zpool created on Solaris is not recognized by
> FreeBSD, because FreeBSD claims GPT label is corrupted. On the other
> hand, creating ZFS on FreeBSD (on a raw disk) can be used under Solaris.
>
>   
I read this earlier, that it''s recommended to use a whole disk instead
of a partition with zfs, the thing that''s holding me back however is
the
mixture of different sized disks I have. I suppose if I had a 300gb per
disk raid-z going on 3 300 disk and one 320gb disk, but only have a
partition of 300gb on it (still with me), i could later expand that
partition with fdisk and the entire raid-z would then expand to 320gb
per disk (assuming the other disks magically gain 20gb, so this is a bad
example in that sense :) )

Also what about full disk vs full partition, e.g. make 1 partition to
span the entire disk vs using the entire disk.
Is there any significant performance penalty? (So not having a disk
split into 2 partitions, but 1 disk, 1 partition) I read that with a
full raw disk zfs will be beter to utilize the disks write cache, but I
don''t see how.

Oliver Schinagl

2007-Jun-20 12:33 UTC

head link

[zfs-discuss] ZFS Scalability/performance

Constantin Gonzalez wrote:> Hi,
>
>   
>> I''m quite interested in ZFS, like everybody else I suppose,
and am about
>> to install FBSD with ZFS.
>>     
>
> welcome to ZFS!
>
>   
>> Anyway, back to business :)
>> I have a whole bunch of different sized disks/speeds. E.g. 3 300GB
disks
>> @ 40mb, a 320GB disk @ 60mb/s, 3 120gb disks @ 50mb/s and so on.
>>
>> Raid-Z and ZFS claims to be uber scalable and all that, but would it
>> ''just work'' with a setup like that too?
>>     
>
> Yes. If you dump a set of variable-size disks into a mirror or RAID-Z
> configuration, you''ll get the same result as if you had the
smallest of
> their sizes. Then, the pool will grow when exchanging smaller disks with
> larger.
>
> I used to run a ZFS pool on 1x250GB, 1x200GB, 1x85 GB and 1x80 GB the
following
> way:
>
> - Set up an 80 GB slice on all 4 disks and make a 4 disk RAID-Z vdev
> - Set up a 5 GB slice on the 250, 200 and 85 GB disks and make a 3 disk
RAID-Z
> - Set up a 115GB slice on the 200 and the 250 GB disk and make a 2 disk
mirror.
> - Concatenate all 3 vdevs into one pool. (You need zpool add -f for that).
>
> Not something to be done on a professional production system, but it worked
> for my home setup just fine. The remaining 50GB from the 250GB drive then
> went into a scratch pool.
>
> Kinda like playing Tetris with RAID-Z...
>
> Later, I decided using just paired disks as mirrors are really more
> flexible and easier to expand, since disk space is cheap.
>
>   
well i''m about to go read the entire admin manual now that I found it
and I hope it will explain all my further questions, but before i start
doing so,

How are paired mirrors more flexiable?

Right now, i have a 3 disk raid 5 running with the linux DM driver. One
of the most resent additions was raid5 expansion, so i could pop in a
matching disk, and expand my raid5 to 4 disks instead of 3 (which is
always interesting as your cutting on your parity loss). I think though
in raid5 you shouldn''t put more then 6 - 8 disks afaik, so I
wouldn''t be
expanding this enlessly.

So how would this translate to ZFS? I have learned so far that, ZFS
basically is raid + LVM. e.g. the mirrored raid-z pairs go into the
pool, just like one would use LVM to bind all the raid pairs. The
difference being I suppose, that you can''t use a zfs mirror/raid-z
without having a pool to use it from?

Wondering now is if I can simply add a new disk to my raid-z and have it
''just work'', e.g. the raid-z would be expanded to use the new
disk(partition of matching size)
> Hope this helps,
>    Constantin
>
>   
Thanks,

Oliver

Constantin Gonzalez

2007-Jun-20 12:59 UTC

head link

[zfs-discuss] ZFS Scalability/performance

Hi,
> How are paired mirrors more flexiable?
well, I''m talking of a small home system. If the pool gets full, the
way to expand with RAID-Z would be to add 3+ disks (typically 4-5).

With mirror only, you just add two. So in my case it''s just about
the granularity of expansion.

The reasoning is that of the three factors reliability, performance and
space, I value them in this order. Space comes last since disk space
is cheap.

If I had a bigger number of disks (12+), I''d be using them in RAID-Z2
sets (4+2 plus 4+2 etc.). Here, the speed is ok and the reliability is
ok and so I can use RAID-Z2 instead of mirroring to get some extra
space as well.
> Right now, i have a 3 disk raid 5 running with the linux DM driver. One
> of the most resent additions was raid5 expansion, so i could pop in a
> matching disk, and expand my raid5 to 4 disks instead of 3 (which is
> always interesting as your cutting on your parity loss). I think though
> in raid5 you shouldn''t put more then 6 - 8 disks afaik, so I
wouldn''t be
> expanding this enlessly.
> 
> So how would this translate to ZFS? I have learned so far that, ZFS
ZFS does not yet support rearranging the disk cofiguration. Right now,
you can expand a single disk to a mirror or an n-way mirror to an n+1 way
mirror.

RAID-Z vdevs can''t be changed right now. But you can add more disks
to a pool by adding more vdevs (You have a 1+1 mirror, add another 1+1
pair and get more space, have a 3+2 RAID-Z2 and add another 5+2 RAID etc.)
> basically is raid + LVM. e.g. the mirrored raid-z pairs go into the
> pool, just like one would use LVM to bind all the raid pairs. The
> difference being I suppose, that you can''t use a zfs mirror/raid-z
> without having a pool to use it from?
Here''s the basic idea:

- You first construct vdevs from disks:

  One disk can be one vdev.
  A 1+1 mirror can be a vdev, too.
  A n+1 or n+2 RAID-Z (RAID-Z2) set can be a vdev too.

- Then you concatenate vdevs to create a pool. Pools can be extended by
  adding more vdevs.

- Then you create ZFS file systems that draw their block usage from the
  resources supplied by the pool. Very flexible.
> Wondering now is if I can simply add a new disk to my raid-z and have it
> ''just work'', e.g. the raid-z would be expanded to use the
new
> disk(partition of matching size)
If you have a RAID-Z based pool in ZFS, you can add another group of disks
that are organized in a RAID-Z manner (a vdev) to expand the storage capacity
of the pool.

Hope this clarifies things a bit. And yes, please check out the admin guide and
the other collateral available on ZFS. It''s full of new concepts and
one needs
some getting used to to explore all possibilities.

Cheers,
   Constantin

-- 
Constantin Gonzalez                            Sun Microsystems GmbH, Germany
Platform Technology Group, Global Systems Engineering      http://www.sun.de/
Tel.: +49 89/4 60 08-25 91                   http://blogs.sun.com/constantin/

Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering

mike

2007-Jun-20 14:29 UTC

head link

[zfs-discuss] ZFS Scalability/performance

On 6/20/07, Constantin Gonzalez <Constantin.Gonzalez at sun.com> wrote:
>  One disk can be one vdev.
>  A 1+1 mirror can be a vdev, too.
>  A n+1 or n+2 RAID-Z (RAID-Z2) set can be a vdev too.
>
> - Then you concatenate vdevs to create a pool. Pools can be extended by
>  adding more vdevs.
>
> - Then you create ZFS file systems that draw their block usage from the
>  resources supplied by the pool. Very flexible.
This actually brings up something I was wondering about last night:

If I was to plan for a 16 disk ZFS-based system, you would probably
suggest me to configure it as something like 5+1, 4+1, 4+1 all raid-z
(I don''t need the double parity concept)

I would prefer something like 15+1 :) I want ZFS to be able to detect
and correct errors, but I do not need to squeeze all the performance
out of it (I''ll be using it as a home storage server for my DVDs and
other audio/video stuff. So only a few clients at the most streaming
off of it)

I would be interested in hearing if there are any other configuration
options to squeeze the most space out of the drives. I have no issue
with powering down to replace a bad drive, and I expect that I''ll only
have one at the most fail at a time. If I really do need room for two
to fail then I suppose I can look for a 14 drive space usable setup
and use raidz-2.

Thanks,
mike

Constantin Gonzalez

2007-Jun-20 14:41 UTC

head link

[zfs-discuss] ZFS Scalability/performance

Hi Mike,
> If I was to plan for a 16 disk ZFS-based system, you would probably
> suggest me to configure it as something like 5+1, 4+1, 4+1 all raid-z
> (I don''t need the double parity concept)
> 
> I would prefer something like 15+1 :) I want ZFS to be able to detect
> and correct errors, but I do not need to squeeze all the performance
> out of it (I''ll be using it as a home storage server for my DVDs
and
> other audio/video stuff. So only a few clients at the most streaming
> off of it)
this is possibe. ZFS in theory does not significantly limit the n and 15+1
is indeed possible.

But for a number of reasons (among them performance) people generally
advise to use no more than 10+1.

A lot of ZFS configuration wisdom can be found on the Solaris internals
ZFS Best Practices Guide Wiki at:

  http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide

Richard Elling has done a great job of thoroughly analyzing different
reliability concepts for ZFS in his blog. One good introduction is the
following entry:

  http://blogs.sun.com/relling/entry/zfs_raid_recommendations_space_performance

That may help you find the right tradeoff between space and reliability.

Hope this helps,
   Constantin


-- 
Constantin Gonzalez                            Sun Microsystems GmbH, Germany
Platform Technology Group, Global Systems Engineering      http://www.sun.de/
Tel.: +49 89/4 60 08-25 91                   http://blogs.sun.com/constantin/

Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Marcel Schneider, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering

Paul Fisher

2007-Jun-20 14:49 UTC

head link

[zfs-discuss] ZFS Scalability/performance

> From: zfs-discuss-bounces at opensolaris.org 
> [mailto:zfs-discuss-bounces at opensolaris.org] On Behalf Of mike
> Sent: Wednesday, June 20, 2007 9:30 AM
> 
> I would prefer something like 15+1 :) I want ZFS to be able to detect
> and correct errors, but I do not need to squeeze all the performance
> out of it (I''ll be using it as a home storage server for my DVDs
and
> other audio/video stuff. So only a few clients at the most streaming
> off of it)
I would not risk raidz on that many disks.  A nice compromise may be 14+2
raidz2, which should perform nicely for your workload and be pretty reliable
when the disks start to fail.


--

paul

mike

2007-Jun-20 15:15 UTC

head link

[zfs-discuss] ZFS Scalability/performance

On 6/20/07, Paul Fisher <pfisher at alertlogic.net>
wrote:> I would not risk raidz on that many disks.  A nice compromise may be 14+2
raidz2, which should perform nicely for your workload and be pretty reliable
when the disks start to fail.
Would anyone on the list not recommend this setup? I could live with 2
drives being used for parity (or the "parity" concept)

I would be able to reap the benefits of ZFS - self-healing, corrupted
file reconstruction (since it has some parity to read from) and should
have decent performance (obviously not smokin'' since I am not
configuring this to try for the fastest possible)

Tomas Ögren

2007-Jun-20 15:21 UTC

head link

[zfs-discuss] ZFS Scalability/performance

On 20 June, 2007 - Oliver Schinagl sent me these 1,9K bytes:
> Also what about full disk vs full partition, e.g. make 1 partition to
> span the entire disk vs using the entire disk.
> Is there any significant performance penalty? (So not having a disk
> split into 2 partitions, but 1 disk, 1 partition) I read that with a
> full raw disk zfs will be beter to utilize the disks write cache, but I
> don''t see how.
Because when given a whole disk, ZFS can safely play with the write
cache in disks without jeopardizing any UFS or so that might be on some
other slice.. Helps when ZFS is batch-writing in a transaction group.

/Tomas
-- 
Tomas ?gren, stric at acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Ume?
`- Sysadmin at {cs,acc}.umu.se

Will Murnane

2007-Jun-20 16:03 UTC

head link

[zfs-discuss] ZFS Scalability/performance

On 6/20/07, mike <mike503 at gmail.com> wrote:> On 6/20/07, Paul Fisher <pfisher at alertlogic.net> wrote:
> > I would not risk raidz on that many disks.  A nice compromise may be
14+2
> > raidz2, which should perform nicely for your workload and be pretty
reliable
> > when the disks start to fail.
> Would anyone on the list not recommend this setup? I could live with 2
> drives being used for parity (or the "parity" concept)Yes.  2 disks means when one fails, you''ve still got an extra.  In
raid 5 boxes, it''s not uncommon with large arrays for one disk to die,
and when it''s replaced, the stress on the other disks causes another
failure.  Then the array is toast.  I don''t know if this is a problem
on ZFS... but they took the time to implement raidz2, so I''d suggest
it.
> I would be able to reap the benefits of ZFS - self-healing, corrupted
> file reconstruction (since it has some parity to read from) and should
> have decent performance (obviously not smokin'' since I am not
> configuring this to try for the fastest possible)And since you''ll generally be doing full-stripe reads and writes, you
get good bandwidth anyways.

Will

Pawel Jakub Dawidek

2007-Jun-20 16:16 UTC

head link

[zfs-discuss] ZFS Scalability/performance

On Wed, Jun 20, 2007 at 01:45:29PM +0200, Oliver Schinagl
wrote:> 
> 
> Pawel Jakub Dawidek wrote:
> > On Tue, Jun 19, 2007 at 07:52:28PM -0700, Richard Elling wrote:
> >   
> >>> On that note, i have a different first question to start with.
I
> >>> personally am a Linux fanboy, and would love to see/use ZFS on
linux. I
> >>> assume that I can use those ZFS disks later with any os that
can
> >>> work/recognizes ZFS correct? e.g.  I can install/setup ZFS in
FBSD, and
> >>> later use it in OpenSolaris/Linux Fuse(native) later?
> >>>       
> >> The on-disk format is an available specification and is designed
to be
> >> platform neutral.  We certainly hope you will be able to access
the
> >> zpools from different OSes (one at a time).
> >>     
> >
> > Will be nice to not EFI label disks, though:) Currently there is a
> > problem with this - zpool created on Solaris is not recognized by
> > FreeBSD, because FreeBSD claims GPT label is corrupted. On the other
> > hand, creating ZFS on FreeBSD (on a raw disk) can be used under
Solaris.
> >
> >   
> 
> I read this earlier, that it''s recommended to use a whole disk
instead
> of a partition with zfs, the thing that''s holding me back however
is the
> mixture of different sized disks I have. I suppose if I had a 300gb per
> disk raid-z going on 3 300 disk and one 320gb disk, but only have a
> partition of 300gb on it (still with me), i could later expand that
> partition with fdisk and the entire raid-z would then expand to 320gb
> per disk (assuming the other disks magically gain 20gb, so this is a bad
> example in that sense :) )
> 
> Also what about full disk vs full partition, e.g. make 1 partition to
> span the entire disk vs using the entire disk.
> Is there any significant performance penalty? (So not having a disk
> split into 2 partitions, but 1 disk, 1 partition) I read that with a
> full raw disk zfs will be beter to utilize the disks write cache, but I
> don''t see how.
On FreeBSD (thanks to GEOM) there is no difference what do you have
under ZFS. On Solaris, ZFS turns on write cache on disk when whole disk
is used. On FreeBSD write cache is enabled by default and GEOM consumers
can send write-cache-flush (BIO_FLUSH) request to any GEOM providers.

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd at FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070620/fee0a5c7/attachment.bin>

Oliver Schinagl

2007-Jun-20 16:26 UTC

head link

[zfs-discuss] ZFS Scalability/performance

Pawel Jakub Dawidek wrote:> On Wed, Jun 20, 2007 at 01:45:29PM +0200, Oliver Schinagl wrote:
>   
>> Pawel Jakub Dawidek wrote:
>>     
>>> On Tue, Jun 19, 2007 at 07:52:28PM -0700, Richard Elling wrote:
>>>   
>>>       
>>>>> On that note, i have a different first question to start
with. I
>>>>> personally am a Linux fanboy, and would love to see/use ZFS
on linux. I
>>>>> assume that I can use those ZFS disks later with any os
that can
>>>>> work/recognizes ZFS correct? e.g.  I can install/setup ZFS
in FBSD, and
>>>>> later use it in OpenSolaris/Linux Fuse(native) later?
>>>>>       
>>>>>           
>>>> The on-disk format is an available specification and is
designed to be
>>>> platform neutral.  We certainly hope you will be able to access
the
>>>> zpools from different OSes (one at a time).
>>>>     
>>>>         
>>> Will be nice to not EFI label disks, though:) Currently there is a
>>> problem with this - zpool created on Solaris is not recognized by
>>> FreeBSD, because FreeBSD claims GPT label is corrupted. On the
other
>>> hand, creating ZFS on FreeBSD (on a raw disk) can be used under
Solaris.
>>>
>>>   
>>>       
>> I read this earlier, that it''s recommended to use a whole disk
instead
>> of a partition with zfs, the thing that''s holding me back
however is the
>> mixture of different sized disks I have. I suppose if I had a 300gb per
>> disk raid-z going on 3 300 disk and one 320gb disk, but only have a
>> partition of 300gb on it (still with me), i could later expand that
>> partition with fdisk and the entire raid-z would then expand to 320gb
>> per disk (assuming the other disks magically gain 20gb, so this is a
bad
>> example in that sense :) )
>>
>> Also what about full disk vs full partition, e.g. make 1 partition to
>> span the entire disk vs using the entire disk.
>> Is there any significant performance penalty? (So not having a disk
>> split into 2 partitions, but 1 disk, 1 partition) I read that with a
>> full raw disk zfs will be beter to utilize the disks write cache, but I
>> don''t see how.
>>     
>
> On FreeBSD (thanks to GEOM) there is no difference what do you have
> under ZFS. On Solaris, ZFS turns on write cache on disk when whole disk
> is used. On FreeBSD write cache is enabled by default and GEOM consumers
> can send write-cache-flush (BIO_FLUSH) request to any GEOM providers.
>
>   zo basically, what you are saying is that on FBSD there''s no performane
issue, whereas on solaris there (can be if write caches aren''t enabled)

Eric Schrock

2007-Jun-20 16:48 UTC

head link

[zfs-discuss] ZFS Scalability/performance

On Wed, Jun 20, 2007 at 12:45:52PM +0200, Pawel Jakub Dawidek
wrote:> 
> Will be nice to not EFI label disks, though:) Currently there is a
> problem with this - zpool created on Solaris is not recognized by
> FreeBSD, because FreeBSD claims GPT label is corrupted. On the other
> hand, creating ZFS on FreeBSD (on a raw disk) can be used under Solaris.
> 
FYI, the primary reason for using EFI labels is that they are
endian-neutral, unlike Solaris VTOC.  The secondary reason is that they
are simpler and easier to use (at least on Solaris).

I''m curious why FreeBSD claims the GPT label is corrupted.  Is this
because FreeBSD doesn''t understand EFI labels, our EFI label is bad, or
is there a bug in the FreeBSD EFI implementation?

Thanks,

- Eric

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Oliver Schinagl

2007-Jun-20 18:12 UTC

head link

[zfs-discuss] ZFS Scalability/performance

mike wrote:> On 6/20/07, Constantin Gonzalez <Constantin.Gonzalez at sun.com>
wrote:
>
>>  One disk can be one vdev.
>>  A 1+1 mirror can be a vdev, too.
>>  A n+1 or n+2 RAID-Z (RAID-Z2) set can be a vdev too.
>>
>> - Then you concatenate vdevs to create a pool. Pools can be extended by
>>  adding more vdevs.
>>
>> - Then you create ZFS file systems that draw their block usage from the
>>  resources supplied by the pool. Very flexible.
>
> This actually brings up something I was wondering about last night:
>
> If I was to plan for a 16 disk ZFS-based system, you would probably
> suggest me to configure it as something like 5+1, 4+1, 4+1 all raid-z
> (I don''t need the double parity concept)
>
> I would prefer something like 15+1 :) I want ZFS to be able to detect
> and correct errors, but I do not need to squeeze all the performance
> out of it (I''ll be using it as a home storage server for my DVDs
and
> other audio/video stuff. So only a few clients at the most streaming
> off of it)
>
> I would be interested in hearing if there are any other configuration
> options to squeeze the most space out of the drives. I have no issue
> with powering down to replace a bad drive, and I expect that I''ll
onlyJust know that, if your server/disks are up all the time, shutting down
your server whilst you wait for replacement drives actually might kill
your array. Especially with consumer IDE/SATA drives.

Those pesky consumer drivers aren''t made for 24/7 usage, i think they
spec em at 8hrs a day? Eitherway, that''s me being sidetracked, the
problem is, you''ll have a disk up spinning normally, some access, same
temperature! all the time. All of a sudden you change the envirment, you
let it cool down and what not. Harddisks don''t like that at all!
I''ve
even heard of harddisk (cases) cracking because of the temperature
differences and such.

My requirements are the same, and i want space, but the thought of
having more disks die on me while i replace the broken one doesn''t
really make me happy either. (I personally use only the WD Raid editions
of HDD''s; wether it''s worth it or not, i dunno, but they have
better
warranty and supposedly should be able to do 24/7 a day)
> have one at the most fail at a time. If I really do need room for two
> to fail then I suppose I can look for a 14 drive space usable setup
> and use raidz-2.
>
> Thanks,
> mike
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Oliver Schinagl

2007-Jun-20 18:42 UTC

head link

[zfs-discuss] Shrinking of Pools.

I''m reading the administration guide pdf and noticed that it claims
that
at the moment ZFS does not support shrinking of the pool. Will this
feature be added in the future? Also expanding of raid-z is not yet
supported, will this also change?

One of the reasons i switched back from X/JFS to ReiserFS on my linux
box was that I couldn''t shrink the FS ontop of my LVM, which was highly
annoying. Also sometimes you might wanna just remove a disk from your
array: Say you setup up a mirrored ZFS with 2 120gb disks. 4 years
later, you get some of those fancy 1tb disks, say 3 or 4 of em and
raid-z them. not only would those 120gb insignificant, but maybe
they''ve
becom a liability, they are old, replacing them ins''t that easy
anymore,
who still sells disks that size, and why bother, if you have plenty of
space.

Oliver

Casper.Dik at Sun.COM

2007-Jun-20 18:46 UTC

head link

[zfs-discuss] Shrinking of Pools.

>One of the reasons i switched back from X/JFS to ReiserFS on my linux
>box was that I couldn''t shrink the FS ontop of my LVM, which was
highly
>annoying. Also sometimes you might wanna just remove a disk from your
>array: Say you setup up a mirrored ZFS with 2 120gb disks. 4 years
>later, you get some of those fancy 1tb disks, say 3 or 4 of em and
>raid-z them. not only would those 120gb insignificant, but maybe
they''ve
>becom a liability, they are old, replacing them ins''t that easy
anymore,
>who still sells disks that size, and why bother, if you have plenty of
>space.
You scenario is adequately covered by "zpool replace"; you can
replace disks with bigger disks or disks of the same size.

Casper

Oliver Schinagl

2007-Jun-20 18:56 UTC

head link

[zfs-discuss] Shrinking of Pools.

Casper.Dik at Sun.COM wrote:>> One of the reasons i switched back from X/JFS to ReiserFS on my linux
>> box was that I couldn''t shrink the FS ontop of my LVM, which
was highly
>> annoying. Also sometimes you might wanna just remove a disk from your
>> array: Say you setup up a mirrored ZFS with 2 120gb disks. 4 years
>> later, you get some of those fancy 1tb disks, say 3 or 4 of em and
>> raid-z them. not only would those 120gb insignificant, but maybe
they''ve
>> becom a liability, they are old, replacing them ins''t that
easy anymore,
>> who still sells disks that size, and why bother, if you have plenty of
>> space.
>>     
>
> You scenario is adequately covered by "zpool replace"; you can
> replace disks with bigger disks or disks of the same size.
>
> Casper
>   yes, but can I replace a mirror with a raid-z?

what i understood was that i can have a 5-way mirror, remove/replace 4 
disks fine, but i can''t remove the 5th disk. I imagine I can add 1 (or 
4) bigger disks, let it resync etc and then pull the last disk, but if i 
would want to go from mirror to raid-z? would that still work with zpool 
replace? (Or am I simply not far enough in the document)

oliver

Bill Sommerfeld

2007-Jun-20 19:23 UTC

head link

[zfs-discuss] ZFS Scalability/performance

On Wed, 2007-06-20 at 12:45 +0200, Pawel Jakub Dawidek
wrote:> Will be nice to not EFI label disks, though:) Currently there is a
> problem with this - zpool created on Solaris is not recognized by
> FreeBSD, because FreeBSD claims GPT label is corrupted.
Hmm.  I''d think the right answer here is to understand why FreeBSD and
solaris disagree about EFI/GPT labels.  Could be a solaris bug, could be
a freebsd bug, but the intent of the label format is to permit
interchange between different platforms..

					- Bill

Pawel Jakub Dawidek

2007-Jun-20 19:38 UTC

head link

[zfs-discuss] ZFS Scalability/performance

On Wed, Jun 20, 2007 at 09:48:08AM -0700, Eric Schrock
wrote:> On Wed, Jun 20, 2007 at 12:45:52PM +0200, Pawel Jakub Dawidek wrote:
> > 
> > Will be nice to not EFI label disks, though:) Currently there is a
> > problem with this - zpool created on Solaris is not recognized by
> > FreeBSD, because FreeBSD claims GPT label is corrupted. On the other
> > hand, creating ZFS on FreeBSD (on a raw disk) can be used under
Solaris.
> > 
> 
> FYI, the primary reason for using EFI labels is that they are
> endian-neutral, unlike Solaris VTOC.  The secondary reason is that they
> are simpler and easier to use (at least on Solaris).
> 
> I''m curious why FreeBSD claims the GPT label is corrupted.  Is
this
> because FreeBSD doesn''t understand EFI labels, our EFI label is
bad, or
> is there a bug in the FreeBSD EFI implementation?
I haven''t investigated this yet. FreeBSD should understand EFI, so
either the last two or a bug in Solaris EFI implementation:) I seem to
recall similar problems on Linux with ZFS/FUSE...

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd at FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070620/f4305e11/attachment.bin>

Richard Elling

2007-Jun-20 21:17 UTC

head link

[zfs-discuss] ZFS Scalability/performance

mike wrote:> I would be interested in hearing if there are any other configuration
> options to squeeze the most space out of the drives. I have no issue
> with powering down to replace a bad drive, and I expect that I''ll
only
> have one at the most fail at a time.
This is what is known as "famous last words."  Sorta like, "we
have triple
redundant navigation computers on this fancy, new International Space
Station."  If you want to know what keeps us RAS guys up at night, it is
famous last words.

[sidebar]
What is the last thing a redneck says before he dies?
"Hey y''all, watch this!"  :-)

  -- richard

Richard Elling

2007-Jun-20 21:18 UTC

head link

[zfs-discuss] ZFS Scalability/performance

Oliver Schinagl wrote:> zo basically, what you are saying is that on FBSD there''s no
performane
> issue, whereas on solaris there (can be if write caches aren''t
enabled)
Solaris plays it safe by default.  You can, of course, override that safety.
Whether it is a performance win seems to be the subject of some debate,
but intuitively it seems like it should help for most cases.
  -- richard

Toby Thain

2007-Jun-20 21:29 UTC

head link

[zfs-discuss] Re: ZFS Scalability/performance

On 20-Jun-07, at 12:23 PM, Richard L. Hamilton wrote:
>> Hello,
>>
>> I''m quite interested in ZFS, like everybody else I
>> suppose, and am about
>> to install FBSD with ZFS.
>>
>> On that note, i have a different first question to
>> start with. I
>> personally am a Linux fanboy, and would love to
>> see/use ZFS on linux. I
>> assume that I can use those ZFS disks later with any
>> os that can
>> work/recognizes ZFS correct? e.g.  I can
>> install/setup ZFS in FBSD, and
>> later use it in OpenSolaris/Linux Fuse(native) later?
>
> I''ve seen some discussions that implied adding attributes
> to support non-Solaris (*BSD) uses of zfs, so that the format would
> remain interoperable (i.e. free of incompatible extensions),
> although not all OSs might fully support those.  But I don''t know
> if there''s some firm direction to keeping the on-disk format
> compatible across platforms that zfs is ported to.  Indeed, if the
> code is open-source, I''m not sure that''s possible to
_enforce_.  But
> I suspect (and certainly hope) it''s being encouraged.  If someone
who
> works on zfs could comment on that, it might help.
Mat Ahrens recently did, on this list:
> ... as a leader of Sun''s ZFS team, and the OpenSolaris ZFS  
> community, I would do everything in my power to prevent the ZFS on- 
> disk format from diverging in different implementations.  Let''s  
> discuss the issues on this mailing list as they come up, and try to  
> arrive at a conclusion which offers the best ZFS for *all* ZFS  
> users, OpenSolaris or otherwise.
> ...
> FYI, we''re already working with engineers on some other ports to  
> ensure on-disk compatability.  Those changes are going smoothly.   
> So please, contact us if you want to make (or want us to make) on- 
> disk changes to ZFS for your port or distro.  We aren''t that  
> difficult to work with :-)
>
> --mat
--Toby
>
>> Anyway, back to business :)
>> I have a whole bunch of different sized disks/speeds.
>> E.g. 3 300GB disks
>> @ 40mb, a 320GB disk @ 60mb/s, 3 120gb disks @ 50mb/s
>> and so on.
>>
>> Raid-Z and ZFS claims to be uber scalable and all
>> that, but would it
>> ''just work'' with a setup like that too?
>>
>> I used to match up partition sizes in linux, so make
>> the 320gb disk into
>> 2 partitions of 300 and 20gb, then use the 4 300gb
>> partitions as a
>> raid5, same with the 120 gigs and use the scrap on
>> those aswell, finally
>> stiching everything together with LVM2. I can''t easly
>> find how this
>> would work with raid-Z/ZFS, e.g. can I really just
>> put all these disks
>> in 1 big pool and remove/add to it at will? And I
>> really don''t need to
>> use softwareraid yet still have the same reliablity
>> with raid-z as I had
>> with raid-5? What about hardware raid controllers,
>> just use it as a JBOD
>> device, or would I use it to match up disk sizes in
>> raid0 stripes (e.g.
>> the 3x 120gb to make a 360 raid0).
>>
>> Or you''d recommend to just stick with
>> raid/lvm/reiserfs and use that.
>
> One of the advantages of zfs is said to be that if it''s used
> end-to-end, it can catch more potential data integrity issues
> (including controller, disk, cabling glitches, misdirected writes,  
> etc).
>
> As far as I understand, raid-z is like raid-5 except that the stripes
> are varying size, so all writes are full-stripe, closing the "write  
> hole",
> so no NVRAM is needed to ensure that recovery would always be  
> possible.
>
> Components of raid-z or raid-z2 or mirrors can AFAIK only be used  
> up to the
> size of the smallest component.  However, a zpool can consist of
> the aggregation (dynamic striping, I think) of various mirrors or  
> raid-z[2]
> virtual devices.  So you could group similar sized chunks (be it  
> partitions or
> whole disks) into redundant virtual devices, and aggregate them all  
> into a
> zpool (and add more later to grow it, too).  Ideally, all such  
> virtual devices
> would have the same level of redundancy; I don''t think
that''s
> _required_, but
> there isn''t much good excuse for doing otherwise, since the  
> performance of
> raid-z[2] is different from that of a mirror.
>
> There may be some advantages to giving zfs entire disks where  
> possible;
> it will handle labelling (using EFI labels) and IIRC, may be able  
> to better
> manage the disk''s write cache.
>
> For the most part, I can''t see many cases where using zfs together
> with
> something else (like vxvm or lvm) would make much sense.  One possible
> exception might be AVS (http://opensolaris.org/os/project/avs/) for
> geographic redundancy; see http://blogs.sun.com/AVS/entry/ 
> avs_and_zfs_seamless
> for more details.
>
> It can be quite easy to use, with only two commands (zpool and zfs);
> however, you still want to know what you''re doing, and there are  
> plenty of
> issues and tradeoffs to consider to get the best out of it.
>
> Look around a little for more info; for example,
> http://www.opensolaris.org/os/community/zfs/faq/
> http://en.wikipedia.org/wiki/ZFS
> http://docs.sun.com/app/docs/doc/817-2271   (ZFS Administration Guide)
> http://www.google.com/search?hl=en&q=zpool+OR+zfs+site% 
> 3Ablogs.sun.com&btnG=Search
>
>
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Brian Hechinger

2007-Jun-23 02:01 UTC

head link

[zfs-discuss] ZFS Scalability/performance

On Wed, Jun 20, 2007 at 12:03:02PM -0400, Will Murnane
wrote:> Yes.  2 disks means when one fails, you''ve still got an extra.  In
> raid 5 boxes, it''s not uncommon with large arrays for one disk to
die,
> and when it''s replaced, the stress on the other disks causes
another
> failure.  Then the array is toast.  I don''t know if this is a
problem
> on ZFS... but they took the time to implement raidz2, so I''d
suggest
> it.
If you buy all the disks at once and add them to a pool all at once,
they should all theoretically have appoximately the same lifespan.
When one dies, you can almost count on others following soon after.
Nothing sucks more than your "redundant" disk array losing more disks
than it can support and you lose all your data anyway.  You''d be better
off doing a giant non-parity stripe and dumping to tape on a regular
basis. ;)

-brian
-- 
"Perl can be fast and elegant as much as J2EE can be fast and elegant.
In the hands of a skilled artisan, it can and does happen; it''s just
that most of the shit out there is built by people who''d be better
suited to making sure that my burger is cooked thoroughly."  -- Jonathan
Patschke

Anton B. Rang

2007-Jun-24 05:21 UTC

head link

[zfs-discuss] Re: ZFS Scalability/performance

> Oliver Schinagl wrote:
> > zo basically, what you are saying is that on FBSD there''s no
performane
> > issue, whereas on solaris there (can be if write caches
aren''t enabled)
> 
> Solaris plays it safe by default.  You can, of course, override that
safety.
FreeBSD plays it safe too.  It''s just that UFS, and other file systems
on FreeBSD, understand write caches and flush at appropriate times.

If Solaris UFS is updated to flush the write cache when necessary (it''s
not only at sync time, of course), it too can enable the write cache!  :-)
 
 
This message posted from opensolaris.org

Anton B. Rang

2007-Jun-24 05:25 UTC

head link

[zfs-discuss] Re: ZFS Scalability/performance

> Nothing sucks more than your "redundant" disk array
> losing more disks than it can support and you lose all your data
> anyway.  You''d be better off doing a giant non-parity stripe and
dumping to
> tape on a regular basis. ;)
Anyone who isn''t dumping to tape (or some other reliable and
[b]off-site[/b] medium) on a regular basis anyway deserves to lose their data.
RAID is not a replacement for backups....
 
 
This message posted from opensolaris.org

Peter Schuller

2007-Jun-24 19:26 UTC

head link

[zfs-discuss] Re: ZFS Scalability/performance

> FreeBSD plays it safe too.  It''s just that UFS, and other file
systems on FreeBSD, understand write caches and flush at appropriate times.
Do you have something to cite w.r.t. UFS here? Because as far as I know,
that is not correct. FreeBSD shipped with write caching turned off by
default for a while for this reason, but then changed it IIRC due to the
hordes of people complaining about performance.

I also have personal experience of corruption-after-powerfail that
indicate otherwise.

I also don''t see any complaints about cache flushing on USB drives with
UFS, which I did with ZFS every five seconds when it wanted to flush the
cache (which fails since the SCSI->USB bridge, or probably the USB mass
storage stuff itself, doest not support it).

Also, given the design of UFS and the need for synchronous writes on
updates, I would be surprised strictly based on performance observations
if it actually did flush caches.

The ability to get decent performance *AND* reliability on cheap disks
is one of the major reasons why I love ZFS :)

-- 
/ Peter Schuller

PGP userID: 0xE9758B7D or ''Peter Schuller <peter.schuller at
infidyne.com>''
Key retrieval: Send an E-Mail to getpgpkey at scode.org
E-Mail: peter.schuller at infidyne.com Web: http://www.scode.org

Pawel Jakub Dawidek

2007-Jun-24 20:45 UTC

head link

[zfs-discuss] Re: ZFS Scalability/performance

On Sat, Jun 23, 2007 at 10:21:14PM -0700, Anton B. Rang
wrote:> > Oliver Schinagl wrote:
> > > zo basically, what you are saying is that on FBSD
there''s no performane
> > > issue, whereas on solaris there (can be if write caches
aren''t enabled)
> > 
> > Solaris plays it safe by default.  You can, of course, override that
safety.
> 
> FreeBSD plays it safe too.  It''s just that UFS, and other file
systems on FreeBSD, understand write caches and flush at appropriate times.
That''s not true. None of file systems in FreeBSD understands and
flushes
disk write cache except for ZFS and UFS+gjournal.

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd at FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070624/31a493e4/attachment.bin>

Oliver Schinagl

2007-Jun-27 23:25 UTC

head link

[zfs-discuss] ReiserFS4 like metadata/search

The only thing I haven''t found in zfs yet, is metadata etc info.

The previous ''next best thing'' in FS was of course ReiserFS
(4). Reiser3
was quite a nice thing, fast, journaled and all that, but Reiser4
promised to bring all those things that we see emerging now, like cross
FS search, any document, audio recording etc could be instantly
searched. True there is google desktop search, trackerd and what not,
but those are ''afterthoughts'', not supported by the underlying
FS.

So does ZFS support features like metadata and such? or is that for zfs2? :)

oliver

Darren J Moffat

2007-Jun-28 12:43 UTC

head link

[zfs-discuss] ReiserFS4 like metadata/search

Oliver Schinagl wrote:> The only thing I haven''t found in zfs yet, is metadata etc info.
> 
> The previous ''next best thing'' in FS was of course
ReiserFS (4). Reiser3
> was quite a nice thing, fast, journaled and all that, but Reiser4
> promised to bring all those things that we see emerging now, like cross
> FS search, any document, audio recording etc could be instantly
> searched. True there is google desktop search, trackerd and what not,
> but those are ''afterthoughts'', not supported by the
underlying FS.
You could use extended attributes for this type of data - just like HFS+ 
does - and then build a search tool ontop of that (like what MacOS X 
does with Spotlight).

You can store any kind of data you like in an extended attribute, 
however I would caution you that storing the metadata of somethink like 
an MP3 file in metadata may not actually be quicker in the long run.

Exactly what problem are you trying to solve and what kind of metadata 
are you looking for that isn''t natively inside the file formats like
MP3
for track info and EXIF data in JPEG etc ?

Why do you believe that the file system having knowledge of this is 
better some how ?

The other thing that ZFS has is user defineable properties on each dataset.

-- 
Darren J Moffat

Wade.Stuart at fallon.com

2007-Jun-28 14:37 UTC

head link

[zfs-discuss] ReiserFS4 like metadata/search

zfs-discuss-bounces at opensolaris.org wrote on 06/27/2007 06:25:47 PM:
> The only thing I haven''t found in zfs yet, is metadata etc info.
>
> The previous ''next best thing'' in FS was of course
ReiserFS (4). Reiser3
> was quite a nice thing, fast, journaled and all that, but Reiser4
> promised to bring all those things that we see emerging now, like cross
> FS search, any document, audio recording etc could be instantly
> searched. True there is google desktop search, trackerd and what not,
> but those are ''afterthoughts'', not supported by the
underlying FS.
>
> So does ZFS support features like metadata and such? or is that for zfs2?:)


      Without getting too far into political/personal debates,  Reiser has
promised a lot and not done very well delivering for common case (nor does
he appear to be in a position to do so any time soon).  A lot can be said
for having the code refused from Linux core.

      I much prefer the route taken by apple on this -- spotlight is fs
agnostic and attaches to the kernel file update poller to know when to
queue files for index/delete. It also resides in userspace with pluggable
modules for extended file types (such as home brew files).  One side effect
of this design is that the indexing is completely pulled away from blocking
any type of fs write -- they can be queued as low priority as needed. With
this type of system and ZFS a pluggin could be created that indexes things
like extended attributes/compression ratio/etc.

      What real advantages do you see doing this _in_ the filesystem layer?
I can certainly see hooks being added where needed for the indexing system
to interface -- but the core indexing and searching code does not seem to
fit well in FS land.

-Wade

Toby Thain

2007-Jun-28 16:54 UTC

head link

[zfs-discuss] ReiserFS4 like metadata/search

On 28-Jun-07, at 11:37 AM, Wade.Stuart at fallon.com wrote:
>
>
>
>
>
> zfs-discuss-bounces at opensolaris.org wrote on 06/27/2007 06:25:47 PM:
>
>> The only thing I haven''t found in zfs yet, is metadata etc
info.
>>
>> The previous ''next best thing'' in FS was of course
ReiserFS (4).
>> Reiser3
>> was quite a nice thing, fast, journaled and all that, but Reiser4
>> promised to bring all those things that we see emerging now, like  
>> cross
>> FS search, any document, audio recording etc could be instantly
>> searched. True there is google desktop search, trackerd and what not,
>> but those are ''afterthoughts'', not supported by the
underlying FS.
>>
>> So does ZFS support features like metadata and such? or is that  
>> for zfs2?
> :)
>
>
>       Without getting too far into political/personal debates,   
> Reiser has
> promised a lot and not done very well delivering for common case ...
Do you mean a simple fs? Reiser3 certainly delivers, and R4 is stable  
according to the list.
>
>       What real advantages do you see doing this _in_ the  
> filesystem layer?
> I can certainly see hooks being added where needed for the indexing  
> system
> to interface -- but the core indexing and searching code does not  
> seem to
> fit well in FS land.
...except it''s not really been tried. If R4 had been merged we would  
at least have some experiential data to work with.

--Toby
>
> -Wade
>
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Oliver Schinagl

2007-Jun-28 19:46 UTC

head link

[zfs-discuss] ReiserFS4 like metadata/search

I guess the userdefinable properties is then what i''m looking for. Well
not what *I* am looking for perse. i was reading the article on Hans 
Reiser, the one over at wired, good read btw,
(http://www.wired.com/techbiz/people/magazine/15-07/ff_hansreiser?currentPage=1).

Somewhere it stated that the ''revolutionairy'' new thing about
reiser4  was
that it tracked meta data somehow, makeing desktop searches MUCH  faster.
Don''t ask me about the details, i''m no filesystem wiz :) but
i''m  sure
someone familiar with Reiser3/4 hopefully ellaborate?

Darren J Moffat schreef:> Oliver Schinagl wrote:
>> The only thing I haven''t found in zfs yet, is metadata etc
info. Theprevious ''next best thing'' in FS was of course ReiserFS (4).
Reiser3>> was quite a nice thing, fast, journaled and all that, but Reiser4promised to bring all those things that we see emerging now, like cross
FS search, any document, audio recording etc could be instantly
searched. True there is google desktop search, trackerd and what not,
but those are ''afterthoughts'', not supported by the underlying
FS.> You could use extended attributes for this type of data - just like HFS+ does - and then build a search tool ontop of that (like what  MacOS
X does with Spotlight).> You can store any kind of data you like in an extended attribute, however I would caution you that storing the metadata of somethink  like
an MP3 file in metadata may not actually be quicker in the long
run.> Exactly what problem are you trying to solve and what kind of metadata are you looking for that isn''t natively inside the file formats like 
MP3 for track info and EXIF data in JPEG etc ?> Why do you believe that the file system having knowledge of this is 
better some how ?> The other thing that ZFS has is user defineable properties on each dataset.

Toby Thain

2007-Jun-28 20:50 UTC

head link

[zfs-discuss] ReiserFS4 like metadata/search

On 28-Jun-07, at 4:46 PM, Oliver Schinagl wrote:
> I guess the userdefinable properties is then what i''m looking for.
> Well
> not what *I* am looking for perse. i was reading the article on Hans
> Reiser, the one over at wired, good read btw,
> (http://www.wired.com/techbiz/people/magazine/15-07/ff_hansreiser? 
> currentPage=1).
>
> Somewhere it stated that the ''revolutionairy'' new thing
about
> reiser4  was
> that it tracked meta data somehow, makeing desktop searches MUCH   
> faster.
> Don''t ask me about the details, i''m no filesystem wiz :)
but i''m  sure
> someone familiar with Reiser3/4 hopefully ellaborate?

The WIRED article is technically crap^W inept. Go to http:// 
namesys.com/ for the R4 white papers.

--Toby
>
> Darren J Moffat schreef:
>> Oliver Schinagl wrote:
>>> The only thing I haven''t found in zfs yet, is metadata etc
info. The
> previous ''next best thing'' in FS was of course ReiserFS
(4).
> Reiser3
>>> was quite a nice thing, fast, journaled and all that, but Reiser4
> promised to bring all those things that we see emerging now, like  
> cross
> FS search, any document, audio recording etc could be instantly
> searched. True there is google desktop search, trackerd and what not,
> but those are ''afterthoughts'', not supported by the
underlying FS.
>> You could use extended attributes for this type of data - just like
> HFS+ does - and then build a search tool ontop of that (like what   
> MacOS
> X does with Spotlight).
>> You can store any kind of data you like in an extended attribute,
> however I would caution you that storing the metadata of somethink   
> like
> an MP3 file in metadata may not actually be quicker in the long
> run.
>> Exactly what problem are you trying to solve and what kind of  
>> metadata
> are you looking for that isn''t natively inside the file formats
like
> MP3 for track info and EXIF data in JPEG etc ?
>> Why do you believe that the file system having knowledge of this is
> better some how ?
>> The other thing that ZFS has is user defineable properties on each
> dataset.
>
>
>
>
>
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Wade.Stuart at fallon.com

2007-Jun-28 21:37 UTC

head link

[zfs-discuss] ReiserFS4 like metadata/search

>
> I guess my real question should have been, IF it turns out that quick
> indexing and the like are really the next hot thing, would ZFS support
> it (yes from what i gathered earlier on this list).
>
> <rant mode>Come to think of it, the biggest difference of putting
this
> info in the FS layer or in a seperate DB would be availabilty. If the
> extra metadata is placed in the FS, as ''extra info'' for
files, then you
> have it in a DB right then and there, quickly accessible without having
> to run any daemons.
You also have now tied the indexing and index db management to fs ops along
with all the baggage it brings (code and memory size for parsing documents,
document parser extensions, index db btree ops, blocking or indexing thread
management, index lock spinning, ...).
> Now you have to scan files keep them in a seperate
> db, and keep that db up to date.
That DB would either be a new file in the fs or another type of data store
managed by the fs -- either way you have the same overhead as a userspace
daemon would.  I can tell you that both MS and Google''s desktop search
product line had a _ton_ of work put in to a; make sure that the default
search index priority did not "slow down" normal end user experience
and b;
is very tunable to the point of being off (no-op).  When coupled directly
into the fs layer you are stuck doing this in threads (if not blocking) and
on a heavy used system where these are tuned down could become quite a
challenge to manage.
> I can see why the guys over at namesys
> would want to do that, with all this ''searching'' going on
nowadays.  I
> don''t need to search that much myself. I usually keep my stuff
organized
> enough to quickly find what I need.
Interestingly enough I view the server/desktop search market as being
formed behind the userspace camp already -- what seems to be on the horizon
is the share search space.  Servers and desktops replying to search
requests as a grid.  Looking at this level of search requires a userland
daemon.

> But if you look at it from a Windows
> point of view: Why was the ''smart start menu'' introduced
you think? The
> start menu always ends up a mess. Install 1 app, your start menu gets
> loaded with crap. even after a clean fresh install it''s loaded
with
> crap/unstructured. The ''smart'' start menu was supposed to
solve that by
> only showing the most rescent items and ''hiding'' the
rest. Then they
> came with the most used apps in the main menu, cause the desktop clearly
> wasn''t usable for that anymore, as it was loaded with icons from
users.
> And quicklaunch seems to be unused/disapeard because of that same
> reason. So I can see why desktop search and the like is getting so
> popular, people make a mess out of their system, requiring tool/methods
> to find stuff again. Anyway, this isn''t really the right place to
rant
> right :) </rant mode>
Search (server and desktop) has many uses -- finding apps to quickly launch
is for sure one. To view MS''s attempt to closely tie FS and search go
boot
up Vista and play around with WINFS -- err nevermind.  I do not hold
Windows (XP or Vista) up as the bar to beat though and would prefer not to
quibble about their failed attempts.

-Wade

Oliver Schinagl

2007-Jun-28 22:28 UTC

head link

[zfs-discuss] ReiserFS4 like metadata/search

Oh i just ment it was generally a good read :) Nicely written and the
likes. It just mentions a little part about RFS4 and that the
''new''
thing introduced by RFS4 was the ability to search through the metadata
extremly fast.

Personally, I don''t really use all this search nonsense. Wait,
that''s a
lie. I love locate. The fact that it runs once a day is fine by me.
It''s
nice and fast, and does it''s scanning in the background. I think
trackerd (a linux thing) is the perfect user replacement, for those who
need it. I believe it uses the inotify/dnotify fam thingers to be
informed when a file changes/updates so it knows when to update/delete
it''s info.

I guess my real question should have been, IF it turns out that quick
indexing and the like are really the next hot thing, would ZFS support
it (yes from what i gathered earlier on this list).

<rant mode>Come to think of it, the biggest difference of putting this
info in the FS layer or in a seperate DB would be availabilty. If the
extra metadata is placed in the FS, as ''extra info'' for files,
then you
have it in a DB right then and there, quickly accessible without having
to run any daemons. Now you have to scan files keep them in a seperate
db, and keep that db up to date. I can see why the guys over at namesys
would want to do that, with all this ''searching'' going on
nowadays. I
don''t need to search that much myself. I usually keep my stuff
organized
enough to quickly find what I need. But if you look at it from a Windows
point of view: Why was the ''smart start menu'' introduced you
think? The
start menu always ends up a mess. Install 1 app, your start menu gets
loaded with crap. even after a clean fresh install it''s loaded with
crap/unstructured. The ''smart'' start menu was supposed to
solve that by
only showing the most rescent items and ''hiding'' the rest.
Then they
came with the most used apps in the main menu, cause the desktop clearly
wasn''t usable for that anymore, as it was loaded with icons from users.
And quicklaunch seems to be unused/disapeard because of that same
reason. So I can see why desktop search and the like is getting so
popular, people make a mess out of their system, requiring tool/methods
to find stuff again. Anyway, this isn''t really the right place to rant
right :) </rant mode>

Toby Thain wrote:>
> On 28-Jun-07, at 4:46 PM, Oliver Schinagl wrote:
>
>> I guess the userdefinable properties is then what i''m looking
for. Well
>> not what *I* am looking for perse. i was reading the article on Hans
>> Reiser, the one over at wired, good read btw,
>>
(http://www.wired.com/techbiz/people/magazine/15-07/ff_hansreiser?currentPage=1).
>>
>>
>> Somewhere it stated that the ''revolutionairy'' new
thing about
>> reiser4  was
>> that it tracked meta data somehow, makeing desktop searches MUCH 
>> faster.
>> Don''t ask me about the details, i''m no filesystem wiz
:) but i''m  sure
>> someone familiar with Reiser3/4 hopefully ellaborate?
>
>
> The WIRED article is technically crap^W inept. Go to
> http://namesys.com/ for the R4 white papers.
>
> --Toby
>
>>
>> Darren J Moffat schreef:
>>> Oliver Schinagl wrote:
>>>> The only thing I haven''t found in zfs yet, is metadata
etc info. The
>> previous ''next best thing'' in FS was of course
ReiserFS (4).
>> Reiser3
>>>> was quite a nice thing, fast, journaled and all that, but
Reiser4
>> promised to bring all those things that we see emerging now, like cross
>> FS search, any document, audio recording etc could be instantly
>> searched. True there is google desktop search, trackerd and what not,
>> but those are ''afterthoughts'', not supported by the
underlying FS.
>>> You could use extended attributes for this type of data - just like
>> HFS+ does - and then build a search tool ontop of that (like what 
MacOS
>> X does with Spotlight).
>>> You can store any kind of data you like in an extended attribute,
>> however I would caution you that storing the metadata of somethink 
like
>> an MP3 file in metadata may not actually be quicker in the long
>> run.
>>> Exactly what problem are you trying to solve and what kind of
metadata
>> are you looking for that isn''t natively inside the file
formats like
>> MP3 for track info and EXIF data in JPEG etc ?
>>> Why do you believe that the file system having knowledge of this is
>> better some how ?
>>> The other thing that ZFS has is user defineable properties on each
>> dataset.
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

Apparently Analagous Threads

Search for more reasonably related threads

zfs discuss - Jun 2007 - ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] Re: ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] Shrinking of Pools.

[zfs-discuss] Shrinking of Pools.

[zfs-discuss] Shrinking of Pools.

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] Re: ZFS Scalability/performance

[zfs-discuss] ZFS Scalability/performance

[zfs-discuss] Re: ZFS Scalability/performance

[zfs-discuss] Re: ZFS Scalability/performance

[zfs-discuss] Re: ZFS Scalability/performance

[zfs-discuss] Re: ZFS Scalability/performance

[zfs-discuss] ReiserFS4 like metadata/search

[zfs-discuss] ReiserFS4 like metadata/search

[zfs-discuss] ReiserFS4 like metadata/search

[zfs-discuss] ReiserFS4 like metadata/search

[zfs-discuss] ReiserFS4 like metadata/search

[zfs-discuss] ReiserFS4 like metadata/search

[zfs-discuss] ReiserFS4 like metadata/search

[zfs-discuss] ReiserFS4 like metadata/search

Apparently Analagous Threads