Hi. I''d like to request a feature be added to zfs. Currently, on SAN attached disk, zpool shows up with a big WWN for the disk. If ZFS (or the zpool command, in particular) had a text field for arbitrary information, it would be possible to add something that would indicate what LUN on what array the disk in question might be. This would make troubleshooting and general understanding of the actual storage layout much simpler, as you''d know something about any disks that are encountering problems. Something like: zpool status pool: local state: ONLINE scrub: scrub completed with 0 errors on Sun Sep 23 04:16:33 2007 config: NAME STATE READ WRITE CKSUM NOTE local ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c2t0d0 ONLINE 0 0 0 Internal SATA on left side c2t2d0 ONLINE 0 0 0 Internal SATA on right side c2t3d0 ONLINE 0 0 0 External SATA disk 1 in box on top c2t4d0 ONLINE 0 0 0 External SATA disk 2 in box on top spares c2t5d0 AVAIL External SATA disk 3 in box on top errors: No known data errors The above would be very useful should a disk fail to identify what device is what. Thanks! ----- Gregory Shaw, IT Architect IT CTO Group, Sun Microsystems Inc. Phone: (303)-272-8817 (x78817) 500 Eldorado Blvd, UBRM02-157 greg.shaw at sun.com (work) Broomfield, CO 80021 shaw at fmsoft.com (home) "When Microsoft writes an application for Linux, I''ve won." - Linus Torvalds
Gregory Shaw wrote:> Hi. I''d like to request a feature be added to zfs. Currently, on > SAN attached disk, zpool shows up with a big WWN for the disk. If > ZFS (or the zpool command, in particular) had a text field for > arbitrary information, it would be possible to add something that > would indicate what LUN on what array the disk in question might be. > This would make troubleshooting and general understanding of the > actual storage layout much simpler, as you''d know something about any > disks that are encountering problems. > > Something like: > > zpool status > pool: local > state: ONLINE > scrub: scrub completed with 0 errors on Sun Sep 23 04:16:33 2007 > config: > > NAME STATE READ WRITE CKSUM NOTE > local ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > c2t0d0 ONLINE 0 0 0 Internal SATA on > left side > c2t2d0 ONLINE 0 0 0 Internal SATA on > right side > c2t3d0 ONLINE 0 0 0 External SATA disk 1 > in box on top > c2t4d0 ONLINE 0 0 0 External SATA disk 2 > in box on top > spares > c2t5d0 AVAIL External SATA disk 3 > in box on top > > errors: No known data errors > > > The above would be very useful should a disk fail to identify what > device is what.How would you gather that information? How would you ensure that it stayed accurate in a hotplug world? James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems
James C. McPherson wrote:> Gregory Shaw wrote: > >> Hi. I''d like to request a feature be added to zfs. Currently, on >> SAN attached disk, zpool shows up with a big WWN for the disk. If >> ZFS (or the zpool command, in particular) had a text field for >> arbitrary information, it would be possible to add something that >> would indicate what LUN on what array the disk in question might be. >> This would make troubleshooting and general understanding of the >> actual storage layout much simpler, as you''d know something about any >> disks that are encountering problems. >> >> Something like: >> >> zpool status >> pool: local >> state: ONLINE >> scrub: scrub completed with 0 errors on Sun Sep 23 04:16:33 2007 >> config: >> >> NAME STATE READ WRITE CKSUM NOTE >> local ONLINE 0 0 0 >> raidz1 ONLINE 0 0 0 >> c2t0d0 ONLINE 0 0 0 Internal SATA on >> left side >> c2t2d0 ONLINE 0 0 0 Internal SATA on >> right side >> c2t3d0 ONLINE 0 0 0 External SATA disk 1 >> in box on top >> c2t4d0 ONLINE 0 0 0 External SATA disk 2 >> in box on top >> spares >> c2t5d0 AVAIL External SATA disk 3 >> in box on top >> >> errors: No known data errors >> >> >> The above would be very useful should a disk fail to identify what >> device is what. >> > > How would you gather that information? > How would you ensure that it stayed accurate in > a hotplug world? >If it is stored on the device itself it would keep the description with the same device. In the case of iSCSI, it would be nice to keep lun info instead of having to correlate the drive id to the iqn to the lun, especially when working with luns in one place and drive ids in another. -Tim
Tim Spriggs wrote:> James C. McPherson wrote: >> Gregory Shaw wrote:...>>> The above would be very useful should a disk fail to identify what >>> device is what. >> How would you gather that information? >> How would you ensure that it stayed accurate in >> a hotplug world? > If it is stored on the device itself it would keep the description with > the same device.What about when you replace a disk in an array? What process would go and write the appropriate information? How about a bootstrap for this? What would accomplish it? What if you moved your entire system and re-cabled your arrays slightly different? We have the beauty of devids to make sure that the pools still work, but the physical information Greg is talking about would almost certainly be incorrect.> In the case of iSCSI, it would be nice to keep lun info instead of > having to correlate the drive id to the iqn to the lun, especially when > working with luns in one place and drive ids in another.When I was playing around with iscsi a few months back, I used 1gb files on zfs which I then exported.... because I could. What sort of lun info do you want in that sort of use case? "Position" would be quite useless in some respects. James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems
On Wed, 2007-09-26 at 08:26 +1000, James C. McPherson wrote:> How would you gather that information?the tools to use would be dependant on the actual storage device in use. luxadm for A5x00 and V8x0 internal storage, sccli for 3xxx, etc., etc.,> How would you ensure that it stayed accurate in > a hotplug world?See above. I''m told that with many jbod arrays, SES has the information. - Bill
Bill Sommerfeld wrote:> On Wed, 2007-09-26 at 08:26 +1000, James C. McPherson wrote: >> How would you gather that information? > > the tools to use would be dependant on the actual storage device in use. > luxadm for A5x00 and V8x0 internal storage, sccli for 3xxx, etc., etc.,No consistent interface to use, then, unless another tool or cruft gets added to ZFS to make this happen. That would seem to defeat one of the major wins of ZFS - storage neutrality.>> How would you ensure that it stayed accurate in >> a hotplug world? > > See above. > > I''m told that with many jbod arrays, SES has the information.True, but that''s still many, but not all. James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems
It would be a manual process. As with any arbitrary name, it''s a useful tag, not much more. James C. McPherson wrote:> Gregory Shaw wrote: > >> Hi. I''d like to request a feature be added to zfs. Currently, on >> SAN attached disk, zpool shows up with a big WWN for the disk. If >> ZFS (or the zpool command, in particular) had a text field for >> arbitrary information, it would be possible to add something that >> would indicate what LUN on what array the disk in question might be. >> This would make troubleshooting and general understanding of the >> actual storage layout much simpler, as you''d know something about any >> disks that are encountering problems. >> >> Something like: >> >> zpool status >> pool: local >> state: ONLINE >> scrub: scrub completed with 0 errors on Sun Sep 23 04:16:33 2007 >> config: >> >> NAME STATE READ WRITE CKSUM NOTE >> local ONLINE 0 0 0 >> raidz1 ONLINE 0 0 0 >> c2t0d0 ONLINE 0 0 0 Internal SATA on >> left side >> c2t2d0 ONLINE 0 0 0 Internal SATA on >> right side >> c2t3d0 ONLINE 0 0 0 External SATA disk 1 >> in box on top >> c2t4d0 ONLINE 0 0 0 External SATA disk 2 >> in box on top >> spares >> c2t5d0 AVAIL External SATA disk 3 >> in box on top >> >> errors: No known data errors >> >> >> The above would be very useful should a disk fail to identify what >> device is what. >> > > How would you gather that information? > How would you ensure that it stayed accurate in > a hotplug world? > > > > James C. McPherson > -- > Senior Kernel Software Engineer, Solaris > Sun Microsystems > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
James C. McPherson wrote:> Bill Sommerfeld wrote: > >> On Wed, 2007-09-26 at 08:26 +1000, James C. McPherson wrote: >> >>> How would you gather that information? >>> >> the tools to use would be dependant on the actual storage device in use. >> luxadm for A5x00 and V8x0 internal storage, sccli for 3xxx, etc., etc., >> > > No consistent interface to use, then, unless another tool > or cruft gets added to ZFS to make this happen. That would > seem to defeat one of the major wins of ZFS - storage > neutrality. > > >I''d be happy with an arbitrary field that could be assigned via a command. Intelligence could be added later if appropriate, but at this point, figuring out what-is-where on a big list of disk IDs on a SAN device is very difficult. So, aiming low, a text field that could be assigned. In the future, perhaps something that would associate a serial number of something similar to that name?>>> How would you ensure that it stayed accurate in >>> a hotplug world? >>> >> See above. >> >> I''m told that with many jbod arrays, SES has the information. >> > > True, but that''s still many, but not all. > > > James C. McPherson > -- > Senior Kernel Software Engineer, Solaris > Sun Microsystems > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Greg Shaw wrote:> James C. McPherson wrote: >> Bill Sommerfeld wrote: >> >>> On Wed, 2007-09-26 at 08:26 +1000, James C. McPherson wrote: >>> >>>> How would you gather that information? >>>> >>> the tools to use would be dependant on the actual storage device in use. >>> luxadm for A5x00 and V8x0 internal storage, sccli for 3xxx, etc., >>> etc., >> >> No consistent interface to use, then, unless another tool >> or cruft gets added to ZFS to make this happen. That would >> seem to defeat one of the major wins of ZFS - storage >> neutrality. >> >> >> > I''d be happy with an arbitrary field that could be assigned via a > command. Intelligence could be added later if appropriate, but at this > point, figuring out what-is-where on a big list of disk IDs on a SAN > device is very difficult. > > So, aiming low, a text field that could be assigned. In the future, > perhaps something that would associate a serial number of something > similar to that name?That sounds like an ok RFE to me. For some of the arrays (eg HDS) that we come into contact with, it''s possible to decode the device guid into something meaningful to a human, but that''s generally closed information. James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems
On Sep 25, 2007, at 7:21 PM, James C. McPherson wrote:> > That sounds like an ok RFE to me. > > For some of the arrays (eg HDS) that we come > into contact with, it''s possible to decode the > device guid into something meaningful to a > human, but that''s generally closed information.To me, this RFE proposal seems akin to "volname" under format(1M) /dale
Greg Shaw wrote:> James C. McPherson wrote: >> Bill Sommerfeld wrote: >> >>> On Wed, 2007-09-26 at 08:26 +1000, James C. McPherson wrote: >>> >>>> How would you gather that information? >>>> >>> the tools to use would be dependant on the actual storage device in use. >>> luxadm for A5x00 and V8x0 internal storage, sccli for 3xxx, etc., etc., >>> >> No consistent interface to use, then, unless another tool >> or cruft gets added to ZFS to make this happen. That would >> seem to defeat one of the major wins of ZFS - storage >> neutrality. >> > I''d be happy with an arbitrary field that could be assigned via a > command. Intelligence could be added later if appropriate, but at this > point, figuring out what-is-where on a big list of disk IDs on a SAN > device is very difficult.Something like a text file? :-P The problem with this is that wrong information is much worse than no information, there is no way to automatically validate the information, and therefore people are involved. If people were reliable, then even a text file would work. If it can''t be automatic and reliable, then it isn''t worth doing. There have been a number of map-this-to-that implementations over the years, but they have all relied upon intimate knowledge of the hardware which also has fixed physical addressing (eg. SSA, thumper, et.al.) The virtual new world intentionally tries to hide this sort of info. There was a time when everyone on the internet knew the IP addresses of all the cool places. After a week or two, work begun on a name service ;-) Perhaps devices should be name service aware and the protocols extended to perform name lookups. -- richard
On Sep 25, 2007, at 7:48 PM, Richard Elling wrote:> > The problem with this is that wrong information is much worse than no > information, there is no way to automatically validate the > information, > and therefore people are involved. If people were reliable, then even > a text file would work. If it can''t be automatic and reliable, then > it isn''t worth doing.I dunno if I think we have to think this far into it. Consider the Clearview project and their implementation of vanity names for network interfaces. Conceivably, such a feature is useful to admins as they can set the name to be a particular vlan number, or the switch/blade/port where the other end of the ethernet line is terminated. Or their ex-girlfriend''s name if it''s a particular troublesome interface. The point is, it allows arbitrary naming of something so that the admin(s) can associate with it better as an object. Most importantly, there''s a distinction here. Solaris provides the facility. It''s up to the admin to maintain it. That''s as far as it should go. I don''t see this RFE proposal as anything different. To wit, a similar facility has existed for some time in the form of 8 character volume names one may set via format(1M). With that, the same predicament as the one you bring up can arise. However, there''s that distinction again. The facility to do so is made available. It''s on the onus of the admin to maintain its contents. /dale
On Sep 25, 2007, at 5:48 PM, Richard Elling wrote:> Greg Shaw wrote: >> James C. McPherson wrote: >>> Bill Sommerfeld wrote: >>> >>>> On Wed, 2007-09-26 at 08:26 +1000, James C. McPherson wrote: >>>> >>>>> How would you gather that information? >>>>> >>>> the tools to use would be dependant on the actual storage device >>>> in use. >>>> luxadm for A5x00 and V8x0 internal storage, sccli for 3xxx, >>>> etc., etc., >>> No consistent interface to use, then, unless another tool >>> or cruft gets added to ZFS to make this happen. That would >>> seem to defeat one of the major wins of ZFS - storage >>> neutrality. >>> >> I''d be happy with an arbitrary field that could be assigned via a >> command. Intelligence could be added later if appropriate, but at >> this point, figuring out what-is-where on a big list of disk IDs >> on a SAN device is very difficult. > > Something like a text file? :-P > The problem with this is that wrong information is much worse than no > information, there is no way to automatically validate the > information, > and therefore people are involved. If people were reliable, then even > a text file would work. If it can''t be automatic and reliable, then > it isn''t worth doing. >ZFS validates the disks as it boots, correct? If it has a way to keep track of who-is-who for pool redundancy purposes, is it that difficult to associate a text field with that device? Wrong information may be bad, but it''s far better than *no* information.> There have been a number of map-this-to-that implementations over the > years, but they have all relied upon intimate knowledge of the > hardware > which also has fixed physical addressing (eg. SSA, thumper, et.al.) > The virtual new world intentionally tries to hide this sort of info. > > There was a time when everyone on the internet knew the IP > addresses of > all the cool places. After a week or two, work begun on a name > service ;-) > Perhaps devices should be name service aware and the protocols > extended > to perform name lookups. > -- richardI disagree. VxVM has always had names for the different pieces, including the disk name. It was able to associate the name with the disk, and has done so for ~10 years? I''m asking for the same ability -- let me name the disks so that I can keep track of them using human readable names. ----- Gregory Shaw, IT Architect IT CTO Group, Sun Microsystems Inc. Phone: (303)-272-8817 (x78817) 500 Eldorado Blvd, UBRM02-157 greg.shaw at sun.com (work) Broomfield, CO 80021 shaw at fmsoft.com (home) "When Microsoft writes an application for Linux, I''ve won." - Linus Torvalds -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070925/ddf0b537/attachment.html>
Dale Ghent wrote:> On Sep 25, 2007, at 7:48 PM, Richard Elling wrote: > >> The problem with this is that wrong information is much worse than no >> information, there is no way to automatically validate the >> information, >> and therefore people are involved. If people were reliable, then even >> a text file would work. If it can''t be automatic and reliable, then >> it isn''t worth doing. > > I dunno if I think we have to think this far into it. > > Consider the Clearview project and their implementation of vanity > names for network interfaces. Conceivably, such a feature is useful > to admins as they can set the name to be a particular vlan number, or > the switch/blade/port where the other end of the ethernet line is > terminated. Or their ex-girlfriend''s name if it''s a particular > troublesome interface. The point is, it allows arbitrary naming of > something so that the admin(s) can associate with it better as an > object. Most importantly, there''s a distinction here. Solaris > provides the facility. It''s up to the admin to maintain it. That''s as > far as it should go.Actually, you can use the existing name space for this. By default, ZFS uses /dev/dsk. But everything in /dev is a symlink. So you could setup your own space, say /dev/myknowndisks and use more descriptive names. You might need to hack on the startup service to not look at /dev, but that shouldn''t be too hard. In other words, if the answer is "let the sysadmin do it" then it can be considered solved. The stretch goal is to make some sort of reasonable name service. At this point I''ll note the the FC folks envisioned something like that, but never implemented it. # ramdiskadm -a BrownDiskWithWhiteHat 150m /dev/ramdisk/BrownDiskWithWhiteDot # zpool create zwimming /dev/ramdisk/BrownDiskWithWhiteDot # zpool status zwimming pool: zwimming state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM zwimming ONLINE 0 0 0 /dev/ramdisk/BrownDiskWithWhiteDot ONLINE 0 0 0 errors: No known data errors # ls -l /dev/ramdisk/BrownDiskWithWhiteHat lrwxrwxrwx 1 root root 55 Sep 25 17:59 /dev/ramdisk/BrownDiskWithWhiteDot -> ../../devices/pseudo/ramdisk at 1024:BrownDiskWithWhiteDot # zpool export zwimming # mkdir /dev/whee # cd /dev/whee # ln -s ../../devices/pseudo/ramdisk at 1024:BrownDiskWithWhiteHat YellowDiskUnderPinkBox # zpool import -d /dev/whee zwimming # zpool status zwimming pool: zwimming state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM zwimming ONLINE 0 0 0 /dev/whee/YellowDiskUnderPinkBox ONLINE 0 0 0 errors: No known data errors -- richard
On Sep 25, 2007, at 7:09 PM, Richard Elling wrote:> Dale Ghent wrote: >> On Sep 25, 2007, at 7:48 PM, Richard Elling wrote: >>> The problem with this is that wrong information is much worse >>> than no >>> information, there is no way to automatically validate the >>> information, >>> and therefore people are involved. If people were reliable, then >>> even >>> a text file would work. If it can''t be automatic and reliable, then >>> it isn''t worth doing. >> I dunno if I think we have to think this far into it. >> Consider the Clearview project and their implementation of vanity >> names for network interfaces. Conceivably, such a feature is >> useful to admins as they can set the name to be a particular vlan >> number, or the switch/blade/port where the other end of the >> ethernet line is terminated. Or their ex-girlfriend''s name if >> it''s a particular troublesome interface. The point is, it allows >> arbitrary naming of something so that the admin(s) can associate >> with it better as an object. Most importantly, there''s a >> distinction here. Solaris provides the facility. It''s up to the >> admin to maintain it. That''s as far as it should go. > > Actually, you can use the existing name space for this. By default, > ZFS uses /dev/dsk. But everything in /dev is a symlink. So you could > setup your own space, say /dev/myknowndisks and use more descriptive > names. You might need to hack on the startup service to not look at > /dev, but that shouldn''t be too hard. In other words, if the answer > is "let the sysadmin do it" then it can be considered solved. The > stretch goal is to make some sort of reasonable name service. At > this point I''ll note the the FC folks envisioned something like that, > but never implemented it. > > # ramdiskadm -a BrownDiskWithWhiteHat 150m /dev/ramdisk/ > BrownDiskWithWhiteDot > # zpool create zwimming /dev/ramdisk/BrownDiskWithWhiteDot > # zpool status zwimming > pool: zwimming > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE > CKSUM > zwimming ONLINE 0 > 0 0 > /dev/ramdisk/BrownDiskWithWhiteDot ONLINE 0 > 0 0 > > errors: No known data errors > # ls -l /dev/ramdisk/BrownDiskWithWhiteHat > lrwxrwxrwx 1 root root 55 Sep 25 17:59 /dev/ramdisk/ > BrownDiskWithWhiteDot -> ../../devices/pseudo/ > ramdisk at 1024:BrownDiskWithWhiteDot > # zpool export zwimming > # mkdir /dev/whee > # cd /dev/whee > # ln -s ../../devices/pseudo/ramdisk at 1024:BrownDiskWithWhiteHat > YellowDiskUnderPinkBox > # zpool import -d /dev/whee zwimming > # zpool status zwimming > pool: zwimming > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > zwimming ONLINE 0 0 0 > /dev/whee/YellowDiskUnderPinkBox ONLINE 0 0 0 > > errors: No known data errors > > > -- richardBut nobody would actually do this. If the process can''t be condensed into a single step (e.g. a single command), people won''t bother. Besides, who would take the chance that a ''boot -r'' would keep their elaborate symbolic link tree intact? I wouldn''t. I''ve learned that you can''t count on anything in /dev remaining over ''boot -r'', patches, driver updates, san events, etc. ----- Gregory Shaw, IT Architect IT CTO Group, Sun Microsystems Inc. Phone: (303)-272-8817 (x78817) 500 Eldorado Blvd, UBRM02-157 greg.shaw at sun.com (work) Broomfield, CO 80021 shaw at fmsoft.com (home) "When Microsoft writes an application for Linux, I''ve won." - Linus Torvalds -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070925/1eb893a6/attachment.html>
On 9/25/07, Gregory Shaw <Greg.Shaw at sun.com> wrote:> > > > On Sep 25, 2007, at 7:09 PM, Richard Elling wrote: > > Dale Ghent wrote: > On Sep 25, 2007, at 7:48 PM, Richard Elling wrote: > The problem with this is that wrong information is much worse than no > information, there is no way to automatically validate the information, > and therefore people are involved. If people were reliable, then even > a text file would work. If it can''t be automatic and reliable, then > it isn''t worth doing. > I dunno if I think we have to think this far into it. > Consider the Clearview project and their implementation of vanity names for > network interfaces. Conceivably, such a feature is useful to admins as they > can set the name to be a particular vlan number, or the switch/blade/port > where the other end of the ethernet line is terminated. Or their > ex-girlfriend''s name if it''s a particular troublesome interface. The point > is, it allows arbitrary naming of something so that the admin(s) can > associate with it better as an object. Most importantly, there''s a > distinction here. Solaris provides the facility. It''s up to the admin to > maintain it. That''s as far as it should go. > > Actually, you can use the existing name space for this. By default, > ZFS uses /dev/dsk. But everything in /dev is a symlink. So you could > setup your own space, say /dev/myknowndisks and use more descriptive > names. You might need to hack on the startup service to not look at > /dev, but that shouldn''t be too hard. In other words, if the answer > is "let the sysadmin do it" then it can be considered solved. The > stretch goal is to make some sort of reasonable name service. At > this point I''ll note the the FC folks envisioned something like that, > but never implemented it. > > # ramdiskadm -a BrownDiskWithWhiteHat 150m > /dev/ramdisk/BrownDiskWithWhiteDot > # zpool create zwimming /dev/ramdisk/BrownDiskWithWhiteDot > # zpool status zwimming > pool: zwimming > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > zwimming ONLINE 0 0 0 > /dev/ramdisk/BrownDiskWithWhiteDot ONLINE > 0 0 0 > > errors: No known data errors > # ls -l /dev/ramdisk/BrownDiskWithWhiteHat > lrwxrwxrwx 1 root root 55 Sep 25 17:59 > /dev/ramdisk/BrownDiskWithWhiteDot -> > ../../devices/pseudo/ramdisk at 1024:BrownDiskWithWhiteDot > # zpool export zwimming > # mkdir /dev/whee > # cd /dev/whee > # ln -s > ../../devices/pseudo/ramdisk at 1024:BrownDiskWithWhiteHat > YellowDiskUnderPinkBox > # zpool import -d /dev/whee zwimming > # zpool status zwimming > pool: zwimming > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > zwimming ONLINE 0 0 0 > /dev/whee/YellowDiskUnderPinkBox ONLINE 0 > 0 0 > > errors: No known data errors > > > -- richard > > But nobody would actually do this. If the process can''t be condensed into a > single step (e.g. a single command), people won''t bother. > > Besides, who would take the chance that a ''boot -r'' would keep their > elaborate symbolic link tree intact? I wouldn''t. > > I''ve learned that you can''t count on anything in /dev remaining over ''boot > -r'', patches, driver updates, san events, etc. > > > ----- > Gregory Shaw, IT Architect > IT CTO Group, Sun Microsystems Inc. > Phone: (303)-272-8817 (x78817) > 500 Eldorado Blvd, UBRM02-157 greg.shaw at sun.com (work) > Broomfield, CO 80021 shaw at fmsoft.com (home) > "When Microsoft writes an application for Linux, I''ve won." - Linus Torvalds > > > > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >I''ve actually contemplated requesting such a feature for /dev (creating vanity aliases that are presistant) as it would also be useful for things like databases that use raw disk (i.e. Oracle ASM).
Please don''t do this as a rule, it makes for horrendous support issues and breaks a lot of health check tools.>> Actually, you can use the existing name space for this. By default, >> ZFS uses /dev/dsk. But everything in /dev is a symlink. So you could >> setup your own space, say /dev/myknowndisks and use more descriptive >> names. You might need to hack on the startup service to not look at >> /dev, but that shouldn''t be too hard. In other words, if the answer >> is "let the sysadmin do it" then it can be considered solved. The >> stretch goal is to make some sort of reasonable name service. At >> this point I''ll note the the FC folks envisioned something like that, >> but never implemented it. >> >> # ramdiskadm -a BrownDiskWithWhiteHat 150m >> /dev/ramdisk/BrownDiskWithWhiteDot >> # zpool create zwimming /dev/ramdisk/BrownDiskWithWhiteDot >> # zpool status zwimming >> pool: zwimming >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> zwimming ONLINE 0 0 0 >> /dev/ramdisk/BrownDiskWithWhiteDot ONLINE >> 0 0 0 >> >> errors: No known data errors >> # ls -l /dev/ramdisk/BrownDiskWithWhiteHat >> lrwxrwxrwx 1 root root 55 Sep 25 17:59 >> /dev/ramdisk/BrownDiskWithWhiteDot -> >> ../../devices/pseudo/ramdisk at 1024:BrownDiskWithWhiteDot >> # zpool export zwimming >> # mkdir /dev/whee >> # cd /dev/whee >> # ln -s >> ../../devices/pseudo/ramdisk at 1024:BrownDiskWithWhiteHat >> YellowDiskUnderPinkBox >> # zpool import -d /dev/whee zwimming >> # zpool status zwimming >> pool: zwimming >> state: ONLINE >> scrub: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> zwimming ONLINE 0 0 0 >> /dev/whee/YellowDiskUnderPinkBox ONLINE 0 >> 0 0 >> >> errors: No known data errors >> >> >> -- richard >> >> But nobody would actually do this. If the process can''t be condensed into a >> single step (e.g. a single command), people won''t bother. >> >> Besides, who would take the chance that a ''boot -r'' would keep their >> elaborate symbolic link tree intact? I wouldn''t. >> >> I''ve learned that you can''t count on anything in /dev remaining over ''boot >> -r'', patches, driver updates, san events, etc. >>
On Tue, Sep 25, 2007 at 06:09:04PM -0700, Richard Elling wrote:> Actually, you can use the existing name space for this. By default, > ZFS uses /dev/dsk. But everything in /dev is a symlink. So you could > setup your own space, say /dev/myknowndisks and use more descriptive > names. You might need to hack on the startup service to not look at > /dev, but that shouldn''t be too hard. In other words, if the answer > is "let the sysadmin do it" then it can be considered solved.It seems to me that would limit the knowledge to the currently imported machine, not keep it with the pool. Naming of VxVM disks is valuable and powerful because the names are stored in the diskgroup and are visible to any host that imports it. -- Darren Dunham ddunham at taos.com Senior Technical Consultant TAOS http://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. >
A Darren Dunham wrote:> On Tue, Sep 25, 2007 at 06:09:04PM -0700, Richard Elling wrote: >> Actually, you can use the existing name space for this. By default, >> ZFS uses /dev/dsk. But everything in /dev is a symlink. So you could >> setup your own space, say /dev/myknowndisks and use more descriptive >> names. You might need to hack on the startup service to not look at >> /dev, but that shouldn''t be too hard. In other words, if the answer >> is "let the sysadmin do it" then it can be considered solved. > > It seems to me that would limit the knowledge to the currently imported > machine, not keep it with the pool.The point is that the name of the vdev doesn''t really matter to ZFS.> Naming of VxVM disks is valuable and powerful because the names are > stored in the diskgroup and are visible to any host that imports it.Ah, but VxVM has the concept of owned disks. You cannot share a disk across VxVM instances (CVM excepted) and each OS can have only one VxVM instance. OTOH, ZFS storage pools are at the vdev level, not the disk level, so ZFS is not constrained by the disk boundary. To take this discussion further, if a disk:vdev mapping is not 1:1, then how would vanity naming of disks make any difference to ZFS? -- richard
On Wed, Sep 26, 2007 at 09:53:00AM -0700, Richard Elling wrote:> A Darren Dunham wrote:> >It seems to me that would limit the knowledge to the currently imported > >machine, not keep it with the pool. > > The point is that the name of the vdev doesn''t really matter to ZFS.I would assume the name of the disk doesn''t matter to VxVM either. It''s just a visible name for administrators.> >Naming of VxVM disks is valuable and powerful because the names are > >stored in the diskgroup and are visible to any host that imports it. > > Ah, but VxVM has the concept of owned disks. You cannot share a disk > across VxVM instances (CVM excepted) and each OS can have only one VxVM > instance. OTOH, ZFS storage pools are at the vdev level, not the disk > level, so ZFS is not constrained by the disk boundary.I see no difference between ZFS and VxVM in these particulars. Neither one by default allows simultaneous imports, both tend to use entire disks and have advantages when used that way, both allow a managed device (ZFS vdev, VxVM disk) to be made out of a portion of a disk. (I will admit it is less common to do that on VxVM than it is on ZFS). So amend my notion of naming a "disk" to naming a "vdev".> To take this discussion further, if a disk:vdev mapping is not 1:1, then > how would vanity naming of disks make any difference to ZFS?I''m only suggesting that a common use of both is in a 1:1 situation, and that being able to give names to storage is valuable in that case. I don''t see that the value is diminished becase we can create a configuration where it''s less obvious how it would be used. I see ZFS as having slightly less need for it at the moment only because deallocation of storage can only happen on mirrors or by destroying a pool. As the flexibility for moving/removing storage in a pool comes to be, I think better ways to view information about the disks/vdevs is going to be more important. -- Darren Dunham ddunham at taos.com Senior Technical Consultant TAOS http://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. >
Quick reset, Greg Shaw asked for a more descriptive output for zpool status. I''ve already demonstrated how to do that. We also discussed the difficulty in making a reliable name to physical location map without involving humans. continuing on... A Darren Dunham wrote:> On Wed, Sep 26, 2007 at 09:53:00AM -0700, Richard Elling wrote: >> A Darren Dunham wrote: > >>> It seems to me that would limit the knowledge to the currently imported >>> machine, not keep it with the pool. >> The point is that the name of the vdev doesn''t really matter to ZFS. > > I would assume the name of the disk doesn''t matter to VxVM either. It''s > just a visible name for administrators.IIRC, the name also appears in the /dev/vx directory structure. Though Mark Ashley claimed this "makes for horrendous support issues and breaks a lot of health check tools," I disagree. zpool status will be perfectly happy and if someone wrote a health check tool which expects /dev/dsk/c* to mean anything then they weren''t thinking of modern (Solaris 2+) OSes. Recall that /devices contains physical device entries, /dev is just a bunch of symlinks to aid system administators and applications which look for default devices. IMHO, it is perfectly reasonable to use /dev for this purpose, though in practice it could be any other directory of your choosing.>>> Naming of VxVM disks is valuable and powerful because the names are >>> stored in the diskgroup and are visible to any host that imports it. >> Ah, but VxVM has the concept of owned disks. You cannot share a disk >> across VxVM instances (CVM excepted) and each OS can have only one VxVM >> instance. OTOH, ZFS storage pools are at the vdev level, not the disk >> level, so ZFS is not constrained by the disk boundary. > > I see no difference between ZFS and VxVM in these particulars.I see a radical difference.> Neither one by default allows simultaneous imports, both tend to use > entire disks and have advantages when used that way, both allow a > managed device (ZFS vdev, VxVM disk) to be made out of a portion of a > disk. (I will admit it is less common to do that on VxVM than it is on > ZFS).AFAIK, VxVM still only expects one private region per disk. The private region stores info on the configuration of the logical devices on the disk, and its participation therein. ZFS places this data in the on-disk format on the vdev, which is radically different. With ZFS you could conceivably have a different storage pool per slice or partition.> So amend my notion of naming a "disk" to naming a "vdev". > >> To take this discussion further, if a disk:vdev mapping is not 1:1, then >> how would vanity naming of disks make any difference to ZFS? > > I''m only suggesting that a common use of both is in a 1:1 situation, and > that being able to give names to storage is valuable in that case. I > don''t see that the value is diminished becase we can create a > configuration where it''s less obvious how it would be used.I think you are still thinking of the old way of doing things where you *had to worry* about disks. To some degree, ZFS frees you from that restriction in that you can worry about storage pools, at a higher level of abstraction. VxVM and SVM got us only part way down the road to abstraction. However, that doesn''t relieve us from the serviceability issues surrounding physical disksor vdevs. Even if we had a vdev name service in zpool(1m) to provide human-readable lookups, we still have the issue that the rest of the OS, especially FMA, will know the device by a different name.> I see ZFS as having slightly less need for it at the moment only because > deallocation of storage can only happen on mirrors or by destroying a > pool. As the flexibility for moving/removing storage in a pool comes to > be, I think better ways to view information about the disks/vdevs is > going to be more important.I think there is a use case lurking here, but it is not actually related to ZFS. fmtopo has some knowledge of topology, but it is far from perfect for random hardware, and seems particularly devoid of disk information. cfgadm also has some info, also limited. Virtualization and multipathing further abstract away physical knowledge. Is there any way to get all of the parties involved to support a name service to perform mappings? -- richard
ZFS should allow 31+NULL chars for a comment against each disk. This would work well with the host name string (I assume is max_hostname 255+NULL) If a disk fails it should report c6t4908029d0 failed "comment from disk", it should also remember the comment until reboot This would be useful for DR, or in clusters. By the operator giving a disk a comment they can check it''s existence on a different server, work out which one is missing and fix it before doing an import. You would also need a command to dump it out without importing a disk. In fact it would be nice to have a tool that check to see if a disk is a zfs disk and print out its info without the need to import it. Cheers
zdb? Damon Atkins wrote:> ZFS should allow 31+NULL chars for a comment against each disk. > This would work well with the host name string (I assume is max_hostname > 255+NULL) > If a disk fails it should report c6t4908029d0 failed "comment from > disk", it should also remember the comment until reboot > > This would be useful for DR, or in clusters. By the operator giving a > disk a comment they can check it''s existence on a different server, work > out which one is missing and fix it before doing an import. You would > also need a command to dump it out without importing a disk. In fact it > would be nice to have a tool that check to see if a disk is a zfs disk > and print out its info without the need to import it. > > Cheers > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
On Wed, Sep 26, 2007 at 11:36:57AM -0700, Richard Elling wrote:> AFAIK, VxVM still only expects one private region per disk. The private > region stores info on the configuration of the logical devices on the > disk, and its participation therein. ZFS places this data in the on-disk > format on the vdev, which is radically different. With ZFS you could > conceivably have a different storage pool per slice or partition.Right. One private region per disk is certainly the standard way to set up VxVM, but it is not required. With type=simple, the private and public data share a slice. # vxdisk -g testdg list DEVICE TYPE DISK GROUP STATUS c1t6d0s5 simple disk6slice5 testdg online c1t6d0s6 simple disk6slice6 testdg online # vxdisk -g testdg list disk6slice5 Device: c1t6d0s5 devicetag: c1t6d0 type: simple [...] info: privoffset=1 flags: online ready private autoimport imported pubpaths: block=/dev/vx/dmp/c1t6d0s5 char=/dev/vx/rdmp/c1t6d0s5 version: 2.1 iosize: min=512 (bytes) max=2048 (blocks) public: slice=5 offset=2305 len=1022520 disk_offset=12510 private: slice=5 offset=1 len=2048 disk_offset=12510 [...] So the "Disk" name is ''disk6slice5'', but that isn''t really the name of a disk on Solaris. It''s just a slice, and I have two of these VM disks on this physical disk.> >I''m only suggesting that a common use of both is in a 1:1 situation, and > >that being able to give names to storage is valuable in that case. I > >don''t see that the value is diminished becase we can create a > >configuration where it''s less obvious how it would be used. > > I think you are still thinking of the old way of doing things where you > *had to worry* about disks. To some degree, ZFS frees you from that > restriction in that you can worry about storage pools, at a higher level > of abstraction. VxVM and SVM got us only part way down the road to > abstraction.I would agree, but I don''t think we''re completely away from that yet either. When auto-identification of disks becomes possible (at whatever level of the stack makes sense), then that will go a long way toward good solutions. In the meantime, adding a name seems easier and possibly helpful. -- Darren Dunham ddunham at taos.com Senior Technical Consultant TAOS http://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. >