Hi. S10U2 Generic_118833-23 (SPARC) LUNs provided by Symmetrix box. # zpool create t2 raidz c7t5d176 c7t5d177 c7t5d178 c7t5d179 c7t5d180 c7t5d181 c7t5d182 c7t5d183 c7t5d184 c7t5d185 c7t5d186 c7t5d187 c7t5d188 c7t5d189 c7t5d190 invalid vdev specification use ''-f'' to override the following errors: raidz contains devices of different sizes # Just remove last one and try again: # zpool create t2 raidz c7t5d176 c7t5d177 c7t5d178 c7t5d179 c7t5d180 c7t5d181 c7t5d182 c7t5d183 c7t5d184 c7t5d185 c7t5d186 c7t5d187 c7t5d188 c7t5d189 # # zpool destroy t2 # Hmm.. lets see if last two disks differ: # prtvtoc /dev/rdsk/c7t5d189s0 * /dev/rdsk/c7t5d189s0 partition map * * Dimensions: * 512 bytes/sector * 17677440 sectors * 17677373 accessible sectors * * Flags: * 1: unmountable * 10: read-only * * First Sector Last * Partition Tag Flags Sector Count Sector Mount Directory 0 4 00 34 17660989 17661022 8 11 00 17661023 16384 17677406 # prtvtoc /dev/rdsk/c7t5d190s0 * /dev/rdsk/c7t5d190s0 partition map * * Dimensions: * 512 bytes/sector * 17677440 sectors * 17677373 accessible sectors * * Flags: * 1: unmountable * 10: read-only * * First Sector Last * Partition Tag Flags Sector Count Sector Mount Directory 0 4 00 34 17660989 17661022 8 11 00 17661023 16384 17677406 # They look the same and still d190 is somewhow different. #truss -v all zpool create .... [...] /1: open("/dev/dsk/c7t5d187", O_RDONLY) = 9 /1: fstat64(9, 0xFFBFB598) = 0 /1: d=0x047C0000 i=16781665 m=0060640 l=1 u=0 g=3 rdev=0x008008AF /1: at = Sep 20 18:25:27 CEST 2006 [ 1158769527 ] /1: mt = Sep 20 18:25:27 CEST 2006 [ 1158769527 ] /1: ct = Sep 20 18:25:27 CEST 2006 [ 1158769527 ] /1: bsz=8192 blks=0 fs=devfs /1: close(9) = 0 /1: open("/dev/dsk/c7t5d188", O_RDONLY) Err#2 ENOENT /1: stat64("/dev/dsk/c7t5d188", 0xFFBFB598) Err#2 ENOENT /1: open("/dev/dsk/c7t5d189", O_RDONLY) = 9 /1: fstat64(9, 0xFFBFB598) = 0 /1: d=0x047C0000 i=16781697 m=0060640 l=1 u=0 g=3 rdev=0x008008BF /1: at = Sep 20 18:25:27 CEST 2006 [ 1158769527 ] /1: mt = Sep 20 18:25:27 CEST 2006 [ 1158769527 ] /1: ct = Sep 20 18:25:27 CEST 2006 [ 1158769527 ] /1: bsz=8192 blks=0 fs=devfs /1: close(9) = 0 /1: open("/dev/dsk/c7t5d190", O_RDONLY) = 9 /1: fstat64(9, 0xFFBFB598) = 0 /1: d=0x047C0000 i=16781713 m=0060640 l=1 u=0 g=3 rdev=0x008008C7 /1: at = Sep 20 19:04:06 CEST 2006 [ 1158771846 ] /1: mt = Sep 20 19:04:06 CEST 2006 [ 1158771846 ] /1: ct = Sep 20 19:04:06 CEST 2006 [ 1158771846 ] /1: bsz=8192 blks=0 fs=devfs /1: close(9) = 0 /1: fstat64(2, 0xFFBFA5C8) = 0 /1: d=0x047C0000 i=12582920 m=0020620 l=1 u=0 g=7 rdev=0x00600002 /1: at = Sep 21 11:21:54 CEST 2006 [ 1158830514 ] /1: mt = Sep 21 11:21:54 CEST 2006 [ 1158830514 ] /1: ct = Sep 20 18:21:33 CEST 2006 [ 1158769293 ] /1: bsz=8192 blks=0 fs=devfs /1: write(2, " i n v a l i d v d e v".., 27) = 27 /1: write(2, " u s e '' - f '' t o ".., 43) = 43 /1: write(2, " r a i d z", 5) = 5 /1: write(2, " c o n t a i n s d e".., 37) = 37 /1: _exit(1) Hmmmm... so maybe it''s d188 after all. # ls -l /dev/dsk/c7t5d187 lrwxrwxrwx 1 root other 46 Sep 20 18:25 /dev/dsk/c7t5d187 -> ../../devices/sbus at 2,0/fce at 2,401000/sd at 5,bb:wd # ls -l /dev/dsk/c7t5d188 /dev/dsk/c7t5d188: No such file or directory # ls -l /dev/dsk/c7t5d189 lrwxrwxrwx 1 root other 46 Sep 20 18:25 /dev/dsk/c7t5d189 -> ../../devices/sbus at 2,0/fce at 2,401000/sd at 5,bd:wd # ls -l /dev/dsk/c7t5d190 lrwxrwxrwx 1 root root 46 Sep 20 19:04 /dev/dsk/c7t5d190 -> ../../devices/sbus at 2,0/fce at 2,401000/sd at 5,be:wd # So format -e I put SMI label then again EFI label and now there''s a symlink: # ls -l /dev/dsk/c7t5d188 lrwxrwxrwx 1 root root 46 Sep 21 11:31 /dev/dsk/c7t5d188 -> ../../devices/sbus at 2,0/fce at 2,401000/sd at 5,bc:wd But still with d190 I can''t create a pool. # zpool create t2 raidz c7t5d176 c7t5d177 c7t5d178 c7t5d179 c7t5d180 c7t5d181 c7t5d182 c7t5d183 c7t5d184 c7t5d185 c7t5d186 c7t5d187 c7t5d188 c7t5d189 c7t5d190 invalid vdev specification use ''-f'' to override the following errors: raidz contains devices of different sizes # # dtrace -n syscall::fstat64:entry''{self->st=arg1;self->fd=arg0;}'' -n syscall::fstat64:return''/self->st/{this->a=(struct stat64_32 *)copyin(self->st,sizeof(struct stat64_32));printf("%s size:%d", fds[self->fd].fi_pathname, (long long)this->a->st_size);self->st=0;self->pn=0;}'' -c "zpool create t2 raidz c7t5d176 c7t5d177 c7t5d178 c7t5d179 c7t5d180 c7t5d181 c7t5d182 c7t5d183 c7t5d184 c7t5d185 c7t5d186 c7t5d187 c7t5d188 c7t5d189 c7t5d190" | grep ":wd" dtrace: description ''syscall::fstat64:entry'' matched 1 probe dtrace: description ''syscall::fstat64:return'' matched 1 probe invalid vdev specification use ''-f'' to override the following errors: raidz contains devices of different sizes dtrace: pid 13314 has exited 8 12562 fstat64:return /devices/sbus at 2,0/fce at 2,401000/sd at 5,b2:wd size:9050849280 8 12562 fstat64:return /devices/sbus at 2,0/fce at 2,401000/sd at 5,b3:wd size:9050849280 8 12562 fstat64:return /devices/sbus at 2,0/fce at 2,401000/sd at 5,b4:wd size:9050849280 8 12562 fstat64:return /devices/sbus at 2,0/fce at 2,401000/sd at 5,b6:wd size:9050849280 8 12562 fstat64:return /devices/sbus at 2,0/fce at 2,401000/sd at 5,b8:wd size:9050849280 8 12562 fstat64:return /devices/sbus at 2,0/fce at 2,401000/sd at 5,b9:wd size:9050849280 8 12562 fstat64:return /devices/sbus at 2,0/fce at 2,401000/sd at 5,ba:wd size:9050849280 8 12562 fstat64:return /devices/sbus at 2,0/fce at 2,401000/sd at 5,bb:wd size:9050849280 8 12562 fstat64:return /devices/sbus at 2,0/fce at 2,401000/sd at 5,bc:wd size:9050357760 8 12562 fstat64:return /devices/sbus at 2,0/fce at 2,401000/sd at 5,bd:wd size:9050849280 8 12562 fstat64:return /devices/sbus at 2,0/fce at 2,401000/sd at 5,be:wd size:9050357760 # Hmmm.. thats interesting. Disk d190 has size 9050357760 as reported by fstat64(). Disk d189 has size 9050849280 Disk d188 has size 9050357760 And other disks have 9050849280. Now why these two disks are reported as different size? # ./inq.solaris -no_dots| grep d190 /dev/rdsk/c7t5d190s2 :EMC :SYMMETRIX :5568 :9419E050 :8838720 # ./inq.solaris -no_dots| grep d189 /dev/rdsk/c7t5d189s2 :EMC :SYMMETRIX :5568 :9419D050 :8838720 # ./inq.solaris -no_dots| grep d188 /dev/rdsk/c7t5d188s2 :EMC :SYMMETRIX :5568 :9419C050 :8838720 # Last column is reported size. Any idea? ps. of course I can force pool creation but still something wrong is here. This message posted from opensolaris.org
Robert Milkowski
2006-Sep-21 10:39 UTC
[zfs-discuss] Re: zpool wrongly recognizes disk size
Looking at zpool_vdev.c#663 If I remove links to a whole disk zpool would skip size checking. So I did: # rm /dev/dsk/c7t5d190 # rm /dev/dsk/c7t5d189 # rm /dev/dsk/c7t5d188 And now: # zpool create t2 raidz c7t5d176 c7t5d177 c7t5d178 c7t5d179 c7t5d180 c7t5d181 c7t5d182 c7t5d183 c7t5d184 c7t5d185 c7t5d186 c7t5d187 c7t5d188 c7t5d189 c7t5d190 # ok, it did worked. However it looks like zpool ignores vdev size checking if it can''t get one from fstat() or stat(). 1. What would happen is size actually are different but it wasn''t checked and pool was created? ZFS will panic or generate r/w error when accessing non existant blocks? 2. What about my case - why format() or EMC''s inq (and Symmetrix itself) show that all these devices are the same size while fstat() shows different? This message posted from opensolaris.org
Robert Milkowski
2006-Sep-21 10:59 UTC
[zfs-discuss] Re: zpool wrongly recognizes disk size
I forced to create raidz pool. # zpool create symm36 raidz c7t5d161 c7t5d162 c7t5d163 c7t5d164 c7t5d165 c7t5d166 c7t5d167 c7t5d168 c7t5d169 c7t5d170 c7t5d171 c7t5d172 c7t5d173 c7t5d174 c7t5d175 raidz c7t5d176 c7t5d177 c7t5d178 c7t5d179 c7t5d180 c7t5d181 c7t5d182 c7t5d183 c7t5d184 c7t5d185 c7t5d186 c7t5d187 c7t5d188 c7t5d189 c7t5d190 raidz c7t5d191 c7t5d192 c7t5d193 c7t5d194 c7t5d195 c7t5d196 c7t5d197 c7t5d198 c7t5d199 c7t5d200 c7t5d201 c7t5d202 c7t5d203 c7t5d204 c7t5d205 raidz c7t5d206 c7t5d207 c7t5d208 c7t5d209 c7t5d210 c7t5d211 c7t5d212 c7t5d213 c7t5d214 c7t5d215 c7t5d216 c7t5d217 c7t5d218 c7t5d219 c7t5d220 invalid vdev specification use ''-f'' to override the following errors: raidz contains devices of different sizes # zpool create -f symm36 raidz c7t5d161 c7t5d162 c7t5d163 c7t5d164 c7t5d165 c7t5d166 c7t5d167 c7t5d168 c7t5d169 c7t5d170 c7t5d171 c7t5d172 c7t5d173 c7t5d174 c7t5d175 raidz c7t5d176 c7t5d177 c7t5d178 c7t5d179 c7t5d180 c7t5d181 c7t5d182 c7t5d183 c7t5d184 c7t5d185 c7t5d186 c7t5d187 c7t5d188 c7t5d189 c7t5d190 raidz c7t5d191 c7t5d192 c7t5d193 c7t5d194 c7t5d195 c7t5d196 c7t5d197 c7t5d198 c7t5d199 c7t5d200 c7t5d201 c7t5d202 c7t5d203 c7t5d204 c7t5d205 raidz c7t5d206 c7t5d207 c7t5d208 c7t5d209 c7t5d210 c7t5d211 c7t5d212 c7t5d213 c7t5d214 c7t5d215 c7t5d216 c7t5d217 c7t5d218 c7t5d219 c7t5d220 # zdb|grep asize asize=135565148160 asize=135565148160 asize=135565148160 asize=135565148160 # bc So while zpool complains it looks like onec the pool is created all raidz groups have the same size. Of course the question is if asize is assumed (based on first vdev which reported size in a given group) or actually all devices were checked for size and from kernel point of view all these devices are the same size and only fstat() reports wrongly for some of them. ??? This message posted from opensolaris.org
On Thu, Sep 21, 2006 at 03:39:08AM -0700, Robert Milkowski wrote:> > 1. What would happen is size actually are different but it wasn''t > checked and pool was created? ZFS will panic or generate r/w error > when accessing non existant blocks?No, the devices will all be created with the same size in the kernel. The size check is solely enforced in userland.> 2. What about my case - why format() or EMC''s inq (and Symmetrix > itself) show that all these devices are the same size while > fstat() shows different?There are some oddities w.r.t. specfs, devices, and size determination. We play some tricks in zpool, such as keeping the device node open, to try and get a reliable device size. But my understanding is that it''s still possible to get the wrong answer for some devices. I would suggest doing a ''truss -v fstat -t open'' and see what the actual values being returned are. - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Robert Milkowski
2006-Sep-22 12:39 UTC
[zfs-discuss] Re: Re: zpool wrongly recognizes disk size
1. But size is not checked on all devices if open() or fstat() return error - it''s just skipped. So if rist device will report size larger than the rest one and you couldn''t get size for all the others I guess zpool will assume the size of each vdev to be that of first device, right? 2. At the end of my first post I used dtrace to get what sizes were returned from fstat64() calls by zpool abd as you can see some devices reported different sizes than the others. But format/inq are reporting the same size for all these devices so there''s something wrong here. Here''s truss output you requested. # truss -v fstat -t open,fstat zpool create t2 raidz c7t5d176 c7t5d177 c7t5d178 c7t5d179 c7t5d180 c7t5d181 c7t5d182 c7t5d183 c7t5d184 c7t5d185 c7t5d186 c7t5d187 c7t5d188 c7t5d189 c7t5d190 [...] /1: open("/dev/dsk/c7t5d183", O_RDONLY) Err#2 ENOENT /1: open("/dev/dsk/c7t5d184", O_RDONLY) = 8 /1: fstat64(8, 0xFFBFB598) = 0 /1: d=0x047C0000 i=16781617 m=0060640 l=1 u=0 g=3 rdev=0x00800897 /1: at = Sep 20 18:25:26 CEST 2006 [ 1158769526 ] /1: mt = Sep 20 18:25:26 CEST 2006 [ 1158769526 ] /1: ct = Sep 20 18:25:26 CEST 2006 [ 1158769526 ] /1: bsz=8192 blks=0 fs=devfs /1: open("/dev/dsk/c7t5d185", O_RDONLY) = 8 /1: fstat64(8, 0xFFBFB598) = 0 /1: d=0x047C0000 i=16781633 m=0060640 l=1 u=0 g=3 rdev=0x0080089F /1: at = Sep 20 18:25:26 CEST 2006 [ 1158769526 ] /1: mt = Sep 20 18:25:26 CEST 2006 [ 1158769526 ] /1: ct = Sep 20 18:25:26 CEST 2006 [ 1158769526 ] /1: bsz=8192 blks=0 fs=devfs /1: open("/dev/dsk/c7t5d186", O_RDONLY) = 8 /1: fstat64(8, 0xFFBFB598) = 0 /1: d=0x047C0000 i=16781649 m=0060640 l=1 u=0 g=3 rdev=0x008008A7 /1: at = Sep 20 18:25:27 CEST 2006 [ 1158769527 ] /1: mt = Sep 20 18:25:27 CEST 2006 [ 1158769527 ] /1: ct = Sep 20 18:25:27 CEST 2006 [ 1158769527 ] /1: bsz=8192 blks=0 fs=devfs /1: open("/dev/dsk/c7t5d187", O_RDONLY) = 8 /1: fstat64(8, 0xFFBFB598) = 0 /1: d=0x047C0000 i=16781665 m=0060640 l=1 u=0 g=3 rdev=0x008008AF /1: at = Sep 20 18:25:27 CEST 2006 [ 1158769527 ] /1: mt = Sep 20 18:25:27 CEST 2006 [ 1158769527 ] /1: ct = Sep 20 18:25:27 CEST 2006 [ 1158769527 ] /1: bsz=8192 blks=0 fs=devfs /1: open("/dev/dsk/c7t5d188", O_RDONLY) = 8 /1: fstat64(8, 0xFFBFB598) = 0 /1: d=0x047C0000 i=16781681 m=0060640 l=1 u=0 g=3 rdev=0x008008B7 /1: at = Sep 21 11:31:52 CEST 2006 [ 1158831112 ] /1: mt = Sep 21 11:31:52 CEST 2006 [ 1158831112 ] /1: ct = Sep 21 11:31:52 CEST 2006 [ 1158831112 ] /1: bsz=8192 blks=0 fs=devfs /1: fstat64(2, 0xFFBFA5C8) = 0 /1: d=0x047C0000 i=12582920 m=0020620 l=1 u=0 g=7 rdev=0x00600002 /1: at = Sep 22 14:32:07 CEST 2006 [ 1158928327 ] /1: mt = Sep 22 14:32:19 CEST 2006 [ 1158928339 ] /1: ct = Sep 20 18:21:33 CEST 2006 [ 1158769293 ] /1: bsz=8192 blks=0 fs=devfs invalid vdev specification use ''-f'' to override the following errors: raidz contains devices of different sizes /1: open("/dev/dsk/c7t5d189", O_RDONLY) = 8 /1: fstat64(8, 0xFFBFB598) = 0 /1: d=0x047C0000 i=16781697 m=0060640 l=1 u=0 g=3 rdev=0x008008BF /1: at = Sep 20 18:25:27 CEST 2006 [ 1158769527 ] /1: mt = Sep 20 18:25:27 CEST 2006 [ 1158769527 ] /1: ct = Sep 20 18:25:27 CEST 2006 [ 1158769527 ] /1: bsz=8192 blks=0 fs=devfs /1: open("/dev/dsk/c7t5d190", O_RDONLY) = 8 /1: fstat64(8, 0xFFBFB598) = 0 /1: d=0x047C0000 i=16781713 m=0060640 l=1 u=0 g=3 rdev=0x008008C7 /1: at = Sep 20 19:04:06 CEST 2006 [ 1158771846 ] /1: mt = Sep 20 19:04:06 CEST 2006 [ 1158771846 ] /1: ct = Sep 20 19:04:06 CEST 2006 [ 1158771846 ] /1: bsz=8192 blks=0 fs=devfs # This message posted from opensolaris.org