Luke Schwab
2006-Oct-10 13:35 UTC
[zfs-discuss] Solaris 10 / ZFS file system major/minor number
Hi, In migrating from **VM to ZFS am I going to have an issue with Major/Minor numbers with NFS mounts? Take the following scenario. 1. NFS clients are connected to an active NFS server that has SAN shared storage between the active and standby nodes in a cluster. 2. The NFS clients are using the major/minor numbers on the active node in the cluster to communicate to the NFS active server. 3. The active node fails over to the secondary node in the cluster. 4. The NFS clients can no longer access the same major minor number for NFS shares. Does anyone know how ZFS fixes this problem. I read something on NFSv4 in Linux that has the "fsid" option of mount that allows you to set the device name instead of the major/minor number. Does Solaris 10 have anything like this? Thanks, ljs This message posted from opensolaris.org
Sanjeev Bagewadi
2006-Oct-11 10:24 UTC
[zfs-discuss] Solaris 10 / ZFS file system major/minor number
Hi Luke, Luke Schwab wrote:>Hi, > >In migrating from **VM to ZFS am I going to have an issue with Major/Minor numbers with NFS mounts? Take the following scenario. > >1. NFS clients are connected to an active NFS server that has SAN shared storage between the active and standby nodes in a cluster. >2. The NFS clients are using the major/minor numbers on the active node in the cluster to communicate to the NFS active server. >3. The active node fails over to the secondary node in the cluster. >4. The NFS clients can no longer access the same major minor number for NFS shares. > > >AFAIK NFS uses the fsid provided by the underlying FS to identify the file system. Most of the underlying filesystems use major/minor number as the fsid (as reported by stat(2))... ZFS creates a unique FSID for every filesystem (called a object set in ZFS terminology). The following is the stack trace which shows how we create a unique id when a new FS is created : -- snip -- # zfs create mypool/test -- snip -- # dtrace -n ''fbt::unique_create:entry { stack(); ustack()}'' dtrace: description ''fbt::unique_create:entry '' matched 1 probe CPU ID FUNCTION:NAME 1 38106 unique_create:entry zfs`dsl_dataset_create_sync+0x10e zfs`dmu_objset_create_sync+0x42 zfs`dsl_dir_sync+0x47 zfs`dsl_pool_sync+0xd4 zfs`spa_sync+0x110 zfs`txg_sync_thread+0x1a5 unix`thread_start+0x8 -- snip -- The unique id is saved (ondisk) as part of dsl_dataset_phys_t in ds_fsid_guid. And this id is a random number generated when the FS is created. This id is used to populate the zfs_t structure (refer to zfs_init_fs()). And the same id would be used as FSID for NFS. Hope that answers your questions. -- Sanjeev.>Does anyone know how ZFS fixes this problem. I read something on NFSv4 in Linux that has the "fsid" option of mount that allows you to set the device name instead of the major/minor number. Does Solaris 10 have anything like this? > >Thanks, > >ljs > > >This message posted from opensolaris.org >_______________________________________________ >zfs-discuss mailing list >zfs-discuss at opensolaris.org >http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >-- Solaris Revenue Products Engineering, India Engineering Center, Sun Microsystems India Pvt Ltd. Tel: x27521 +91 80 669 27521
Darren Dunham
2006-Oct-11 16:46 UTC
[zfs-discuss] Solaris 10 / ZFS file system major/minor number
> ZFS creates a unique FSID for every filesystem (called a object set in > ZFS terminology). > > The unique id is saved (ondisk) as part of dsl_dataset_phys_t in > ds_fsid_guid. > And this id is a random number generated when the FS is created. > > This id is used to populate the zfs_t structure (refer to zfs_init_fs()). > > And the same id would be used as FSID for NFS.Sorry, allow me to be dense for a moment. Does this mean that I should expect to be able to bring up any machine with the same ZFS pool and the same IP address and have it serve filehandles handed out by a previous server? Including NFS3 and 4? How about if I have to mount the filesystem on an alternate root? I don''t really have a setup that I could move between machines to test this at the moment... -- Darren Dunham ddunham at taos.com Senior Technical Consultant TAOS http://www.taos.com/ Got some Dr Pepper? San Francisco, CA bay area < This line left intentionally blank to confuse you. >
Sanjeev Bagewadi
2006-Oct-12 05:31 UTC
[zfs-discuss] Solaris 10 / ZFS file system major/minor number
Hi Darren, Coments inline.... Darren Dunham wrote:>>ZFS creates a unique FSID for every filesystem (called a object set in >>ZFS terminology). >> >>The unique id is saved (ondisk) as part of dsl_dataset_phys_t in >>ds_fsid_guid. >>And this id is a random number generated when the FS is created. >> >>This id is used to populate the zfs_t structure (refer to zfs_init_fs()). >> >>And the same id would be used as FSID for NFS. >> >> > >Sorry, allow me to be dense for a moment. > >Does this mean that I should expect to be able to bring up any machine >with the same ZFS pool and the same IP address and have it serve >filehandles handed out by a previous server? Including NFS3 and 4? > >How about if I have to mount the filesystem on an alternate root? > >I don''t really have a setup that I could move between machines to test >this at the moment... > >This works... I tried it here and things worked fine... here is what I did : - Configured an additional IP address (IP1) on a host (HOST1) - Shared a pool over NFS.. - Automounted the filesystem on my desktop - Started a "tail -f " on one of the files - Exported the pool on HOST1 - Unplumbed the ip address IP1 on HOST1 - Plumbed up the ip address on HOST2 - Imported the pool on HOST2 with alternate root ie. -R option - Any further concats to the file done were seen by the "tail -f" So, it works... Thanks and regards, Sanjeev. PS : This very similar to what HA-ZFS does in SunCluster 3.2 -- Solaris Revenue Products Engineering, India Engineering Center, Sun Microsystems India Pvt Ltd. Tel: x27521 +91 80 669 27521