Hi Ben, Benjamin Bennett wrote:> Hi Eric, > > If you could give me your input on something I''d greatly appreciate > it. Or, if I should just sent this to -devel let me know...Yes you can always send to -devel for open discussions. I''m CCing it.> For Lustre-WAN across TeraGrid we were hoping to distribute OSSs > across several resource providers (sites), leveraging existing kerberos > infrastructure, placing OSSs in each resource provider''s local kerberos > realm, and the MDS in the teragrid realm. Unfortunately, MDT -> OST > connections will not allow the MDT and OST to be in different realms, > since an OSS considers an MDS to be anyone holding a lustre_mds > principal in their local realm. > > This also seems undesirable within a single-realm where multiple > lustre clusters may exist, as an OSS in cluster A will trust an MDS for > cluster B, and an OSS for cluster B will trust an MDS for cluster A. > > My first thought was to add functionality to tell the OSSs lsvcgssd > what the trusted MDS principals should be (local or not). Do you have > any thoughts on this?We just never thought the usage that OSS could locate in multiple realms. I agree with you in this case, we can make OSS configurable to only accept designated lustre_mds principals, local or remote. I''v questions just for curiosity: 1) is the benefit of cross site OSSs about bandwidth? 2) in the future with CMD (clustered metadata, multiple MDS nodes), would it be useful to distribute MDSs across multiple site as well? Thanks -- Eric
Yes, it will be very important that we can separate OST''s/MDT''s widely. But placing them in different realms, I''m not sure about. Can PSC explain what administrative model warrants that? Why can a remote OST not be part of the realm of the MDS that controls it? Peter On 7/8/08 11:27 AM, "Eric Mei" <Eric.Mei at Sun.COM> wrote:> Hi Ben, > > Benjamin Bennett wrote: >> Hi Eric, >> >> If you could give me your input on something I''d greatly appreciate >> it. Or, if I should just sent this to -devel let me know... > > Yes you can always send to -devel for open discussions. I''m CCing it. > >> For Lustre-WAN across TeraGrid we were hoping to distribute OSSs >> across several resource providers (sites), leveraging existing kerberos >> infrastructure, placing OSSs in each resource provider''s local kerberos >> realm, and the MDS in the teragrid realm. Unfortunately, MDT -> OST >> connections will not allow the MDT and OST to be in different realms, >> since an OSS considers an MDS to be anyone holding a lustre_mds >> principal in their local realm. >> >> This also seems undesirable within a single-realm where multiple >> lustre clusters may exist, as an OSS in cluster A will trust an MDS for >> cluster B, and an OSS for cluster B will trust an MDS for cluster A. >> >> My first thought was to add functionality to tell the OSSs lsvcgssd >> what the trusted MDS principals should be (local or not). Do you have >> any thoughts on this? > > We just never thought the usage that OSS could locate in multiple > realms. I agree with you in this case, we can make OSS configurable to > only accept designated lustre_mds principals, local or remote. > > I''v questions just for curiosity: 1) is the benefit of cross site OSSs > about bandwidth? 2) in the future with CMD (clustered metadata, multiple > MDS nodes), would it be useful to distribute MDSs across multiple site > as well? > > Thanks
Peter Braam wrote:> Yes, it will be very important that we can separate OST''s/MDT''s widely. > > But placing them in different realms, I''m not sure about. Can PSC explain > what administrative model warrants that? Why can a remote OST not be part > of the realm of the MDS that controls it?The OSTs will be distributed among several resource provider organizations, each with their own existing domain name space and kerberos realm. There is also a centrally managed teragrid realm which could be used to provide cross-realm transit between the resource provider realms. With this kerberos authentication infrastructure already in place the issue comes down to that of authorizing a principal as an MDS, the logic of which I believe should be reconsidered regardless of cross-realm issues. Currently an OSS''s authz of an MDS is inherent in the name of the principal (lustre_mds/host) so AFAICT one cannot safely run two distinct lustre clusters within a single kerberos realm. Moreover, this makes the assumption that all kerberos admins are knowledgeable enough about lustre to only issue lustre_mds/host principals to entities that should have MDS privileges throughout the entire realm. Please do correct me if I''m wrong here. --ben -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://lists.lustre.org/pipermail/lustre-devel/attachments/20080708/a5cd273e/attachment-0001.bin
On Tue, 8 Jul 2008, Eric Mei wrote:> Hi Ben, > > Benjamin Bennett wrote: >> Hi Eric, >> >> If you could give me your input on something I''d greatly appreciate >> it. Or, if I should just sent this to -devel let me know... > > Yes you can always send to -devel for open discussions. I''m CCing it. > >> For Lustre-WAN across TeraGrid we were hoping to distribute OSSs >> across several resource providers (sites), leveraging existing kerberos >> infrastructure, placing OSSs in each resource provider''s local kerberos >> realm, and the MDS in the teragrid realm. Unfortunately, MDT -> OST >> connections will not allow the MDT and OST to be in different realms, >> since an OSS considers an MDS to be anyone holding a lustre_mds >> principal in their local realm. >> >> This also seems undesirable within a single-realm where multiple >> lustre clusters may exist, as an OSS in cluster A will trust an MDS for >> cluster B, and an OSS for cluster B will trust an MDS for cluster A. >> >> My first thought was to add functionality to tell the OSSs lsvcgssd >> what the trusted MDS principals should be (local or not). Do you have >> any thoughts on this? > > We just never thought the usage that OSS could locate in multiple > realms. I agree with you in this case, we can make OSS configurable to > only accept designated lustre_mds principals, local or remote. > > I''v questions just for curiosity: 1) is the benefit of cross site OSSs > about bandwidth? 2) in the future with CMD (clustered metadata, multiple > MDS nodes), would it be useful to distribute MDSs across multiple site > as well?At this point, we only want to be able to test functionality, i.e. create a lustre fs over the WAN with OSS contributions from various remote RP remote while kerb auth is enabled. Eventually one will have to deal with performance issues due to the setup but hopefully when that time comes, other mechanisms will become possible (distributed MDS) to manage the remote OSS''s. Thanks, -j> > Thanks > -- > Eric > _______________________________________________ > Lustre-devel mailing list > Lustre-devel at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-devel >
Hmm. Perhaps there are implementation issues here that overshadow the architecture. To interact with MDS nodes that are part of one file system, the MDS needs to be part of a realm. The MDS performs authorization based on a principal to MDS (i.e. Lustre) user/group database. Within one Lustre file system each MDS MUST HAVE the same user group database. We will likely want to place MDS''s distributedly in the longer term future, so take clear note of this: one Kerberos realm owns the entire MDS cluster for a file system. There can be multiple MDS clusters, i.e. Lustre file systems, in a single realm, each serving their own file system. Each Lustre file system can have its own user/group database. No restrictions here. For a given file system the MDS nodes produce capabilities which the OSS nodes use for authorization. It is important that the MDS can maken authenticated RPC''s to the OSS nodes in its file system and for this we use Kerberos (this is not a "must have" - it could have been done with a different key sharing mechanism). ==> So the first issue you have to become clear about is how you authorize an MDS to contact one of its OSS nodes, wherever these are place. Similarly the Kerberos connections are used by the clients to connect to the OSS, but they are not used to authenticate anything (but optionally the node), they are used merely to provide privacy and/or authenticity for transporting data between the client and the OSS nodes. With relatively little effort this could be done without Kerberos at all, on the other hand, probably using Kerberos for this leads to a more easily understood architecture. So, to repeat, the authorization uses capabilities, which authenticate the requestor and contain authorization information, independent of a server user/group database on the OSS. ==> The second issue you need to be clear about is how you authenticate client NODES (NOT users) to OSS nodes. Peter On 7/8/08 12:41 PM, "Benjamin Bennett" <ben at psc.edu> wrote:> Peter Braam wrote: >> Yes, it will be very important that we can separate OST''s/MDT''s widely. >> >> But placing them in different realms, I''m not sure about. Can PSC explain >> what administrative model warrants that? Why can a remote OST not be part >> of the realm of the MDS that controls it? > > The OSTs will be distributed among several resource provider > organizations, each with their own existing domain name space and > kerberos realm. There is also a centrally managed teragrid realm which > could be used to provide cross-realm transit between the resource > provider realms. With this kerberos authentication infrastructure > already in place the issue comes down to that of authorizing a principal > as an MDS, the logic of which I believe should be reconsidered > regardless of cross-realm issues. > > Currently an OSS''s authz of an MDS is inherent in the name of the > principal (lustre_mds/host) so AFAICT one cannot safely run two distinct > lustre clusters within a single kerberos realm. Moreover, this makes > the assumption that all kerberos admins are knowledgeable enough about > lustre to only issue lustre_mds/host principals to entities that should > have MDS privileges throughout the entire realm. Please do correct me > if I''m wrong here. > > > --ben >
Benjamin Bennett wrote:> For Lustre-WAN across TeraGrid we were hoping to distribute OSSs > across several resource providers (sites), leveraging existing kerberos > infrastructure, placing OSSs in each resource provider''s local kerberos > realm, and the MDS in the teragrid realm. Unfortunately, MDT -> OST > connections will not allow the MDT and OST to be in different realms, > since an OSS considers an MDS to be anyone holding a lustre_mds > principal in their local realm.As an FYI, you should follow the work being done on OST pools (bug 14836 and its decendants), which will allow users to specify subsets of the OSTs to store files. This would allow users to save output files only to local OSTs for jobs being run at some site, but they could still access input files from any OSTs transparently (albeit more slowly over a WAN connection). This feature is expected to be in the 1.8 release. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Peter Braam wrote:> Hmm. Perhaps there are implementation issues here that overshadow the > architecture. > > To interact with MDS nodes that are part of one file system, the MDS needs > to be part of a realm. The MDS performs authorization based on a principal > to MDS (i.e. Lustre) user/group database. Within one Lustre file system > each MDS MUST HAVE the same user group database. We will likely want to > place MDS''s distributedly in the longer term future, so take clear note of > this: one Kerberos realm owns the entire MDS cluster for a file system.Could you explain more on why this requires a single realm and not just consistent mappings across all MDSs?> There can be multiple MDS clusters, i.e. Lustre file systems, in a single > realm, each serving their own file system. Each Lustre file system can have > its own user/group database. No restrictions here.Well, that''s the problem with multiple clusters in a single realm, lack of restriction... ;-)> For a given file system the MDS nodes produce capabilities which the OSS > nodes use for authorization. It is important that the MDS can maken > authenticated RPC''s to the OSS nodes in its file system and for this we use > Kerberos (this is not a "must have" - it could have been done with a > different key sharing mechanism).With multiple clusters in a single realm an MDS from any cluster could authenticate and authorize as an MDS to an OSS in any cluster. This would allow an MDS in one cluster to change the key used for capabilities on the OSSs in another cluster, no?> ==> So the first issue you have to become clear about is how you authorize > an MDS to contact one of its OSS nodes, wherever these are place.I''ve changed lsvcgssd on the OSSs to take an arbitrary number of ''-M lustre_mds/mdshost at REALM'' and use this list to determine MDS authorization. Is there a way in which an OSS is already aware of its appropriate MDSs?> Similarly the Kerberos connections are used by the clients to connect to the > OSS, but they are not used to authenticate anything (but optionally the > node), they are used merely to provide privacy and/or authenticity for > transporting data between the client and the OSS nodes. With relatively > little effort this could be done without Kerberos at all, on the other hand, > probably using Kerberos for this leads to a more easily understood > architecture. > > So, to repeat, the authorization uses capabilities, which authenticate the > requestor and contain authorization information, independent of a server > user/group database on the OSS. > > ==> The second issue you need to be clear about is how you authenticate > client NODES (NOT users) to OSS nodes.Client nodes are issued lustre_root/host credentials from their local realm. This works just fine for Client -> OST since the only [kerberos-related] authorization check is a "lustre_root" service part.> Peter--ben -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://lists.lustre.org/pipermail/lustre-devel/attachments/20080708/128df10d/attachment.bin
On 7/8/08 2:38 PM, "Benjamin Bennett" <ben at psc.edu> wrote:> Peter Braam wrote: >> Hmm. Perhaps there are implementation issues here that overshadow the >> architecture. >> >> To interact with MDS nodes that are part of one file system, the MDS needs >> to be part of a realm. The MDS performs authorization based on a principal >> to MDS (i.e. Lustre) user/group database. Within one Lustre file system >> each MDS MUST HAVE the same user group database. We will likely want to >> place MDS''s distributedly in the longer term future, so take clear note of >> this: one Kerberos realm owns the entire MDS cluster for a file system. > > Could you explain more on why this requires a single realm and not just > consistent mappings across all MDSs?That MIGHT work ... But how would two domains guarantee consistent updates to the databases? However, the server - server trust across domains we need is new to me (and I am not sure if/how it works).> >> There can be multiple MDS clusters, i.e. Lustre file systems, in a single >> realm, each serving their own file system. Each Lustre file system can have >> its own user/group database. No restrictions here. > > Well, that''s the problem with multiple clusters in a single realm, lack > of restriction... ;-)Restrict yourself, not me or Lustre :)> >> For a given file system the MDS nodes produce capabilities which the OSS >> nodes use for authorization. It is important that the MDS can maken >> authenticated RPC''s to the OSS nodes in its file system and for this we use >> Kerberos (this is not a "must have" - it could have been done with a >> different key sharing mechanism). > > With multiple clusters in a single realm an MDS from any cluster could > authenticate and authorize as an MDS to an OSS in any cluster.Good point. If so that should be a bug. ===> Eric Mei, what is the story here? The key (which is manually generated) should authenticate an instance of an MDS, not a "cluster". The only case where this might become delicate is if one MDS node is the server for two file systems.> This > would allow an MDS in one cluster to change the key used for > capabilities on the OSSs in another cluster, no? > >> ==> So the first issue you have to become clear about is how you authorize >> an MDS to contact one of its OSS nodes, wherever these are place. > > I''ve changed lsvcgssd on the OSSs to take an arbitrary number of ''-M > lustre_mds/mdshost at REALM'' and use this list to determine MDS > authorization. Is there a way in which an OSS is already aware of its > appropriate MDSs?As you pointed out, we need that, and Eric Mei should help you get that.> >> Similarly the Kerberos connections are used by the clients to connect to the >> OSS, but they are not used to authenticate anything (but optionally the >> node), they are used merely to provide privacy and/or authenticity for >> transporting data between the client and the OSS nodes. With relatively >> little effort this could be done without Kerberos at all, on the other hand, >> probably using Kerberos for this leads to a more easily understood >> architecture. >> >> So, to repeat, the authorization uses capabilities, which authenticate the >> requestor and contain authorization information, independent of a server >> user/group database on the OSS. >> >> ==> The second issue you need to be clear about is how you authenticate >> client NODES (NOT users) to OSS nodes. > > Client nodes are issued lustre_root/host credentials from their local > realm. This works just fine for Client -> OST since the only > [kerberos-related] authorization check is a "lustre_root" service part.Good. Does it work across realms, because it seems we need that in any case? BTW, thank you for trying this all out in detail, that is very helpful. Perhaps Sheila could talk with you and Eric Mei and get a nice writeup done for the manual. Regards peter> >> Peter > > --ben > > _______________________________________________ > Lustre-devel mailing list > Lustre-devel at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-devel
Peter Braam wrote:> > > On 7/8/08 2:38 PM, "Benjamin Bennett" <ben at psc.edu> wrote: > >> Peter Braam wrote: >>> Hmm. Perhaps there are implementation issues here that overshadow the >>> architecture. >>> >>> To interact with MDS nodes that are part of one file system, the MDS needs >>> to be part of a realm. The MDS performs authorization based on a principal >>> to MDS (i.e. Lustre) user/group database. Within one Lustre file system >>> each MDS MUST HAVE the same user group database. We will likely want to >>> place MDS''s distributedly in the longer term future, so take clear note of >>> this: one Kerberos realm owns the entire MDS cluster for a file system. >> Could you explain more on why this requires a single realm and not just >> consistent mappings across all MDSs? > > That MIGHT work ... But how would two domains guarantee consistent updates > to the databases? However, the server - server trust across domains we need > is new to me (and I am not sure if/how it works).Practically it''s doable, of course. But as Peter pointed out the user database must be the same across all MDSs within a Luster FS. If 2 MDSs could share the user database, why bother putting them into different kerberos realms? So we assume all MDSs should be in a single realm. Does TeraGrid have different requirement?>>> There can be multiple MDS clusters, i.e. Lustre file systems, in a single >>> realm, each serving their own file system. Each Lustre file system can have >>> its own user/group database. No restrictions here. >> Well, that''s the problem with multiple clusters in a single realm, lack >> of restriction... ;-) > > Restrict yourself, not me or Lustre :) > >>> For a given file system the MDS nodes produce capabilities which the OSS >>> nodes use for authorization. It is important that the MDS can maken >>> authenticated RPC''s to the OSS nodes in its file system and for this we use >>> Kerberos (this is not a "must have" - it could have been done with a >>> different key sharing mechanism). >> With multiple clusters in a single realm an MDS from any cluster could >> authenticate and authorize as an MDS to an OSS in any cluster. > > > > Good point. If so that should be a bug. > > ===> Eric Mei, what is the story here?Yes Ben is right, currently in a same realm any MDS could authenticate with any MDS and OSS. But afaics the problem is nothing to do with Kerberos. It''s because currently Lustre have no config information about the server cluster membership, each server target have no idea what other targets are. So solve this, we can either place the configuration on each MDS/OST nodes - as Ben proposed in last mail; or probably better centrally managed by MGS, thus MDT/OST would be able to get uptodate server cluster information. Would it work?> The key (which is manually generated) should authenticate an instance of an > MDS, not a "cluster". The only case where this might become delicate is if > one MDS node is the server for two file systems.GSS/Kerberos is for the a certain kind service on a node, we can tell it simply from the composition of Kerberos principal "service_name/hostname at REALM". As to Lustre, lustre_mds/hostname at REALM it''s for MDS, not specific to MDT. So if two MDTs on a MDS serving two different file systems, GSS/Kerberos authentications are performed in the same way for them, further access control should be handled by each target (MDT/OST).>> This >> would allow an MDS in one cluster to change the key used for >> capabilities on the OSSs in another cluster, no? >> >>> ==> So the first issue you have to become clear about is how you authorize >>> an MDS to contact one of its OSS nodes, wherever these are place. >> I''ve changed lsvcgssd on the OSSs to take an arbitrary number of ''-M >> lustre_mds/mdshost at REALM'' and use this list to determine MDS >> authorization. Is there a way in which an OSS is already aware of its >> appropriate MDSs? > > As you pointed out, we need that, and Eric Mei should help you get that.Yes that works, probably as temporary solution. As described above, currently OSS don''t know that info. we may need a more complete centrally controlled server membership authentication, maybe independent of GSS/Kerberos.>>> Similarly the Kerberos connections are used by the clients to connect to the >>> OSS, but they are not used to authenticate anything (but optionally the >>> node), they are used merely to provide privacy and/or authenticity for >>> transporting data between the client and the OSS nodes. With relatively >>> little effort this could be done without Kerberos at all, on the other hand, >>> probably using Kerberos for this leads to a more easily understood >>> architecture. >>> >>> So, to repeat, the authorization uses capabilities, which authenticate the >>> requestor and contain authorization information, independent of a server >>> user/group database on the OSS. >>> >>> ==> The second issue you need to be clear about is how you authenticate >>> client NODES (NOT users) to OSS nodes. >> Client nodes are issued lustre_root/host credentials from their local >> realm. This works just fine for Client -> OST since the only >> [kerberos-related] authorization check is a "lustre_root" service part. > > Good. Does it work across realms, because it seems we need that in any > case?Yes, Ben had a patch to make it work.> BTW, thank you for trying this all out in detail, that is very helpful. > Perhaps Sheila could talk with you and Eric Mei and get a nice writeup done > for the manual.-- Eric
Eric Mei wrote:> Peter Braam wrote: >> >> >> On 7/8/08 2:38 PM, "Benjamin Bennett" <ben at psc.edu> wrote: >> >>> Peter Braam wrote: >>>> Hmm. Perhaps there are implementation issues here that overshadow the >>>> architecture. >>>> >>>> To interact with MDS nodes that are part of one file system, the MDS >>>> needs >>>> to be part of a realm. The MDS performs authorization based on a >>>> principal >>>> to MDS (i.e. Lustre) user/group database. Within one Lustre file >>>> system >>>> each MDS MUST HAVE the same user group database. We will likely >>>> want to >>>> place MDS''s distributedly in the longer term future, so take clear >>>> note of >>>> this: one Kerberos realm owns the entire MDS cluster for a file system. >>> Could you explain more on why this requires a single realm and not just >>> consistent mappings across all MDSs? >> >> That MIGHT work ... But how would two domains guarantee consistent >> updates >> to the databases? However, the server - server trust across domains >> we need >> is new to me (and I am not sure if/how it works). > > Practically it''s doable, of course. But as Peter pointed out the user > database must be the same across all MDSs within a Luster FS. If 2 MDSs > could share the user database, why bother putting them into different > kerberos realms? So we assume all MDSs should be in a single realm. Does > TeraGrid have different requirement?TeraGrid has a central database of users which could be used to consistently generate mappings. The reason to bother putting MDSs in separate realms is that TeraGrid is composed of distinct organizations. We are trying to distribute a filesystem across several organizations, not simply implement a centralized fs accessed by several organizations.>>>> There can be multiple MDS clusters, i.e. Lustre file systems, in a >>>> single >>>> realm, each serving their own file system. Each Lustre file system >>>> can have >>>> its own user/group database. No restrictions here. >>> Well, that''s the problem with multiple clusters in a single realm, lack >>> of restriction... ;-) >> >> Restrict yourself, not me or Lustre :) >> >>>> For a given file system the MDS nodes produce capabilities which the >>>> OSS >>>> nodes use for authorization. It is important that the MDS can maken >>>> authenticated RPC''s to the OSS nodes in its file system and for this >>>> we use >>>> Kerberos (this is not a "must have" - it could have been done with a >>>> different key sharing mechanism). >>> With multiple clusters in a single realm an MDS from any cluster could >>> authenticate and authorize as an MDS to an OSS in any cluster. >> >> >> >> Good point. If so that should be a bug. >> >> ===> Eric Mei, what is the story here? > > Yes Ben is right, currently in a same realm any MDS could authenticate > with any MDS and OSS. But afaics the problem is nothing to do with > Kerberos. It''s because currently Lustre have no config information about > the server cluster membership, each server target have no idea what > other targets are. > > So solve this, we can either place the configuration on each MDS/OST > nodes - as Ben proposed in last mail; or probably better centrally > managed by MGS, thus MDT/OST would be able to get uptodate server > cluster information. Would it work?Sounds like a good idea. If I understand correctly... A) An MDT/OST is explicitly given the MGS NID by a trusted entity (administrator) during mkfs. B) The MGS principal name would be derived from its NID (assuming lustre_mgs/mgsnode at REALM). Realm is determined from the usual kerberos dns -> realm mapping mechanism? C) MDT and OST (or just MDS, OSS) list retrieved via secured MGC -> MGS connection. D) MDS and OSS principal names are derived from MDS and OSS NIDs. Same realm determination as in B?>> The key (which is manually generated) should authenticate an instance >> of an >> MDS, not a "cluster". The only case where this might become delicate >> is if >> one MDS node is the server for two file systems. > > GSS/Kerberos is for the a certain kind service on a node, we can tell it > simply from the composition of Kerberos principal > "service_name/hostname at REALM". As to Lustre, lustre_mds/hostname at REALM > it''s for MDS, not specific to MDT. So if two MDTs on a MDS serving two > different file systems, GSS/Kerberos authentications are performed in > the same way for them, further access control should be handled by each > target (MDT/OST). > >>> This would allow an MDS in one cluster to change the key used for >>> capabilities on the OSSs in another cluster, no? >>> >>>> ==> So the first issue you have to become clear about is how you >>>> authorize >>>> an MDS to contact one of its OSS nodes, wherever these are place. >>> I''ve changed lsvcgssd on the OSSs to take an arbitrary number of ''-M >>> lustre_mds/mdshost at REALM'' and use this list to determine MDS >>> authorization. Is there a way in which an OSS is already aware of its >>> appropriate MDSs? >> >> As you pointed out, we need that, and Eric Mei should help you get that. > > Yes that works, probably as temporary solution. As described above, > currently OSS don''t know that info. we may need a more complete > centrally controlled server membership authentication, maybe independent > of GSS/Kerberos.If you''re interested, the patch I have is at [1].>>>> Similarly the Kerberos connections are used by the clients to >>>> connect to the >>>> OSS, but they are not used to authenticate anything (but optionally the >>>> node), they are used merely to provide privacy and/or authenticity for >>>> transporting data between the client and the OSS nodes. With >>>> relatively >>>> little effort this could be done without Kerberos at all, on the >>>> other hand, >>>> probably using Kerberos for this leads to a more easily understood >>>> architecture. >>>> >>>> So, to repeat, the authorization uses capabilities, which >>>> authenticate the >>>> requestor and contain authorization information, independent of a >>>> server >>>> user/group database on the OSS. >>>> >>>> ==> The second issue you need to be clear about is how you authenticate >>>> client NODES (NOT users) to OSS nodes. >>> Client nodes are issued lustre_root/host credentials from their local >>> realm. This works just fine for Client -> OST since the only >>> [kerberos-related] authorization check is a "lustre_root" service part. >> >> Good. Does it work across realms, because it seems we need that in any >> case? > > Yes, Ben had a patch to make it work.The foreign lustre_root principals have to be mapped on the MDS to allow mount. What are your thoughts on authorizing [squashed] mount to all, so as to not require mapping?>> BTW, thank you for trying this all out in detail, that is very helpful. >> Perhaps Sheila could talk with you and Eric Mei and get a nice writeup >> done >> for the manual.np :-) --ben [1] http://staff.psc.edu/ben/patches/lustre/lustre-explicit-mds-authz.patch -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: OpenPGP digital signature Url : http://lists.lustre.org/pipermail/lustre-devel/attachments/20080709/afbc33af/attachment.bin
On Jul 09, 2008 11:25 -0600, Eric Mei wrote:> Yes Ben is right, currently in a same realm any MDS could authenticate > with any MDS and OSS. But afaics the problem is nothing to do with > Kerberos. It''s because currently Lustre have no config information about > the server cluster membership, each server target have no idea what > other targets are. > > So solve this, we can either place the configuration on each MDS/OST > nodes - as Ben proposed in last mail; or probably better centrally > managed by MGS, thus MDT/OST would be able to get uptodate server > cluster information. Would it work?I think that MDT/OST addition to the filesystem needs to be managed properly at the MGS, regardless of whether Kerberos is in use or not. Please see bug 15827 with some details of the problem. For the non-kerberos case having administrator action at the MGS is the most secure. Enabling a shared secret key passed to mkfs.lustre like "--mgs-key e85021aee637f7250e482a9a5b23cb0d" sent from the MDT/OST to the MGS at first connect time at least provides some restriction on adding new devices to the filesystem. With Kerberos systems there could be principals for the OSTs stored inside their filesystems by mkfs.lustre or tunefs.lustre that can be loaded into the keyring at mount time. Having it inside the filesystem (instead of e.g /etc/{something}) ensures that it is always available to the MDT/OST if it can mount. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
On Wed, Jul 09, 2008 at 02:29:38PM -0600, Andreas Dilger wrote:> On Jul 09, 2008 11:25 -0600, Eric Mei wrote: > > Yes Ben is right, currently in a same realm any MDS could authenticate > > with any MDS and OSS. But afaics the problem is nothing to do with > > Kerberos. It''s because currently Lustre have no config information about > > the server cluster membership, each server target have no idea what > > other targets are. > > > > So solve this, we can either place the configuration on each MDS/OST > > nodes - as Ben proposed in last mail; or probably better centrally > > managed by MGS, thus MDT/OST would be able to get uptodate server > > cluster information. Would it work? > > I think that MDT/OST addition to the filesystem needs to be managed > properly at the MGS, regardless of whether Kerberos is in use or not. > > Please see bug 15827 with some details of the problem. > > For the non-kerberos case having administrator action at the MGS is > the most secure. Enabling a shared secret key passed to mkfs.lustre > like "--mgs-key e85021aee637f7250e482a9a5b23cb0d" sent from the > MDT/OST to the MGS at first connect time at least provides some > restriction on adding new devices to the filesystem.Hmm, is a secret key really neccessary, for me this sounds a bit like security by obscurity. Wouldn''t it be better to to have a two way MDT/OST registration? 1.) As it is, simply mount the filesystem on the MDT/OST. But this will put this filesystem into a registered, but unconfirmed state on the MGS. 2.) Introduce a lctl command for the MGS to list all registered-but-unconfirmed systems. And another command to confirm the registration. On on the other hand, this approach might conflict with the writeconf concept. Cheers, Bernd
Benjamin Bennett wrote:> Eric Mei wrote: >> Peter Braam wrote: >>> >>> >>> On 7/8/08 2:38 PM, "Benjamin Bennett" <ben at psc.edu> wrote: >>> >>>> Peter Braam wrote: >>>>> Hmm. Perhaps there are implementation issues here that overshadow the >>>>> architecture. >>>>> >>>>> To interact with MDS nodes that are part of one file system, the >>>>> MDS needs >>>>> to be part of a realm. The MDS performs authorization based on a >>>>> principal >>>>> to MDS (i.e. Lustre) user/group database. Within one Lustre file >>>>> system >>>>> each MDS MUST HAVE the same user group database. We will likely >>>>> want to >>>>> place MDS''s distributedly in the longer term future, so take clear >>>>> note of >>>>> this: one Kerberos realm owns the entire MDS cluster for a file >>>>> system. >>>> Could you explain more on why this requires a single realm and not just >>>> consistent mappings across all MDSs? >>> >>> That MIGHT work ... But how would two domains guarantee consistent >>> updates >>> to the databases? However, the server - server trust across domains >>> we need >>> is new to me (and I am not sure if/how it works). >> >> Practically it''s doable, of course. But as Peter pointed out the user >> database must be the same across all MDSs within a Luster FS. If 2 >> MDSs could share the user database, why bother putting them into >> different kerberos realms? So we assume all MDSs should be in a single >> realm. Does TeraGrid have different requirement? > > TeraGrid has a central database of users which could be used to > consistently generate mappings. > > The reason to bother putting MDSs in separate realms is that TeraGrid is > composed of distinct organizations. We are trying to distribute a > filesystem across several organizations, not simply implement a > centralized fs accessed by several organizations.I see, thanks for explanation. I think if the issue of server membership solved, there''ll be no problem to do that as GSS/Kerberos''s aspect.>>>>> There can be multiple MDS clusters, i.e. Lustre file systems, in a >>>>> single >>>>> realm, each serving their own file system. Each Lustre file system >>>>> can have >>>>> its own user/group database. No restrictions here. >>>> Well, that''s the problem with multiple clusters in a single realm, lack >>>> of restriction... ;-) >>> >>> Restrict yourself, not me or Lustre :) >>> >>>>> For a given file system the MDS nodes produce capabilities which >>>>> the OSS >>>>> nodes use for authorization. It is important that the MDS can maken >>>>> authenticated RPC''s to the OSS nodes in its file system and for >>>>> this we use >>>>> Kerberos (this is not a "must have" - it could have been done with a >>>>> different key sharing mechanism). >>>> With multiple clusters in a single realm an MDS from any cluster could >>>> authenticate and authorize as an MDS to an OSS in any cluster. >>> >>> >>> >>> Good point. If so that should be a bug. >>> >>> ===> Eric Mei, what is the story here? >> >> Yes Ben is right, currently in a same realm any MDS could authenticate >> with any MDS and OSS. But afaics the problem is nothing to do with >> Kerberos. It''s because currently Lustre have no config information >> about the server cluster membership, each server target have no idea >> what other targets are. >> >> So solve this, we can either place the configuration on each MDS/OST >> nodes - as Ben proposed in last mail; or probably better centrally >> managed by MGS, thus MDT/OST would be able to get uptodate server >> cluster information. Would it work? > > Sounds like a good idea. If I understand correctly... > A) An MDT/OST is explicitly given the MGS NID by a trusted entity > (administrator) during mkfs. > > B) The MGS principal name would be derived from its NID (assuming > lustre_mgs/mgsnode at REALM). Realm is determined from the usual kerberos > dns -> realm mapping mechanism? > > C) MDT and OST (or just MDS, OSS) list retrieved via secured MGC -> > MGS connection. > > D) MDS and OSS principal names are derived from MDS and OSS NIDs. Same > realm determination as in B?Well I guess you''re talking about secure connection of MGC->MGS. Yes we have plan to add that in the near future. As for the server membership control, I meant sysad need to teach MGS that a Lustre filesytem is comprised of what MDT/OSTs. And when a MDT/OST mounting, it can get the server list from MGS, thus it would know to prevent unwanted connection which pretend to be a MDT. And I think the membership management better be working for both with or without Kerberos.>>> The key (which is manually generated) should authenticate an instance >>> of an >>> MDS, not a "cluster". The only case where this might become >>> delicate is if >>> one MDS node is the server for two file systems. >> >> GSS/Kerberos is for the a certain kind service on a node, we can tell >> it simply from the composition of Kerberos principal >> "service_name/hostname at REALM". As to Lustre, lustre_mds/hostname at REALM >> it''s for MDS, not specific to MDT. So if two MDTs on a MDS serving two >> different file systems, GSS/Kerberos authentications are performed in >> the same way for them, further access control should be handled by >> each target (MDT/OST). >> >>>> This would allow an MDS in one cluster to change the key used for >>>> capabilities on the OSSs in another cluster, no? >>>> >>>>> ==> So the first issue you have to become clear about is how you >>>>> authorize >>>>> an MDS to contact one of its OSS nodes, wherever these are place. >>>> I''ve changed lsvcgssd on the OSSs to take an arbitrary number of ''-M >>>> lustre_mds/mdshost at REALM'' and use this list to determine MDS >>>> authorization. Is there a way in which an OSS is already aware of its >>>> appropriate MDSs? >>> >>> As you pointed out, we need that, and Eric Mei should help you get that. >> >> Yes that works, probably as temporary solution. As described above, >> currently OSS don''t know that info. we may need a more complete >> centrally controlled server membership authentication, maybe >> independent of GSS/Kerberos. > > If you''re interested, the patch I have is at [1].Thanks.>>>>> Similarly the Kerberos connections are used by the clients to >>>>> connect to the >>>>> OSS, but they are not used to authenticate anything (but optionally >>>>> the >>>>> node), they are used merely to provide privacy and/or authenticity for >>>>> transporting data between the client and the OSS nodes. With >>>>> relatively >>>>> little effort this could be done without Kerberos at all, on the >>>>> other hand, >>>>> probably using Kerberos for this leads to a more easily understood >>>>> architecture. >>>>> >>>>> So, to repeat, the authorization uses capabilities, which >>>>> authenticate the >>>>> requestor and contain authorization information, independent of a >>>>> server >>>>> user/group database on the OSS. >>>>> >>>>> ==> The second issue you need to be clear about is how you >>>>> authenticate >>>>> client NODES (NOT users) to OSS nodes. >>>> Client nodes are issued lustre_root/host credentials from their local >>>> realm. This works just fine for Client -> OST since the only >>>> [kerberos-related] authorization check is a "lustre_root" service part. >>> >>> Good. Does it work across realms, because it seems we need that in any >>> case? >> >> Yes, Ben had a patch to make it work. > > The foreign lustre_root principals have to be mapped on the MDS to allow > mount. What are your thoughts on authorizing [squashed] mount to all, > so as to not require mapping?It was original assumption we made is that "remote realm" means "different user database". That''s why remote realm user have to be remapped to a local user. It seems in TeraGrid case that''s not true anymore. The squashed mount, if I understand it correctly, it can be done by set a mapping entry in lustre/idmap.conf, to map "*@REALM" from NID "*" to a local user "U" - I don''t remember the exact syntax though. As for the user mapping part, I always feel not confident whether the current implementation is what people really want or not, and not fully tested, that''s why I didn''t put the UID mapping information on the public wiki. I believe you are the first one outside of Lustre Group to try that :) any opinions are very welcome, but decisions to change need to be made by Peter Braam.>>> BTW, thank you for trying this all out in detail, that is very helpful. >>> Perhaps Sheila could talk with you and Eric Mei and get a nice >>> writeup done >>> for the manual. > > np :-) > > > --ben > > [1] http://staff.psc.edu/ben/patches/lustre/lustre-explicit-mds-authz.patch-- Eric
On Jul 09, 2008 23:10 +0200, Bernd Schubert wrote:> On Wed, Jul 09, 2008 at 02:29:38PM -0600, Andreas Dilger wrote: > > Please see bug 15827 with some details of the problem. > > > > For the non-kerberos case having administrator action at the MGS is > > the most secure. Enabling a shared secret key passed to mkfs.lustre > > like "--mgs-key e85021aee637f7250e482a9a5b23cb0d" sent from the > > MDT/OST to the MGS at first connect time at least provides some > > restriction on adding new devices to the filesystem. > > Hmm, is a secret key really neccessary, for me this sounds a bit like > security by obscurity. Wouldn''t it be better to to have a two way > MDT/OST registration? > > 1.) As it is, simply mount the filesystem on the MDT/OST. But this will > put this filesystem into a registered, but unconfirmed state on the MGS. > > 2.) Introduce a lctl command for the MGS to list all > registered-but-unconfirmed systems. And another command to confirm the > registration.Yes, this is defintely the minimum requirement, and it should be the default behaviour. For the case when completely automated configuration is needed (e.g. during automated regression testing) then having the --mgs-key mount option would still prevent "random" OSTs from joining the filesystem (as in bug 15827).> On on the other hand, this approach might conflict with the writeconf concept.No, I think 2-step authentication is the most secure, but there needs to some way to circumvent it, only if the MGS allows it of course. If the MGS is compromised then all bets are off... Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.