thr3ads.net - samba - Samba, nmbd, HA [Jan 1999]

If this information is useful, please help other people find it:
Share via:

Nicolas Williams

1999-Jan-27 17:41 UTC

Samba, nmbd, HA

Has anyone ever setup a pair of fileservers running Samba such that each
server can takeover the other server's functionality and identity?

It seems to me that a Samba HA setup would be very similar to a virtual
server Samba setup. In other words, use the 'include' directive in
conjunction with the %L variable to make each smbd process act as the
server named by the client.

So far so good.

But in an HA setup, when one server fails and the other one takes over
it's IP addresses and services, the NetBIOS identity of the failed
server must also be passed to the remaining server.

How should one accomplish this? Can it be as simple as restarting nmbd
with a different config file naming the failed server's NetBIOS name and
aliases as aliases of the server that takes over?

Or should one use the 'bind interfaces only' and 'interfaces'
configuration parameters in two configuration files and run two nmbd
processes on the server that takes over?

Looking at the source I see that SO_REUSEADDR is set before binding all
listening TCP sockets and, most importantly, all UDP sockets used by
nmbd. This should allow two nmbd processes to run in daemon mode if they
have 'bind interfaces only' set to 'true' and disjoint interface
address
sets specified in their 'interfaces' parameter. [When using 'bind
interfaces only' nmbd will open a socket for each interface and a socket
bound to INADDR_ANY for receiving broadcast UDP packets, thus my
interest in SO_REUSEADDR. I.e., will the two nmbd processes co-exist?]

Thus, the takeover process for SMB services would consist simply of
starting an nmbd process with a smb.conf file specifying 'bind
interfaces only = true' and an 'interfaces' list that contains only
the
IP addresses of the failover host.

As for the smbd processes, the use of an 'include' directive that use
'%L' substitutions in the path of the included file should suffice.

Yes? Or should the takeover server simply change nmbd's configuration
and [not strictly necessary] restart/HUP nmbd so it takes the failed
server's NetBIOS name and aliases and adds them as aliases to the
current server?

I'm leaning to having two nmbd processes. Has anyone done this before?
What platforms does it work with?

I'm using Samba 1.9.18pl10 on Solaris 2.6 systems.

Nico

PS: Which is the most appropriate list for these questions?
PPS: I did search the archives and the standard documentation, a bit.

Nicolas Williams

1999-Jan-29 23:27 UTC

head link

Samba, nmbd, HA

The point is this:

 - When one server takes over the other, it takes over the failed host's
   disks, IP addresses and other necessary resources, then starts or
   updates various services so as to present, to the clients, the
   illusion that the failed hosts is still there.

 - When the failed host is brought back online one would like to have it
   take over its resources and services.

 - There's nothing to do as far as taking over a host's hostname/aliases
   when DNS is the naming service. You just take over the IP addresses.

 - But if the naming service is WINS/NetBIOS, then you have a problem,
   because each host must register with WINS and/or answer broadcast
   NetBIOS requests.

   Thus, when taking over a failed host one needs to take over its
   NetBIOS/WINS hostname and aliases but without just re-mapping them to
   the takeover host's IP addresses. They failed host's names must still
   resolve to the failed host's IP addresses so that you can later
   restore the services to that host once its revived.

   This is very important because NT clients cache their hostname
   lookups (like any decent platform). So when an NT client has to
   reconnect to its file servers it reconnects to the same IP it had
   been connected to, unless (so I understand) its cached hostname
   lookups have expired.

At any rate. What I was thinking of *does* work. I've just tested it.

What I suggested earlier, to recap, is:

 - Keep two separate configuration files for nmbd: one for each host in
   an HA pair. Copies of both files will be available locally on both
   hosts.

 - Each of those two files is associated with the identity of one of the
   two HA hosts.

   Thus each file lists different netbios names and aliases.

 - To make this work you must specify:

   bind interfaces only = true

   And you must list the IP addresses of the host to which the given
   config file belongs with the 'interfaces' parameter.

 - Under normal circumstances each HA host runs just ONE nmbd with the
   configuration file that corresponds to that host.

 - When one host takes over the other, it starts a second nmbd with the
   failed host's configuration file.

 - When the failed host is restored, the takeover host kills that second
   nmbd process as part of the process of relinquishing the failed
   host's resources; then the failed host, now restored, takes over its
   own resources and starts its services, as normal. And thus all goes
   back to normal.

It works for me.

I'll stick with this unless there's better way. You may want to include
a note about all this in the standard Samba docs...

Thanks to all that replied,

Nico


On Fri, Jan 29, 1999 at 12:12:34AM +1100, David Collier-Brown
wrote:> Nicolas Williams wrote:
> > But in an HA setup, when one server fails and the other one takes over
> > it's IP addresses and services, the NetBIOS identity of the failed
> > server must also be passed to the remaining server.
> 
> 	My limited understanding is that the "new" server
> 	needs to broadcast it's identity at startup time
> 	in order to "claim" its name (which samba does).
> 
> 	If the name is a duplicate, the "name server" 
> 	queries the previous holder of the name to see if it's
> 	still alive. If not, the new server gets the name.
> 	If not, it gets told to go away.
> 
> 	In the above "name server" is a deliberately vague term.
> 	It's a WINS server, a browse master or an elephant playing
> 	the trombone: I don't know which (:-))
> 
> --dave
> -- 
> David Collier-Brown,  | Always do right. This will gratify some people
> 185 Ellerslie Ave.,   | and astonish the rest.        -- Mark Twain
> Willowdale, Ontario   | http://java.science.yorku.ca/~davecb
> Work: (905) 477-0437 Home: (416) 223-8968 Email: davecb@canada.sun.com

Dan Roscigno

1999-Jan-30 04:14 UTC

head link

Group permissions (umask)

> We need to modify access rights on the Unix server so that
> when file(s) are created in a directory the file(s) get
> the group (rw-)  access privileges. When creating files in
> Unix the files do receive the group (rw-) access rights. 
> However, files created on a PC with WINDOWS 95 within the
> same directory will not receive the group (rw-) access
> rights. 
> We have added the following to the smb.conf file:
>        [newsletter]
>                comment = newsletter
>                path = /some/newsletter
>                valid users = @staff @editors
>                read only = yes
>                write list = @editors

To do this I would add:

		force group = newsletter 
		create mask = 0760

The group would be whatever group owns the files.

Dan
---------------------------------------------------------
Dan Roscigno ddr@phys.ufl.edu (352)392-4028
Physics Dept. University of Florida 2122 New Physics Building

Robert Dahlem

1999-Feb-21 10:35 UTC

head link

Samba, nmbd, HA

Nicolas,

On Sat, 30 Jan 1999 10:28:03 +1100, Nicolas Williams wrote:
> - When one server takes over the other, it takes over the failed host's
> disks, IP addresses and other necessary resources, then starts or
> updates various services so as to present, to the clients, the
> illusion that the failed hosts is still there.
That's exactly what we do in our HA cluster.

Both machines each can be in one of three states: O, B and OB meaning
"Original",
"Backup" and "Original+Backup". I'm not so happy with
the terms Original and
Backup because they imply that one machine only does some kind of warm standby
which is not the case. Instead of this, O has a set of functions, B has a set of
functions and in the backup case the surviving machine has to provide both sets
of
functionality.

We did not bother with all this nmbd and interface stuff.

Instead we did as follows:

file smb.conf.machine1:
...
all the shares on machine1
...
file smb.conf.machine2:
...
all the shares on machine2
...

file smb.conf.O:
include = /path/smb.conf.machine1
file smb.conf.B:
include = /path/smb.conf.machine2
file smb.conf.OB:
include = /path/smb.conf.machine1
include = /path/smb.conf.machine2

And now for the trick:

file smb.conf:
...
include = /path/smb.conf.HA.state
...

In our HA state changing procedure we do something like

echo "include = /path/smb.conf."`get_HA_status`
>/path/smb.conf.HA.state

When one machine fails and the other one takes over (state transition from O or
B
to OB) the file smb.conf.HA.state gets modified.

As O or B went down, all connections between any clients and the failed machine
break (naturally :-). The clients will have to reconnect (what they do silently
in
normal cases). OB comes up with the interface of the now dead machine and does
an
"ARP reply broadcast" from its new interface so everyone in the
segment learns the
new MAC address.
>From the view of OB clients (re)connecting to the failed machine produce newconnections. In this case smbd will re-read smb.conf, stumble over
smb.conf.HA.state and will have the shares of the failed machine available.

Now how to get back from OB to O and B on different machines?

We decided not to give back functionality within production times. The failed
machine gets repaired and waits for return of functionality. We wait until users
leave office and switch back the state.

There are issues like "who is local browse master". This one is
tricked the easy
way: Just let them fight for it, running one with os level = 65 and the other
one
with os level = 64. Other issues are handled with the smb.conf.O|B|OB mimic.

Regards,
Robert

--
---------------------------------------------------------------
Robert.Dahlem@gmx.net
Radio Bornheim - 2:2461/332@fidonet +49-69-4930830 (ZyX, V34)
2:2461/326@fidonet +49-69-94414444 (ISDN X.75)
---------------------------------------------------------------

Maybe Matching Threads

Search for more possibly parallel threads

samba - Jan 1999 - Samba, nmbd, HA

Samba, nmbd, HA

Samba, nmbd, HA

Group permissions (umask)

Samba, nmbd, HA

Maybe Matching Threads