Hi Jeremy, how are you doing?
We are a petroleum engineering research lab. We work specifically with
reservoir simulation and we have this HPC to do this job for us.
There are some studie case that a user needs to simulate 100, 200 or 500
cases at once. And these simulations take place in parallel, having an
intense reading and writing operation.
Samba is installed in the headnode of this HPC, but the storage is in
another dedicated server and they are connected by NFS mount in an
infiniband network (100Gbps). The clients (users desktops) are connected to
headnode with a gigabit network.
There are not many symptoms and after a certain time of running these
simulations, Windows returns this error (An unexpected network error
occurred) without giving details. In the Windows event logs there is also
not much to investigate too. The error appears to be a momentary network
outage.
Some searches on Google have returned that it may be a Windows problem with
UAC. But I haven't investigated it in depth yet.
I thought that Samba could have a specific configuration for these intense
uses of reading and writing, so I decided to write to you and ask for some
advice.
Thanks and best regards
On Mon, May 24, 2021 at 1:07 PM Jeremy Allison <jra at samba.org> wrote:
> On Mon, May 24, 2021 at 09:05:05AM -0300, Daniel Lopes de Carvalho via
> samba wrote:
> >Hey guys,
> >
> >I would like to ask for help with a Samba share in an HPC environment.
> >
> >We have a pretty new HPC cluster that is running a proprietary software
> >that opens multiple Samba network connections. And these connections
> >perform intense reading and writing files.
> >
> >At a given moment, I don't know if Samba or Windows has a network
> >unavailability error.
> >
> >I would like to know if there is any configuration or tunning that can
be
> >done in Samba to support these connections and these reading and
writing
> >operations.
> >
> >In the Samba log, there are no error messages.
> >
> >Regarding the hardware, they are new machines, with less than 1 years
of
> >manufacture and the storage of this HPC is a dedicated server. Specs:
> >
> >HPE ProLiant DL380 Gen10
> >2x Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz (total 64 cores)
> >192GB RAM
> >Centos 8.1
> >ulimit = unlimited
> >file system = XFS
>
> Can you explain exactly what you mean by "as a network
> unavailability error" ? What are the symptoms you are
> seeing ? What are you trying to do ?
>
--
Daniel Lopes de Carvalho
daniel at cepetro.unicamp.br
unisim.cepetro.unicamp.br <https://www.unisim.cepetro.unicamp.br/>
+55 19 3521-1221