Johannes Amorosa | Celluloid VFX
2016-Apr-05  07:28 UTC
[Samba] Debugging Samba4 - application sometimes fails because files are invisible/gone
Hello Samba list,
we have a problem that our proprietary application sometimes can't find 
files on our samba share. I'm hoping
for some help on this list.
Our setup is two ADs as replicated domain Controller ( Ubuntu 12.04.5 
LTS, Version 4.1.17-SerNet-Ubuntu-10.precise)
and several domain member as file servers and mixed clients (~40 x Win7, 
Ubuntu and OSX). The ADs use internal DNS.
We have a proprietary software that runs as a cluster and needs a common 
shared network volume. This volume is
on a domain member running (Ubuntu 12.04.5 LTS, Version 
4.1.17-SerNet-Ubuntu-10.precise) with a zfs Raid 0.6.3.
Authentication is done via pam and works fine. All test described 
[1]here succeed and we're using this setup in production for over
a year.
Problem: Sometimes (1-2/month) our application fails with a error 
message like:
\\cell-dead-01\deadlinerepo\jobs\56fe4a61b9baa917e4169c31\DraftCreateMovie.py 
(System.IO.FileNotFoundException)"
Although the file exists and has the same acl like everything else:
/silo/deadlinerepo/jobs/56fe4a61b9baa917e4169c31/DraftCreateMovie.py
We know that zfs is maybe not production ready and needs at least to be 
upgraded to 0.6.5.6.
We should upgrade samba as well at least to 4.2.X. This will be done 
hopefully in may. It's possible
we hit a bug in the application itself. Meanwhile I'm trying to make 
sense of samba log files and
basically fail of spaminess. I configured vfs_audit to get behind these 
issues to see who is
responsible. I'm seeing a lot of errors and want to know what to make 
out of it. In one day
audit.log increased to 35mb.
Here a some snippets:
deadlinerepo|is_offline|fail (Operation not 
supported)|scripts/Submission/HServerSubmission.py
deadlinerepo|translate_name|fail (Operation not supported)
deadlinerepo|sys_acl_get_file|fail (Operation not 
supported)|scripts/Submission
deadlinerepo|open|ok|r|custom/scripts/Submission
deadlinerepo|realpath|fail (No such file or directory)|custom/events/Draft
Interesting enough the app runs perfect most of the time - but if this 
happens it ruins a day of computation and
deadlines are always super tight meaning overtime for some of us. Can 
someone shed some
light on this? Thank you for your time.
Joe
Domain dc config:
[global]
     ...
     server role = active directory domain controller
     ...
Domain member config:
[global]
    ...
    security = ADS
    realm = MOO.NET
    encrypt passwords = yes
    ...
    full_audit:prefix = %u|%I|%S
    full_audit:success = open opendir
    full_audit:failure = all !open
    full_audit:facility = local5
    full_audit:priority = notice
    ...
[deadlinerepo]
         ...
         read only       = no
         path            = /silo/deadlinerepo
         comment         = Deadline Repository
         veto files      = 
/._*/.DS_Store/.Trash*/.TemporaryItems/desktop.ini/.apdisk/
         guest ok        = yes
         force user      = moo
         browseable = yes
         vfs objects = full_audit
         ...
[1] 
https://wiki.samba.org/index.php/Setup_a_Samba_Active_Directory_Domain_Controller
-- 
Johannes Amorosa | Celluloid VFX
Celluloid Visual Effects GmbH & Co. KG
Paul-Lincke-Ufer 39/40, 10999 Berlin
Jeremy Allison
2016-Apr-08  00:01 UTC
[Samba] Debugging Samba4 - application sometimes fails because files are invisible/gone
On Tue, Apr 05, 2016 at 09:28:12AM +0200, Johannes Amorosa | Celluloid VFX wrote:> Hello Samba list, > we have a problem that our proprietary application sometimes can't > find files on our samba share. I'm hoping > for some help on this list. > > Our setup is two ADs as replicated domain Controller ( Ubuntu > 12.04.5 LTS, Version 4.1.17-SerNet-Ubuntu-10.precise) > and several domain member as file servers and mixed clients (~40 x > Win7, Ubuntu and OSX). The ADs use internal DNS. > > We have a proprietary software that runs as a cluster and needs a > common shared network volume. This volume is > on a domain member running (Ubuntu 12.04.5 LTS, Version > 4.1.17-SerNet-Ubuntu-10.precise) with a zfs Raid 0.6.3. > > Authentication is done via pam and works fine. All test described > [1]here succeed and we're using this setup in production for over > a year. > > Problem: Sometimes (1-2/month) our application fails with a error > message like: > \\cell-dead-01\deadlinerepo\jobs\56fe4a61b9baa917e4169c31\DraftCreateMovie.py > (System.IO.FileNotFoundException)" > > Although the file exists and has the same acl like everything else: > /silo/deadlinerepo/jobs/56fe4a61b9baa917e4169c31/DraftCreateMovie.py > > We know that zfs is maybe not production ready and needs at least to > be upgraded to 0.6.5.6. > We should upgrade samba as well at least to 4.2.X. This will be done > hopefully in may. It's possible > we hit a bug in the application itself. Meanwhile I'm trying to make > sense of samba log files and > basically fail of spaminess. I configured vfs_audit to get behind > these issues to see who is > responsible. I'm seeing a lot of errors and want to know what to > make out of it. In one day > audit.log increased to 35mb. > > Here a some snippets: > > deadlinerepo|is_offline|fail (Operation not > supported)|scripts/Submission/HServerSubmission.py > deadlinerepo|translate_name|fail (Operation not supported) > deadlinerepo|sys_acl_get_file|fail (Operation not > supported)|scripts/Submission > deadlinerepo|open|ok|r|custom/scripts/Submission > deadlinerepo|realpath|fail (No such file or directory)|custom/events/Draft > > Interesting enough the app runs perfect most of the time - but if > this happens it ruins a day of computation and > deadlines are always super tight meaning overtime for some of us. > Can someone shed some > light on this? Thank you for your time. > JoeSorry, but there's not enough info for us to determine what might be the problem. Getting it repeatable will be the first step.
Johannes Amorosa | Celluloid VFX
2016-Apr-08  09:17 UTC
[Samba] Debugging Samba4 - application sometimes fails because files are invisible/gone
On 04/08/2016 02:01 AM, Jeremy Allison wrote:> On Tue, Apr 05, 2016 at 09:28:12AM +0200, Johannes Amorosa | Celluloid VFX wrote: >> Hello Samba list, >> we have a problem that our proprietary application sometimes can't >> find files on our samba share. I'm hoping >> for some help on this list. >> >> Our setup is two ADs as replicated domain Controller ( Ubuntu >> 12.04.5 LTS, Version 4.1.17-SerNet-Ubuntu-10.precise) >> and several domain member as file servers and mixed clients (~40 x >> Win7, Ubuntu and OSX). The ADs use internal DNS. >> >> We have a proprietary software that runs as a cluster and needs a >> common shared network volume. This volume is >> on a domain member running (Ubuntu 12.04.5 LTS, Version >> 4.1.17-SerNet-Ubuntu-10.precise) with a zfs Raid 0.6.3. >> >> Authentication is done via pam and works fine. All test described >> [1]here succeed and we're using this setup in production for over >> a year. >> >> Problem: Sometimes (1-2/month) our application fails with a error >> message like: >> \\cell-dead-01\deadlinerepo\jobs\56fe4a61b9baa917e4169c31\DraftCreateMovie.py >> (System.IO.FileNotFoundException)" >> >> Although the file exists and has the same acl like everything else: >> /silo/deadlinerepo/jobs/56fe4a61b9baa917e4169c31/DraftCreateMovie.py >> >> We know that zfs is maybe not production ready and needs at least to >> be upgraded to 0.6.5.6. >> We should upgrade samba as well at least to 4.2.X. This will be done >> hopefully in may. It's possible >> we hit a bug in the application itself. Meanwhile I'm trying to make >> sense of samba log files and >> basically fail of spaminess. I configured vfs_audit to get behind >> these issues to see who is >> responsible. I'm seeing a lot of errors and want to know what to >> make out of it. In one day >> audit.log increased to 35mb. >> >> Here a some snippets: >> >> deadlinerepo|is_offline|fail (Operation not >> supported)|scripts/Submission/HServerSubmission.py >> deadlinerepo|translate_name|fail (Operation not supported) >> deadlinerepo|sys_acl_get_file|fail (Operation not >> supported)|scripts/Submission >> deadlinerepo|open|ok|r|custom/scripts/Submission >> deadlinerepo|realpath|fail (No such file or directory)|custom/events/Draft >> >> Interesting enough the app runs perfect most of the time - but if >> this happens it ruins a day of computation and >> deadlines are always super tight meaning overtime for some of us. >> Can someone shed some >> light on this? Thank you for your time. >> Joe > Sorry, but there's not enough info for us to > determine what might be the problem. Getting > it repeatable will be the first step. >Thank you Jeremy for answering my post - I have upgraded all our DCs and fileservers to 4.2. in hope of not hitting that bug again - zfs upgrade requires a reboot - we have a window next week. Unfortunately after the upgrade my audit log stays empty. -- Johannes Amorosa | Celluloid VFX Celluloid Visual Effects GmbH & Co. KG Paul-Lincke-Ufer 39/40, 10999 Berlin phone +49 (0)30 / 54 735 220 fax +49 (0)30 / 54 735 221
Maybe Matching Threads
- Debugging Samba4 - application sometimes fails because files are invisible/gone
- Debugging Samba4 - application sometimes fails because files are invisible/gone
- TSIG error with server: tsig verify failure
- Samba anonymous dns forwarding
- Samba anonymous dns forwarding