About a week and half ago I upgraded from 4.0.12 to 4.9.6. Overall, things are functioning. However, I have come across several strange behaviors and wandered if anyone else has noticed similar behavior on 4.9.6 or has any suggestions of what might be occurring. As background information, I have 3 DCs (dc3, dc4 and dc5) -- all running the same version (4.9.6) and all have the same configuration; dc3 was the original holder of all 7 FSMO roles, but as of last night, they were all transferred to dc4. First off, all the DCs hold steady at different levels of memory utilized. dc3 hovers at about 1.5 GB used, dc4 hovers at about .75 GB used and dc5 hovers at a little less than .5 GB used. I think that the difference in memory used might be related is the number of samba/smbd processes running; dc3 has about 250 samba/smbd processes running, dc4 has about 100 and dc5 has about 30. But why are so many more clients connecting to dc3? Secondly, dc5 has been having quirky issues ever since the upgrade. I run various health checks on the DCs nightly and it seems that every other day "samba-tool drs kcc dc5" from one of the other two DCs fails with "ERROR(runtime): DsExecuteKCC failed - (3221356597, 'The operation cannot be performed.')". dc5 also has issues creating an online backup and intermittently errors out with: "ERROR(<type 'exceptions.IndexError'>): uncaught exception - list index out of range". I did see a note about this in the troubleshooting section of the samba backup wiki page; however, the error comes and goes, so I don't know if this means it is something else. Lastly (and most disturbingly), I moved the FSMO roles from dc3 to dc4 last night (to see if the load on dc3 was related to owning those roles) and had huge instability this morning. All dcs looked OK last night (I did restart samba on dc3 when the system was experiencing low memory), but I cam in this morning and found that dc4 and dc5 had such high loads that clients that were communicating with those DCs were unable to log in. Our monitoring system saw huge CPU loads as of this morning and a memory instability (jumping up and down) since just after the FSMO role transfer last night. Are there known issues with transferring FSMO roles that might explain the instability? Is it best practice to restart samba after doing a FSMO transfer just in case? I know this is a wide range of issues, but I appreciate any input on any of them. Mike Ray
On Tue, 23 Apr 2019 11:11:21 -0500 (CDT) Mike Ray via samba <samba at lists.samba.org> wrote:> About a week and half ago I upgraded from 4.0.12 to 4.9.6. Overall, > things are functioning. > > However, I have come across several strange behaviors and wandered if > anyone else has noticed similar behavior on 4.9.6 or has any > suggestions of what might be occurring. > > As background information, I have 3 DCs (dc3, dc4 and dc5) -- all > running the same version (4.9.6) and all have the same configuration; > dc3 was the original holder of all 7 FSMO roles, but as of last > night, they were all transferred to dc4. > > > First off, all the DCs hold steady at different levels of memory > utilized. dc3 hovers at about 1.5 GB used, dc4 hovers at about .75 GB > used and dc5 hovers at a little less than .5 GB used. I think that > the difference in memory used might be related is the number of > samba/smbd processes running; dc3 has about 250 samba/smbd processes > running, dc4 has about 100 and dc5 has about 30. But why are so many > more clients connecting to dc3? > > Secondly, dc5 has been having quirky issues ever since the upgrade. I > run various health checks on the DCs nightly and it seems that every > other day "samba-tool drs kcc dc5" from one of the other two DCs > fails with "ERROR(runtime): DsExecuteKCC failed - (3221356597, 'The > operation cannot be performed.')". dc5 also has issues creating an > online backup and intermittently errors out with: "ERROR(<type > 'exceptions.IndexError'>): uncaught exception - list index out of > range". I did see a note about this in the troubleshooting section of > the samba backup wiki page; however, the error comes and goes, so I > don't know if this means it is something else. > > Lastly (and most disturbingly), I moved the FSMO roles from dc3 to > dc4 last night (to see if the load on dc3 was related to owning those > roles) and had huge instability this morning. All dcs looked OK last > night (I did restart samba on dc3 when the system was experiencing > low memory), but I cam in this morning and found that dc4 and dc5 had > such high loads that clients that were communicating with those DCs > were unable to log in. Our monitoring system saw huge CPU loads as of > this morning and a memory instability (jumping up and down) since > just after the FSMO role transfer last night. Are there known issues > with transferring FSMO roles that might explain the instability? Is > it best practice to restart samba after doing a FSMO transfer just in > case? >I wonder if you are hitting this bug: https://bugzilla.samba.org/show_bug.cgi?id=13760 I know it is supposed to be fixed, but I wonder ?? Is there anyway you can downgrade again, then walk your way up the versions ? Upgrade to 4.7.x, then 4.8.x then 4.9.x A new type of indexing was introduced at 4.8.0 and this lead to problems if you upgraded from 4.7.x (or earlier) directly to 4.9.x Can I also suggest you upgrade a bit more often in future ;-) Samba gets lots of nice things added very often. Rowland
----- On Apr 23, 2019, at 11:34 AM, samba samba at lists.samba.org wrote:> On Tue, 23 Apr 2019 11:11:21 -0500 (CDT) > Mike Ray via samba <samba at lists.samba.org> wrote: > >> About a week and half ago I upgraded from 4.0.12 to 4.9.6. Overall, >> things are functioning. >> >> However, I have come across several strange behaviors and wandered if >> anyone else has noticed similar behavior on 4.9.6 or has any >> suggestions of what might be occurring. >> >> As background information, I have 3 DCs (dc3, dc4 and dc5) -- all >> running the same version (4.9.6) and all have the same configuration; >> dc3 was the original holder of all 7 FSMO roles, but as of last >> night, they were all transferred to dc4. >> >> >> First off, all the DCs hold steady at different levels of memory >> utilized. dc3 hovers at about 1.5 GB used, dc4 hovers at about .75 GB >> used and dc5 hovers at a little less than .5 GB used. I think that >> the difference in memory used might be related is the number of >> samba/smbd processes running; dc3 has about 250 samba/smbd processes >> running, dc4 has about 100 and dc5 has about 30. But why are so many >> more clients connecting to dc3? >> >> Secondly, dc5 has been having quirky issues ever since the upgrade. I >> run various health checks on the DCs nightly and it seems that every >> other day "samba-tool drs kcc dc5" from one of the other two DCs >> fails with "ERROR(runtime): DsExecuteKCC failed - (3221356597, 'The >> operation cannot be performed.')". dc5 also has issues creating an >> online backup and intermittently errors out with: "ERROR(<type >> 'exceptions.IndexError'>): uncaught exception - list index out of >> range". I did see a note about this in the troubleshooting section of >> the samba backup wiki page; however, the error comes and goes, so I >> don't know if this means it is something else. >> >> Lastly (and most disturbingly), I moved the FSMO roles from dc3 to >> dc4 last night (to see if the load on dc3 was related to owning those >> roles) and had huge instability this morning. All dcs looked OK last >> night (I did restart samba on dc3 when the system was experiencing >> low memory), but I cam in this morning and found that dc4 and dc5 had >> such high loads that clients that were communicating with those DCs >> were unable to log in. Our monitoring system saw huge CPU loads as of >> this morning and a memory instability (jumping up and down) since >> just after the FSMO role transfer last night. Are there known issues >> with transferring FSMO roles that might explain the instability? Is >> it best practice to restart samba after doing a FSMO transfer just in >> case? >> > > I wonder if you are hitting this bug: > > https://bugzilla.samba.org/show_bug.cgi?id=13760 > > I know it is supposed to be fixed, but I wonder ??It looks like Andrew suggests a number of commands can be run to fix it. Think there is any chance that simply running them now may work? Or danger?> > Is there anyway you can downgrade again, then walk your way up the > versions ?Is there a builtin way to downgrade or official documentation on that process?> Upgrade to 4.7.x, then 4.8.x then 4.9.x > A new type of indexing was introduced at 4.8.0 and this lead to > problems if you upgraded from 4.7.x (or earlier) directly to 4.9.x > > Can I also suggest you upgrade a bit more often in future ;-) > Samba gets lots of nice things added very often.Yes! That is the plan. :)> > Rowland > > -- > To unsubscribe from this list go to the following URL and read the > instructions: https://lists.samba.org/mailman/options/samba