I have multiple servers running stock CentOS 7 rsyslog 7.4.7-16.el7, which are configured to log locally and over TCP to a remote logserver, also running stock CentOS 7 rsyslog. The remote server uses imptcp to receive, and pretty basic rules to parse and commit to disk. I have several systems that log prolifically, but periodically, they stop soon after the remote log server HUPs (daily logrotate). Very soon after they stop logging (completely, even to local files), the services on these systems block, and our monitoring system starts alerting. Restarting rsyslog on the clients proves ineffectual. The situation may clear itself without intervention after 90 minutes to several hours. However, this does not happen on all client systems in a similar situation (CentOS 7, large volume of constant log data); nor does it happen daily. Any ideas as to what's going on? Thanks in advance.
On 09/07/17 18:37, John Jasen wrote:> I have multiple servers running stock CentOS 7 rsyslog 7.4.7-16.el7, > which are configured to log locally and over TCP to a remote logserver, > also running stock CentOS 7 rsyslog. The remote server uses imptcp to > receive, and pretty basic rules to parse and commit to disk. > > I have several systems that log prolifically, but periodically, they > stop soon after the remote log server HUPs (daily logrotate). Very soon > after they stop logging (completely, even to local files), the services > on these systems block, and our monitoring system starts alerting. > Restarting rsyslog on the clients proves ineffectual. > > The situation may clear itself without intervention after 90 minutes to > several hours. > > However, this does not happen on all client systems in a similar > situation (CentOS 7, large volume of constant log data); nor does it > happen daily. > > Any ideas as to what's going on? > > Thanks in advance. >Sorry for the late answer, but can you give more details ? I remember having seen that kind of issue only when sending other logs that the default one (so when using imfile plugin, tracking other files like httpd logs as an example) What are your rules ? How is the network between all those nodes ? I had also an issue over "unreliable" network with buffer/queue and also when the receiver had his main msg queue size too small. Some parameters that can help (?) : # sender size $WorkDirectory /var/lib/rsyslog # default location for work (spool) files $ActionQueueType LinkedList # use asynchronous processing $ActionQueueFileName forwardqueue # set file name, also enables disk mode $ActionResumeRetryCount -1 # infinite retries on insert failure $ActionQueueSaveOnShutdown on # save in-memory data if rsyslog shuts down # receiver side $MainMsgQueueSize 100000 -- Fabian Arrotin The CentOS Project | http://www.centos.org gpg key: 56BEC54E | twitter: @arrfab -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: OpenPGP digital signature URL: <http://lists.centos.org/pipermail/centos/attachments/20170713/1e976007/attachment.sig>
The long and the short of the story was that another misconfigured client on the network was swamping the central logserver right after logrotate kicked offed. The best fix was to enable client memory/file queues. On 07/13/2017 04:40 AM, Fabian Arrotin wrote:> On 09/07/17 18:37, John Jasen wrote: >> I have multiple servers running stock CentOS 7 rsyslog 7.4.7-16.el7, >> which are configured to log locally and over TCP to a remote logserver, >> also running stock CentOS 7 rsyslog. The remote server uses imptcp to >> receive, and pretty basic rules to parse and commit to disk. >> >> I have several systems that log prolifically, but periodically, they >> stop soon after the remote log server HUPs (daily logrotate). Very soon >> after they stop logging (completely, even to local files), the services >> on these systems block, and our monitoring system starts alerting. >> Restarting rsyslog on the clients proves ineffectual. >> >> The situation may clear itself without intervention after 90 minutes to >> several hours. >> >> However, this does not happen on all client systems in a similar >> situation (CentOS 7, large volume of constant log data); nor does it >> happen daily. >> >> Any ideas as to what's going on? >> >> Thanks in advance. >> > Sorry for the late answer, but can you give more details ? > I remember having seen that kind of issue only when sending other logs > that the default one (so when using imfile plugin, tracking other files > like httpd logs as an example) > > What are your rules ? How is the network between all those nodes ? I had > also an issue over "unreliable" network with buffer/queue and also when > the receiver had his main msg queue size too small. > > Some parameters that can help (?) : > # sender size > $WorkDirectory /var/lib/rsyslog # default location for work (spool) files > $ActionQueueType LinkedList # use asynchronous processing > $ActionQueueFileName forwardqueue # set file name, also enables disk mode > $ActionResumeRetryCount -1 # infinite retries on insert failure > $ActionQueueSaveOnShutdown on # save in-memory data if rsyslog shuts down > > # receiver side > $MainMsgQueueSize 100000 > > > > > _______________________________________________ > CentOS mailing list > CentOS at centos.org > https://lists.centos.org/mailman/listinfo/centos
Possibly Parallel Threads
- rsyslog stops logging on service reload?
- [PATCH node] add logging.py
- remote logging with rsyslog
- [PATCH node] Added support for remote logging with rsyslog-gssapi to node. NOTE: Needs selinux to be set to permissive (setenforce 0) to work.
- First Foray into Parameterized Classes.... not so good