thr3ads.net - Puppet users - [Puppet Users] puppet-dashboard 2.0.0 (open source) and postgresq 8.4l tuning [Mar 2014]

If this information is useful, please help other people find it:
Share via:

Pete Hartman

2014-Mar-17 20:29 UTC

[Puppet Users] puppet-dashboard 2.0.0 (open source) and postgresq 8.4l tuning

I deployed the open source puppet-dashboard 2.0.0 this past weekend for our
production environment. I did a fair amount of testing in the lab to
ensure I had the deployment down, and I deployed as a passenger service
knowing that we have a large environment and that webrick wasn't likely to
cut it. Overall, it appears to be working and behaving reasonably--I get
the summary run status graph, etc, the rest of the UI. Load average on the
box is high-ish but nothing unreasonable, and I certainly appear to have
headroom in memory and CPU.

However, when I click the "export nodes as CSV" link, it runs forever
(Hasn't stopped yet).

I looked into what the database was doing and it appears to be looping over
some unknown number of report_ids, doing

7172 | dashboard | SELECT COUNT(*) FROM "resource_statuses" WHERE
"resource_statuses"."report_id" = 39467 AND
"resource_statuses"."failed" =
'f' AND (
IN ( | 00:00:15.575955
: SELECT resource_statuses.id FROM
resource_statuses

: INNER JOIN resource_events ON
resource_statuses.id = resource_events.resource_status_id

: WHERE resource_events.status = 'noop'

: )

I ran the inner join by hand and it takes roughly 2 - 3 minutes each time.
The overall query appears to be running 8 minutes per report ID.

I've done a few things to tweak postgresql before this--it could have been
running longer earlier when I first noticed the problem.

I increased checkpoint segments to 32 from the default of 3, the
checkpoint_completion_target to 0.9 from the default of 0.5, and to be able
to observe what's going on I set stats_command_string to on.

Some other details: we have 3400 nodes (dashboard is only seeing 3290 or
so, which is part of why I want this CSV report to determine why it's a
smaller number). This postgresql instance is also the instance supporting
puppetdb, though obviously a separate database. The resource statuses
table has 47 million rows right now, and the inner join returns 4.3 million.

I'm curious if anyone else is running this version on postgresql with a
large environment and if there are places I ought to be looking to tune
this so it will run faster, or if I need to be doing something to shrink
those tables without losing information, etc.

Thanks

Pete

--
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to puppet-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/puppet-users/d2909399-b071-43e4-9ad8-2c9d6cbc170c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Puppet users - Mar 2014 - puppet-dashboard 2.0.0 (open source) and postgresq 8.4l tuning

[Puppet Users] puppet-dashboard 2.0.0 (open source) and postgresq 8.4l tuning