thr3ads.net - Puppet users - [Puppet Users] Help with scaling puppetdb/postgres [Oct 2013]

If this information is useful, please help other people find it:
Share via:

David Mesler

2013-Oct-24 15:55 UTC

[Puppet Users] Help with scaling puppetdb/postgres

Hello, I''m currently trying to deploy puppetdb to my environment but
I''m
having difficulties and am unsure on how to proceed. 
I have 1300+ nodes checking in at 15 minute intervals (3.7 million 
resources in the population). The load is spread across 6 puppet masters. I 
requisitioned what I thought would be a powerful enough machine for the 
puppetdb/postgres server. A machine with 128GB of RAM, 16 physical cpu 
cores, and a 500GB ssd for the database. I can point one or two of my 
puppet masters at puppetdb with reasonable enough performance, but anymore 
and commands start stacking up in the puppetdb command queue and agents 
start timing out. (Actually, even with just one puppet master using 
puppetdb I still have occasional agent timeouts.) Is one postgres server 
not going to cut it? Do I need to look into clustering? I''m sure some
of
you must run puppetdb in larger environments than this, any tips?

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to puppet-users+unsubscribe@googlegroups.com.
To post to this group, send email to puppet-users@googlegroups.com.
Visit this group at http://groups.google.com/group/puppet-users.
For more options, visit https://groups.google.com/groups/opt_out.

Darin Perusich

2013-Oct-24 16:54 UTC

head link

Re: [Puppet Users] Help with scaling puppetdb/postgres

Have you tuned PG? You can run pgtune,
http://pgfoundry.org/projects/pgtune, and it''ll set sizes for
postgresql.conf based on the resources available on the postgres
server.
--
Later,
Darin


On Thu, Oct 24, 2013 at 11:55 AM, David Mesler <david.mesler@gmail.com>
wrote:> Hello, I''m currently trying to deploy puppetdb to my environment
but I''m
> having difficulties and am unsure on how to proceed.
> I have 1300+ nodes checking in at 15 minute intervals (3.7 million
resources
> in the population). The load is spread across 6 puppet masters. I
> requisitioned what I thought would be a powerful enough machine for the
> puppetdb/postgres server. A machine with 128GB of RAM, 16 physical cpu
> cores, and a 500GB ssd for the database. I can point one or two of my
puppet
> masters at puppetdb with reasonable enough performance, but anymore and
> commands start stacking up in the puppetdb command queue and agents start
> timing out. (Actually, even with just one puppet master using puppetdb I
> still have occasional agent timeouts.) Is one postgres server not going to
> cut it? Do I need to look into clustering? I''m sure some of you
must run
> puppetdb in larger environments than this, any tips?
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to puppet-users+unsubscribe@googlegroups.com.
> To post to this group, send email to puppet-users@googlegroups.com.
> Visit this group at http://groups.google.com/group/puppet-users.
> For more options, visit https://groups.google.com/groups/opt_out.
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to puppet-users+unsubscribe@googlegroups.com.
To post to this group, send email to puppet-users@googlegroups.com.
Visit this group at http://groups.google.com/group/puppet-users.
For more options, visit https://groups.google.com/groups/opt_out.

Ken Barber

2013-Oct-24 17:02 UTC

head link

Re: [Puppet Users] Help with scaling puppetdb/postgres

pgtune is probably a good place to start:
https://github.com/gregs1104/pgtune ... available as an rpm/deb on the
more popular distros I believe.

Also, this is probably very premature, but I have a draft doc with
notes for how to tune your DB for PuppetDB:

https://docs.google.com/document/d/1hpFbh2q0WmxAvwfWRlurdaEF70fLc6oZtdktsCq2UFU/edit?usp=sharing

Use at your own risk, as it hasn''t been completely vetted. Happy to
get any feedback on this, as I plan on making this part of our
endorsed documentation.

Also ... there is an index that lately has been causing people
problems ''idx_catalog_resources_tags_gin''. You might want to
try
dropping it to see if it improves performances (thanks to Erik Dalen
and his colleagues for that one):

DROP INDEX idx_catalog_resources_tags_gin;

It is easily restored if it doesn''t help ... but may take some time to
build:

CREATE INDEX idx_catalog_resources_tags_gin
  ON catalog_resources
  USING gin
  (tags COLLATE pg_catalog."default");

ken.

On Thu, Oct 24, 2013 at 4:55 PM, David Mesler <david.mesler@gmail.com>
wrote:> Hello, I''m currently trying to deploy puppetdb to my environment
but I''m
> having difficulties and am unsure on how to proceed.
> I have 1300+ nodes checking in at 15 minute intervals (3.7 million
resources
> in the population). The load is spread across 6 puppet masters. I
> requisitioned what I thought would be a powerful enough machine for the
> puppetdb/postgres server. A machine with 128GB of RAM, 16 physical cpu
> cores, and a 500GB ssd for the database. I can point one or two of my
puppet
> masters at puppetdb with reasonable enough performance, but anymore and
> commands start stacking up in the puppetdb command queue and agents start
> timing out. (Actually, even with just one puppet master using puppetdb I
> still have occasional agent timeouts.) Is one postgres server not going to
> cut it? Do I need to look into clustering? I''m sure some of you
must run
> puppetdb in larger environments than this, any tips?
>
> --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to puppet-users+unsubscribe@googlegroups.com.
> To post to this group, send email to puppet-users@googlegroups.com.
> Visit this group at http://groups.google.com/group/puppet-users.
> For more options, visit https://groups.google.com/groups/opt_out.
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to puppet-users+unsubscribe@googlegroups.com.
To post to this group, send email to puppet-users@googlegroups.com.
Visit this group at http://groups.google.com/group/puppet-users.
For more options, visit https://groups.google.com/groups/opt_out.

Ken Barber

2013-Oct-24 17:08 UTC

head link

Re: [Puppet Users] Help with scaling puppetdb/postgres

Here is the URL for the GIN index problem:
http://projects.puppetlabs.com/issues/22947 so if removing it does
help, please let us know either in this thread, or preferably in the
ticket as well.

ken.

On Thu, Oct 24, 2013 at 6:02 PM, Ken Barber <ken@puppetlabs.com>
wrote:> pgtune is probably a good place to start:
> https://github.com/gregs1104/pgtune ... available as an rpm/deb on the
> more popular distros I believe.
>
> Also, this is probably very premature, but I have a draft doc with
> notes for how to tune your DB for PuppetDB:
>
>
https://docs.google.com/document/d/1hpFbh2q0WmxAvwfWRlurdaEF70fLc6oZtdktsCq2UFU/edit?usp=sharing
>
> Use at your own risk, as it hasn''t been completely vetted. Happy
to
> get any feedback on this, as I plan on making this part of our
> endorsed documentation.
>
> Also ... there is an index that lately has been causing people
> problems ''idx_catalog_resources_tags_gin''. You might want
to try
> dropping it to see if it improves performances (thanks to Erik Dalen
> and his colleagues for that one):
>
> DROP INDEX idx_catalog_resources_tags_gin;
>
> It is easily restored if it doesn''t help ... but may take some
time to build:
>
> CREATE INDEX idx_catalog_resources_tags_gin
>   ON catalog_resources
>   USING gin
>   (tags COLLATE pg_catalog."default");
>
> ken.
>
> On Thu, Oct 24, 2013 at 4:55 PM, David Mesler
<david.mesler@gmail.com> wrote:
>> Hello, I''m currently trying to deploy puppetdb to my
environment but I''m
>> having difficulties and am unsure on how to proceed.
>> I have 1300+ nodes checking in at 15 minute intervals (3.7 million
resources
>> in the population). The load is spread across 6 puppet masters. I
>> requisitioned what I thought would be a powerful enough machine for the
>> puppetdb/postgres server. A machine with 128GB of RAM, 16 physical cpu
>> cores, and a 500GB ssd for the database. I can point one or two of my
puppet
>> masters at puppetdb with reasonable enough performance, but anymore and
>> commands start stacking up in the puppetdb command queue and agents
start
>> timing out. (Actually, even with just one puppet master using puppetdb
I
>> still have occasional agent timeouts.) Is one postgres server not going
to
>> cut it? Do I need to look into clustering? I''m sure some of
you must run
>> puppetdb in larger environments than this, any tips?
>>
>> --
>> You received this message because you are subscribed to the Google
Groups
>> "Puppet Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send
an
>> email to puppet-users+unsubscribe@googlegroups.com.
>> To post to this group, send email to puppet-users@googlegroups.com.
>> Visit this group at http://groups.google.com/group/puppet-users.
>> For more options, visit https://groups.google.com/groups/opt_out.
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to puppet-users+unsubscribe@googlegroups.com.
To post to this group, send email to puppet-users@googlegroups.com.
Visit this group at http://groups.google.com/group/puppet-users.
For more options, visit https://groups.google.com/groups/opt_out.

David Mesler

2013-Oct-29 02:26 UTC

head link

Re: [Puppet Users] Help with scaling puppetdb/postgres

I reconfigured postgres based on the recommendations from pgtune and your 
document. I still had a lot of agent timeouts and eventually after running 
overnight the command queue on the puppetdb server was over 4000. Maybe I 
need a box with traditional RAID and a lot of spindles instead of the SSD. 
Or maybe I need a cluster of postgres servers (if that''s possible), I
don''t
know. The puppetdb docs said a laptop with a consumer grade SSD was enough 
for 5000 virtual nodes so I was optimistic this would be a simple setup. Oh 
well. 

On Thursday, October 24, 2013 1:02:55 PM UTC-4, Ken Barber
wrote:>
> pgtune is probably a good place to start: 
> https://github.com/gregs1104/pgtune ... available as an rpm/deb on the 
> more popular distros I believe. 
>
> Also, this is probably very premature, but I have a draft doc with 
> notes for how to tune your DB for PuppetDB: 
>
>
>
https://docs.google.com/document/d/1hpFbh2q0WmxAvwfWRlurdaEF70fLc6oZtdktsCq2UFU/edit?usp=sharing
>
> Use at your own risk, as it hasn''t been completely vetted. Happy
to
> get any feedback on this, as I plan on making this part of our 
> endorsed documentation. 
>
> Also ... there is an index that lately has been causing people 
> problems ''idx_catalog_resources_tags_gin''. You might want
to try
> dropping it to see if it improves performances (thanks to Erik Dalen 
> and his colleagues for that one): 
>
> DROP INDEX idx_catalog_resources_tags_gin; 
>
> It is easily restored if it doesn''t help ... but may take some
time to
> build: 
>
> CREATE INDEX idx_catalog_resources_tags_gin 
>   ON catalog_resources 
>   USING gin 
>   (tags COLLATE pg_catalog."default"); 
>
> ken. 
>
> On Thu, Oct 24, 2013 at 4:55 PM, David Mesler
<david....@gmail.com<javascript:>>
> wrote: 
> > Hello, I''m currently trying to deploy puppetdb to my
environment but I''m
> > having difficulties and am unsure on how to proceed. 
> > I have 1300+ nodes checking in at 15 minute intervals (3.7 million 
> resources 
> > in the population). The load is spread across 6 puppet masters. I 
> > requisitioned what I thought would be a powerful enough machine for
the
> > puppetdb/postgres server. A machine with 128GB of RAM, 16 physical cpu
> > cores, and a 500GB ssd for the database. I can point one or two of my 
> puppet 
> > masters at puppetdb with reasonable enough performance, but anymore
and
> > commands start stacking up in the puppetdb command queue and agents 
> start 
> > timing out. (Actually, even with just one puppet master using puppetdb
I
> > still have occasional agent timeouts.) Is one postgres server not
going
> to 
> > cut it? Do I need to look into clustering? I''m sure some of
you must run
> > puppetdb in larger environments than this, any tips? 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups 
> > "Puppet Users" group. 
> > To unsubscribe from this group and stop receiving emails from it, send
> an 
> > email to puppet-users...@googlegroups.com <javascript:>. 
> > To post to this group, send email to
puppet...@googlegroups.com<javascript:>.
>
> > Visit this group at http://groups.google.com/group/puppet-users. 
> > For more options, visit https://groups.google.com/groups/opt_out. 
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to puppet-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/puppet-users/32aeae93-6636-4f30-83a4-69036374b8fe%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

ak0ska

2013-Oct-29 13:04 UTC

head link

Re: [Puppet Users] Help with scaling puppetdb/postgres

Just out of curiosity, what is your catalog duplication rate?

On Tuesday, October 29, 2013 3:26:20 AM UTC+1, David Mesler
wrote:>
> I reconfigured postgres based on the recommendations from pgtune and your 
> document. I still had a lot of agent timeouts and eventually after running 
> overnight the command queue on the puppetdb server was over 4000. Maybe I 
> need a box with traditional RAID and a lot of spindles instead of the SSD. 
> Or maybe I need a cluster of postgres servers (if that''s
possible), I don''t
> know. The puppetdb docs said a laptop with a consumer grade SSD was enough 
> for 5000 virtual nodes so I was optimistic this would be a simple setup. Oh
> well. 
>
> On Thursday, October 24, 2013 1:02:55 PM UTC-4, Ken Barber wrote:
>>
>> pgtune is probably a good place to start: 
>> https://github.com/gregs1104/pgtune ... available as an rpm/deb on the 
>> more popular distros I believe. 
>>
>> Also, this is probably very premature, but I have a draft doc with 
>> notes for how to tune your DB for PuppetDB: 
>>
>>
>>
https://docs.google.com/document/d/1hpFbh2q0WmxAvwfWRlurdaEF70fLc6oZtdktsCq2UFU/edit?usp=sharing
>>
>> Use at your own risk, as it hasn''t been completely vetted.
Happy to
>> get any feedback on this, as I plan on making this part of our 
>> endorsed documentation. 
>>
>> Also ... there is an index that lately has been causing people 
>> problems ''idx_catalog_resources_tags_gin''. You might
want to try
>> dropping it to see if it improves performances (thanks to Erik Dalen 
>> and his colleagues for that one): 
>>
>> DROP INDEX idx_catalog_resources_tags_gin; 
>>
>> It is easily restored if it doesn''t help ... but may take some
time to
>> build: 
>>
>> CREATE INDEX idx_catalog_resources_tags_gin 
>>   ON catalog_resources 
>>   USING gin 
>>   (tags COLLATE pg_catalog."default"); 
>>
>> ken. 
>>
>> On Thu, Oct 24, 2013 at 4:55 PM, David Mesler
<david....@gmail.com>
>> wrote: 
>> > Hello, I''m currently trying to deploy puppetdb to my
environment but
>> I''m 
>> > having difficulties and am unsure on how to proceed. 
>> > I have 1300+ nodes checking in at 15 minute intervals (3.7 million
>> resources 
>> > in the population). The load is spread across 6 puppet masters. I 
>> > requisitioned what I thought would be a powerful enough machine
for the
>> > puppetdb/postgres server. A machine with 128GB of RAM, 16 physical
cpu
>> > cores, and a 500GB ssd for the database. I can point one or two of
my
>> puppet 
>> > masters at puppetdb with reasonable enough performance, but
anymore and
>> > commands start stacking up in the puppetdb command queue and
agents
>> start 
>> > timing out. (Actually, even with just one puppet master using
puppetdb
>> I 
>> > still have occasional agent timeouts.) Is one postgres server not
going
>> to 
>> > cut it? Do I need to look into clustering? I''m sure some
of you must
>> run 
>> > puppetdb in larger environments than this, any tips? 
>> > 
>> > -- 
>> > You received this message because you are subscribed to the Google
>> Groups 
>> > "Puppet Users" group. 
>> > To unsubscribe from this group and stop receiving emails from it,
send
>> an 
>> > email to puppet-users...@googlegroups.com. 
>> > To post to this group, send email to puppet...@googlegroups.com. 
>> > Visit this group at http://groups.google.com/group/puppet-users. 
>> > For more options, visit https://groups.google.com/groups/opt_out. 
>>
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to puppet-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/puppet-users/34a832fd-dcbb-4ffe-ae99-3e0ae80f24cc%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ken Barber

2013-Oct-29 13:06 UTC

head link

Re: [Puppet Users] Help with scaling puppetdb/postgres

Hmm.
> I reconfigured postgres based on the recommendations from pgtune and your
> document. I still had a lot of agent timeouts and eventually after running
> overnight the command queue on the puppetdb server was over 4000. Maybe I
> need a box with traditional RAID and a lot of spindles instead of the SSD.
> Or maybe I need a cluster of postgres servers (if that''s
possible), I don''t
> know. The puppetdb docs said a laptop with a consumer grade SSD was enough
> for 5000 virtual nodes so I was optimistic this would be a simple setup. Oh
> well.
So the reality is, you are effectively running 5200 nodes in
comparison with the vague statement in the docs. This is because you
are running every 15 minutes, whereas the statement presumes running
every hour.

Can we get a look at your dashboard? In particular your catalog and
resource duplication rate?

ken.

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to puppet-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/puppet-users/CAE4bNTnBeRexPU05w5DxrKLkPA98a95gJmS%2BcDW%2BaUvmm14Fng%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

David Mesler

2013-Oct-29 16:50 UTC

head link

Re: [Puppet Users] Help with scaling puppetdb/postgres

Resource duplication is 98.7%, catalog duplication is 1.5%. 

On Tuesday, October 29, 2013 9:06:37 AM UTC-4, Ken Barber
wrote:>
> Hmm. 
>
> > I reconfigured postgres based on the recommendations from pgtune and 
> your 
> > document. I still had a lot of agent timeouts and eventually after 
> running 
> > overnight the command queue on the puppetdb server was over 4000.
Maybe
> I 
> > need a box with traditional RAID and a lot of spindles instead of the 
> SSD. 
> > Or maybe I need a cluster of postgres servers (if that''s
possible), I
> don''t 
> > know. The puppetdb docs said a laptop with a consumer grade SSD was 
> enough 
> > for 5000 virtual nodes so I was optimistic this would be a simple
setup.
> Oh 
> > well. 
>
> So the reality is, you are effectively running 5200 nodes in 
> comparison with the vague statement in the docs. This is because you 
> are running every 15 minutes, whereas the statement presumes running 
> every hour. 
>
> Can we get a look at your dashboard? In particular your catalog and 
> resource duplication rate? 
>
> ken. 
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to puppet-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/puppet-users/46312de5-62fb-4844-9ab6-a93a01abfe24%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ryan Senior

2013-Oct-29 18:32 UTC

head link

Re: [Puppet Users] Help with scaling puppetdb/postgres

1.5% catalog duplication is really low and from a PuppetDB perspective,
means a lot more database I/O.  I think that probably explains the problems
you are seeing.  A more typical duplication percentage would be something
over 90%.

The next step here is figuring out why the duplication percentage is so
low.  There''s a ticket I''m working on now [1] to help in
debugging these
kinds of issues with catalogs, but it''s not done yet.  One option you
have
now is to query for the current catalog of a node after a few subsequent
catalog updates.  You can do this using curl and the catalogs API [2].
 That API call will give you a JSON representation of the catalog data from
PuppetDB for that node.  You can then compare the JSON files and see if you
maybe have a resource that is changing with each run.  If you need help
getting that information or want some more help troubleshooting the output,
head over to #puppet on IRC [3] and one of the PuppetDB folks can help you
out.

1 - https://projects.puppetlabs.com/issues/22977
2 - https://docs.puppetlabs.com/puppetdb/1.5/api/query/v3/catalogs.html
3 - http://projects.puppetlabs.com/projects/1/wiki/Irc_Channel

On Tue, Oct 29, 2013 at 11:50 AM, David Mesler
<david.mesler@gmail.com>wrote:
> Resource duplication is 98.7%, catalog duplication is 1.5%.
>
> On Tuesday, October 29, 2013 9:06:37 AM UTC-4, Ken Barber wrote:
>>
>> Hmm.
>>
>> > I reconfigured postgres based on the recommendations from pgtune
and
>> your
>> > document. I still had a lot of agent timeouts and eventually after
>> running
>> > overnight the command queue on the puppetdb server was over 4000.
Maybe
>> I
>> > need a box with traditional RAID and a lot of spindles instead of
the
>> SSD.
>> > Or maybe I need a cluster of postgres servers (if that''s
possible), I
>> don''t
>> > know. The puppetdb docs said a laptop with a consumer grade SSD
was
>> enough
>> > for 5000 virtual nodes so I was optimistic this would be a simple
>> setup. Oh
>> > well.
>>
>> So the reality is, you are effectively running 5200 nodes in
>> comparison with the vague statement in the docs. This is because you
>> are running every 15 minutes, whereas the statement presumes running
>> every hour.
>>
>> Can we get a look at your dashboard? In particular your catalog and
>> resource duplication rate?
>>
>> ken.
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "Puppet Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to puppet-users+unsubscribe@googlegroups.com.
> To view this discussion on the web visit
>
https://groups.google.com/d/msgid/puppet-users/46312de5-62fb-4844-9ab6-a93a01abfe24%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to puppet-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/puppet-users/CAGDMwd0Ms21m49g3%3DEsyCFpDuHdCZ-6L%3D2YitRnZHE6ii3kj5Q%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

ak0ska

2013-Oct-30 08:11 UTC

head link

Re: [Puppet Users] Help with scaling puppetdb/postgres

Also looking at the reports (Foreman, PuppetDB) might give a clue of what 
is changing.

On Tuesday, October 29, 2013 7:32:54 PM UTC+1, Ryan Senior
wrote:>
> 1.5% catalog duplication is really low and from a PuppetDB perspective, 
> means a lot more database I/O.  I think that probably explains the problems
> you are seeing.  A more typical duplication percentage would be something 
> over 90%.
>
> The next step here is figuring out why the duplication percentage is so 
> low.  There''s a ticket I''m working on now [1] to help in
debugging these
> kinds of issues with catalogs, but it''s not done yet.  One option
you have
> now is to query for the current catalog of a node after a few subsequent 
> catalog updates.  You can do this using curl and the catalogs API [2]. 
>  That API call will give you a JSON representation of the catalog data from
> PuppetDB for that node.  You can then compare the JSON files and see if you
> maybe have a resource that is changing with each run.  If you need help 
> getting that information or want some more help troubleshooting the output,
> head over to #puppet on IRC [3] and one of the PuppetDB folks can help you 
> out. 
>
>
> 1 - https://projects.puppetlabs.com/issues/22977
> 2 - https://docs.puppetlabs.com/puppetdb/1.5/api/query/v3/catalogs.html
> 3 - http://projects.puppetlabs.com/projects/1/wiki/Irc_Channel
>
>
> On Tue, Oct 29, 2013 at 11:50 AM, David Mesler
<david....@gmail.com<javascript:>
> > wrote:
>
>> Resource duplication is 98.7%, catalog duplication is 1.5%. 
>>
>> On Tuesday, October 29, 2013 9:06:37 AM UTC-4, Ken Barber wrote:
>>>
>>> Hmm. 
>>>
>>> > I reconfigured postgres based on the recommendations from
pgtune and
>>> your 
>>> > document. I still had a lot of agent timeouts and eventually
after
>>> running 
>>> > overnight the command queue on the puppetdb server was over
4000.
>>> Maybe I 
>>> > need a box with traditional RAID and a lot of spindles instead
of the
>>> SSD. 
>>> > Or maybe I need a cluster of postgres servers (if
that''s possible), I
>>> don''t 
>>> > know. The puppetdb docs said a laptop with a consumer grade
SSD was
>>> enough 
>>> > for 5000 virtual nodes so I was optimistic this would be a
simple
>>> setup. Oh 
>>> > well. 
>>>
>>> So the reality is, you are effectively running 5200 nodes in 
>>> comparison with the vague statement in the docs. This is because
you
>>> are running every 15 minutes, whereas the statement presumes
running
>>> every hour. 
>>>
>>> Can we get a look at your dashboard? In particular your catalog and
>>> resource duplication rate? 
>>>
>>> ken. 
>>>
>>  -- 
>> You received this message because you are subscribed to the Google
Groups
>> "Puppet Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send
an
>> email to puppet-users...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>>
https://groups.google.com/d/msgid/puppet-users/46312de5-62fb-4844-9ab6-a93a01abfe24%40googlegroups.com
>> .
>>
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to puppet-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/puppet-users/3fcde527-e51a-4ceb-a126-61b731a3b257%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

David Mesler

2013-Nov-08 00:53 UTC

head link

Re: [Puppet Users] Help with scaling puppetdb/postgres

Well I found the cause of my 1% duplication rate. I was using the 
recommendation from this page 
(http://projects.puppetlabs.com/projects/mcollective-plugins/wiki/FactsFacterYAML)
to generate a facts.yaml file for mcollective. I got rid of that and my 
catalog duplication went up to 73%. I''m not sure what else is changing,
my
catalogs are huge and I don''t know how to diff unsorted json files. 

I also moved to a server with a 10 disk RAID10 and performance is better.  
I''m still having trouble tuning autovacuum. Either vacuums never finish
because they''re constantly delayed, or they eat up all the IO and
things
grind to a halt. And even when IO seems low there are still times where the 
puppetdb queue swells to over 1000 before draining. 


On Tuesday, October 29, 2013 2:32:54 PM UTC-4, Ryan Senior
wrote:>
> 1.5% catalog duplication is really low and from a PuppetDB perspective, 
> means a lot more database I/O.  I think that probably explains the problems
> you are seeing.  A more typical duplication percentage would be something 
> over 90%.
>
> The next step here is figuring out why the duplication percentage is so 
> low.  There''s a ticket I''m working on now [1] to help in
debugging these
> kinds of issues with catalogs, but it''s not done yet.  One option
you have
> now is to query for the current catalog of a node after a few subsequent 
> catalog updates.  You can do this using curl and the catalogs API [2]. 
>  That API call will give you a JSON representation of the catalog data from
> PuppetDB for that node.  You can then compare the JSON files and see if you
> maybe have a resource that is changing with each run.  If you need help 
> getting that information or want some more help troubleshooting the output,
> head over to #puppet on IRC [3] and one of the PuppetDB folks can help you 
> out. 
>
>
> 1 - https://projects.puppetlabs.com/issues/22977
> 2 - https://docs.puppetlabs.com/puppetdb/1.5/api/query/v3/catalogs.html
> 3 - http://projects.puppetlabs.com/projects/1/wiki/Irc_Channel
>
>
> On Tue, Oct 29, 2013 at 11:50 AM, David Mesler
<david....@gmail.com<javascript:>
> > wrote:
>
>> Resource duplication is 98.7%, catalog duplication is 1.5%. 
>>
>> On Tuesday, October 29, 2013 9:06:37 AM UTC-4, Ken Barber wrote:
>>>
>>> Hmm. 
>>>
>>> > I reconfigured postgres based on the recommendations from
pgtune and
>>> your 
>>> > document. I still had a lot of agent timeouts and eventually
after
>>> running 
>>> > overnight the command queue on the puppetdb server was over
4000.
>>> Maybe I 
>>> > need a box with traditional RAID and a lot of spindles instead
of the
>>> SSD. 
>>> > Or maybe I need a cluster of postgres servers (if
that''s possible), I
>>> don''t 
>>> > know. The puppetdb docs said a laptop with a consumer grade
SSD was
>>> enough 
>>> > for 5000 virtual nodes so I was optimistic this would be a
simple
>>> setup. Oh 
>>> > well. 
>>>
>>> So the reality is, you are effectively running 5200 nodes in 
>>> comparison with the vague statement in the docs. This is because
you
>>> are running every 15 minutes, whereas the statement presumes
running
>>> every hour. 
>>>
>>> Can we get a look at your dashboard? In particular your catalog and
>>> resource duplication rate? 
>>>
>>> ken. 
>>>
>>  -- 
>> You received this message because you are subscribed to the Google
Groups
>> "Puppet Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send
an
>> email to puppet-users...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>>
https://groups.google.com/d/msgid/puppet-users/46312de5-62fb-4844-9ab6-a93a01abfe24%40googlegroups.com
>> .
>>
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>
-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to puppet-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/puppet-users/c92a6d01-bed2-462a-a536-69f0dae33fc0%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jcbollinger

2013-Nov-08 14:35 UTC

head link

Re: [Puppet Users] Help with scaling puppetdb/postgres

On Thursday, November 7, 2013 6:53:25 PM UTC-6, David Mesler
wrote:>
> Well I found the cause of my 1% duplication rate. I was using the 
> recommendation from this page (
>
http://projects.puppetlabs.com/projects/mcollective-plugins/wiki/FactsFacterYAML)
> to generate a facts.yaml file for mcollective.
>

Most likely that would be because of the ''content'' parameter
of resource
File[''/etc/mcollective/facts.yaml''].  The values of resource
parameters are
part of the catalog, so to the extent that nodes have different facts 
($::hostname, for instance) their catalogs will differ.  That seems to 
present a fundamental problem for scaling to large numbers of nodes when 
you''re also using PuppetDB.

 
> I got rid of that and my catalog duplication went up to 73%. I''m
not sure
> what else is changing, my catalogs are huge and I don''t know how
to diff
> unsorted json files. 
>
>
A quick and dirty way to compare would be simply to pass the catalogs 
through ''sort'' before ''diff''ing them.  Doing
so will trash the json
structure, but you should still get some useful information out of it.  At 
minimum you will find out whether there are few or many differences between 
your catalogs, and you should get at least a general idea about what 
differs.  This would be most effective if applied to catalogs that are 
distinct with the facts.yaml generation in place, but duplicate without.



John

-- 
You received this message because you are subscribed to the Google Groups
"Puppet Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to puppet-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/puppet-users/857f7988-2c26-49d5-a349-fbd57bb0f223%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Maybe Matching Threads

Search for more apparently analagous threads

Puppet users - Oct 2013 - Help with scaling puppetdb/postgres

[Puppet Users] Help with scaling puppetdb/postgres

Re: [Puppet Users] Help with scaling puppetdb/postgres

Re: [Puppet Users] Help with scaling puppetdb/postgres

Re: [Puppet Users] Help with scaling puppetdb/postgres

Re: [Puppet Users] Help with scaling puppetdb/postgres

Re: [Puppet Users] Help with scaling puppetdb/postgres

Re: [Puppet Users] Help with scaling puppetdb/postgres

Re: [Puppet Users] Help with scaling puppetdb/postgres

Re: [Puppet Users] Help with scaling puppetdb/postgres

Re: [Puppet Users] Help with scaling puppetdb/postgres

Re: [Puppet Users] Help with scaling puppetdb/postgres

Re: [Puppet Users] Help with scaling puppetdb/postgres

Maybe Matching Threads