thr3ads.net - Puppet users - improving "file" to support http [Dec 2007]

If this information is useful, please help other people find it:
Share via:

Phillip Scholz

2007-Dec-18 15:55 UTC

improving "file" to support http

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hey,

would it be very hard to improve file''s source parameter to support
downloading a file from a http server?
I got some quite huge files (~100MB) to distribute and update on each of
the servers, so doing this with puppet:// is a mess (and mostly failes
with the known buffer error).

Yours, Phillip
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHZ+1c1MyrzuZeshoRAm+iAJsFcwnxaiDrtjZPnWljp/TWSxipcQCffEwn
k2OEyngXZeUzUiWz4SsnVyM=OYbh
-----END PGP SIGNATURE-----

Berge Schwebs Bjørlo

2007-Dec-18 16:17 UTC

head link

Re: improving "file" to support http

On Tue, Dec 18, 2007 at 04:55:09PM +0100, Phillip Scholz
wrote:> Hey,
Greetings!
> I got some quite huge files (~100MB) to distribute and update on each of
> the servers, so doing this with puppet:// is a mess (and mostly failes with
> the known buffer error).
Any particular reason you don't do this by distributing packages for your
OS?
We distribute nothing but config files and very simple scripts with puppet,
all other stuff is packaged (in Debian packages, for our part) and
distributed along with other OS updates. puppet can ensure these packages get
installed.

Of course, you could just execute wget with puppet, but I admit it's kind of
hackish.

Cheers,
-Berge

-- 
Berge Schwebs Bjørlo
Alegría!
_______________________________________________
Puppet-users mailing list
Puppet-users@madstop.com
https://mail.madstop.com/mailman/listinfo/puppet-users

Luke Kanies

2007-Dec-18 16:57 UTC

head link

Re: improving "file" to support http

On Dec 18, 2007, at 9:55 AM, Phillip Scholz wrote:
> Hey,
>
> would it be very hard to improve file''s source parameter to
support
> downloading a file from a http server?
> I got some quite huge files (~100MB) to distribute and update on  
> each of
> the servers, so doing this with puppet:// is a mess (and mostly failes
> with the known buffer error).
As Berge mentioned, Puppet isn''t really meant to transfer large  
files.  It could be extended to do so, although it will be much  
easier once we''ve actually successfully switched to REST (which
I''d
love to give a timeline for but that''s burned me recently, so....).

The big problem is that you can''t really depend on md5 sums for http  
transfers, you''d have to stick with modification dates.

I''d back-burner it until REST is complete, then if you''re
interested
I can show you where to go.

  --
  The time to repair the roof is when the sun is shining.
          -- John F. Kennedy
  ---------------------------------------------------------------------
  Luke Kanies | http://reductivelabs.com | http://madstop.com

Phillip Scholz

2007-Dec-19 10:20 UTC

head link

Re: improving "file" to support http

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hey,

thank you both.
Sp I guess I''ll have a look into our package mirror ...

Luke Kanies schrieb:> The big problem is that you can''t really depend on md5 sums for
http
> transfers, you''d have to stick with modification dates.Ok, that really sounds like a good reason to not use http. What about rsync?
> I''d back-burner it until REST is complete, then if you''re
interested
> I can show you where to go.Would be nice. Thanks.

Yours,
	Phillip
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHaPBU1MyrzuZeshoRAsOLAJ46PylO3zOV0H8VOOn3hCEMpy5otgCdHV28
AjT4fneATVEJrTRc9DXkXjc=utC9
-----END PGP SIGNATURE-----

Trevor Vaughan

2007-Dec-20 12:16 UTC

head link

Re: improving "file" to support http

Just out of curiosity, why can''t you rely on MD5 sums for file
transfers over http?

If you combined a direct http server on the puppet server itself, then
the puppet server could calculate the MD5 sums locally and pass that
to the client bound with the URL.

Sort of like:

Client requests file foo.txt
Manifest states that foo.txt should have MD5 checksum <whatever>
Manifest states that foo.txt resides at <URL>
Client checks local MD5 checksum and, if not correct, retrieves from
<URL> and re-checks.  Throws warning if still incorrect.

The server side would be a bit more interesting...

If the file resides on the puppet server then the server can note a
mask URL for the client but use a file URL internally. Example:
   file { "foo.txt":
        content => file:///files/foo.txt,
        mask => http://puppet/files/foo.txt,
        <etc...> }

Then, any file type that the internal ruby libs can handle can be
retrieved from anywhere.  I can envision this for file:// http://
https:// ldap:// rsync:// ftp:// etc.....

You could even optionally designate an external source for MD5
checksums or place them in the manifest directly.  I.e. the standard
output of md5sum bound to the source and pass that into the manifest.
Example:

   file { "foo.txt":
        content => file:///files/foo.txt,
        mask => http://puppet/files/foo.txt,
        checksum => file:///data/md5sums.txt
        <etc...> }

Obviously, if the files were on the same server as the puppet server,
then you could scan for changes on a very rapid basis (inotify?).
However, if they were on a remote server, then some politeness should
ensue since auto-generating the checksum file would require
downloading the file from the remote system to obtain the checksum, or
relying on the remote system to have a routine available that would
output the checksum for the puppet server.  All remote communication
regarding checksums should, of course, be done over SSL.

In the end, implementing this kind of utility should eliminate the
need of the puppet server to have it''s own file distribution mechanism
at all.

Let me know if that didn''t make sense or if I''ve got something
horribly wrong with my logic, it was a bit of a brain dump.

Trevor

On Dec 18, 2007 11:57 AM, Luke Kanies <luke@madstop.com>
wrote:> On Dec 18, 2007, at 9:55 AM, Phillip Scholz wrote:
>
> > Hey,
> >
> > would it be very hard to improve file''s source parameter to
support
> > downloading a file from a http server?
> > I got some quite huge files (~100MB) to distribute and update on
> > each of
> > the servers, so doing this with puppet:// is a mess (and mostly failes
> > with the known buffer error).
>
> As Berge mentioned, Puppet isn''t really meant to transfer large
> files.  It could be extended to do so, although it will be much
> easier once we''ve actually successfully switched to REST (which
I''d
> love to give a timeline for but that''s burned me recently,
so....).
>
> The big problem is that you can''t really depend on md5 sums for
http
> transfers, you''d have to stick with modification dates.
>
> I''d back-burner it until REST is complete, then if you''re
interested
> I can show you where to go.
>
>   --
>   The time to repair the roof is when the sun is shining.
>           -- John F. Kennedy
>   ---------------------------------------------------------------------
>   Luke Kanies | http://reductivelabs.com | http://madstop.com
>
>
>
> _______________________________________________
> Puppet-users mailing list
> Puppet-users@madstop.com
> https://mail.madstop.com/mailman/listinfo/puppet-users
>

Luke Kanies

2007-Dec-21 16:21 UTC

head link

Re: improving "file" to support http

On Dec 20, 2007, at 6:16 AM, Trevor Vaughan wrote:
> Just out of curiosity, why can''t you rely on MD5 sums for file
> transfers over http?
Well, truthfully, if we split the metadata collection from the file  
retrieval, we could use http for file retrieval and something else  
for getting the metadata.

Which, incidentally, is basically what we''re doing when we move to  
REST.  I''ve already split fileserving into metadata and content  
services, and you could pretty easily support metadata over REST and  
content over straight HTTP, I''d think.  However, my goal is to make  
the REST content service just about equivalent to straight http, so  
you shouldn''t even need to do so.

Getting this to work has been one of the main motivations of the REST  
work, it''s just been a ton more work than I''d hoped.
> If you combined a direct http server on the puppet server itself, then
> the puppet server could calculate the MD5 sums locally and pass that
> to the client bound with the URL.
>
> Sort of like:
>
> Client requests file foo.txt
> Manifest states that foo.txt should have MD5 checksum <whatever>
> Manifest states that foo.txt resides at <URL>
> Client checks local MD5 checksum and, if not correct, retrieves from
> <URL> and re-checks.  Throws warning if still incorrect.
This isn''t actually how Puppet works right now, though -- the  
manifest only provides a URL, not a checksum for the file, so every  
file copy is two connections:  One to get the metadata and one to get  
the content.  This means that the server''s file can change without  
forcing a recompile, and I''m not sure if this is a feature or a bug.

I''m hoping to eventually support getting the metadata during  
compilation, so that the manifest actually would include the  
checksum.  This would be a pretty radical departure from how  
fileserving works, but I think it would make a lot of sense -- we  
would retrieve files by their checksums, rather than by URLs, which  
would open up all kinds of interesting possibilities, including  
things like BitTorrent for file retrieval.
> The server side would be a bit more interesting...
>
> If the file resides on the puppet server then the server can note a
> mask URL for the client but use a file URL internally. Example:
>    file { "foo.txt":
>         content => file:///files/foo.txt,
>         mask => http://puppet/files/foo.txt,
>         <etc...> }
>
> Then, any file type that the internal ruby libs can handle can be
> retrieved from anywhere.  I can envision this for file:// http://
> https:// ldap:// rsync:// ftp:// etc.....
>
> You could even optionally designate an external source for MD5
> checksums or place them in the manifest directly.  I.e. the standard
> output of md5sum bound to the source and pass that into the manifest.
> Example:
>
>    file { "foo.txt":
>         content => file:///files/foo.txt,
>         mask => http://puppet/files/foo.txt,
>         checksum => file:///data/md5sums.txt
>         <etc...> }
>
> Obviously, if the files were on the same server as the puppet server,
> then you could scan for changes on a very rapid basis (inotify?).
> However, if they were on a remote server, then some politeness should
> ensue since auto-generating the checksum file would require
> downloading the file from the remote system to obtain the checksum, or
> relying on the remote system to have a routine available that would
> output the checksum for the puppet server.  All remote communication
> regarding checksums should, of course, be done over SSL.
>
> In the end, implementing this kind of utility should eliminate the
> need of the puppet server to have it''s own file distribution
mechanism
> at all.
I think the move to REST is the right way -- get metadata over REST,  
and get the files directly over http.

  --
  Getting caught is the mother of invention.  --Robert Byrne
  ---------------------------------------------------------------------
  Luke Kanies | http://reductivelabs.com | http://madstop.com

Apparently Analagous Threads

Search for more apparently analagous threads

Puppet users - Dec 2007 - improving "file" to support http

improving "file" to support http

Re: improving "file" to support http

Re: improving "file" to support http

Re: improving "file" to support http

Re: improving "file" to support http

Re: improving "file" to support http

Apparently Analagous Threads