Pino Toscano
2018-Dec-03 16:17 UTC
[Libguestfs] [supermin PATCH 0/2] Create a really empty base.tar.gz
See patch #2 for more explanation. Pino Toscano (2): prepare: keep config_files available for longer prepare: create a really empty base.tar.gz with no config files src/mode_prepare.ml | 87 +++++++++++++++++++++++++++------------------ 1 file changed, 52 insertions(+), 35 deletions(-) -- 2.17.2
Pino Toscano
2018-Dec-03 16:17 UTC
[Libguestfs] [supermin PATCH 1/2] prepare: keep config_files available for longer
This is just refactoring, with no behaviour changes. --- src/mode_prepare.ml | 40 ++++++++++++++++++++-------------------- 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/src/mode_prepare.ml b/src/mode_prepare.ml index 7c8221e..7759c58 100644 --- a/src/mode_prepare.ml +++ b/src/mode_prepare.ml @@ -128,28 +128,28 @@ let prepare debug (copy_kernel, format, host_cpu, * be missing either from the package or from the filesystem (the * latter case with --use-installed). *) - let files_from - let config_files - List.map ( - fun (_, files) -> - filter_map ( - function - | { ft_config = true; ft_path = path } -> Some path - | { ft_config = false } -> None - ) files - ) packages in - let config_files = List.flatten config_files in - - let config_files = List.filter ( - fun path -> - try close_in (open_in (dir // path)); true - with Sys_error _ -> false - ) config_files in + let config_files + List.map ( + fun (_, files) -> + filter_map ( + function + | { ft_config = true; ft_path = path } -> Some path + | { ft_config = false } -> None + ) files + ) packages in + let config_files = List.flatten config_files in + + let config_files = List.filter ( + fun path -> + try close_in (open_in (dir // path)); true + with Sys_error _ -> false + ) config_files in - if debug >= 1 then - printf "supermin: there are %d config files\n" - (List.length config_files); + if debug >= 1 then + printf "supermin: there are %d config files\n" + (List.length config_files); + let files_from (* Put the list of config files into a file, for tar to read. *) let files_from = tmpdir // "files-from.txt" in let chan = open_out files_from in -- 2.17.2
Pino Toscano
2018-Dec-03 16:17 UTC
[Libguestfs] [supermin PATCH 2/2] prepare: create a really empty base.tar.gz with no config files
tar defaults to 10K as block size, and thus it pads the created archives to multiples of that size using zero's. Hence, when creating an archive with no files, it will be 10K zero's, that is compressed by gzip, resulting in few bytes. The issue happens later, during the build phase: base.tar.gz is correctly detected as gz, and zcat is run to detect its content: since the empty tar was 10K of zero's, the buffer in get_compressed_file_content will be filled by zero's, and get_file_content will fail to detect anything. As solution, at least for our own base.tar.gz: in case there are no config files to copy, create a gzip file from /dev/null, which is still recognized as empty tar. --- src/mode_prepare.ml | 49 ++++++++++++++++++++++++++++++--------------- 1 file changed, 33 insertions(+), 16 deletions(-) diff --git a/src/mode_prepare.ml b/src/mode_prepare.ml index 7759c58..8a09315 100644 --- a/src/mode_prepare.ml +++ b/src/mode_prepare.ml @@ -149,20 +149,37 @@ let prepare debug (copy_kernel, format, host_cpu, printf "supermin: there are %d config files\n" (List.length config_files); - let files_from - (* Put the list of config files into a file, for tar to read. *) - let files_from = tmpdir // "files-from.txt" in - let chan = open_out files_from in - List.iter (fprintf chan ".%s\n") config_files; (* "./filename" *) - close_out chan; - - files_from in - - (* Write base.tar.gz. *) let base = outputdir // "base.tar.gz" in - if debug >= 1 then printf "supermin: writing %s\n%!" base; - let cmd - sprintf "tar%s -C %s -zcf %s -T %s" - (if debug >=1 then " -v" else "") - (quote dir) (quote base) (quote files_from) in - run_command cmd; + + if config_files <> [] then ( + (* There are config files to copy, so create the list with them, + * and then compress them with tar. + *) + let files_from + (* Put the list of config files into a file, for tar to read. *) + let files_from = tmpdir // "files-from.txt" in + let chan = open_out files_from in + List.iter (fprintf chan ".%s\n") config_files; (* "./filename" *) + close_out chan; + + files_from in + + (* Write base.tar.gz. *) + if debug >= 1 then printf "supermin: writing %s\n%!" base; + let cmd + sprintf "tar%s -C %s -zcf %s -T %s" + (if debug >=1 then " -v" else "") + (quote dir) (quote base) (quote files_from) in + run_command cmd; + ) + else ( + (* No config files to copy: create an gzip file from /dev/null, + * which will be recognized as empty tar. + *) + if debug >= 1 then printf "supermin: creating an empty %s\n%!" base; + let cmd + sprintf "gzip%s -c /dev/null > %s" + (if debug >=1 then " -v" else "") + (quote base) in + run_command cmd; + ) -- 2.17.2
Richard W.M. Jones
2018-Dec-03 17:06 UTC
Re: [Libguestfs] [supermin PATCH 1/2] prepare: keep config_files available for longer
On Mon, Dec 03, 2018 at 05:17:32PM +0100, Pino Toscano wrote:> This is just refactoring, with no behaviour changes. > --- > src/mode_prepare.ml | 40 ++++++++++++++++++++-------------------- > 1 file changed, 20 insertions(+), 20 deletions(-) > > diff --git a/src/mode_prepare.ml b/src/mode_prepare.ml > index 7c8221e..7759c58 100644 > --- a/src/mode_prepare.ml > +++ b/src/mode_prepare.ml > @@ -128,28 +128,28 @@ let prepare debug (copy_kernel, format, host_cpu, > * be missing either from the package or from the filesystem (the > * latter case with --use-installed). > *) > - let files_from > - let config_files > - List.map ( > - fun (_, files) -> > - filter_map ( > - function > - | { ft_config = true; ft_path = path } -> Some path > - | { ft_config = false } -> None > - ) files > - ) packages in > - let config_files = List.flatten config_files in > - > - let config_files = List.filter ( > - fun path -> > - try close_in (open_in (dir // path)); true > - with Sys_error _ -> false > - ) config_files in > + let config_files > + List.map ( > + fun (_, files) -> > + filter_map ( > + function > + | { ft_config = true; ft_path = path } -> Some path > + | { ft_config = false } -> None > + ) files > + ) packages in > + let config_files = List.flatten config_files in > + > + let config_files = List.filter ( > + fun path -> > + try close_in (open_in (dir // path)); true > + with Sys_error _ -> false > + ) config_files in > > - if debug >= 1 then > - printf "supermin: there are %d config files\n" > - (List.length config_files); > + if debug >= 1 then > + printf "supermin: there are %d config files\n" > + (List.length config_files); > > + let files_from > (* Put the list of config files into a file, for tar to read. *) > let files_from = tmpdir // "files-from.txt" in > let chan = open_out files_from inYes just a refactor, so ACK. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v
Richard W.M. Jones
2018-Dec-03 17:07 UTC
Re: [Libguestfs] [supermin PATCH 2/2] prepare: create a really empty base.tar.gz with no config files
On Mon, Dec 03, 2018 at 05:17:33PM +0100, Pino Toscano wrote:> tar defaults to 10K as block size, and thus it pads the created archives > to multiples of that size using zero's. Hence, when creating an archive > with no files, it will be 10K zero's, that is compressed by gzip, > resulting in few bytes. The issue happens later, during the build > phase: base.tar.gz is correctly detected as gz, and zcat is run to > detect its content: since the empty tar was 10K of zero's, the buffer in > get_compressed_file_content will be filled by zero's, and > get_file_content will fail to detect anything. > > As solution, at least for our own base.tar.gz: in case there are no > config files to copy, create a gzip file from /dev/null, which is still > recognized as empty tar. > --- > src/mode_prepare.ml | 49 ++++++++++++++++++++++++++++++--------------- > 1 file changed, 33 insertions(+), 16 deletions(-) > > diff --git a/src/mode_prepare.ml b/src/mode_prepare.ml > index 7759c58..8a09315 100644 > --- a/src/mode_prepare.ml > +++ b/src/mode_prepare.ml > @@ -149,20 +149,37 @@ let prepare debug (copy_kernel, format, host_cpu, > printf "supermin: there are %d config files\n" > (List.length config_files); > > - let files_from > - (* Put the list of config files into a file, for tar to read. *) > - let files_from = tmpdir // "files-from.txt" in > - let chan = open_out files_from in > - List.iter (fprintf chan ".%s\n") config_files; (* "./filename" *) > - close_out chan; > - > - files_from in > - > - (* Write base.tar.gz. *) > let base = outputdir // "base.tar.gz" in > - if debug >= 1 then printf "supermin: writing %s\n%!" base; > - let cmd > - sprintf "tar%s -C %s -zcf %s -T %s" > - (if debug >=1 then " -v" else "") > - (quote dir) (quote base) (quote files_from) in > - run_command cmd; > + > + if config_files <> [] then ( > + (* There are config files to copy, so create the list with them, > + * and then compress them with tar. > + *) > + let files_from > + (* Put the list of config files into a file, for tar to read. *) > + let files_from = tmpdir // "files-from.txt" in > + let chan = open_out files_from in > + List.iter (fprintf chan ".%s\n") config_files; (* "./filename" *) > + close_out chan; > + > + files_from in > + > + (* Write base.tar.gz. *) > + if debug >= 1 then printf "supermin: writing %s\n%!" base; > + let cmd > + sprintf "tar%s -C %s -zcf %s -T %s" > + (if debug >=1 then " -v" else "") > + (quote dir) (quote base) (quote files_from) in > + run_command cmd; > + ) > + else ( > + (* No config files to copy: create an gzip file from /dev/null, > + * which will be recognized as empty tar. > + *) > + if debug >= 1 then printf "supermin: creating an empty %s\n%!" base; > + let cmd > + sprintf "gzip%s -c /dev/null > %s" > + (if debug >=1 then " -v" else "") > + (quote base) in > + run_command cmd; > + )I wonder if we should just create no base.tar.gz at all in this case? Anyway, it's fine, ACK. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Apparently Analagous Threads
- [supermin PATCH v2 0/3] Better handle no config files
- [supermin PATCH 1/2] prepare: keep config_files available for longer
- [supermin PATCH 2/2] prepare: create a really empty base.tar.gz with no config files
- [supermin 1/3] Recognize dpkg-divert
- supermin and dpkg-divert