Linux 3.11 added respective support, so I think we should follow suit. 1: xen: add LZ4 decompression support 2: libxc: add LZ4 decompression support Signed-off-by: Jan Beulich <jbeulich@suse.com>
Add support for LZ4 decompression in Xen. LZ4 Decompression APIs for Xen are based on LZ4 implementation by Yann Collet. Benchmark Results(PATCH v3) Compiler: Linaro ARM gcc 4.6.2 1. ARMv7, 1.5GHz based board Kernel: linux 3.4 Uncompressed Kernel Size: 14MB Compressed Size Decompression Speed LZO 6.7MB 20.1MB/s, 25.2MB/s(UA) LZ4 7.3MB 29.1MB/s, 45.6MB/s(UA) 2. ARMv7, 1.7GHz based board Kernel: linux 3.7 Uncompressed Kernel Size: 14MB Compressed Size Decompression Speed LZO 6.0MB 34.1MB/s, 52.2MB/s(UA) LZ4 6.5MB 86.7MB/s - UA: Unaligned memory Access support - Latest patch set for LZO applied This patch set is for adding support for LZ4-compressed Kernel. LZ4 is a very fast lossless compression algorithm and it also features an extremely fast decoder [1]. But we have five of decompressors already and one question which does arise, however, is that of where do we stop adding new ones? This issue had been discussed and came to the conclusion [2]. Russell King said that we should have: - one decompressor which is the fastest - one decompressor for the highest compression ratio - one popular decompressor (eg conventional gzip) If we have a replacement one for one of these, then it should do exactly that: replace it. The benchmark shows that an 8% increase in image size vs a 66% increase in decompression speed compared to LZO(which has been known as the fastest decompressor in the Kernel). Therefore the "fast but may not be small" compression title has clearly been taken by LZ4 [3]. [1] http://code.google.com/p/lz4/ [2] http://thread.gmane.org/gmane.linux.kbuild.devel/9157 [3] http://thread.gmane.org/gmane.linux.kbuild.devel/9347 LZ4 homepage: http://fastcompression.blogspot.com/p/lz4.html LZ4 source repository: http://code.google.com/p/lz4/ Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com> Signed-off-by: Yann Collet <yann.collet.73@gmail.com> Adopted for Xen. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/common/Makefile +++ b/xen/common/Makefile @@ -49,7 +49,7 @@ obj-y += radix-tree.o obj-y += rbtree.o obj-y += lzo.o -obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo,$(n).init.o) +obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo unlz4,$(n).init.o) obj-$(perfc) += perfc.o obj-$(crash_debug) += gdbstub.o --- a/xen/common/decompress.c +++ b/xen/common/decompress.c @@ -29,5 +29,8 @@ int __init decompress(void *inbuf, unsig if ( len >= 5 && !memcmp(inbuf, "\x89LZO", 5) ) return unlzo(inbuf, len, NULL, NULL, outbuf, NULL, error); + if ( len >= 2 && !memcmp(inbuf, "\x02\x21", 2) ) + return unlz4(inbuf, len, NULL, NULL, outbuf, NULL, error); + return 1; } --- /dev/null +++ b/xen/common/lz4/decompress.c @@ -0,0 +1,314 @@ +/* + * LZ4 Decompressor for Linux kernel + * + * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com> + * + * Based on LZ4 implementation by Yann Collet. + * + * LZ4 - Fast LZ compression algorithm + * Copyright (C) 2011-2012, Yann Collet. + * BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php) + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are + * met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following disclaimer + * in the documentation and/or other materials provided with the + * distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * You can contact the author at : + * - LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html + * - LZ4 source repository : http://code.google.com/p/lz4/ + */ + +#include "defs.h" + +static int INIT lz4_uncompress(const unsigned char *source, unsigned char *dest, + int osize) +{ + const BYTE *ip = (const BYTE *) source; + const BYTE *ref; + BYTE *op = (BYTE *) dest; + BYTE * const oend = op + osize; + BYTE *cpy; + unsigned token; + size_t length; + size_t dec32table[] = {0, 3, 2, 3, 0, 0, 0, 0}; +#if LZ4_ARCH64 + size_t dec64table[] = {0, 0, 0, -1, 0, 1, 2, 3}; +#endif + + while (1) { + + /* get runlength */ + token = *ip++; + length = (token >> ML_BITS); + if (length == RUN_MASK) { + size_t len; + + len = *ip++; + for (; len == 255; length += 255) + len = *ip++; + length += len; + } + + /* copy literals */ + cpy = op + length; + if (unlikely(cpy > oend - COPYLENGTH)) { + /* + * Error: not enough place for another match + * (min 4) + 5 literals + */ + if (cpy != oend) + goto _output_error; + + memcpy(op, ip, length); + ip += length; + break; /* EOF */ + } + LZ4_WILDCOPY(ip, op, cpy); + ip -= (op - cpy); + op = cpy; + + /* get offset */ + LZ4_READ_LITTLEENDIAN_16(ref, cpy, ip); + ip += 2; + + /* Error: offset create reference outside destination buffer */ + if (unlikely(ref < (BYTE *const) dest)) + goto _output_error; + + /* get matchlength */ + length = token & ML_MASK; + if (length == ML_MASK) { + for (; *ip == 255; length += 255) + ip++; + length += *ip++; + } + + /* copy repeated sequence */ + if (unlikely((op - ref) < STEPSIZE)) { +#if LZ4_ARCH64 + size_t dec64 = dec64table[op - ref]; +#else + const int dec64 = 0; +#endif + op[0] = ref[0]; + op[1] = ref[1]; + op[2] = ref[2]; + op[3] = ref[3]; + op += 4; + ref += 4; + ref -= dec32table[op-ref]; + PUT4(ref, op); + op += STEPSIZE - 4; + ref -= dec64; + } else { + LZ4_COPYSTEP(ref, op); + } + cpy = op + length - (STEPSIZE - 4); + if (cpy > (oend - COPYLENGTH)) { + + /* Error: request to write beyond destination buffer */ + if (cpy > oend) + goto _output_error; + LZ4_SECURECOPY(ref, op, (oend - COPYLENGTH)); + while (op < cpy) + *op++ = *ref++; + op = cpy; + /* + * Check EOF (should never happen, since last 5 bytes + * are supposed to be literals) + */ + if (op == oend) + goto _output_error; + continue; + } + LZ4_SECURECOPY(ref, op, cpy); + op = cpy; /* correction */ + } + /* end of decoding */ + return (int) (((unsigned char *)ip) - source); + + /* write overflow error detected */ +_output_error: + return (int) (-(((unsigned char *)ip) - source)); +} + +#ifndef __XEN__ +static int lz4_uncompress_unknownoutputsize(const char *source, char *dest, + int isize, size_t maxoutputsize) +{ + const BYTE *ip = (const BYTE *) source; + const BYTE *const iend = ip + isize; + const BYTE *ref; + + + BYTE *op = (BYTE *) dest; + BYTE * const oend = op + maxoutputsize; + BYTE *cpy; + + size_t dec32table[] = {0, 3, 2, 3, 0, 0, 0, 0}; +#if LZ4_ARCH64 + size_t dec64table[] = {0, 0, 0, -1, 0, 1, 2, 3}; +#endif + + /* Main Loop */ + while (ip < iend) { + + unsigned token; + size_t length; + + /* get runlength */ + token = *ip++; + length = (token >> ML_BITS); + if (length == RUN_MASK) { + int s = 255; + while ((ip < iend) && (s == 255)) { + s = *ip++; + length += s; + } + } + /* copy literals */ + cpy = op + length; + if ((cpy > oend - COPYLENGTH) || + (ip + length > iend - COPYLENGTH)) { + + if (cpy > oend) + goto _output_error;/* writes beyond buffer */ + + if (ip + length != iend) + goto _output_error;/* + * Error: LZ4 format requires + * to consume all input + * at this stage + */ + memcpy(op, ip, length); + op += length; + break;/* Necessarily EOF, due to parsing restrictions */ + } + LZ4_WILDCOPY(ip, op, cpy); + ip -= (op - cpy); + op = cpy; + + /* get offset */ + LZ4_READ_LITTLEENDIAN_16(ref, cpy, ip); + ip += 2; + if (ref < (BYTE * const) dest) + goto _output_error; + /* + * Error : offset creates reference + * outside of destination buffer + */ + + /* get matchlength */ + length = (token & ML_MASK); + if (length == ML_MASK) { + while (ip < iend) { + int s = *ip++; + length += s; + if (s == 255) + continue; + break; + } + } + + /* copy repeated sequence */ + if (unlikely((op - ref) < STEPSIZE)) { +#if LZ4_ARCH64 + size_t dec64 = dec64table[op - ref]; +#else + const int dec64 = 0; +#endif + op[0] = ref[0]; + op[1] = ref[1]; + op[2] = ref[2]; + op[3] = ref[3]; + op += 4; + ref += 4; + ref -= dec32table[op - ref]; + PUT4(ref, op); + op += STEPSIZE - 4; + ref -= dec64; + } else { + LZ4_COPYSTEP(ref, op); + } + cpy = op + length - (STEPSIZE-4); + if (cpy > oend - COPYLENGTH) { + if (cpy > oend) + goto _output_error; /* write outside of buf */ + + LZ4_SECURECOPY(ref, op, (oend - COPYLENGTH)); + while (op < cpy) + *op++ = *ref++; + op = cpy; + /* + * Check EOF (should never happen, since last 5 bytes + * are supposed to be literals) + */ + if (op == oend) + goto _output_error; + continue; + } + LZ4_SECURECOPY(ref, op, cpy); + op = cpy; /* correction */ + } + /* end of decoding */ + return (int) (((char *) op) - dest); + + /* write overflow error detected */ +_output_error: + return (int) (-(((char *) ip) - source)); +} +#endif + +STATIC int INIT lz4_decompress(const unsigned char *src, size_t *src_len, + unsigned char *dest, size_t actual_dest_len) +{ + int ret = -1; + int input_len = 0; + + input_len = lz4_uncompress(src, dest, actual_dest_len); + if (input_len < 0) + goto exit_0; + *src_len = input_len; + + return 0; +exit_0: + return ret; +} + +#ifndef __XEN__ +int lz4_decompress_unknownoutputsize(const char *src, size_t src_len, + char *dest, size_t *dest_len) +{ + int ret = -1; + int out_len = 0; + + out_len = lz4_uncompress_unknownoutputsize(src, dest, src_len, + *dest_len); + if (out_len < 0) + goto exit_0; + *dest_len = out_len; + + return 0; +exit_0: + return ret; +} +#endif --- /dev/null +++ b/xen/common/lz4/defs.h @@ -0,0 +1,184 @@ +/* + * lz4defs.h -- architecture specific defines + * + * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#ifdef __XEN__ +#include <asm/byteorder.h> +#endif + +#ifdef __LITTLE_ENDIAN +static inline u16 INIT get_unaligned_le16(const void *p) +{ + return le16_to_cpup(p); +} + +static inline u32 INIT get_unaligned_le32(const void *p) +{ + return le32_to_cpup(p); +} +#else +#include <asm/unaligned.h> + +static inline u16 INIT get_unaligned_le16(const void *p) +{ + return le16_to_cpu(__get_unaligned(p, 2)); +} + +static inline u32 INIT get_unaligned_le32(void *p) +{ + return le32_to_cpu(__get_unaligned(p, 4)); +} +#endif + +/* + * Detects 64 bits mode + */ +#if (defined(__x86_64__) || defined(__x86_64) || defined(__amd64__) \ + || defined(__ppc64__) || defined(__LP64__)) +#define LZ4_ARCH64 1 +#else +#define LZ4_ARCH64 0 +#endif + +/* + * Architecture-specific macros + */ +#define BYTE u8 +typedef struct _U16_S { u16 v; } U16_S; +typedef struct _U32_S { u32 v; } U32_S; +typedef struct _U64_S { u64 v; } U64_S; +#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) \ + || defined(CONFIG_ARM) && __LINUX_ARM_ARCH__ >= 6 \ + && defined(ARM_EFFICIENT_UNALIGNED_ACCESS) + +#define A16(x) (((U16_S *)(x))->v) +#define A32(x) (((U32_S *)(x))->v) +#define A64(x) (((U64_S *)(x))->v) + +#define PUT4(s, d) (A32(d) = A32(s)) +#define PUT8(s, d) (A64(d) = A64(s)) +#define LZ4_WRITE_LITTLEENDIAN_16(p, v) \ + do { \ + A16(p) = v; \ + p += 2; \ + } while (0) +#else /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */ + +#define A64(x) get_unaligned((u64 *)&(((U16_S *)(x))->v)) +#define A32(x) get_unaligned((u32 *)&(((U16_S *)(x))->v)) +#define A16(x) get_unaligned((u16 *)&(((U16_S *)(x))->v)) + +#define PUT4(s, d) \ + put_unaligned(get_unaligned((const u32 *) s), (u32 *) d) +#define PUT8(s, d) \ + put_unaligned(get_unaligned((const u64 *) s), (u64 *) d) + +#define LZ4_WRITE_LITTLEENDIAN_16(p, v) \ + do { \ + put_unaligned(v, (u16 *)(p)); \ + p += 2; \ + } while (0) +#endif + +#define COPYLENGTH 8 +#define ML_BITS 4 +#define ML_MASK ((1U << ML_BITS) - 1) +#define RUN_BITS (8 - ML_BITS) +#define RUN_MASK ((1U << RUN_BITS) - 1) +#define MEMORY_USAGE 14 +#define MINMATCH 4 +#define SKIPSTRENGTH 6 +#define LASTLITERALS 5 +#define MFLIMIT (COPYLENGTH + MINMATCH) +#define MINLENGTH (MFLIMIT + 1) +#define MAXD_LOG 16 +#define MAXD (1 << MAXD_LOG) +#define MAXD_MASK (u32)(MAXD - 1) +#define MAX_DISTANCE (MAXD - 1) +#define HASH_LOG (MAXD_LOG - 1) +#define HASHTABLESIZE (1 << HASH_LOG) +#define MAX_NB_ATTEMPTS 256 +#define OPTIMAL_ML (int)((ML_MASK-1)+MINMATCH) +#define LZ4_64KLIMIT ((1<<16) + (MFLIMIT - 1)) +#define HASHLOG64K ((MEMORY_USAGE - 2) + 1) +#define HASH64KTABLESIZE (1U << HASHLOG64K) +#define LZ4_HASH_VALUE(p) (((A32(p)) * 2654435761U) >> \ + ((MINMATCH * 8) - (MEMORY_USAGE-2))) +#define LZ4_HASH64K_VALUE(p) (((A32(p)) * 2654435761U) >> \ + ((MINMATCH * 8) - HASHLOG64K)) +#define HASH_VALUE(p) (((A32(p)) * 2654435761U) >> \ + ((MINMATCH * 8) - HASH_LOG)) + +#if LZ4_ARCH64/* 64-bit */ +#define STEPSIZE 8 + +#define LZ4_COPYSTEP(s, d) \ + do { \ + PUT8(s, d); \ + d += 8; \ + s += 8; \ + } while (0) + +#define LZ4_COPYPACKET(s, d) LZ4_COPYSTEP(s, d) + +#define LZ4_SECURECOPY(s, d, e) \ + do { \ + if (d < e) { \ + LZ4_WILDCOPY(s, d, e); \ + } \ + } while (0) +#define HTYPE u32 + +#ifdef __BIG_ENDIAN +#define LZ4_NBCOMMONBYTES(val) (__builtin_clzll(val) >> 3) +#else +#define LZ4_NBCOMMONBYTES(val) (__builtin_ctzll(val) >> 3) +#endif + +#else /* 32-bit */ +#define STEPSIZE 4 + +#define LZ4_COPYSTEP(s, d) \ + do { \ + PUT4(s, d); \ + d += 4; \ + s += 4; \ + } while (0) + +#define LZ4_COPYPACKET(s, d) \ + do { \ + LZ4_COPYSTEP(s, d); \ + LZ4_COPYSTEP(s, d); \ + } while (0) + +#define LZ4_SECURECOPY LZ4_WILDCOPY +#define HTYPE const u8* + +#ifdef __BIG_ENDIAN +#define LZ4_NBCOMMONBYTES(val) (__builtin_clz(val) >> 3) +#else +#define LZ4_NBCOMMONBYTES(val) (__builtin_ctz(val) >> 3) +#endif + +#endif + +#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \ + (d = s - get_unaligned_le16(p)) + +#define LZ4_WILDCOPY(s, d, e) \ + do { \ + LZ4_COPYPACKET(s, d); \ + } while (d < e) + +#define LZ4_BLINDCOPY(s, d, l) \ + do { \ + u8 *e = (d) + l; \ + LZ4_WILDCOPY(s, d, e); \ + d = e; \ + } while (0) --- /dev/null +++ b/xen/common/unlz4.c @@ -0,0 +1,158 @@ +/* + * Wrapper for decompressing LZ4-compressed kernel, initramfs, and initrd + * + * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include "decompress.h" +#include <xen/lz4.h> +#include "lz4/decompress.c" + +/* + * Note: Uncompressed chunk size is used in the compressor side + * (userspace side for compression). + * It is hardcoded because there is not proper way to extract it + * from the binary stream which is generated by the preliminary + * version of LZ4 tool so far. + */ +#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20) +#define ARCHIVE_MAGICNUMBER 0x184C2102 + +STATIC int INIT unlz4(unsigned char *input, unsigned int in_len, + int (*fill)(void *, unsigned int), + int (*flush)(void *, unsigned int), + unsigned char *output, + unsigned int *posp, + void (*error)(const char *x)) +{ + int ret = -1; + size_t chunksize = 0; + size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE; + u8 *inp; + u8 *inp_start; + u8 *outp; + int size = in_len -= 4; + size_t out_len = get_unaligned_le32(input + in_len); + size_t dest_len; + + + if (output) { + outp = output; + } else if (!flush) { + error("NULL output pointer and no flush function provided"); + goto exit_0; + } else { + outp = large_malloc(uncomp_chunksize); + if (!outp) { + error("Could not allocate output buffer"); + goto exit_0; + } + } + + if (input && fill) { + error("Both input pointer and fill function provided,"); + goto exit_1; + } else if (input) { + inp = input; + } else if (!fill) { + error("NULL input pointer and missing fill function"); + goto exit_1; + } else { + inp = large_malloc(lz4_compressbound(uncomp_chunksize)); + if (!inp) { + error("Could not allocate input buffer"); + goto exit_1; + } + } + inp_start = inp; + + if (posp) + *posp = 0; + + if (fill) + fill(inp, 4); + + chunksize = get_unaligned_le32(inp); + if (chunksize == ARCHIVE_MAGICNUMBER) { + inp += 4; + size -= 4; + } else { + error("invalid header"); + goto exit_2; + } + + if (posp) + *posp += 4; + + for (;;) { + + if (fill) + fill(inp, 4); + + chunksize = get_unaligned_le32(inp); + if (chunksize == ARCHIVE_MAGICNUMBER) { + inp += 4; + size -= 4; + if (posp) + *posp += 4; + continue; + } + inp += 4; + size -= 4; + + if (posp) + *posp += 4; + + if (fill) { + if (chunksize > lz4_compressbound(uncomp_chunksize)) { + error("chunk length is longer than allocated"); + goto exit_2; + } + fill(inp, chunksize); + } + if (out_len >= uncomp_chunksize) { + dest_len = uncomp_chunksize; + out_len -= dest_len; + } else + dest_len = out_len; + ret = lz4_decompress(inp, &chunksize, outp, dest_len); + if (ret < 0) { + error("Decoding failed"); + goto exit_2; + } + + if (flush && flush(outp, dest_len) != dest_len) + goto exit_2; + if (output) + outp += dest_len; + if (posp) + *posp += chunksize; + + size -= chunksize; + + if (size == 0) + break; + else if (size < 0) { + error("data corrupted"); + goto exit_2; + } + + inp += chunksize; + if (fill) + inp = inp_start; + } + + ret = 0; +exit_2: + if (!input) + large_free(inp_start); +exit_1: + if (!output) + large_free(outp); +exit_0: + return ret; +} --- a/xen/include/asm-x86/config.h +++ b/xen/include/asm-x86/config.h @@ -25,6 +25,7 @@ #define CONFIG_X86_PM_TIMER 1 #define CONFIG_HPET_TIMER 1 #define CONFIG_X86_MCE_THERMAL 1 +#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS 1 #define CONFIG_NUMA 1 #define CONFIG_DISCONTIGMEM 1 #define CONFIG_NUMA_EMU 1 --- a/xen/include/xen/decompress.h +++ b/xen/include/xen/decompress.h @@ -31,7 +31,7 @@ typedef int decompress_fn(unsigned char * dependent). */ -decompress_fn bunzip2, unxz, unlzma, unlzo; +decompress_fn bunzip2, unxz, unlzma, unlzo, unlz4; int decompress(void *inbuf, unsigned int len, void *outbuf); --- /dev/null +++ b/xen/include/xen/lz4.h @@ -0,0 +1,90 @@ +#ifndef __LZ4_H__ +#define __LZ4_H__ + +#include <xen/types.h> + +/* + * LZ4 Hypervisor Interface + * + * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ +#define LZ4_MEM_COMPRESS (4096 * sizeof(unsigned char *)) +#define LZ4HC_MEM_COMPRESS (65538 * sizeof(unsigned char *)) + +/* + * lz4_compressbound() + * Provides the maximum size that LZ4 may output in a "worst case" scenario + * (input data not compressible) + */ +static inline size_t lz4_compressbound(size_t isize) +{ + return isize + (isize / 255) + 16; +} + +/* + * lz4_compress() + * src : source address of the original data + * src_len : size of the original data + * dst : output buffer address of the compressed data + * This requires ''dst'' of size LZ4_COMPRESSBOUND. + * dst_len : is the output size, which is returned after compress done + * workmem : address of the working memory. + * This requires ''workmem'' of size LZ4_MEM_COMPRESS. + * return : Success if return 0 + * Error if return (< 0) + * note : Destination buffer and workmem must be already allocated with + * the defined size. + */ +int lz4_compress(const unsigned char *src, size_t src_len, + unsigned char *dst, size_t *dst_len, void *wrkmem); + + /* + * lz4hc_compress() + * src : source address of the original data + * src_len : size of the original data + * dst : output buffer address of the compressed data + * This requires ''dst'' of size LZ4_COMPRESSBOUND. + * dst_len : is the output size, which is returned after compress done + * workmem : address of the working memory. + * This requires ''workmem'' of size LZ4HC_MEM_COMPRESS. + * return : Success if return 0 + * Error if return (< 0) + * note : Destination buffer and workmem must be already allocated with + * the defined size. + */ +int lz4hc_compress(const unsigned char *src, size_t src_len, + unsigned char *dst, size_t *dst_len, void *wrkmem); + +/* + * lz4_decompress() + * src : source address of the compressed data + * src_len : is the input size, whcih is returned after decompress done + * dest : output buffer address of the decompressed data + * actual_dest_len: is the size of uncompressed data, supposing it''s known + * return : Success if return 0 + * Error if return (< 0) + * note : Destination buffer must be already allocated. + * slightly faster than lz4_decompress_unknownoutputsize() + */ +int lz4_decompress(const unsigned char *src, size_t *src_len, + unsigned char *dest, size_t actual_dest_len); + +/* + * lz4_decompress_unknownoutputsize() + * src : source address of the compressed data + * src_len : is the input size, therefore the compressed size + * dest : output buffer address of the decompressed data + * dest_len: is the max size of the destination buffer, which is + * returned with actual size of decompressed data after + * decompress done + * return : Success if return 0 + * Error if return (< 0) + * note : Destination buffer must be already allocated. + */ +int lz4_decompress_unknownoutputsize(const char *src, size_t src_len, + char *dest, size_t *dest_len); +#endif _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Since there''s no shared or static library to link against, this simply re-uses the hypervisor side code. However, I only audited the code added here for possible security issues, not the referenced code in the hypervisor tree. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- I intentionally retained the tab indentation in the code cloned from its hypervisor original (which in turn retained it as being a clone of the Linux original), to ease diff-ing. Perhaps this should therefore be considered RFC. --- a/tools/libxc/xc_dom_bzimageloader.c +++ b/tools/libxc/xc_dom_bzimageloader.c @@ -573,6 +573,124 @@ int xc_try_xz_decode(struct xc_dom_image #endif /* !__MINIOS__ */ +#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS +#define STATIC static +#define u8 uint8_t +#define u16 uint16_t +#define u32 uint32_t +#define u64 uint64_t +#define INIT +#define unlikely(x) (x) + +static inline uint_fast16_t le16_to_cpup(const unsigned char *buf) +{ + return buf[0] | (buf[1] << 8); +} + +static inline uint_fast32_t le32_to_cpup(const unsigned char *buf) +{ + return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16); +} + +#include "../../xen/common/lz4/decompress.c" + +/* + * Note: Uncompressed chunk size is used in the compressor side + * (userspace side for compression). + * It is hardcoded because there is not proper way to extract it + * from the binary stream which is generated by the preliminary + * version of LZ4 tool so far. + */ +#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20) +#define ARCHIVE_MAGICNUMBER 0x184C2102 + +static int xc_try_lz4_decode( + struct xc_dom_image *dom, void **blob, size_t *psize) +{ + int ret = -1; + size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE; + unsigned char *inp = *blob, *output, *outp; + ssize_t size = *psize - 4; + size_t out_size, out_len, dest_len, chunksize; + const char *msg; + + if (size < 4) { + msg = "input too small"; + goto exit_0; + } + + out_size = out_len = get_unaligned_le32(inp + size); + if (xc_dom_kernel_check_size(dom, out_len)) { + msg = "Decompressed image too large"; + goto exit_0; + } + + output = malloc(out_len); + if (!output) { + msg = "Could not allocate output buffer"; + goto exit_0; + } + outp = output; + + chunksize = get_unaligned_le32(inp); + if (chunksize == ARCHIVE_MAGICNUMBER) { + inp += 4; + size -= 4; + } else { + msg = "invalid header"; + goto exit_2; + } + + for (;;) { + if (size < 4) { + msg = "missing data"; + goto exit_2; + } + chunksize = get_unaligned_le32(inp); + if (chunksize == ARCHIVE_MAGICNUMBER) { + inp += 4; + size -= 4; + continue; + } + inp += 4; + size -= 4; + + if (out_len >= uncomp_chunksize) { + dest_len = uncomp_chunksize; + out_len -= dest_len; + } else + dest_len = out_len; + ret = lz4_decompress(inp, &chunksize, outp, dest_len); + if (ret < 0) { + msg = "decoding failed"; + goto exit_2; + } + + outp += dest_len; + size -= chunksize; + + if (size == 0) + { + *blob = output; + *psize = out_size; + return 0; + } + + if (size < 0) { + msg = "data corrupted"; + goto exit_2; + } + + inp += chunksize; + } + +exit_2: + free(output); + DOMPRINTF("LZ4 decompression error: %s\n", msg); +exit_0: + return ret; +} + struct setup_header { uint8_t _pad0[0x1f1]; /* skip uninteresting stuff */ uint8_t setup_sects; @@ -733,6 +851,17 @@ static int xc_dom_probe_bzimage_kernel(s return -EINVAL; } } + else if ( check_magic(dom, "\x02\x21", 2) ) + { + ret = xc_try_lz4_decode(dom, &dom->kernel_blob, &dom->kernel_size); + if ( ret < 0 ) + { + xc_dom_panic(dom->xch, XC_INVALID_KERNEL, + "%s unable to LZ4 decompress kernel\n", + __FUNCTION__); + return -EINVAL; + } + } else { xc_dom_panic(dom->xch, XC_INVALID_KERNEL, --- a/xen/common/lz4/decompress.c +++ b/xen/common/lz4/decompress.c @@ -151,7 +151,7 @@ _output_error: return (int) (-(((unsigned char *)ip) - source)); } -#ifndef __XEN__ +#if !defined(__XEN__) && !defined(__XEN_TOOLS__) static int lz4_uncompress_unknownoutputsize(const char *source, char *dest, int isize, size_t maxoutputsize) { @@ -294,7 +294,7 @@ exit_0: return ret; } -#ifndef __XEN__ +#if !defined(__XEN__) && !defined(__XEN_TOOLS__) int lz4_decompress_unknownoutputsize(const char *src, size_t src_len, char *dest, size_t *dest_len) { _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Mon, 2013-09-23 at 15:56 +0100, Jan Beulich wrote:> Since there''s no shared or static library to link against, this simply > re-uses the hypervisor side code. However, I only audited the code > added here for possible security issues, not the referenced code in > the hypervisor tree.With that in mind, you aren''t adding a tools/libxc/xc_dom_decompress_unsafe_lz4.c which would enable this stuff in pvgrub domains.> > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > I intentionally retained the tab indentation in the code cloned from > its hypervisor original (which in turn retained it as being a clone > of the Linux original), to ease diff-ing. Perhaps this should therefore > be considered RFC.That''s fine IMHO.> > --- a/tools/libxc/xc_dom_bzimageloader.c > +++ b/tools/libxc/xc_dom_bzimageloader.c > @@ -573,6 +573,124 @@ int xc_try_xz_decode(struct xc_dom_image > > #endif /* !__MINIOS__ */ > > +#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > +#define STATIC static > +#define u8 uint8_t > +#define u16 uint16_t > +#define u32 uint32_t > +#define u64 uint64_t > +#define INIT > +#define unlikely(x) (x) > + > +static inline uint_fast16_t le16_to_cpup(const unsigned char *buf) > +{ > + return buf[0] | (buf[1] << 8); > +} > + > +static inline uint_fast32_t le32_to_cpup(const unsigned char *buf) > +{ > + return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16); > +} > + > +#include "../../xen/common/lz4/decompress.c" > + > +/* > + * Note: Uncompressed chunk size is used in the compressor side > + * (userspace side for compression). > + * It is hardcoded because there is not proper way to extract it > + * from the binary stream which is generated by the preliminary > + * version of LZ4 tool so far. > + */ > +#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20) > +#define ARCHIVE_MAGICNUMBER 0x184C2102 > + > +static int xc_try_lz4_decode( > + struct xc_dom_image *dom, void **blob, size_t *psize) > +{ > + int ret = -1; > + size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE; > + unsigned char *inp = *blob, *output, *outp; > + ssize_t size = *psize - 4; > + size_t out_size, out_len, dest_len, chunksize; > + const char *msg; > + > + if (size < 4) { > + msg = "input too small"; > + goto exit_0; > + } > + > + out_size = out_len = get_unaligned_le32(inp + size); > + if (xc_dom_kernel_check_size(dom, out_len)) { > + msg = "Decompressed image too large"; > + goto exit_0; > + } > + > + output = malloc(out_len); > + if (!output) { > + msg = "Could not allocate output buffer"; > + goto exit_0; > + } > + outp = output; > + > + chunksize = get_unaligned_le32(inp); > + if (chunksize == ARCHIVE_MAGICNUMBER) { > + inp += 4; > + size -= 4; > + } else { > + msg = "invalid header"; > + goto exit_2; > + } > + > + for (;;) { > + if (size < 4) { > + msg = "missing data"; > + goto exit_2; > + } > + chunksize = get_unaligned_le32(inp); > + if (chunksize == ARCHIVE_MAGICNUMBER) { > + inp += 4; > + size -= 4; > + continue; > + } > + inp += 4; > + size -= 4; > + > + if (out_len >= uncomp_chunksize) { > + dest_len = uncomp_chunksize; > + out_len -= dest_len; > + } else > + dest_len = out_len; > + ret = lz4_decompress(inp, &chunksize, outp, dest_len); > + if (ret < 0) { > + msg = "decoding failed"; > + goto exit_2; > + } > + > + outp += dest_len; > + size -= chunksize; > + > + if (size == 0) > + { > + *blob = output; > + *psize = out_size; > + return 0; > + } > + > + if (size < 0) { > + msg = "data corrupted"; > + goto exit_2; > + } > + > + inp += chunksize; > + } > + > +exit_2: > + free(output); > + DOMPRINTF("LZ4 decompression error: %s\n", msg); > +exit_0: > + return ret; > +} > + > struct setup_header { > uint8_t _pad0[0x1f1]; /* skip uninteresting stuff */ > uint8_t setup_sects; > @@ -733,6 +851,17 @@ static int xc_dom_probe_bzimage_kernel(s > return -EINVAL; > } > } > + else if ( check_magic(dom, "\x02\x21", 2) ) > + { > + ret = xc_try_lz4_decode(dom, &dom->kernel_blob, &dom->kernel_size); > + if ( ret < 0 ) > + { > + xc_dom_panic(dom->xch, XC_INVALID_KERNEL, > + "%s unable to LZ4 decompress kernel\n", > + __FUNCTION__); > + return -EINVAL; > + } > + } > else > { > xc_dom_panic(dom->xch, XC_INVALID_KERNEL, > --- a/xen/common/lz4/decompress.c > +++ b/xen/common/lz4/decompress.c > @@ -151,7 +151,7 @@ _output_error: > return (int) (-(((unsigned char *)ip) - source)); > } > > -#ifndef __XEN__ > +#if !defined(__XEN__) && !defined(__XEN_TOOLS__) > static int lz4_uncompress_unknownoutputsize(const char *source, char *dest, > int isize, size_t maxoutputsize) > { > @@ -294,7 +294,7 @@ exit_0: > return ret; > } > > -#ifndef __XEN__ > +#if !defined(__XEN__) && !defined(__XEN_TOOLS__) > int lz4_decompress_unknownoutputsize(const char *src, size_t src_len, > char *dest, size_t *dest_len) > { > >
On Mon, 2013-09-23 at 15:54 +0100, Jan Beulich wrote:> Add support for LZ4 decompression in Xen. LZ4 Decompression APIs for > Xen are based on LZ4 implementation by Yann Collet. > > Benchmark Results(PATCH v3) > Compiler: Linaro ARM gcc 4.6.2 > > 1. ARMv7, 1.5GHz based board > [...] > 2. ARMv7, 1.7GHz based boardBTW Xen doesn''t use these decompressors on ARM, the kernel self extracts in guest context. I see you aren''t adding this to the ARM build anyway, just thought I''d explain ;-) [...]> Russell King said that we should have: > > - one decompressor which is the fastest > - one decompressor for the highest compression ratio > - one popular decompressor (eg conventional gzip) > > If we have a replacement one for one of these, then it should do exactly > that: replace it.What a shame this logic doesn''t apply to Xen''s use too since we need to support old kernels :-( Ian.
>>> On 23.09.13 at 17:00, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Mon, 2013-09-23 at 15:54 +0100, Jan Beulich wrote: >> Add support for LZ4 decompression in Xen. LZ4 Decompression APIs for >> Xen are based on LZ4 implementation by Yann Collet. >> >> Benchmark Results(PATCH v3) >> Compiler: Linaro ARM gcc 4.6.2 >> >> 1. ARMv7, 1.5GHz based board >> [...] >> 2. ARMv7, 1.7GHz based board > > BTW Xen doesn''t use these decompressors on ARM, the kernel self extracts > in guest context. I see you aren''t adding this to the ARM build anyway, > just thought I''d explain ;-)Yeah, I just copied the Linux commit message mostly verbatim.>> Russell King said that we should have: >> >> - one decompressor which is the fastest >> - one decompressor for the highest compression ratio >> - one popular decompressor (eg conventional gzip) >> >> If we have a replacement one for one of these, then it should do exactly >> that: replace it. > > What a shame this logic doesn''t apply to Xen''s use too since we need to > support old kernels :-(And they didn''t rip out the LZO one either... Jan
>>> On 23.09.13 at 17:00, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Mon, 2013-09-23 at 15:56 +0100, Jan Beulich wrote: >> Since there''s no shared or static library to link against, this simply >> re-uses the hypervisor side code. However, I only audited the code >> added here for possible security issues, not the referenced code in >> the hypervisor tree. > > With that in mind, you aren''t adding a > tools/libxc/xc_dom_decompress_unsafe_lz4.c which would enable this stuff > in pvgrub domains.I''d prefer to leave this to someone who knows what he''s doing; I don''t think I even know when the respective build portion gets enabled. Does this hang off the stubdom build (which I try to avoid because of the big number of build warnings it produces, all giving the impression that I''m not doing this right)? Despite it being guarded by __MINIOS__ checks, I don''t think the mini-os build alone triggers this. And then again I specifically placed the newly added code outside the __MINIOS__ guarded region, as from looking at the code I concluded that these "unsafe" decompressors do nothing else than what I did here: Include the hypervisor sources. So I was sort of hoping that the extra stuff wouldn''t even be necessary here. Jan
On Mon, 2013-09-23 at 16:22 +0100, Jan Beulich wrote:> >>> On 23.09.13 at 17:00, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Mon, 2013-09-23 at 15:56 +0100, Jan Beulich wrote: > >> Since there''s no shared or static library to link against, this simply > >> re-uses the hypervisor side code. However, I only audited the code > >> added here for possible security issues, not the referenced code in > >> the hypervisor tree. > > > > With that in mind, you aren''t adding a > > tools/libxc/xc_dom_decompress_unsafe_lz4.c which would enable this stuff > > in pvgrub domains. > > I''d prefer to leave this to someone who knows what he''s doing; > I don''t think I even know when the respective build portion gets > enabled. Does this hang off the stubdom buildYes.> (which I try to > avoid because of the big number of build warnings it produces, > all giving the impression that I''m not doing this right)?The warnings are "normal". I think they mostly come from the third party stuff which stubdom builds.> Despite it > being guarded by __MINIOS__ checks, I don''t think the mini-os > build alone triggers this.No, you would need one of the stubdoms which enables libxc.> And then again I specifically placed the newly added code outside > the __MINIOS__ guarded region, as from looking at the code I > concluded that these "unsafe" decompressors do nothing else > than what I did here: Include the hypervisor sources. So I was > sort of hoping that the extra stuff wouldn''t even be necessary > here.I''ll give it a go and see what happens... Ian.
>>> On 24.09.13 at 11:34, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Mon, 2013-09-23 at 16:22 +0100, Jan Beulich wrote: >> (which I try to >> avoid because of the big number of build warnings it produces, >> all giving the impression that I''m not doing this right)? > > The warnings are "normal". I think they mostly come from the third party > stuff which stubdom builds.I realize they''re "normal", but that way "not normal" ones are generally hidden in the noise of all the existing ones, which results in a build that I personally don''t have much faith in (i.e. seeing a successful build doesn''t really mean much). Jan
On Tue, 2013-09-24 at 10:49 +0100, Jan Beulich wrote:> >>> On 24.09.13 at 11:34, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Mon, 2013-09-23 at 16:22 +0100, Jan Beulich wrote: > >> (which I try to > >> avoid because of the big number of build warnings it produces, > >> all giving the impression that I''m not doing this right)? > > > > The warnings are "normal". I think they mostly come from the third party > > stuff which stubdom builds. > > I realize they''re "normal", but that way "not normal" ones are > generally hidden in the noise of all the existing ones, which > results in a build that I personally don''t have much faith in (i.e. > seeing a successful build doesn''t really mean much).Under the circumstances I''d normally diff the before and after logs. Ian.
On Tue, 2013-09-24 at 10:34 +0100, Ian Campbell wrote:> I''ll give it a go and see what happens...gcc [...] -c -o xc_dom_bzimageloader.o xc_dom_bzimageloader.c In file included from ../../xen/common/lz4/decompress.c:40:0, from xc_dom_bzimageloader.c:595: ../../xen/common/lz4/defs.h:26:27: fatal error: asm/unaligned.h: No such file or directory compilation terminated. make[2]: *** [xc_dom_bzimageloader.o] Error 1 make[2]: Leaving directory `/local/scratch/ianc/devel/xen.git/stubdom/libxc-x86_64'' make[1]: *** [build] Error 2 make[1]: Leaving directory `/local/scratch/ianc/devel/xen.git/stubdom/libxc-x86_64'' make: *** [libxc-x86_64/libxenctrl.a] Error 2 make: Leaving directory `/local/scratch/ianc/devel/xen.git/stubdom'' This incremental hunk fixes it. xen/common/unlzo.c uses an uglier #if 1, but this seems correct to me. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Feel free to fold it in. diff --git a/xen/common/lz4/defs.h b/xen/common/lz4/defs.h index f46df08..dca9519 100644 --- a/xen/common/lz4/defs.h +++ b/xen/common/lz4/defs.h @@ -12,6 +12,10 @@ #include <asm/byteorder.h> #endif +#ifdef __MINIOS__ +#include <mini-os/byteorder.h> +#endif + #ifdef __LITTLE_ENDIAN static inline u16 INIT get_unaligned_le16(const void *p) { With that and a hack to not inline xc_try_lz4_decode I see the expected symbols in stubdom/mini-os-x86_64-grub/mini-os (I''ve not actually run it) BTW, I noticed that xc_try_lz4_decode does not prefix its errors with "LZ4" like most of the other such functions do. Ian.
>>> On 24.09.13 at 12:01, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Tue, 2013-09-24 at 10:49 +0100, Jan Beulich wrote: >> >>> On 24.09.13 at 11:34, Ian Campbell <Ian.Campbell@citrix.com> wrote: >> > On Mon, 2013-09-23 at 16:22 +0100, Jan Beulich wrote: >> >> (which I try to >> >> avoid because of the big number of build warnings it produces, >> >> all giving the impression that I''m not doing this right)? >> > >> > The warnings are "normal". I think they mostly come from the third party >> > stuff which stubdom builds. >> >> I realize they''re "normal", but that way "not normal" ones are >> generally hidden in the noise of all the existing ones, which >> results in a build that I personally don''t have much faith in (i.e. >> seeing a successful build doesn''t really mean much). > > Under the circumstances I''d normally diff the before and after logs.With parallel builds that usually isn''t very helpful. Jan
On Tue, 2013-09-24 at 11:17 +0100, Jan Beulich wrote:> >>> On 24.09.13 at 12:01, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Tue, 2013-09-24 at 10:49 +0100, Jan Beulich wrote: > >> >>> On 24.09.13 at 11:34, Ian Campbell <Ian.Campbell@citrix.com> wrote: > >> > On Mon, 2013-09-23 at 16:22 +0100, Jan Beulich wrote: > >> >> (which I try to > >> >> avoid because of the big number of build warnings it produces, > >> >> all giving the impression that I''m not doing this right)? > >> > > >> > The warnings are "normal". I think they mostly come from the third party > >> > stuff which stubdom builds. > >> > >> I realize they''re "normal", but that way "not normal" ones are > >> generally hidden in the noise of all the existing ones, which > >> results in a build that I personally don''t have much faith in (i.e. > >> seeing a successful build doesn''t really mean much). > > > > Under the circumstances I''d normally diff the before and after logs. > > With parallel builds that usually isn''t very helpful.I usually either don''t do a parallel build in such cases or I sort the log first, which is enough to spot any new message... Ian.
>>> On 24.09.13 at 12:05, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Tue, 2013-09-24 at 10:34 +0100, Ian Campbell wrote: >> I''ll give it a go and see what happens... > > gcc [...] -c -o xc_dom_bzimageloader.o xc_dom_bzimageloader.c > In file included from ../../xen/common/lz4/decompress.c:40:0, > from xc_dom_bzimageloader.c:595: > ../../xen/common/lz4/defs.h:26:27: fatal error: asm/unaligned.h: No such > file or directory > compilation terminated. > make[2]: *** [xc_dom_bzimageloader.o] Error 1 > make[2]: Leaving directory > `/local/scratch/ianc/devel/xen.git/stubdom/libxc-x86_64'' > make[1]: *** [build] Error 2 > make[1]: Leaving directory > `/local/scratch/ianc/devel/xen.git/stubdom/libxc-x86_64'' > make: *** [libxc-x86_64/libxenctrl.a] Error 2 > make: Leaving directory `/local/scratch/ianc/devel/xen.git/stubdom'' > > This incremental hunk fixes it. xen/common/unlzo.c uses an uglier #if > 1, but this seems correct to me. > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > > Feel free to fold it in.Done, albeit not in the xen/ subtree but instead in tools/libxc/xc_dom_bzimageloader.c itself. Thanks for helping out with this!> --- a/xen/common/lz4/defs.h > +++ b/xen/common/lz4/defs.h > @@ -12,6 +12,10 @@ > #include <asm/byteorder.h> > #endif > > +#ifdef __MINIOS__ > +#include <mini-os/byteorder.h> > +#endif > + > #ifdef __LITTLE_ENDIAN > static inline u16 INIT get_unaligned_le16(const void *p) > { > > With that and a hack to not inline xc_try_lz4_decode I see the expected > symbols in stubdom/mini-os-x86_64-grub/mini-os (I''ve not actually run > it) > > BTW, I noticed that xc_try_lz4_decode does not prefix its errors with > "LZ4" like most of the other such functions do.Just like the LZO code it simply stores a message in a local variable and has a single point where DOMPRINTF() gets used, and that single point does have a proper prefix. Jan
Since there''s no shared or static library to link against, this simply re-uses the hypervisor side code. However, I only audited the code added here for possible security issues, not the referenced code in the hypervisor tree. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v2: Fold in the addition of a previously missing include in the Mini-OS case - provided by Ian Campbell - to allow stubdom compilation to succeed. --- I intentionally retained the tab indentation in the code cloned from its hypervisor original (which in turn retained it as being a clone of the Linux original), to ease diff-ing. --- a/tools/libxc/xc_dom_bzimageloader.c +++ b/tools/libxc/xc_dom_bzimageloader.c @@ -566,6 +566,8 @@ static int xc_try_lzo1x_decode( #else /* __MINIOS__ */ +#include <mini-os/byteorder.h> + int xc_try_bzip2_decode(struct xc_dom_image *dom, void **blob, size_t *size); int xc_try_lzma_decode(struct xc_dom_image *dom, void **blob, size_t *size); int xc_try_lzo1x_decode(struct xc_dom_image *dom, void **blob, size_t *size); @@ -573,6 +575,124 @@ int xc_try_xz_decode(struct xc_dom_image #endif /* !__MINIOS__ */ +#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS +#define STATIC static +#define u8 uint8_t +#define u16 uint16_t +#define u32 uint32_t +#define u64 uint64_t +#define INIT +#define unlikely(x) (x) + +static inline uint_fast16_t le16_to_cpup(const unsigned char *buf) +{ + return buf[0] | (buf[1] << 8); +} + +static inline uint_fast32_t le32_to_cpup(const unsigned char *buf) +{ + return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16); +} + +#include "../../xen/common/lz4/decompress.c" + +/* + * Note: Uncompressed chunk size is used in the compressor side + * (userspace side for compression). + * It is hardcoded because there is not proper way to extract it + * from the binary stream which is generated by the preliminary + * version of LZ4 tool so far. + */ +#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20) +#define ARCHIVE_MAGICNUMBER 0x184C2102 + +static int xc_try_lz4_decode( + struct xc_dom_image *dom, void **blob, size_t *psize) +{ + int ret = -1; + size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE; + unsigned char *inp = *blob, *output, *outp; + ssize_t size = *psize - 4; + size_t out_size, out_len, dest_len, chunksize; + const char *msg; + + if (size < 4) { + msg = "input too small"; + goto exit_0; + } + + out_size = out_len = get_unaligned_le32(inp + size); + if (xc_dom_kernel_check_size(dom, out_len)) { + msg = "Decompressed image too large"; + goto exit_0; + } + + output = malloc(out_len); + if (!output) { + msg = "Could not allocate output buffer"; + goto exit_0; + } + outp = output; + + chunksize = get_unaligned_le32(inp); + if (chunksize == ARCHIVE_MAGICNUMBER) { + inp += 4; + size -= 4; + } else { + msg = "invalid header"; + goto exit_2; + } + + for (;;) { + if (size < 4) { + msg = "missing data"; + goto exit_2; + } + chunksize = get_unaligned_le32(inp); + if (chunksize == ARCHIVE_MAGICNUMBER) { + inp += 4; + size -= 4; + continue; + } + inp += 4; + size -= 4; + + if (out_len >= uncomp_chunksize) { + dest_len = uncomp_chunksize; + out_len -= dest_len; + } else + dest_len = out_len; + ret = lz4_decompress(inp, &chunksize, outp, dest_len); + if (ret < 0) { + msg = "decoding failed"; + goto exit_2; + } + + outp += dest_len; + size -= chunksize; + + if (size == 0) + { + *blob = output; + *psize = out_size; + return 0; + } + + if (size < 0) { + msg = "data corrupted"; + goto exit_2; + } + + inp += chunksize; + } + +exit_2: + free(output); + DOMPRINTF("LZ4 decompression error: %s\n", msg); +exit_0: + return ret; +} + struct setup_header { uint8_t _pad0[0x1f1]; /* skip uninteresting stuff */ uint8_t setup_sects; @@ -733,6 +853,17 @@ static int xc_dom_probe_bzimage_kernel(s return -EINVAL; } } + else if ( check_magic(dom, "\x02\x21", 2) ) + { + ret = xc_try_lz4_decode(dom, &dom->kernel_blob, &dom->kernel_size); + if ( ret < 0 ) + { + xc_dom_panic(dom->xch, XC_INVALID_KERNEL, + "%s unable to LZ4 decompress kernel\n", + __FUNCTION__); + return -EINVAL; + } + } else { xc_dom_panic(dom->xch, XC_INVALID_KERNEL, --- a/xen/common/lz4/decompress.c +++ b/xen/common/lz4/decompress.c @@ -151,7 +151,7 @@ _output_error: return (int) (-(((unsigned char *)ip) - source)); } -#ifndef __XEN__ +#if !defined(__XEN__) && !defined(__XEN_TOOLS__) static int lz4_uncompress_unknownoutputsize(const char *source, char *dest, int isize, size_t maxoutputsize) { @@ -294,7 +294,7 @@ exit_0: return ret; } -#ifndef __XEN__ +#if !defined(__XEN__) && !defined(__XEN_TOOLS__) int lz4_decompress_unknownoutputsize(const char *src, size_t src_len, char *dest, size_t *dest_len) { _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Tue, 2013-09-24 at 17:29 +0100, Jan Beulich wrote:> Since there''s no shared or static library to link against, this simply > re-uses the hypervisor side code. However, I only audited the code > added here for possible security issues, not the referenced code in > the hypervisor tree. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > v2: Fold in the addition of a previously missing include in the Mini-OS > case - provided by Ian Campbell - to allow stubdom compilation to > succeed. > --- > I intentionally retained the tab indentation in the code cloned from > its hypervisor original (which in turn retained it as being a clone > of the Linux original), to ease diff-ing. > > --- a/tools/libxc/xc_dom_bzimageloader.c > +++ b/tools/libxc/xc_dom_bzimageloader.c > @@ -566,6 +566,8 @@ static int xc_try_lzo1x_decode( > > #else /* __MINIOS__ */ > > +#include <mini-os/byteorder.h> > + > int xc_try_bzip2_decode(struct xc_dom_image *dom, void **blob, size_t *size); > int xc_try_lzma_decode(struct xc_dom_image *dom, void **blob, size_t *size); > int xc_try_lzo1x_decode(struct xc_dom_image *dom, void **blob, size_t *size); > @@ -573,6 +575,124 @@ int xc_try_xz_decode(struct xc_dom_image > > #endif /* !__MINIOS__ */ > > +#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > +#define STATIC static > +#define u8 uint8_t > +#define u16 uint16_t > +#define u32 uint32_t > +#define u64 uint64_t > +#define INIT > +#define unlikely(x) (x)I think rather than pollute this file with this sort of thing (which might have unexpected consequences in the future) this would be better of placed in a separate file compiled for both regular and stub use.> + > +static inline uint_fast16_t le16_to_cpup(const unsigned char *buf) > +{ > + return buf[0] | (buf[1] << 8); > +} > + > +static inline uint_fast32_t le32_to_cpup(const unsigned char *buf) > +{ > + return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16); > +} > + > +#include "../../xen/common/lz4/decompress.c" > + > +/* > + * Note: Uncompressed chunk size is used in the compressor side > + * (userspace side for compression). > + * It is hardcoded because there is not proper way to extract it > + * from the binary stream which is generated by the preliminary > + * version of LZ4 tool so far. > + */ > +#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20) > +#define ARCHIVE_MAGICNUMBER 0x184C2102 > + > +static int xc_try_lz4_decode( > + struct xc_dom_image *dom, void **blob, size_t *psize) > +{ > + int ret = -1; > + size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE; > + unsigned char *inp = *blob, *output, *outp; > + ssize_t size = *psize - 4; > + size_t out_size, out_len, dest_len, chunksize; > + const char *msg; > + > + if (size < 4) { > + msg = "input too small"; > + goto exit_0;The exit_0 loop is after the error print, so these errors don''t get printed. I don''t really like exit_0 and exit_2 as names (what happened to exit_1?). "err" and "err_free" would be more usual.> + } > + > + out_size = out_len = get_unaligned_le32(inp + size); > + if (xc_dom_kernel_check_size(dom, out_len)) { > + msg = "Decompressed image too large"; > + goto exit_0; > + } > + > + output = malloc(out_len); > + if (!output) { > + msg = "Could not allocate output buffer"; > + goto exit_0; > + } > + outp = output; > + > + chunksize = get_unaligned_le32(inp); > + if (chunksize == ARCHIVE_MAGICNUMBER) { > + inp += 4; > + size -= 4; > + } else { > + msg = "invalid header"; > + goto exit_2; > + } > + > + for (;;) { > + if (size < 4) { > + msg = "missing data"; > + goto exit_2; > + } > + chunksize = get_unaligned_le32(inp); > + if (chunksize == ARCHIVE_MAGICNUMBER) { > + inp += 4; > + size -= 4; > + continue; > + } > + inp += 4; > + size -= 4;We know size is at least 4 from the check at the head of the loop, but now we subtracted 4 so size could be 0, but lz4_decompress doesn''t take size as an argument so whatever it does it isn''t basing the return value in &chunksize on it, so I assume it reads over the end of the buffer via inp. I didn''t look to see what the minimum number of bytes which lz4_decompress will consume is, but I think we need a check of some description.> + > + if (out_len >= uncomp_chunksize) { > + dest_len = uncomp_chunksize; > + out_len -= dest_len; > + } else > + dest_len = out_len; > + ret = lz4_decompress(inp, &chunksize, outp, dest_len); > + if (ret < 0) { > + msg = "decoding failed"; > + goto exit_2; > + } > + > + outp += dest_len; > + size -= chunksize; > + > + if (size == 0) > + { > + *blob = output; > + *psize = out_size; > + return 0; > + } > + > + if (size < 0) { > + msg = "data corrupted"; > + goto exit_2; > + } > + > + inp += chunksize; > + } > + > +exit_2: > + free(output); > + DOMPRINTF("LZ4 decompression error: %s\n", msg); > +exit_0: > + return ret; > +} > + > struct setup_header { > uint8_t _pad0[0x1f1]; /* skip uninteresting stuff */ > uint8_t setup_sects; > @@ -733,6 +853,17 @@ static int xc_dom_probe_bzimage_kernel(s > return -EINVAL; > } > } > + else if ( check_magic(dom, "\x02\x21", 2) ) > + { > + ret = xc_try_lz4_decode(dom, &dom->kernel_blob, &dom->kernel_size); > + if ( ret < 0 ) > + { > + xc_dom_panic(dom->xch, XC_INVALID_KERNEL, > + "%s unable to LZ4 decompress kernel\n", > + __FUNCTION__); > + return -EINVAL; > + } > + } > else > { > xc_dom_panic(dom->xch, XC_INVALID_KERNEL, > --- a/xen/common/lz4/decompress.c > +++ b/xen/common/lz4/decompress.c > @@ -151,7 +151,7 @@ _output_error: > return (int) (-(((unsigned char *)ip) - source)); > } > > -#ifndef __XEN__ > +#if !defined(__XEN__) && !defined(__XEN_TOOLS__) > static int lz4_uncompress_unknownoutputsize(const char *source, char *dest, > int isize, size_t maxoutputsize) > { > @@ -294,7 +294,7 @@ exit_0: > return ret; > } > > -#ifndef __XEN__ > +#if !defined(__XEN__) && !defined(__XEN_TOOLS__) > int lz4_decompress_unknownoutputsize(const char *src, size_t src_len, > char *dest, size_t *dest_len) > { > >
Hi, Just for your information, regarding limit case of size 0 : lz4 decompression functions are routinely tested to decode a compressed block of null original size, and they handle this use case correctly. Note that, due to format definition, a block of size 0 is compressed into a compressed-block of size 1 (yeah, I know...). That byte, by the way, will be 0. Consequently, a compressed-block of size 0 is not supposed to exist (break format). For such a situation, the decompression functions should generate an error (negative function result). Now, I realize that what I just said apply to the generic version, hosted at http://code.google.com/p/lz4/ Since you are probably using the Linux kernel version, it could be interesting to check if this corner case still holds. Obviously, compressing a null-sized block is a questionable use case, and most lz4 implementations ensure this never happens . But well, there must be a definition for it, just in case it does nonetheless happen. The linux kernel version, while heavily based on original version, is a slightly modified one, most probably to respect some kernel rule requirement. That work was done by Kyungsik Lee, so you may want to include him in the discussion if you want to ask questions on his version. Best Regards Yann 2013/9/24 Ian Campbell <Ian.Campbell@citrix.com>> On Tue, 2013-09-24 at 17:29 +0100, Jan Beulich wrote: > > Since there''s no shared or static library to link against, this simply > > re-uses the hypervisor side code. However, I only audited the code > > added here for possible security issues, not the referenced code in > > the hypervisor tree. > > > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- > > v2: Fold in the addition of a previously missing include in the Mini-OS > > case - provided by Ian Campbell - to allow stubdom compilation to > > succeed. > > --- > > I intentionally retained the tab indentation in the code cloned from > > its hypervisor original (which in turn retained it as being a clone > > of the Linux original), to ease diff-ing. > > > > --- a/tools/libxc/xc_dom_bzimageloader.c > > +++ b/tools/libxc/xc_dom_bzimageloader.c > > @@ -566,6 +566,8 @@ static int xc_try_lzo1x_decode( > > > > #else /* __MINIOS__ */ > > > > +#include <mini-os/byteorder.h> > > + > > int xc_try_bzip2_decode(struct xc_dom_image *dom, void **blob, size_t > *size); > > int xc_try_lzma_decode(struct xc_dom_image *dom, void **blob, size_t > *size); > > int xc_try_lzo1x_decode(struct xc_dom_image *dom, void **blob, size_t > *size); > > @@ -573,6 +575,124 @@ int xc_try_xz_decode(struct xc_dom_image > > > > #endif /* !__MINIOS__ */ > > > > +#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > +#define STATIC static > > +#define u8 uint8_t > > +#define u16 uint16_t > > +#define u32 uint32_t > > +#define u64 uint64_t > > +#define INIT > > +#define unlikely(x) (x) > > I think rather than pollute this file with this sort of thing (which > might have unexpected consequences in the future) this would be better > of placed in a separate file compiled for both regular and stub use. > > > + > > +static inline uint_fast16_t le16_to_cpup(const unsigned char *buf) > > +{ > > + return buf[0] | (buf[1] << 8); > > +} > > + > > +static inline uint_fast32_t le32_to_cpup(const unsigned char *buf) > > +{ > > + return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16); > > +} > > + > > +#include "../../xen/common/lz4/decompress.c" > > + > > +/* > > + * Note: Uncompressed chunk size is used in the compressor side > > + * (userspace side for compression). > > + * It is hardcoded because there is not proper way to extract it > > + * from the binary stream which is generated by the preliminary > > + * version of LZ4 tool so far. > > + */ > > +#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20) > > +#define ARCHIVE_MAGICNUMBER 0x184C2102 > > + > > +static int xc_try_lz4_decode( > > + struct xc_dom_image *dom, void **blob, size_t *psize) > > +{ > > + int ret = -1; > > + size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE; > > + unsigned char *inp = *blob, *output, *outp; > > + ssize_t size = *psize - 4; > > + size_t out_size, out_len, dest_len, chunksize; > > + const char *msg; > > + > > + if (size < 4) { > > + msg = "input too small"; > > + goto exit_0; > > The exit_0 loop is after the error print, so these errors don''t get > printed. > > I don''t really like exit_0 and exit_2 as names (what happened to > exit_1?). "err" and "err_free" would be more usual. > > > > + } > > + > > + out_size = out_len = get_unaligned_le32(inp + size); > > + if (xc_dom_kernel_check_size(dom, out_len)) { > > + msg = "Decompressed image too large"; > > + goto exit_0; > > + } > > + > > + output = malloc(out_len); > > + if (!output) { > > + msg = "Could not allocate output buffer"; > > + goto exit_0; > > + } > > + outp = output; > > + > > + chunksize = get_unaligned_le32(inp); > > + if (chunksize == ARCHIVE_MAGICNUMBER) { > > + inp += 4; > > + size -= 4; > > + } else { > > + msg = "invalid header"; > > + goto exit_2; > > + } > > + > > + for (;;) { > > + if (size < 4) { > > + msg = "missing data"; > > + goto exit_2; > > + } > > + chunksize = get_unaligned_le32(inp); > > + if (chunksize == ARCHIVE_MAGICNUMBER) { > > + inp += 4; > > + size -= 4; > > + continue; > > + } > > + inp += 4; > > + size -= 4; > > We know size is at least 4 from the check at the head of the loop, but > now we subtracted 4 so size could be 0, but lz4_decompress doesn''t take > size as an argument so whatever it does it isn''t basing the return value > in &chunksize on it, so I assume it reads over the end of the buffer via > inp. > > I didn''t look to see what the minimum number of bytes which > lz4_decompress will consume is, but I think we need a check of some > description. > > > + > > + if (out_len >= uncomp_chunksize) { > > + dest_len = uncomp_chunksize; > > + out_len -= dest_len; > > + } else > > + dest_len = out_len; > > + ret = lz4_decompress(inp, &chunksize, outp, dest_len); > > + if (ret < 0) { > > + msg = "decoding failed"; > > + goto exit_2; > > + } > > + > > + outp += dest_len; > > + size -= chunksize; > > + > > + if (size == 0) > > + { > > + *blob = output; > > + *psize = out_size; > > + return 0; > > + } > > + > > + if (size < 0) { > > + msg = "data corrupted"; > > + goto exit_2; > > + } > > + > > + inp += chunksize; > > + } > > + > > +exit_2: > > + free(output); > > + DOMPRINTF("LZ4 decompression error: %s\n", msg); > > +exit_0: > > + return ret; > > +} > > + > > struct setup_header { > > uint8_t _pad0[0x1f1]; /* skip uninteresting stuff */ > > uint8_t setup_sects; > > @@ -733,6 +853,17 @@ static int xc_dom_probe_bzimage_kernel(s > > return -EINVAL; > > } > > } > > + else if ( check_magic(dom, "\x02\x21", 2) ) > > + { > > + ret = xc_try_lz4_decode(dom, &dom->kernel_blob, > &dom->kernel_size); > > + if ( ret < 0 ) > > + { > > + xc_dom_panic(dom->xch, XC_INVALID_KERNEL, > > + "%s unable to LZ4 decompress kernel\n", > > + __FUNCTION__); > > + return -EINVAL; > > + } > > + } > > else > > { > > xc_dom_panic(dom->xch, XC_INVALID_KERNEL, > > --- a/xen/common/lz4/decompress.c > > +++ b/xen/common/lz4/decompress.c > > @@ -151,7 +151,7 @@ _output_error: > > return (int) (-(((unsigned char *)ip) - source)); > > } > > > > -#ifndef __XEN__ > > +#if !defined(__XEN__) && !defined(__XEN_TOOLS__) > > static int lz4_uncompress_unknownoutputsize(const char *source, char > *dest, > > int isize, size_t maxoutputsize) > > { > > @@ -294,7 +294,7 @@ exit_0: > > return ret; > > } > > > > -#ifndef __XEN__ > > +#if !defined(__XEN__) && !defined(__XEN_TOOLS__) > > int lz4_decompress_unknownoutputsize(const char *src, size_t src_len, > > char *dest, size_t *dest_len) > > { > > > > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
>>> On 24.09.13 at 18:53, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Tue, 2013-09-24 at 17:29 +0100, Jan Beulich wrote: >> @@ -573,6 +575,124 @@ int xc_try_xz_decode(struct xc_dom_image >> >> #endif /* !__MINIOS__ */ >> >> +#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS >> +#define STATIC static >> +#define u8 uint8_t >> +#define u16 uint16_t >> +#define u32 uint32_t >> +#define u64 uint64_t >> +#define INIT >> +#define unlikely(x) (x) > > I think rather than pollute this file with this sort of thing (which > might have unexpected consequences in the future) this would be better > of placed in a separate file compiled for both regular and stub use.So you mean to move the whole thing quoted above down to ...>> + >> +static inline uint_fast16_t le16_to_cpup(const unsigned char *buf) >> +{ >> + return buf[0] | (buf[1] << 8); >> +} >> + >> +static inline uint_fast32_t le32_to_cpup(const unsigned char *buf) >> +{ >> + return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16); >> +} >> + >> +#include "../../xen/common/lz4/decompress.c"... here out into a separate file. I can surely do that. Any preference for the name (xc_dom_decompress_unsafe_lz4.c would seem the prime candidate, albeit the other similarly named files have a slightly different purpose). The main problem doing so is going to be the uses of get_unaligned_le32() in xc_try_lz4_decode() - they''d need to be adjusted, increasing the amount of change from the original. And lz4_decompress() would the also need to be declared in some (new?) header. All of which didn''t seem very desirable...>> + >> +/* >> + * Note: Uncompressed chunk size is used in the compressor side >> + * (userspace side for compression). >> + * It is hardcoded because there is not proper way to extract it >> + * from the binary stream which is generated by the preliminary >> + * version of LZ4 tool so far. >> + */ >> +#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20) >> +#define ARCHIVE_MAGICNUMBER 0x184C2102 >> + >> +static int xc_try_lz4_decode( >> + struct xc_dom_image *dom, void **blob, size_t *psize) >> +{ >> + int ret = -1; >> + size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE; >> + unsigned char *inp = *blob, *output, *outp; >> + ssize_t size = *psize - 4; >> + size_t out_size, out_len, dest_len, chunksize; >> + const char *msg; >> + >> + if (size < 4) { >> + msg = "input too small"; >> + goto exit_0; > > The exit_0 loop is after the error print, so these errors don''t get > printed.Oops - fixed.> I don''t really like exit_0 and exit_2 as names (what happened to > exit_1?). "err" and "err_free" would be more usual.I didn''t like those too, but tried to modify the original as little as possible. exit_1 disappeared as having got orphaned in the process of the adaptation.>> + } >> + >> + out_size = out_len = get_unaligned_le32(inp + size); >> + if (xc_dom_kernel_check_size(dom, out_len)) { >> + msg = "Decompressed image too large"; >> + goto exit_0; >> + } >> + >> + output = malloc(out_len); >> + if (!output) { >> + msg = "Could not allocate output buffer"; >> + goto exit_0; >> + } >> + outp = output; >> + >> + chunksize = get_unaligned_le32(inp); >> + if (chunksize == ARCHIVE_MAGICNUMBER) { >> + inp += 4; >> + size -= 4; >> + } else { >> + msg = "invalid header"; >> + goto exit_2; >> + } >> + >> + for (;;) { >> + if (size < 4) { >> + msg = "missing data"; >> + goto exit_2; >> + } >> + chunksize = get_unaligned_le32(inp); >> + if (chunksize == ARCHIVE_MAGICNUMBER) { >> + inp += 4; >> + size -= 4; >> + continue; >> + } >> + inp += 4; >> + size -= 4; > > We know size is at least 4 from the check at the head of the loop, but > now we subtracted 4 so size could be 0, but lz4_decompress doesn''t take > size as an argument so whatever it does it isn''t basing the return value > in &chunksize on it, so I assume it reads over the end of the buffer via > inp. > > I didn''t look to see what the minimum number of bytes which > lz4_decompress will consume is, but I think we need a check of some > description.Sadly there doesn''t appear to be a minimum - the main loop in lz4_uncompress() has just a single exit point, which isn''t dependent upon the input size at all (the function doesn''t even know how much input there is). Immediately before the call to lz4_decompress() chunksize gets read, but I can''t tell whether we could expect that to be the required amount of bytes to follow. Yann - can you clarify this? Jan
On Wed, 2013-09-25 at 08:23 +0100, Jan Beulich wrote:> >>> On 24.09.13 at 18:53, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Tue, 2013-09-24 at 17:29 +0100, Jan Beulich wrote: > >> @@ -573,6 +575,124 @@ int xc_try_xz_decode(struct xc_dom_image > >> > >> #endif /* !__MINIOS__ */ > >> > >> +#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > >> +#define STATIC static > >> +#define u8 uint8_t > >> +#define u16 uint16_t > >> +#define u32 uint32_t > >> +#define u64 uint64_t > >> +#define INIT > >> +#define unlikely(x) (x) > > > > I think rather than pollute this file with this sort of thing (which > > might have unexpected consequences in the future) this would be better > > of placed in a separate file compiled for both regular and stub use. > > So you mean to move the whole thing quoted above down to ... > > >> + > >> +static inline uint_fast16_t le16_to_cpup(const unsigned char *buf) > >> +{ > >> + return buf[0] | (buf[1] << 8); > >> +} > >> + > >> +static inline uint_fast32_t le32_to_cpup(const unsigned char *buf) > >> +{ > >> + return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16); > >> +} > >> + > >> +#include "../../xen/common/lz4/decompress.c" > > ... here out into a separate file. I can surely do that.I was thinking to include xc_try_lz4_decode too, but this would work well, whichever you prefer.> Any preference for the name (xc_dom_decompress_unsafe_lz4.c > would seem the prime candidate, albeit the other similarly > named files have a slightly different purpose).Following the pattern used by the existing unsafe decompressors would seem reasonable.> The main problem doing so is going to be the uses of > get_unaligned_le32() in xc_try_lz4_decode() - they''d need to > be adjusted, increasing the amount of change from the original. > And lz4_decompress() would the also need to be declared in > some (new?) header. All of which didn''t seem very desirable...In which case maybe moving xc_try.. too is the right answer, there is already a list of prototypes for the others (albeit in a minios ifdef)> > I don''t really like exit_0 and exit_2 as names (what happened to > > exit_1?). "err" and "err_free" would be more usual. > > I didn''t like those too, but tried to modify the original as little > as possible. exit_1 disappeared as having got orphaned in the > process of the adaptation.I didn''t realise this had come from elsewhere rather than being newly written xc glue code. I suppose its fine then.> > >> + } > >> + > >> + out_size = out_len = get_unaligned_le32(inp + size); > >> + if (xc_dom_kernel_check_size(dom, out_len)) { > >> + msg = "Decompressed image too large"; > >> + goto exit_0; > >> + } > >> + > >> + output = malloc(out_len); > >> + if (!output) { > >> + msg = "Could not allocate output buffer"; > >> + goto exit_0; > >> + } > >> + outp = output; > >> + > >> + chunksize = get_unaligned_le32(inp); > >> + if (chunksize == ARCHIVE_MAGICNUMBER) { > >> + inp += 4; > >> + size -= 4; > >> + } else { > >> + msg = "invalid header"; > >> + goto exit_2; > >> + } > >> + > >> + for (;;) { > >> + if (size < 4) { > >> + msg = "missing data"; > >> + goto exit_2; > >> + } > >> + chunksize = get_unaligned_le32(inp); > >> + if (chunksize == ARCHIVE_MAGICNUMBER) { > >> + inp += 4; > >> + size -= 4; > >> + continue; > >> + } > >> + inp += 4; > >> + size -= 4; > > > > We know size is at least 4 from the check at the head of the loop, but > > now we subtracted 4 so size could be 0, but lz4_decompress doesn''t take > > size as an argument so whatever it does it isn''t basing the return value > > in &chunksize on it, so I assume it reads over the end of the buffer via > > inp. > > > > I didn''t look to see what the minimum number of bytes which > > lz4_decompress will consume is, but I think we need a check of some > > description. > > Sadly there doesn''t appear to be a minimum - the main loop in > lz4_uncompress() has just a single exit point, which isn''t > dependent upon the input size at all (the function doesn''t even > know how much input there is). > > Immediately before the call to lz4_decompress() chunksize gets > read, but I can''t tell whether we could expect that to be the > required amount of bytes to follow. Yann - can you clarify this?Whatever expectations we might have from the encoder a malicious attacker can construct a byte stream which violates it and has ARCHIVE_MAGICNUMBER as the last thing. Under the circumstances I don''t see how lz4_decompress() can be safely used on untrusted data since it has no knowledge of the remaining input length and therefore no way to avoid running off the end of the data. Digging down a couple of layers of callchain it seems like this knowledge isn''t use anywhere (i.e. it''s not just lost in the outermost API. This is a blocker for inclusion IMHO. Ian.
On Tue, 2013-09-24 at 21:55 +0200, Yann Collet wrote:> Consequently, a compressed-block of size 0 is not supposed to exist > (break format).Unfortunately an attacker is free to break the format. The question is what will the decoder do when faced with such invalid inputs? It seems like such concerns were not considered at all during implementation? (which is fair enough, since the data stream is implicitly trusted in the original target use case) Ian.
There are 2 families of decoding functions within LZ4 : 1) LZ4_decompress_fast* : These decoding functions must be used with trusted sources only. They only guarantee that they will write exactly the size of output buffer, but cannot guarantee anything regarding input buffer, since its size is unknown. (by the way, the amount of bytes read into input buffer is the result of the function). 2) LZ4_decompress_safe* : These decoding functions are protected against malicious input. It resists fuzzer attack. This is the recommended choice for "general decompression usage". Looking at the kernel code, at https://github.com/torvalds/linux/blob/master/lib/lz4/lz4_decompress.c the naming seems different, but both variants are still there : lz4_decompress is the equivalent of LZ4_decompress_fast. lz4_decompress_unknownoutputsize is the equivalent of LZ4_decompress_safe. I would recommend to use the second one for untrusted sources. Regards 2013/9/25 Ian Campbell <Ian.Campbell@citrix.com>> On Tue, 2013-09-24 at 21:55 +0200, Yann Collet wrote: > > Consequently, a compressed-block of size 0 is not supposed to exist > > (break format). > > Unfortunately an attacker is free to break the format. The question is > what will the decoder do when faced with such invalid inputs? > > It seems like such concerns were not considered at all during > implementation? (which is fair enough, since the data stream is > implicitly trusted in the original target use case) > > Ian. > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
>>> On 25.09.13 at 10:06, Yann Collet <yann.collet.73@gmail.com> wrote: > There are 2 families of decoding functions within LZ4 : > > 1) LZ4_decompress_fast* : These decoding functions must be used with > trusted sources only. They only guarantee that they will write exactly the > size of output buffer, but cannot guarantee anything regarding input > buffer, since its size is unknown. (by the way, the amount of bytes read > into input buffer is the result of the function). > > 2) LZ4_decompress_safe* : These decoding functions are protected against > malicious input. It resists fuzzer attack. This is the recommended choice > for "general decompression usage". > > Looking at the kernel code, at > https://github.com/torvalds/linux/blob/master/lib/lz4/lz4_decompress.c > the naming seems different, but both variants are still there : > > lz4_decompress is the equivalent of LZ4_decompress_fast. > lz4_decompress_unknownoutputsize is the equivalent of LZ4_decompress_safe. > > I would recommend to use the second one for untrusted sources.Ah, okay. Will re-do the DomU patch then. Jan
Linux 3.11 added respective support, so I think we should follow suit. 1: xen: add LZ4 decompression support 2: libxc: add LZ4 decompression support Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v3: Tools side adjustments: Fix placement of DOMPRINTF() invocation. Split implementation into its own source file. Use the safe decompressor variant in non-Mini-OS libxc (lz4_decompress_unknownoutputsize()). Minor adjustments to hypervisor side avoid the tools side patch having to touch the former.
xen: add LZ4 decompression support Add support for LZ4 decompression in Xen. LZ4 Decompression APIs for Xen are based on LZ4 implementation by Yann Collet. Benchmark Results(PATCH v3) Compiler: Linaro ARM gcc 4.6.2 1. ARMv7, 1.5GHz based board Kernel: linux 3.4 Uncompressed Kernel Size: 14MB Compressed Size Decompression Speed LZO 6.7MB 20.1MB/s, 25.2MB/s(UA) LZ4 7.3MB 29.1MB/s, 45.6MB/s(UA) 2. ARMv7, 1.7GHz based board Kernel: linux 3.7 Uncompressed Kernel Size: 14MB Compressed Size Decompression Speed LZO 6.0MB 34.1MB/s, 52.2MB/s(UA) LZ4 6.5MB 86.7MB/s - UA: Unaligned memory Access support - Latest patch set for LZO applied This patch set is for adding support for LZ4-compressed Kernel. LZ4 is a very fast lossless compression algorithm and it also features an extremely fast decoder [1]. But we have five of decompressors already and one question which does arise, however, is that of where do we stop adding new ones? This issue had been discussed and came to the conclusion [2]. Russell King said that we should have: - one decompressor which is the fastest - one decompressor for the highest compression ratio - one popular decompressor (eg conventional gzip) If we have a replacement one for one of these, then it should do exactly that: replace it. The benchmark shows that an 8% increase in image size vs a 66% increase in decompression speed compared to LZO(which has been known as the fastest decompressor in the Kernel). Therefore the "fast but may not be small" compression title has clearly been taken by LZ4 [3]. [1] http://code.google.com/p/lz4/ [2] http://thread.gmane.org/gmane.linux.kbuild.devel/9157 [3] http://thread.gmane.org/gmane.linux.kbuild.devel/9347 LZ4 homepage: http://fastcompression.blogspot.com/p/lz4.html LZ4 source repository: http://code.google.com/p/lz4/ Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com> Signed-off-by: Yann Collet <yann.collet.73@gmail.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v3: Minor adjustments to avoid the tools side patch having to touch hypervisor side sources. --- a/xen/common/Makefile +++ b/xen/common/Makefile @@ -49,7 +49,7 @@ obj-y += radix-tree.o obj-y += rbtree.o obj-y += lzo.o -obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo earlycpio,$(n).init.o) +obj-bin-$(CONFIG_X86) += $(foreach n,decompress bunzip2 unxz unlzma unlzo unlz4 earlycpio,$(n).init.o) obj-$(perfc) += perfc.o obj-$(crash_debug) += gdbstub.o --- a/xen/common/decompress.c +++ b/xen/common/decompress.c @@ -29,5 +29,8 @@ int __init decompress(void *inbuf, unsig if ( len >= 5 && !memcmp(inbuf, "\x89LZO", 5) ) return unlzo(inbuf, len, NULL, NULL, outbuf, NULL, error); + if ( len >= 2 && !memcmp(inbuf, "\x02\x21", 2) ) + return unlz4(inbuf, len, NULL, NULL, outbuf, NULL, error); + return 1; } --- /dev/null +++ b/xen/common/lz4/decompress.c @@ -0,0 +1,323 @@ +/* + * LZ4 Decompressor for Linux kernel + * + * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com> + * + * Based on LZ4 implementation by Yann Collet. + * + * LZ4 - Fast LZ compression algorithm + * Copyright (C) 2011-2012, Yann Collet. + * BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php) + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions are + * met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following disclaimer + * in the documentation and/or other materials provided with the + * distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + * + * You can contact the author at : + * - LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html + * - LZ4 source repository : http://code.google.com/p/lz4/ + */ + +#include "defs.h" + +#if defined(__XEN__) || defined(__MINIOS__) + +static int INIT lz4_uncompress(const unsigned char *source, unsigned char *dest, + int osize) +{ + const BYTE *ip = (const BYTE *) source; + const BYTE *ref; + BYTE *op = (BYTE *) dest; + BYTE * const oend = op + osize; + BYTE *cpy; + unsigned token; + size_t length; + size_t dec32table[] = {0, 3, 2, 3, 0, 0, 0, 0}; +#if LZ4_ARCH64 + size_t dec64table[] = {0, 0, 0, -1, 0, 1, 2, 3}; +#endif + + while (1) { + + /* get runlength */ + token = *ip++; + length = (token >> ML_BITS); + if (length == RUN_MASK) { + size_t len; + + len = *ip++; + for (; len == 255; length += 255) + len = *ip++; + length += len; + } + + /* copy literals */ + cpy = op + length; + if (unlikely(cpy > oend - COPYLENGTH)) { + /* + * Error: not enough place for another match + * (min 4) + 5 literals + */ + if (cpy != oend) + goto _output_error; + + memcpy(op, ip, length); + ip += length; + break; /* EOF */ + } + LZ4_WILDCOPY(ip, op, cpy); + ip -= (op - cpy); + op = cpy; + + /* get offset */ + LZ4_READ_LITTLEENDIAN_16(ref, cpy, ip); + ip += 2; + + /* Error: offset create reference outside destination buffer */ + if (unlikely(ref < (BYTE *const) dest)) + goto _output_error; + + /* get matchlength */ + length = token & ML_MASK; + if (length == ML_MASK) { + for (; *ip == 255; length += 255) + ip++; + length += *ip++; + } + + /* copy repeated sequence */ + if (unlikely((op - ref) < STEPSIZE)) { +#if LZ4_ARCH64 + size_t dec64 = dec64table[op - ref]; +#else + const int dec64 = 0; +#endif + op[0] = ref[0]; + op[1] = ref[1]; + op[2] = ref[2]; + op[3] = ref[3]; + op += 4; + ref += 4; + ref -= dec32table[op-ref]; + PUT4(ref, op); + op += STEPSIZE - 4; + ref -= dec64; + } else { + LZ4_COPYSTEP(ref, op); + } + cpy = op + length - (STEPSIZE - 4); + if (cpy > (oend - COPYLENGTH)) { + + /* Error: request to write beyond destination buffer */ + if (cpy > oend) + goto _output_error; + LZ4_SECURECOPY(ref, op, (oend - COPYLENGTH)); + while (op < cpy) + *op++ = *ref++; + op = cpy; + /* + * Check EOF (should never happen, since last 5 bytes + * are supposed to be literals) + */ + if (op == oend) + goto _output_error; + continue; + } + LZ4_SECURECOPY(ref, op, cpy); + op = cpy; /* correction */ + } + /* end of decoding */ + return (int) (ip - source); + + /* write overflow error detected */ +_output_error: + return (int) (-(ip - source)); +} + +#else /* defined(__XEN__) || defined(__MINIOS__) */ + +static int lz4_uncompress_unknownoutputsize(const unsigned char *source, + unsigned char *dest, int isize, + size_t maxoutputsize) +{ + const BYTE *ip = (const BYTE *) source; + const BYTE *const iend = ip + isize; + const BYTE *ref; + + + BYTE *op = (BYTE *) dest; + BYTE * const oend = op + maxoutputsize; + BYTE *cpy; + + size_t dec32table[] = {0, 3, 2, 3, 0, 0, 0, 0}; +#if LZ4_ARCH64 + size_t dec64table[] = {0, 0, 0, -1, 0, 1, 2, 3}; +#endif + + /* Main Loop */ + while (ip < iend) { + + unsigned token; + size_t length; + + /* get runlength */ + token = *ip++; + length = (token >> ML_BITS); + if (length == RUN_MASK) { + int s = 255; + while ((ip < iend) && (s == 255)) { + s = *ip++; + length += s; + } + } + /* copy literals */ + cpy = op + length; + if ((cpy > oend - COPYLENGTH) || + (ip + length > iend - COPYLENGTH)) { + + if (cpy > oend) + goto _output_error;/* writes beyond buffer */ + + if (ip + length != iend) + goto _output_error;/* + * Error: LZ4 format requires + * to consume all input + * at this stage + */ + memcpy(op, ip, length); + op += length; + break;/* Necessarily EOF, due to parsing restrictions */ + } + LZ4_WILDCOPY(ip, op, cpy); + ip -= (op - cpy); + op = cpy; + + /* get offset */ + LZ4_READ_LITTLEENDIAN_16(ref, cpy, ip); + ip += 2; + if (ref < (BYTE * const) dest) + goto _output_error; + /* + * Error : offset creates reference + * outside of destination buffer + */ + + /* get matchlength */ + length = (token & ML_MASK); + if (length == ML_MASK) { + while (ip < iend) { + int s = *ip++; + length += s; + if (s == 255) + continue; + break; + } + } + + /* copy repeated sequence */ + if (unlikely((op - ref) < STEPSIZE)) { +#if LZ4_ARCH64 + size_t dec64 = dec64table[op - ref]; +#else + const int dec64 = 0; +#endif + op[0] = ref[0]; + op[1] = ref[1]; + op[2] = ref[2]; + op[3] = ref[3]; + op += 4; + ref += 4; + ref -= dec32table[op - ref]; + PUT4(ref, op); + op += STEPSIZE - 4; + ref -= dec64; + } else { + LZ4_COPYSTEP(ref, op); + } + cpy = op + length - (STEPSIZE-4); + if (cpy > oend - COPYLENGTH) { + if (cpy > oend) + goto _output_error; /* write outside of buf */ + + LZ4_SECURECOPY(ref, op, (oend - COPYLENGTH)); + while (op < cpy) + *op++ = *ref++; + op = cpy; + /* + * Check EOF (should never happen, since last 5 bytes + * are supposed to be literals) + */ + if (op == oend) + goto _output_error; + continue; + } + LZ4_SECURECOPY(ref, op, cpy); + op = cpy; /* correction */ + } + /* end of decoding */ + return (int) (op - dest); + + /* write overflow error detected */ +_output_error: + return (int) (-(ip - source)); +} + +#endif + +#if defined(__XEN__) || defined(__MINIOS__) + +int INIT lz4_decompress(const unsigned char *src, size_t *src_len, + unsigned char *dest, size_t actual_dest_len) +{ + int ret = -1; + int input_len = 0; + + input_len = lz4_uncompress(src, dest, actual_dest_len); + if (input_len < 0) + goto exit_0; + *src_len = input_len; + + return 0; +exit_0: + return ret; +} + +#else /* defined(__XEN__) || defined(__MINIOS__) */ + +int lz4_decompress_unknownoutputsize(const unsigned char *src, size_t src_len, + unsigned char *dest, size_t *dest_len) +{ + int ret = -1; + int out_len = 0; + + out_len = lz4_uncompress_unknownoutputsize(src, dest, src_len, + *dest_len); + if (out_len < 0) + goto exit_0; + *dest_len = out_len; + + return 0; +exit_0: + return ret; +} + +#endif --- /dev/null +++ b/xen/common/lz4/defs.h @@ -0,0 +1,184 @@ +/* + * lz4defs.h -- architecture specific defines + * + * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#ifdef __XEN__ +#include <asm/byteorder.h> +#endif + +#ifdef __LITTLE_ENDIAN +static inline u16 INIT get_unaligned_le16(const void *p) +{ + return le16_to_cpup(p); +} + +static inline u32 INIT get_unaligned_le32(const void *p) +{ + return le32_to_cpup(p); +} +#else +#include <asm/unaligned.h> + +static inline u16 INIT get_unaligned_le16(const void *p) +{ + return le16_to_cpu(__get_unaligned(p, 2)); +} + +static inline u32 INIT get_unaligned_le32(void *p) +{ + return le32_to_cpu(__get_unaligned(p, 4)); +} +#endif + +/* + * Detects 64 bits mode + */ +#if (defined(__x86_64__) || defined(__x86_64) || defined(__amd64__) \ + || defined(__ppc64__) || defined(__LP64__)) +#define LZ4_ARCH64 1 +#else +#define LZ4_ARCH64 0 +#endif + +/* + * Architecture-specific macros + */ +#define BYTE u8 +typedef struct _U16_S { u16 v; } U16_S; +typedef struct _U32_S { u32 v; } U32_S; +typedef struct _U64_S { u64 v; } U64_S; +#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) \ + || defined(CONFIG_ARM) && __LINUX_ARM_ARCH__ >= 6 \ + && defined(ARM_EFFICIENT_UNALIGNED_ACCESS) + +#define A16(x) (((U16_S *)(x))->v) +#define A32(x) (((U32_S *)(x))->v) +#define A64(x) (((U64_S *)(x))->v) + +#define PUT4(s, d) (A32(d) = A32(s)) +#define PUT8(s, d) (A64(d) = A64(s)) +#define LZ4_WRITE_LITTLEENDIAN_16(p, v) \ + do { \ + A16(p) = v; \ + p += 2; \ + } while (0) +#else /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */ + +#define A64(x) get_unaligned((u64 *)&(((U16_S *)(x))->v)) +#define A32(x) get_unaligned((u32 *)&(((U16_S *)(x))->v)) +#define A16(x) get_unaligned((u16 *)&(((U16_S *)(x))->v)) + +#define PUT4(s, d) \ + put_unaligned(get_unaligned((const u32 *) s), (u32 *) d) +#define PUT8(s, d) \ + put_unaligned(get_unaligned((const u64 *) s), (u64 *) d) + +#define LZ4_WRITE_LITTLEENDIAN_16(p, v) \ + do { \ + put_unaligned(v, (u16 *)(p)); \ + p += 2; \ + } while (0) +#endif + +#define COPYLENGTH 8 +#define ML_BITS 4 +#define ML_MASK ((1U << ML_BITS) - 1) +#define RUN_BITS (8 - ML_BITS) +#define RUN_MASK ((1U << RUN_BITS) - 1) +#define MEMORY_USAGE 14 +#define MINMATCH 4 +#define SKIPSTRENGTH 6 +#define LASTLITERALS 5 +#define MFLIMIT (COPYLENGTH + MINMATCH) +#define MINLENGTH (MFLIMIT + 1) +#define MAXD_LOG 16 +#define MAXD (1 << MAXD_LOG) +#define MAXD_MASK (u32)(MAXD - 1) +#define MAX_DISTANCE (MAXD - 1) +#define HASH_LOG (MAXD_LOG - 1) +#define HASHTABLESIZE (1 << HASH_LOG) +#define MAX_NB_ATTEMPTS 256 +#define OPTIMAL_ML (int)((ML_MASK-1)+MINMATCH) +#define LZ4_64KLIMIT ((1<<16) + (MFLIMIT - 1)) +#define HASHLOG64K ((MEMORY_USAGE - 2) + 1) +#define HASH64KTABLESIZE (1U << HASHLOG64K) +#define LZ4_HASH_VALUE(p) (((A32(p)) * 2654435761U) >> \ + ((MINMATCH * 8) - (MEMORY_USAGE-2))) +#define LZ4_HASH64K_VALUE(p) (((A32(p)) * 2654435761U) >> \ + ((MINMATCH * 8) - HASHLOG64K)) +#define HASH_VALUE(p) (((A32(p)) * 2654435761U) >> \ + ((MINMATCH * 8) - HASH_LOG)) + +#if LZ4_ARCH64/* 64-bit */ +#define STEPSIZE 8 + +#define LZ4_COPYSTEP(s, d) \ + do { \ + PUT8(s, d); \ + d += 8; \ + s += 8; \ + } while (0) + +#define LZ4_COPYPACKET(s, d) LZ4_COPYSTEP(s, d) + +#define LZ4_SECURECOPY(s, d, e) \ + do { \ + if (d < e) { \ + LZ4_WILDCOPY(s, d, e); \ + } \ + } while (0) +#define HTYPE u32 + +#ifdef __BIG_ENDIAN +#define LZ4_NBCOMMONBYTES(val) (__builtin_clzll(val) >> 3) +#else +#define LZ4_NBCOMMONBYTES(val) (__builtin_ctzll(val) >> 3) +#endif + +#else /* 32-bit */ +#define STEPSIZE 4 + +#define LZ4_COPYSTEP(s, d) \ + do { \ + PUT4(s, d); \ + d += 4; \ + s += 4; \ + } while (0) + +#define LZ4_COPYPACKET(s, d) \ + do { \ + LZ4_COPYSTEP(s, d); \ + LZ4_COPYSTEP(s, d); \ + } while (0) + +#define LZ4_SECURECOPY LZ4_WILDCOPY +#define HTYPE const u8* + +#ifdef __BIG_ENDIAN +#define LZ4_NBCOMMONBYTES(val) (__builtin_clz(val) >> 3) +#else +#define LZ4_NBCOMMONBYTES(val) (__builtin_ctz(val) >> 3) +#endif + +#endif + +#define LZ4_READ_LITTLEENDIAN_16(d, s, p) \ + (d = s - get_unaligned_le16(p)) + +#define LZ4_WILDCOPY(s, d, e) \ + do { \ + LZ4_COPYPACKET(s, d); \ + } while (d < e) + +#define LZ4_BLINDCOPY(s, d, l) \ + do { \ + u8 *e = (d) + l; \ + LZ4_WILDCOPY(s, d, e); \ + d = e; \ + } while (0) --- /dev/null +++ b/xen/common/unlz4.c @@ -0,0 +1,166 @@ +/* + * Wrapper for decompressing LZ4-compressed kernel, initramfs, and initrd + * + * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include "decompress.h" +#include <xen/lz4.h> +#include "lz4/decompress.c" + +/* + * Note: Uncompressed chunk size is used in the compressor side + * (userspace side for compression). + * It is hardcoded because there is not proper way to extract it + * from the binary stream which is generated by the preliminary + * version of LZ4 tool so far. + */ +#define LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE (8 << 20) +#define ARCHIVE_MAGICNUMBER 0x184C2102 + +STATIC int INIT unlz4(unsigned char *input, unsigned int in_len, + int (*fill)(void *, unsigned int), + int (*flush)(void *, unsigned int), + unsigned char *output, + unsigned int *posp, + void (*error)(const char *x)) +{ + int ret = -1; + size_t chunksize = 0; + size_t uncomp_chunksize = LZ4_DEFAULT_UNCOMPRESSED_CHUNK_SIZE; + u8 *inp; + u8 *inp_start; + u8 *outp; + int size = in_len -= 4; +#if defined(__XEN__) || defined(__MINIOS__) + size_t out_len = get_unaligned_le32(input + in_len); +#endif + size_t dest_len; + + + if (output) { + outp = output; + } else if (!flush) { + error("NULL output pointer and no flush function provided"); + goto exit_0; + } else { + outp = large_malloc(uncomp_chunksize); + if (!outp) { + error("Could not allocate output buffer"); + goto exit_0; + } + } + + if (input && fill) { + error("Both input pointer and fill function provided,"); + goto exit_1; + } else if (input) { + inp = input; + } else if (!fill) { + error("NULL input pointer and missing fill function"); + goto exit_1; + } else { + inp = large_malloc(lz4_compressbound(uncomp_chunksize)); + if (!inp) { + error("Could not allocate input buffer"); + goto exit_1; + } + } + inp_start = inp; + + if (posp) + *posp = 0; + + if (fill) + fill(inp, 4); + + chunksize = get_unaligned_le32(inp); + if (chunksize == ARCHIVE_MAGICNUMBER) { + inp += 4; + size -= 4; + } else { + error("invalid header"); + goto exit_2; + } + + if (posp) + *posp += 4; + + for (;;) { + + if (fill) + fill(inp, 4); + + chunksize = get_unaligned_le32(inp); + if (chunksize == ARCHIVE_MAGICNUMBER) { + inp += 4; + size -= 4; + if (posp) + *posp += 4; + continue; + } + inp += 4; + size -= 4; + + if (posp) + *posp += 4; + + if (fill) { + if (chunksize > lz4_compressbound(uncomp_chunksize)) { + error("chunk length is longer than allocated"); + goto exit_2; + } + fill(inp, chunksize); + } +#if defined(__XEN__) || defined(__MINIOS__) + if (out_len >= uncomp_chunksize) { + dest_len = uncomp_chunksize; + out_len -= dest_len; + } else + dest_len = out_len; + ret = lz4_decompress(inp, &chunksize, outp, dest_len); +#else + dest_len = uncomp_chunksize; + ret = lz4_decompress_unknownoutputsize(inp, chunksize, outp, + &dest_len); +#endif + if (ret < 0) { + error("Decoding failed"); + goto exit_2; + } + + if (flush && flush(outp, dest_len) != dest_len) + goto exit_2; + if (output) + outp += dest_len; + if (posp) + *posp += chunksize; + + size -= chunksize; + + if (size == 0) + break; + else if (size < 0) { + error("data corrupted"); + goto exit_2; + } + + inp += chunksize; + if (fill) + inp = inp_start; + } + + ret = 0; +exit_2: + if (!input) + large_free(inp_start); +exit_1: + if (!output) + large_free(outp); +exit_0: + return ret; +} --- a/xen/include/asm-x86/config.h +++ b/xen/include/asm-x86/config.h @@ -25,6 +25,7 @@ #define CONFIG_X86_PM_TIMER 1 #define CONFIG_HPET_TIMER 1 #define CONFIG_X86_MCE_THERMAL 1 +#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS 1 #define CONFIG_NUMA 1 #define CONFIG_DISCONTIGMEM 1 #define CONFIG_NUMA_EMU 1 --- a/xen/include/xen/decompress.h +++ b/xen/include/xen/decompress.h @@ -31,7 +31,7 @@ typedef int decompress_fn(unsigned char * dependent). */ -decompress_fn bunzip2, unxz, unlzma, unlzo; +decompress_fn bunzip2, unxz, unlzma, unlzo, unlz4; int decompress(void *inbuf, unsigned int len, void *outbuf); --- /dev/null +++ b/xen/include/xen/lz4.h @@ -0,0 +1,88 @@ +#ifndef __LZ4_H__ +#define __LZ4_H__ + +/* + * LZ4 Kernel Interface + * + * Copyright (C) 2013, LG Electronics, Kyungsik Lee <kyungsik.lee@lge.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ +#define LZ4_MEM_COMPRESS (4096 * sizeof(unsigned char *)) +#define LZ4HC_MEM_COMPRESS (65538 * sizeof(unsigned char *)) + +/* + * lz4_compressbound() + * Provides the maximum size that LZ4 may output in a "worst case" scenario + * (input data not compressible) + */ +static inline size_t lz4_compressbound(size_t isize) +{ + return isize + (isize / 255) + 16; +} + +/* + * lz4_compress() + * src : source address of the original data + * src_len : size of the original data + * dst : output buffer address of the compressed data + * This requires ''dst'' of size LZ4_COMPRESSBOUND. + * dst_len : is the output size, which is returned after compress done + * workmem : address of the working memory. + * This requires ''workmem'' of size LZ4_MEM_COMPRESS. + * return : Success if return 0 + * Error if return (< 0) + * note : Destination buffer and workmem must be already allocated with + * the defined size. + */ +int lz4_compress(const unsigned char *src, size_t src_len, + unsigned char *dst, size_t *dst_len, void *wrkmem); + + /* + * lz4hc_compress() + * src : source address of the original data + * src_len : size of the original data + * dst : output buffer address of the compressed data + * This requires ''dst'' of size LZ4_COMPRESSBOUND. + * dst_len : is the output size, which is returned after compress done + * workmem : address of the working memory. + * This requires ''workmem'' of size LZ4HC_MEM_COMPRESS. + * return : Success if return 0 + * Error if return (< 0) + * note : Destination buffer and workmem must be already allocated with + * the defined size. + */ +int lz4hc_compress(const unsigned char *src, size_t src_len, + unsigned char *dst, size_t *dst_len, void *wrkmem); + +/* + * lz4_decompress() + * src : source address of the compressed data + * src_len : is the input size, whcih is returned after decompress done + * dest : output buffer address of the decompressed data + * actual_dest_len: is the size of uncompressed data, supposing it''s known + * return : Success if return 0 + * Error if return (< 0) + * note : Destination buffer must be already allocated. + * slightly faster than lz4_decompress_unknownoutputsize() + */ +int lz4_decompress(const unsigned char *src, size_t *src_len, + unsigned char *dest, size_t actual_dest_len); + +/* + * lz4_decompress_unknownoutputsize() + * src : source address of the compressed data + * src_len : is the input size, therefore the compressed size + * dest : output buffer address of the decompressed data + * dest_len: is the max size of the destination buffer, which is + * returned with actual size of decompressed data after + * decompress done + * return : Success if return 0 + * Error if return (< 0) + * note : Destination buffer must be already allocated. + */ +int lz4_decompress_unknownoutputsize(const unsigned char *src, size_t src_len, + unsigned char *dest, size_t *dest_len); +#endif _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Since there''s no shared or static library to link against, this simply re-uses the hypervisor side code. However, I only audited the code added here for possible security issues, not the referenced code in the hypervisor tree. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v3: Fix placement of DOMPRINTF() invocation. Split implementation into its own source file. Use the safe decompressor variant in non- Mini-OS libxc (lz4_decompress_unknownoutputsize()). --- I intentionally retained the tab indentation in the code cloned from its hypervisor original (which in turn retained it as being a clone of the Linux original), to ease diff-ing. --- a/tools/libxc/Makefile +++ b/tools/libxc/Makefile @@ -63,6 +63,7 @@ $(patsubst %.c,%.opic,$(ELF_SRCS-y)): CF GUEST_SRCS-y += xc_dom_core.c xc_dom_boot.c GUEST_SRCS-y += xc_dom_elfloader.c GUEST_SRCS-$(CONFIG_X86) += xc_dom_bzimageloader.c +GUEST_SRCS-$(CONFIG_X86) += xc_dom_decompress_lz4.c GUEST_SRCS-$(CONFIG_ARM) += xc_dom_armzimageloader.c GUEST_SRCS-y += xc_dom_binloader.c GUEST_SRCS-y += xc_dom_compat_linux.c --- a/tools/libxc/xc_dom_bzimageloader.c +++ b/tools/libxc/xc_dom_bzimageloader.c @@ -33,7 +33,7 @@ #include <inttypes.h> #include "xg_private.h" -#include "xc_dom.h" +#include "xc_dom_decompress.h" #ifndef __MINIOS__ @@ -733,6 +733,17 @@ static int xc_dom_probe_bzimage_kernel(s return -EINVAL; } } + else if ( check_magic(dom, "\x02\x21", 2) ) + { + ret = xc_try_lz4_decode(dom, &dom->kernel_blob, &dom->kernel_size); + if ( ret < 0 ) + { + xc_dom_panic(dom->xch, XC_INVALID_KERNEL, + "%s unable to LZ4 decompress kernel\n", + __FUNCTION__); + return -EINVAL; + } + } else { xc_dom_panic(dom->xch, XC_INVALID_KERNEL, --- /dev/null +++ b/tools/libxc/xc_dom_decompress.h @@ -0,0 +1,8 @@ +#ifndef __MINIOS__ +# include "xc_dom.h" +#else +# include "xc_dom_decompress_unsafe.h" +#endif + +int xc_try_lz4_decode(struct xc_dom_image *dom, void **blob, size_t *size); + --- /dev/null +++ b/tools/libxc/xc_dom_decompress_lz4.c @@ -0,0 +1,132 @@ +#include <stdio.h> +#include <stdlib.h> +#include <inttypes.h> +#include <endian.h> +#include <stdint.h> + +#include "xg_private.h" +#include "xc_dom_decompress.h" + +#define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS + +typedef uint8_t u8; +typedef uint16_t u16; +typedef uint32_t u32; +typedef uint64_t u64; + +#define likely(a) a +#define unlikely(a) a + +static inline uint_fast16_t le16_to_cpup(const unsigned char *buf) +{ + return buf[0] | (buf[1] << 8); +} + +static inline uint_fast32_t le32_to_cpup(const unsigned char *buf) +{ + return le16_to_cpup(buf) | ((uint32_t)le16_to_cpup(buf + 2) << 16); +} + +#include "../../xen/include/xen/lz4.h" +#include "../../xen/common/decompress.h" + +#ifndef __MINIOS__ + +#include "../../xen/common/lz4/decompress.c" + +#define ARCHIVE_MAGICNUMBER 0x184C2102 + +int xc_try_lz4_decode( + struct xc_dom_image *dom, void **blob, size_t *psize) +{ + int ret = -1; + unsigned char *inp = *blob, *output, *outp; + ssize_t size = *psize - 4; + size_t out_len, dest_len, chunksize; + const char *msg; + + if (size < 4) { + msg = "input too small"; + goto exit_0; + } + + out_len = get_unaligned_le32(inp + size); + if (xc_dom_kernel_check_size(dom, out_len)) { + msg = "Decompressed image too large"; + goto exit_0; + } + + output = malloc(out_len); + if (!output) { + msg = "Could not allocate output buffer"; + goto exit_0; + } + outp = output; + + chunksize = get_unaligned_le32(inp); + if (chunksize == ARCHIVE_MAGICNUMBER) { + inp += 4; + size -= 4; + } else { + msg = "invalid header"; + goto exit_2; + } + + for (;;) { + if (size < 4) { + msg = "missing data"; + goto exit_2; + } + chunksize = get_unaligned_le32(inp); + if (chunksize == ARCHIVE_MAGICNUMBER) { + inp += 4; + size -= 4; + continue; + } + inp += 4; + size -= 4; + + dest_len = out_len - (outp - output); + ret = lz4_decompress_unknownoutputsize(inp, chunksize, outp, + &dest_len); + if (ret < 0) { + msg = "decoding failed"; + goto exit_2; + } + + outp += dest_len; + size -= chunksize; + + if (size == 0) + { + *blob = output; + *psize = out_len; + return 0; + } + + if (size < 0) { + msg = "data corrupted"; + goto exit_2; + } + + inp += chunksize; + } + +exit_2: + free(output); +exit_0: + DOMPRINTF("LZ4 decompression error: %s\n", msg); + return ret; +} + +#else /* __MINIOS__ */ + +#include "../../xen/common/unlz4.c" + +int xc_try_lz4_decode( + struct xc_dom_image *dom, void **blob, size_t *size) +{ + return xc_dom_decompress_unsafe(unlz4, dom, blob, size); +} + +#endif _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On 30/09/2013 12:40, "Jan Beulich" <JBeulich@suse.com> wrote:> Linux 3.11 added respective support, so I think we should follow > suit. > > 1: xen: add LZ4 decompression support > 2: libxc: add LZ4 decompression support > > Signed-off-by: Jan Beulich <jbeulich@suse.com>Acked-by: Keir Fraser <keir@xen.org>> --- > v3: Tools side adjustments: Fix placement of DOMPRINTF() > invocation. Split implementation into its own source file. Use the > safe decompressor variant in non-Mini-OS libxc > (lz4_decompress_unknownoutputsize()). Minor adjustments to > hypervisor side avoid the tools side patch having to touch the > former. > >
Ian Campbell
2013-Oct-04 11:16 UTC
Re: [PATCH v3 2/2] libxc: add LZ4 decompression support
On Mon, 2013-09-30 at 12:45 +0100, Jan Beulich wrote:> + chunksize = get_unaligned_le32(inp); > + if (chunksize == ARCHIVE_MAGICNUMBER) { > + inp += 4; > + size -= 4; > + continue; > + } > + inp += 4; > + size -= 4;I think you want a chunksize < size check here, in case a malicious file tries to run off the end. The rest looks good, so with e.g. this inserted: if (chunksize < size) { msg = "insufficient input data"; goto exit_2; } Acked-by: Ian Campbell <ian.campbell@citrix.com>
>>> On 04.10.13 at 13:16, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Mon, 2013-09-30 at 12:45 +0100, Jan Beulich wrote: > >> + chunksize = get_unaligned_le32(inp); >> + if (chunksize == ARCHIVE_MAGICNUMBER) { >> + inp += 4; >> + size -= 4; >> + continue; >> + } >> + inp += 4; >> + size -= 4; > > I think you want a chunksize < size check here, in case a malicious file > tries to run off the end. > > The rest looks good, so with e.g. this inserted: > if (chunksize < size) {Did you perhaps mean the opposite if (chunksize > size) { Jan> msg = "insufficient input data"; > goto exit_2; > } > > Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell
2013-Oct-04 14:30 UTC
Re: [PATCH v3 2/2] libxc: add LZ4 decompression support
On Fri, 2013-10-04 at 15:19 +0100, Jan Beulich wrote:> >>> On 04.10.13 at 13:16, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Mon, 2013-09-30 at 12:45 +0100, Jan Beulich wrote: > > > >> + chunksize = get_unaligned_le32(inp); > >> + if (chunksize == ARCHIVE_MAGICNUMBER) { > >> + inp += 4; > >> + size -= 4; > >> + continue; > >> + } > >> + inp += 4; > >> + size -= 4; > > > > I think you want a chunksize < size check here, in case a malicious file > > tries to run off the end. > > > > The rest looks good, so with e.g. this inserted: > > if (chunksize < size) { > > Did you perhaps mean the opposite > > if (chunksize > size) {Yes. Doh!> > Jan > > > msg = "insufficient input data"; > > goto exit_2; > > } > > > > Acked-by: Ian Campbell <ian.campbell@citrix.com> > > >